1.1 Billion Taxi Rides with MapD & 4 Nvidia Titan Xs

Update: MapD rebranded as OmniSci in 2018.

In my previous blog post I looked at benchmarking how fast MapD can query 1.1 billion taxi trips made in New York City over the course of six years. In that post I used a machine with 8 Nvidia Telsa K80 GPU cards and a host of other extremely high end components. Though the query speeds were by miles the fastest I've ever seen, a machine of that caliber will easily run into the mid-5 figures in terms of cost.

In this blog post I'll be looking at running the same benchmark on a machine that should run at 1/10th the cost. I will avoid quoting the specific prices of each of the components used in my MapD-related blogs as they are constantly falling and would quickly date these posts.

And though these aren't apples-for-apples comparisons, I've benchmarked BigQuery, Elasticsearch, Presto on EMR and Dataproc, PostgreSQL and Redshift using the same dataset as I will in this post.

A Modest Machine Up & Running

For this benchmark I'll be using a 12-core Xeon E5-1650 v3 clocked at 3.5 GHz with 192 GB of RAM and 2 x 1 TB Samsung 840 EVO SSDs setup in a software-based RAID 0 configuration with 1.7 TB of usable capacity and around 1 GB/s of sequential read performance all running on CentOS 7.2.1511.

The GPU cards in question are 4 x Nvidia GeForce GTX TITAN X cards. They each come with 12,288 MB of GDDR5 memory that has a bandwidth capability of 336.5 GB/s and 6.144 teraflops of single-precision performance.

The following is the output from Nvidia's system management interface showing various diagnostics of the GPU cluster.

$ nvidia-smi

+------------------------------------------------------+
| NVIDIA-SMI 352.93     Driver Version: 352.93         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  Off  | 0000:02:00.0     Off |                  N/A |
| 28%   59C    P8    17W / 250W |  12199MiB / 12287MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX TIT...  Off  | 0000:03:00.0     Off |                  N/A |
| 28%   59C    P8    16W / 250W |  12165MiB / 12287MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX TIT...  Off  | 0000:04:00.0     Off |                  N/A |
| 28%   60C    P8    17W / 250W |  12165MiB / 12287MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX TIT...  Off  | 0000:05:00.0     Off |                  N/A |
| 22%   48C    P8    16W / 250W |  10378MiB / 12287MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       988    G   /usr/bin/Xorg                                   17MiB |
|    0      1326  C+G   /home/mapd/prod/mapd/bin/mapd_server         12074MiB |
|    1       988    G   /usr/bin/Xorg                                   17MiB |
|    1      1326  C+G   /home/mapd/prod/mapd/bin/mapd_server         12040MiB |
|    2       988    G   /usr/bin/Xorg                                   17MiB |
|    2      1326  C+G   /home/mapd/prod/mapd/bin/mapd_server         12040MiB |
|    3       988    G   /usr/bin/Xorg                                   17MiB |
|    3      1326  C+G   /home/mapd/prod/mapd/bin/mapd_server         10253MiB |
+-----------------------------------------------------------------------------+

Loading 1.1 Billion Trips into MapD

To start, I'll download the 104 GB of CSV data I created in my Billion Taxi Rides in Redshift blog post. This data sits in 56 GZIP files and decompresses into around 500 GB of raw CSV data.

$ cd /raidStorage/mark/
$ vi urls.txt

https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaa.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xab.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xac.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xad.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xae.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaf.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xag.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xah.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xai.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaj.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xak.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xal.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xam.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xan.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xao.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xap.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaq.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xar.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xas.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xat.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xau.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xav.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaw.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xax.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xay.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaz.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xba.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbb.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbc.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbd.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbe.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbf.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbg.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbh.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbi.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbj.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbk.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbl.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbm.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbn.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbo.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbp.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbq.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbr.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbs.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbt.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbu.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbv.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbw.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbx.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xby.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbz.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xca.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xcb.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xcc.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xcd.csv.gz

$ cat urls.txt | \
    xargs -n 1 -P 6 \
        wget

I'll then create a schema for my trips table. The fragment size for this exercise will be 100,000,000 as there are only 4 cards. Fragment size tells MapD how to spread records across each GPU. It's possible to set this value a lot lower than I did. Fingers crossed in a future posting I can go into further detail around this setting.

$ vi create_trips_table.sql

CREATE TABLE trips (
    trip_id                 INTEGER,
    vendor_id               VARCHAR(3) ENCODING DICT,

    pickup_datetime         TIMESTAMP,

    dropoff_datetime        TIMESTAMP,
    store_and_fwd_flag      VARCHAR(1) ENCODING DICT,
    rate_code_id            SMALLINT,
    pickup_longitude        DECIMAL(14,2),
    pickup_latitude         DECIMAL(14,2),
    dropoff_longitude       DECIMAL(14,2),
    dropoff_latitude        DECIMAL(14,2),
    passenger_count         SMALLINT,
    trip_distance           DECIMAL(14,2),
    fare_amount             DECIMAL(14,2),
    extra                   DECIMAL(14,2),
    mta_tax                 DECIMAL(14,2),
    tip_amount              DECIMAL(14,2),
    tolls_amount            DECIMAL(14,2),
    ehail_fee               DECIMAL(14,2),
    improvement_surcharge   DECIMAL(14,2),
    total_amount            DECIMAL(14,2),
    payment_type            VARCHAR(3) ENCODING DICT,
    trip_type               SMALLINT,
    pickup                  VARCHAR(50) ENCODING DICT,
    dropoff                 VARCHAR(50) ENCODING DICT,

    cab_type                VARCHAR(6) ENCODING DICT,

    precipitation           SMALLINT,
    snow_depth              SMALLINT,
    snowfall                SMALLINT,
    max_temperature         SMALLINT,
    min_temperature         SMALLINT,
    average_wind_speed      SMALLINT,

    pickup_nyct2010_gid     SMALLINT,
    pickup_ctlabel          VARCHAR(10) ENCODING DICT,
    pickup_borocode         SMALLINT,
    pickup_boroname         VARCHAR(13) ENCODING DICT,
    pickup_ct2010           VARCHAR(6) ENCODING DICT,
    pickup_boroct2010       VARCHAR(7) ENCODING DICT,
    pickup_cdeligibil       VARCHAR(1) ENCODING DICT,
    pickup_ntacode          VARCHAR(4) ENCODING DICT,
    pickup_ntaname          VARCHAR(56) ENCODING DICT,
    pickup_puma             VARCHAR(4) ENCODING DICT,

    dropoff_nyct2010_gid    SMALLINT,
    dropoff_ctlabel         VARCHAR(10) ENCODING DICT,
    dropoff_borocode        SMALLINT,
    dropoff_boroname        VARCHAR(13) ENCODING DICT,
    dropoff_ct2010          VARCHAR(6)  ENCODING DICT,
    dropoff_boroct2010      VARCHAR(7)  ENCODING DICT,
    dropoff_cdeligibil      VARCHAR(1)  ENCODING DICT,
    dropoff_ntacode         VARCHAR(4)  ENCODING DICT,
    dropoff_ntaname         VARCHAR(56) ENCODING DICT,
    dropoff_puma            VARCHAR(4)  ENCODING DICT
) WITH (FRAGMENT_SIZE=100000000);

I'll create two environment variables with my credentials for MapD.

$ read MAPD_USERNAME
$ read MAPD_PASSWORD
$ export MAPD_USERNAME
$ export MAPD_PASSWORD

The following will create the table schema using the mapdql cli tool.

$ mapdql mapd \
    -u $MAPD_USERNAME \
    -p $MAPD_PASSWORD \
    < create_trips_table.sql

I'll then check that the table has been created:

$ echo "\t" | mapdql mapd -u $MAPD_USERNAME -p $MAPD_PASSWORD

User mapd connected to database mapd
trips
User mapd disconnected from database mapd

MapD doesn't support loading CSV data from GZIP files at this time so I'll decompress the CSV files before loading them.

$ gunzip trips_x*.csv.gz

With the table and files in place I'll load the 500 GB of CSV data into MapD.

$ for filename in *.csv; do
      echo "COPY trips
            FROM '/raidStorage/mark/$filename'
            WITH (header='false');" | \
          mapdql \
              mapd \
              -u $MAPD_USERNAME \
              -p $MAPD_PASSWORD
  done

The above completed in 86 minutes and 24 seconds.

Benchmarking MapD

The times quoted below are the lowest query times seen during a series of runs.

$ mapdql \
    mapd \
    -u $MAPD_USERNAME \
    -p $MAPD_PASSWORD

\timing on

The following completed in 0.036 seconds.

SELECT cab_type,
       count(*)
FROM trips
GROUP BY 1;

The following completed in 0.131 seconds.

SELECT passenger_count,
       avg(total_amount)
FROM trips
GROUP BY 1;

The following completed in 0.439 seconds.

SELECT passenger_count,
       extract(year from pickup_datetime),
       count(*)
FROM trips
GROUP BY 1,
         2;

The following completed in 0.964 seconds.

SELECT passenger_count,
       extract(year from pickup_datetime),
       cast(trip_distance as int),
       count(*)
FROM trips
GROUP BY 1,
         2,
         3
ORDER BY 2,
         4 desc;

It's fantastic to see that I've been able to use a machine that costs 1/10th of the one used in the 8 x Tesla K80s benchmark but still have queries running within 33% of the previous performances witnessed.

Despite the initial outlay for the hardware I think this is a very cost-effective setup. For the cost of a month on some hosted BI services you could own this machine outright and depreciate it's cost over three years.

To top that off, I have yet to find another BI system capable of query speeds within an order of magnitude of what MapD has managed to deliver. The 0.036 second query time seen with the first query is around 43x faster than what I've been able to achieve with CPU-based clusters to date. It will be interesting to see if there is a CPU-based setup that can come within a order of magnitude of the query speeds seen here.

GPU manufacturers have been consistent in releasing faster products that steadily decrease the price for performance ratio over the past couple of decades which leads me to believe the performance gap to CPUs will continue to widen.

Thank you for taking the time to read this post. I offer both consulting and hands-on development services to clients in North America and Europe. If you'd like to discuss how my offerings can help your business please contact me via LinkedIn.