Update: MapD rebranded as OmniSci in 2018.
In my previous blog post I looked at benchmarking how fast MapD can query 1.1 billion taxi trips made in New York City over the course of six years. In that post I used a machine with 8 Nvidia Telsa K80 GPU cards and a host of other extremely high end components. Though the query speeds were by miles the fastest I've ever seen, a machine of that caliber will easily run into the mid-5 figures in terms of cost.
In this blog post I'll be looking at running the same benchmark on a machine that should run at 1/10th the cost. I will avoid quoting the specific prices of each of the components used in my MapD-related blogs as they are constantly falling and would quickly date these posts.
And though these aren't apples-for-apples comparisons, I've benchmarked BigQuery, Elasticsearch, Presto on EMR and Dataproc, PostgreSQL and Redshift using the same dataset as I will in this post.
A Modest Machine Up & Running
For this benchmark I'll be using a 12-core Xeon E5-1650 v3 clocked at 3.5 GHz with 192 GB of RAM and 2 x 1 TB Samsung 840 EVO SSDs setup in a software-based RAID 0 configuration with 1.7 TB of usable capacity and around 1 GB/s of sequential read performance all running on CentOS 7.2.1511.
The GPU cards in question are 4 x Nvidia GeForce GTX TITAN X cards. They each come with 12,288 MB of GDDR5 memory that has a bandwidth capability of 336.5 GB/s and 6.144 teraflops of single-precision performance.
The following is the output from Nvidia's system management interface showing various diagnostics of the GPU cluster.
$ nvidia-smi
+------------------------------------------------------+
| NVIDIA-SMI 352.93 Driver Version: 352.93 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... Off | 0000:02:00.0 Off | N/A |
| 28% 59C P8 17W / 250W | 12199MiB / 12287MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX TIT... Off | 0000:03:00.0 Off | N/A |
| 28% 59C P8 16W / 250W | 12165MiB / 12287MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX TIT... Off | 0000:04:00.0 Off | N/A |
| 28% 60C P8 17W / 250W | 12165MiB / 12287MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX TIT... Off | 0000:05:00.0 Off | N/A |
| 22% 48C P8 16W / 250W | 10378MiB / 12287MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 988 G /usr/bin/Xorg 17MiB |
| 0 1326 C+G /home/mapd/prod/mapd/bin/mapd_server 12074MiB |
| 1 988 G /usr/bin/Xorg 17MiB |
| 1 1326 C+G /home/mapd/prod/mapd/bin/mapd_server 12040MiB |
| 2 988 G /usr/bin/Xorg 17MiB |
| 2 1326 C+G /home/mapd/prod/mapd/bin/mapd_server 12040MiB |
| 3 988 G /usr/bin/Xorg 17MiB |
| 3 1326 C+G /home/mapd/prod/mapd/bin/mapd_server 10253MiB |
+-----------------------------------------------------------------------------+
Loading 1.1 Billion Trips into MapD
To start, I'll download the 104 GB of CSV data I created in my Billion Taxi Rides in Redshift blog post. This data sits in 56 GZIP files and decompresses into around 500 GB of raw CSV data.
$ cd /raidStorage/mark/
$ vi urls.txt
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaa.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xab.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xac.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xad.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xae.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaf.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xag.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xah.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xai.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaj.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xak.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xal.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xam.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xan.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xao.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xap.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaq.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xar.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xas.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xat.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xau.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xav.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaw.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xax.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xay.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xaz.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xba.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbb.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbc.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbd.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbe.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbf.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbg.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbh.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbi.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbj.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbk.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbl.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbm.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbn.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbo.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbp.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbq.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbr.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbs.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbt.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbu.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbv.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbw.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbx.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xby.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xbz.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xca.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xcb.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xcc.csv.gz
https://s3-eu-west-1.amazonaws.com/<s3_bucket>/csv/trips_xcd.csv.gz
$ cat urls.txt | \
xargs -n 1 -P 6 \
wget
I'll then create a schema for my trips table. The fragment size for this exercise will be 100,000,000 as there are only 4 cards. Fragment size tells MapD how to spread records across each GPU. It's possible to set this value a lot lower than I did. Fingers crossed in a future posting I can go into further detail around this setting.
$ vi create_trips_table.sql
CREATE TABLE trips (
trip_id INTEGER,
vendor_id VARCHAR(3) ENCODING DICT,
pickup_datetime TIMESTAMP,
dropoff_datetime TIMESTAMP,
store_and_fwd_flag VARCHAR(1) ENCODING DICT,
rate_code_id SMALLINT,
pickup_longitude DECIMAL(14,2),
pickup_latitude DECIMAL(14,2),
dropoff_longitude DECIMAL(14,2),
dropoff_latitude DECIMAL(14,2),
passenger_count SMALLINT,
trip_distance DECIMAL(14,2),
fare_amount DECIMAL(14,2),
extra DECIMAL(14,2),
mta_tax DECIMAL(14,2),
tip_amount DECIMAL(14,2),
tolls_amount DECIMAL(14,2),
ehail_fee DECIMAL(14,2),
improvement_surcharge DECIMAL(14,2),
total_amount DECIMAL(14,2),
payment_type VARCHAR(3) ENCODING DICT,
trip_type SMALLINT,
pickup VARCHAR(50) ENCODING DICT,
dropoff VARCHAR(50) ENCODING DICT,
cab_type VARCHAR(6) ENCODING DICT,
precipitation SMALLINT,
snow_depth SMALLINT,
snowfall SMALLINT,
max_temperature SMALLINT,
min_temperature SMALLINT,
average_wind_speed SMALLINT,
pickup_nyct2010_gid SMALLINT,
pickup_ctlabel VARCHAR(10) ENCODING DICT,
pickup_borocode SMALLINT,
pickup_boroname VARCHAR(13) ENCODING DICT,
pickup_ct2010 VARCHAR(6) ENCODING DICT,
pickup_boroct2010 VARCHAR(7) ENCODING DICT,
pickup_cdeligibil VARCHAR(1) ENCODING DICT,
pickup_ntacode VARCHAR(4) ENCODING DICT,
pickup_ntaname VARCHAR(56) ENCODING DICT,
pickup_puma VARCHAR(4) ENCODING DICT,
dropoff_nyct2010_gid SMALLINT,
dropoff_ctlabel VARCHAR(10) ENCODING DICT,
dropoff_borocode SMALLINT,
dropoff_boroname VARCHAR(13) ENCODING DICT,
dropoff_ct2010 VARCHAR(6) ENCODING DICT,
dropoff_boroct2010 VARCHAR(7) ENCODING DICT,
dropoff_cdeligibil VARCHAR(1) ENCODING DICT,
dropoff_ntacode VARCHAR(4) ENCODING DICT,
dropoff_ntaname VARCHAR(56) ENCODING DICT,
dropoff_puma VARCHAR(4) ENCODING DICT
) WITH (FRAGMENT_SIZE=100000000);
I'll create two environment variables with my credentials for MapD.
$ read MAPD_USERNAME
$ read MAPD_PASSWORD
$ export MAPD_USERNAME
$ export MAPD_PASSWORD
The following will create the table schema using the mapdql cli tool.
$ mapdql mapd \
-u $MAPD_USERNAME \
-p $MAPD_PASSWORD \
< create_trips_table.sql
I'll then check that the table has been created:
$ echo "\t" | mapdql mapd -u $MAPD_USERNAME -p $MAPD_PASSWORD
User mapd connected to database mapd
trips
User mapd disconnected from database mapd
MapD doesn't support loading CSV data from GZIP files at this time so I'll decompress the CSV files before loading them.
$ gunzip trips_x*.csv.gz
With the table and files in place I'll load the 500 GB of CSV data into MapD.
$ for filename in *.csv; do
echo "COPY trips
FROM '/raidStorage/mark/$filename'
WITH (header='false');" | \
mapdql \
mapd \
-u $MAPD_USERNAME \
-p $MAPD_PASSWORD
done
The above completed in 86 minutes and 24 seconds.
Benchmarking MapD
The times quoted below are the lowest query times seen during a series of runs.
$ mapdql \
mapd \
-u $MAPD_USERNAME \
-p $MAPD_PASSWORD
\timing on
The following completed in 0.036 seconds.
SELECT cab_type,
count(*)
FROM trips
GROUP BY 1;
The following completed in 0.131 seconds.
SELECT passenger_count,
avg(total_amount)
FROM trips
GROUP BY 1;
The following completed in 0.439 seconds.
SELECT passenger_count,
extract(year from pickup_datetime),
count(*)
FROM trips
GROUP BY 1,
2;
The following completed in 0.964 seconds.
SELECT passenger_count,
extract(year from pickup_datetime),
cast(trip_distance as int),
count(*)
FROM trips
GROUP BY 1,
2,
3
ORDER BY 2,
4 desc;
It's fantastic to see that I've been able to use a machine that costs 1/10th of the one used in the 8 x Tesla K80s benchmark but still have queries running within 33% of the previous performances witnessed.
Despite the initial outlay for the hardware I think this is a very cost-effective setup. For the cost of a month on some hosted BI services you could own this machine outright and depreciate it's cost over three years.
To top that off, I have yet to find another BI system capable of query speeds within an order of magnitude of what MapD has managed to deliver. The 0.036 second query time seen with the first query is around 43x faster than what I've been able to achieve with CPU-based clusters to date. It will be interesting to see if there is a CPU-based setup that can come within a order of magnitude of the query speeds seen here.
GPU manufacturers have been consistent in releasing faster products that steadily decrease the price for performance ratio over the past couple of decades which leads me to believe the performance gap to CPUs will continue to widen.