Home | Benchmarks | Categories | Atom Feed

Posted on Mon 13 November 2017 under Databases

1.1 Billion Taxi Rides with BrytlytDB 2.0 & 2 GPU-Powered p2.16xlarge EC2 Instances

During the summer I benchmarked BrytlytDB, a GPU-powered database. The software performed well and since then its version 1.0 has sat at the seventh position on my benchmarks recap board. This week I've had a chance to benchmark version 2.0 on two p16.8xlarge AWS EC2 instances again. Previously, there were a number of manual steps that were needed to get the software up and running but thankfully those steps have been largely removed thanks to an AWS Marketplace page where you can spin up a BrytlytDB cluster on the instance sizes and counts of your choice with a few clicks.

For those unfamiliar with BrytlytDB, it's a GPU-powered database built on top of PostgreSQL. Joins, Stored procedures, ODBC (important for Tableau users) and the PostgreSQL wire protocol all work as expected. When queries are run their execution plans are put together by PostgreSQL's regular codebase before being passed off to BrytlytDB to execute them on one or more of Nvidia's CUDA-powered GPUs. The GPUs don't need to live on the same machine either; ByrtlytDB, as will be shown in this benchmark, can work with a cluster of machines hosting their own discrete GPUs connected via a regular network.

In this benchmark I'll see how well the new version 2.0 of BrytlytDB runs on 32 Tesla K80 GPUs spread across two EC2 instances when querying 1.1 billion taxi trips. I'll be using the same dataset I've used to benchmark Amazon Athena, BigQuery, ClickHouse, Elasticsearch, EMR, kdb+/q, MapD, PostgreSQL, Redshift and Vertica. I have a single-page summary of all these benchmarks for comparison.

A GPU-Powered AWS EC2 Cluster

For this benchmark I'll be using two p2.16xlarge EC2 instances running Ubuntu 16.04.2 LTS in Amazon Web Services' us-east-1a region in Northern Virginia. Each machine has 8 Nvidia K80 cards which have 2 GPUs each, 64 virtual CPUs and 732 GB of memory. There's also 20 Gbit/s of networking capacity available to each instance.

Below are the specifications of the compute capabilities available each one of the EC2 instances.

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               1230.230
CPU max MHz:           3000.0000
CPU min MHz:           1200.0000
BogoMIPS:              4600.08
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-15,32-47
NUMA node1 CPU(s):     16-31,48-63
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq monitor est ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt ida

This is the layout of the GPUs available on one of the two EC2 instances.

$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.98                 Driver Version: 384.98                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:0F.0 Off |                    0 |
| N/A   65C    P0    57W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:00:10.0 Off |                    0 |
| N/A   54C    P0    73W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | 00000000:00:11.0 Off |                    0 |
| N/A   73C    P0    61W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | 00000000:00:12.0 Off |                    0 |
| N/A   55C    P0    71W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   4  Tesla K80           Off  | 00000000:00:13.0 Off |                    0 |
| N/A   71C    P0    60W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   5  Tesla K80           Off  | 00000000:00:14.0 Off |                    0 |
| N/A   56C    P0    74W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  Tesla K80           Off  | 00000000:00:15.0 Off |                    0 |
| N/A   74C    P0    61W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   7  Tesla K80           Off  | 00000000:00:16.0 Off |                    0 |
| N/A   57C    P0    71W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   8  Tesla K80           Off  | 00000000:00:17.0 Off |                    0 |
| N/A   65C    P0    59W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   9  Tesla K80           Off  | 00000000:00:18.0 Off |                    0 |
| N/A   53C    P0    72W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|  10  Tesla K80           Off  | 00000000:00:19.0 Off |                    0 |
| N/A   72C    P0    61W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|  11  Tesla K80           Off  | 00000000:00:1A.0 Off |                    0 |
| N/A   57C    P0    71W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|  12  Tesla K80           Off  | 00000000:00:1B.0 Off |                    0 |
| N/A   70C    P0    59W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|  13  Tesla K80           Off  | 00000000:00:1C.0 Off |                    0 |
| N/A   57C    P0    74W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|  14  Tesla K80           Off  | 00000000:00:1D.0 Off |                    0 |
| N/A   74C    P0    58W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|  15  Tesla K80           Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   57C    P0    71W / 149W |  11240MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

The first machine has an IP address of 52.91.174.189 and acts as the Coordinator and host to the first 16 data nodes.

The second machine has an IP address of 34.235.138.233 and hosts the second set of 16 data nodes.

TCP port 5432 is open between the two machines for communicating via PostgreSQL's wire protocol, TCP port 7777 is open for global transaction manager communication and ports 15,432 through to 20,000 are open for data node communication.

In addition to the default 20 GB EBS volumes on each EC2 instance there are six 500 GB General Purpose SSDs attached which offer a baseline of 100 IOPS each and can burst to 3,000 IOPS if need be. These drives are setup in a RAID 0 configuration on each instance. This gives each instance around 2.2 TB of formatted storage capacity in addition to the 32 GB boot drives. The Marketplace AMI sets up the RAID formation automatically.

$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
xvda    202:0    0   32G  0 disk
└─xvda1 202:1    0   32G  0 part  /
xvdb    202:16   0  500G  0 disk
└─md0     9:0    0    3T  0 raid0 /usr/local/data
xvdc    202:32   0  500G  0 disk
└─md0     9:0    0    3T  0 raid0 /usr/local/data
xvdd    202:48   0  500G  0 disk
└─md0     9:0    0    3T  0 raid0 /usr/local/data
xvde    202:64   0  500G  0 disk
└─md0     9:0    0    3T  0 raid0 /usr/local/data
xvdf    202:80   0  500G  0 disk
└─md0     9:0    0    3T  0 raid0 /usr/local/data
xvdg    202:96   0  500G  0 disk
└─md0     9:0    0    3T  0 raid0 /usr/local/data

BrytlytDB Up & Running

The BrytlytDB software will be installed and running when the EC2 instances launch but since I'll be using the two machines as a cluster I'll register the second instance's data nodes on the first instance using the brytlyt PostgreSQL user account.

$ vi register.sql
CREATE NODE d16 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15432);
CREATE NODE d17 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15433);
CREATE NODE d18 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15434);
CREATE NODE d19 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15435);
CREATE NODE d20 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15436);
CREATE NODE d21 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15437);
CREATE NODE d22 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15438);
CREATE NODE d23 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15439);
CREATE NODE d24 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15440);
CREATE NODE d25 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15441);
CREATE NODE d26 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15442);
CREATE NODE d27 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15443);
CREATE NODE d28 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15444);
CREATE NODE d29 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15445);
CREATE NODE d30 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15446);
CREATE NODE d31 WITH (HOST='34.235.138.233', TYPE='datanode', PORT=15447);
SELECT pgxc_pool_reload();
$ psql -U brytlyt -f register.sql

Once that's run I can check that the data nodes have all registered.

SELECT * FROM pgxc_node;
node_name | node_type | node_port |   node_host    | nodeis_primary | nodeis_preferred |   node_id
----------+-----------+-----------+----------------+----------------+------------------+-------------
c0        | C         |      5432 | localhost      | f              | f                |   546080270
d0        | D         |     15432 | 127.0.0.1      | f              | f                | -2088056588
d1        | D         |     15433 | 127.0.0.1      | f              | f                |   342786568
d2        | D         |     15434 | 127.0.0.1      | f              | f                |   823431008
d3        | D         |     15435 | 127.0.0.1      | f              | f                |   907287964
d4        | D         |     15436 | 127.0.0.1      | f              | f                | -1005442993
d5        | D         |     15437 | 127.0.0.1      | f              | f                | -2088175618
d6        | D         |     15438 | 127.0.0.1      | f              | f                |   653419373
d7        | D         |     15439 | 127.0.0.1      | f              | f                | -2114583294
d8        | D         |     15440 | 127.0.0.1      | f              | f                | -1181003975
d9        | D         |     15441 | 127.0.0.1      | f              | f                |  1701757799
d10       | D         |     15442 | 127.0.0.1      | f              | f                |     2938326
d11       | D         |     15443 | 127.0.0.1      | f              | f                |  -340513757
d12       | D         |     15444 | 127.0.0.1      | f              | f                |   899853678
d13       | D         |     15445 | 127.0.0.1      | f              | f                | -1529184863
d14       | D         |     15446 | 127.0.0.1      | f              | f                |   297400288
d15       | D         |     15447 | 127.0.0.1      | f              | f                | -2050394442
d16       | D         |     15432 | 34.235.138.233 | f              | f                |  -951284347
d17       | D         |     15433 | 34.235.138.233 | f              | f                |   940304836
d18       | D         |     15434 | 34.235.138.233 | f              | f                |  1871723707
d19       | D         |     15435 | 34.235.138.233 | f              | f                |  1351481950
d20       | D         |     15436 | 34.235.138.233 | f              | f                |  -958755150
d21       | D         |     15437 | 34.235.138.233 | f              | f                |  1823411759
d22       | D         |     15438 | 34.235.138.233 | f              | f                |  1397545445
d23       | D         |     15439 | 34.235.138.233 | f              | f                |    68184193
d24       | D         |     15440 | 34.235.138.233 | f              | f                |  -965766358
d25       | D         |     15441 | 34.235.138.233 | f              | f                |  1469379654
d26       | D         |     15442 | 34.235.138.233 | f              | f                | -2015650531
d27       | D         |     15443 | 34.235.138.233 | f              | f                |    14563042
d28       | D         |     15444 | 34.235.138.233 | f              | f                |  1604721461
d29       | D         |     15445 | 34.235.138.233 | f              | f                |  -557897158
d30       | D         |     15446 | 34.235.138.233 | f              | f                |  1910249780
d31       | D         |     15447 | 34.235.138.233 | f              | f                | -1701917292

Loading 1.1 Billion Trips Into BrytlytDB

I'll first define the table schema for the 1.1 billion taxi trip records.

$ psql -U brytlyt
CREATE FOREIGN TABLE trips (
   trip_id                 INTEGER,
   vendor_id               VARCHAR(3),

   pickup_datetime         DATE,
   dropoff_datetime        DATE,

   store_and_fwd_flag      VARCHAR(1),
   rate_code_id            INTEGER,
   pickup_longitude        DOUBLE PRECISION,
   pickup_latitude         DOUBLE PRECISION,
   dropoff_longitude       DOUBLE PRECISION,
   dropoff_latitude        DOUBLE PRECISION,
   passenger_count         INTEGER,
   trip_distance           DOUBLE PRECISION,
   fare_amount             DOUBLE PRECISION,
   extra                   DOUBLE PRECISION,
   mta_tax                 DOUBLE PRECISION,
   tip_amount              DOUBLE PRECISION,
   tolls_amount            DOUBLE PRECISION,
   ehail_fee               DOUBLE PRECISION,
   improvement_surcharge   DOUBLE PRECISION,
   total_amount            DOUBLE PRECISION,
   payment_type            VARCHAR(10),
   trip_type               INTEGER,
   pickup                  VARCHAR(50),
   dropoff                 VARCHAR(50),

   cab_type                VARCHAR(6),

   precipitation           INTEGER,
   snow_depth              INTEGER,
   snowfall                INTEGER,
   max_temperature         INTEGER,
   min_temperature         INTEGER,
   average_wind_speed      INTEGER,

   pickup_nyct2010_gid     INTEGER,
   pickup_ctlabel          VARCHAR(10),
   pickup_borocode         INTEGER,
   pickup_boroname         VARCHAR(13),
   pickup_ct2010           VARCHAR(6),
   pickup_boroct2010       VARCHAR(7),
   pickup_cdeligibil       VARCHAR(1),
   pickup_ntacode          VARCHAR(4),
   pickup_ntaname          VARCHAR(56),
   pickup_puma             VARCHAR(4),

   dropoff_nyct2010_gid    INTEGER,
   dropoff_ctlabel         VARCHAR(10),
   dropoff_borocode        INTEGER,
   dropoff_boroname        VARCHAR(13),
   dropoff_ct2010          VARCHAR(6),
   dropoff_boroct2010      VARCHAR(7),
   dropoff_cdeligibil      VARCHAR(1),
   dropoff_ntacode         VARCHAR(4),
   dropoff_ntaname         VARCHAR(56),
   dropoff_puma            VARCHAR(4)
) SERVER gm_fdw_server
  OPTIONS (max_size '35000000',
           is_random 'false',
           index '10,24')
  DISTRIBUTE BY roundrobin;

Note in the above there are indices on the passenger count and cab type columns. Also, the timestamp columns are using a DATE column type. I've been told DATETIME fields will be supported in a future version of the software. This is important to take into consideration as there will be data loss as only the dates and not the times of the journeys can be queried.

Also note that I had to use the larger INTEGER field instead of a SMALLINT field I normally use for several columns due to the current level of column type support in the software.

For this benchmark I've downloaded and decompressed the 500 GB of CSV data I created in my Billion Taxi Rides in Redshift blog post onto the /usr/local/data folder on the first EC2 instance. The following command will load the data onto the two EC2 instances. It took around 50 minutes to complete.

count=0

for filename in /usr/local/data/xa*; do
    psql -c "EXECUTE DIRECT ON ( d$count )
                'SELECT load_data(''trips'',
                                  ''$filename'',
                                  '','',
                                  1000000,
                                  ''0'',
                                  '''',
                                  ''overwrite'')'" 2>&1 &
    psql -c "EXECUTE DIRECT ON ( d$(($count+16)) )
                'SELECT load_data(''trips'',
                                  ''$filename'',
                                  '','',
                                  1000000,
                                  ''0'',
                                  '''',
                                  ''overwrite'')'" 2>&1 &
    (( count++ ))
done

Benchmarking BrytlytDB 2.0

The times quoted below are the lowest query times seen during a series of runs. As with all my benchmarks, I use the lowest query time as a way of indicating "top speed".

$ psql -U brytlyt
\timing on

The following completed in 0.009 seconds.

SELECT cab_type,
       count(*)
FROM trips
GROUP BY cab_type;

The following completed in 0.027 seconds.

SELECT passenger_count,
       avg(total_amount)
FROM trips
GROUP BY passenger_count;

The following completed in 0.287 seconds.

SELECT passenger_count,
       extract(year from pickup_datetime) AS pickup_year,
       count(*)
FROM trips
GROUP BY passenger_count,
         pickup_year;

The following completed in 0.428 seconds.

SELECT passenger_count,
       extract(year from pickup_datetime) AS pickup_year,
       cast(trip_distance as int) AS distance,
       count(*) AS the_count
FROM trips
GROUP BY passenger_count,
         pickup_year,
         distance
ORDER BY pickup_year,
         the_count desc;

BrytlytDB is now the fastest database I've benchmarked. Query 1 can run 111 times in a second, no other database I've benchmarked has been able to perform at that rate with any hardware I've been given access to. The 84x speed up in query 1's performance compared to version 1.0 of the software is absolutely incredible.

BrytlytDB is built on top of PostgreSQL and supports all of its features. I thought that this would some how make it difficult to optimise but I was wrong. It's incredible to think that the fastest OLAP setup I've come across is effectively PostgreSQL running on a few GPUs.

I'm genuinely impressed to see the installation process more or less disappear as well. This does a lot to lower the friction to getting people using GPU-powered databases. If spot instances are used on AWS then the cost of using the database on billions of rows of data could be no more than a few dollars an hour.

Thank you for taking the time to read this post. I offer both consulting and hands-on development services to clients in North America and Europe. If you'd like to discuss how my offerings can help your business please contact me via LinkedIn.

Copyright © 2014 - 2024 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.