I investigate how fast a 50-node Dataproc cluster queries the metadata of 1.1 billion taxi trips.
I investigate the performance impact of ORC file sizes on Presto query times using Google Cloud's Dataproc service.
I look at speeding up Presto queries on 1.1 billion records run on a 10-node Dataproc cluster.
I investigate how fast a small Dataproc cluster can query over a billion records using Presto.
Copyright © 2014 - 2021 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.