Category: Spark | All Categories

1.1 Billion Taxi Rides with Spark 2.2 & 3 Raspberry Pi 3 Model Bs

I investigate how fast Spark 2.2 can query 1.1 billion taxi journeys using a cluster of three Raspberry Pis.


Analysing Petabytes of Websites

I demonstrate how to extract analytical data from petabytes worth of websites collected by Common Crawl.


1.1 Billion Taxi Rides on AWS EMR 5.3.0 & Spark 2.1.0

I investigate how fast an 11-node Spark 2.1.0 cluster can query over a billion records.


A Billion Taxi Rides on Amazon EMR running Spark

I investigate how fast a small AWS EMR cluster can query over a billion records using Spark.


Recommendation Engine built using Spark and Python

An end-to-end guide to building a film recommendation engine.

Copyright © 2014 - 2017 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.