Category: Spark | All Categories

1.1 Billion Taxi Rides with Spark 2.2 & 3 Raspberry Pi 3 Model Bs

I investigate how fast Spark 2.2 can query 1.1 billion taxi journeys using a cluster of three Raspberry Pis.

Analysing Petabytes of Websites

I demonstrate how to extract analytical data from petabytes worth of websites collected by Common Crawl.

1.1 Billion Taxi Rides on AWS EMR 5.3.0 & Spark 2.1.0

I investigate how fast an 11-node Spark 2.1.0 cluster can query over a billion records.

A Billion Taxi Rides on Amazon EMR running Spark

I investigate how fast a small AWS EMR cluster can query over a billion records using Spark.

Recommendation Engine built using Spark and Python

An end-to-end guide to building a film recommendation engine.

Copyright © 2014 - 2022 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.