Category: Data Science | All Categories

Working with Data Feeds

This tutorial covers converting Wikipedia's XML dump of its English-language site into CSV, JSON, AVRO and ORC file formats as well as analysing the data using ClickHouse.


Popular Airline Passenger Routes

Scraping 29K Wikipedia pages to find the most popular commercial airline passenger routes.


Linting UK Postcodes

Parsing and linting UK postcodes is ripe with edge cases.

Copyright © 2014 - 2017 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.