This tutorial covers converting Wikipedia's XML dump of its English-language site into CSV, JSON, AVRO and ORC file formats as well as analysing the data using ClickHouse.
Scraping 29K Wikipedia pages to find the most popular commercial airline passenger routes.
Parsing and linting UK postcodes is ripe with edge cases.
Copyright © 2014 - 2017 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.