Cost-effective, mobile-friendly file sharing using two Go-based offerings.
I walk through tracking changes in rich documents using Git.
I walk through installing and running Snappy's S2 extension.
I walk through running an AWS S3-compatible storage service on HDFS.
I compare the decompression times of various DEFLATE implementations.
I look at various aspects of lossless compression.
I review the Hadoop-focused book "Architecting Modern Data Platforms".
I walk through setting up Apache Airflow to use Dask.distributed, PostgreSQL, logging to AWS S3 as well as create User accounts and Plugins.
A guide to running Airflow and Jupyter Notebook with Hadoop 3, Spark & Presto.
I review an early release of Martin Kleppmann's book "Designing Data-Intensive Applications".
I walk through setting up a data pipeline for currency exchange rates using Airflow, PostgreSQL and Redis.
Copyright © 2014 - 2022 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.