Category: File and Data Management | All Categories

File Sharing with Caddy & MinIO

Cost-effective, mobile-friendly file sharing using two Go-based offerings.

Track changes in Excel, Word, PowerPoint, PDFs, Images & Videos with Git

I walk through tracking changes in rich documents using Git.

Faster Compression with Snappy's S2 Extension

I walk through installing and running Snappy's S2 extension.

MinIO: A Bare Metal Drop-In for AWS S3

I walk through running an AWS S3-compatible storage service on HDFS.

Faster ZIP Decompression

I compare the decompression times of various DEFLATE implementations.

Minimalist Guide to Lossless Compression

I look at various aspects of lossless compression.

"Architecting Modern Data Platforms" Book Review

I review the Hadoop-focused book "Architecting Modern Data Platforms".

Customising Airflow: Beyond Boilerplate Settings

I walk through setting up Apache Airflow to use Dask.distributed, PostgreSQL, logging to AWS S3 as well as create User accounts and Plugins.

Python & Big Data: Airflow & Jupyter Notebook with Hadoop 3, Spark & Presto

A guide to running Airflow and Jupyter Notebook with Hadoop 3, Spark & Presto.

A Review of "Designing Data-Intensive Applications"

I review an early release of Martin Kleppmann's book "Designing Data-Intensive Applications".

Building a Data Pipeline with Airflow

I walk through setting up a data pipeline for currency exchange rates using Airflow, PostgreSQL and Redis.

Copyright © 2014 - 2022 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.