Home | Benchmarks | Categories | Atom Feed

Posted on Tue 10 November 2015 under Python

Faster Testing with RAM Drives

When writing unit tests in Django using django.test.TestCase the database will be flushed after each test. If you're not using an in-memory database this will create a lot of overhead writing to disk.

PostgreSQL is a popular database used in many Django projects but its default behaviour is to write data to disk which makes it much slower then running SQLite in memory. The default SQLite3 driver will run in memory but it won't be able to test PostgreSQL-specific aspects of your models (such as foreign key integrity on by default when loading in fixtures). MySQL supports an in-memory backend but among other things it does not support foriegn keys, blob / text columns or transactions. Using a RAM drive as a storage area for the test database is a good way to test all applicable aspects of PostgreSQL and get the performance of an in-memory database.

Setup a RAM Drive

The following commands were all run on Ubuntu Server 15.10.

To start, I'll create a 512MB RAM drive.

$ sudo mkdir -p /media/pg_ram_drive
$ sudo mount -t tmpfs -o size=512M tmpfs /media/pg_ram_drive/

I'll confirm I can see the drive mentioned among the mounting points on my system.

$ mount | grep pg_ram_drive
tmpfs on /media/pg_ram_drive type tmpfs (rw,relatime,size=524288k)

Benchmark comparison between an SSD & the RAM drive

The following benchmarks the RAM drive to see what sort of write performance it offers over the SSD drive on my system.

$ dd if=/dev/zero \
      of=/tmp/benchmark \
      conv=fdatasync \
      bs=4k \
      count=100000 \
      && rm -f /tmp/benchmark
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 0.801915 s, 511 MB/s
$ dd if=/dev/zero \
      of=/media/pg_ram_drive/benchmark \
      conv=fdatasync \
      bs=4k \
      count=100000 \
      && rm -f /media/pg_ram_drive/benchmark
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 0.129084 s, 3.2 GB/s

The SSD drive managed to write at 511 MB/s while the RAM drive was 6.4x faster at 3.2 GB/s.

Install PostgreSQL and setup the RAM Drive Tablespace

The following will install the PostgreSQL 9.4 packages we need:

$ sudo apt update
$ sudo apt install \
    postgresql-server-dev-9.4 \
    postgresql-client-9.4 \
    postgresql-contrib-9.4 \
    libpq-dev

I'll then make sure PostgreSQL's user has ownership over the RAM drive, create the regular django project's database (which will sit on my SSD drive), create the table space for the RAM drive and create a user account for Django to use which will have permissions to create and drop databases.

$ sudo chown -R postgres /media/pg_ram_drive/
$ sudo -u postgres psql
postgres=# CREATE DATABASE django;
postgres=# CREATE TABLESPACE ram_disk LOCATION '/media/pg_ram_drive';
postgres=# CREATE USER django WITH SUPERUSER PASSWORD 'django';

There should be a folder on the RAM drive now with a PG_ prefix that looks something like the following:

$ sudo find /media/pg_ram_drive
/media/pg_ram_drive
/media/pg_ram_drive/PG_9.4_201409291 # This folder should be empty

Testing a Django project on the RAM drive

I'll install all the packages needed to test an example Django project:

$ sudo apt install \
    python-virtualenv \
    python-pip \
    python-dev \
    git-core

In a previous blog post I created a project where a Django model is tested, I'll run these tests on the RAM drive.

$ git clone https://github.com/marklit/meetup-testing.git
$ virtualenv venv
$ source venv/bin/activate
$ pip install -r meetup-testing/requirements.txt
$ pip install psycopg2
$ cd meetup-testing

This code base has a convention that settings that need to be overridden are done so in a base/local_settings.py file which is not kept in the git repo. In this file I'll set both the regular and test database settings.

The most important setting is the DEFAULT_TABLESPACE attribute which should be the name of the RAM disk tablespace that was created in PostgreSQL.

$ vi base/local_settings.py
import sys


DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'HOST': 'localhost',
        'PORT': 5432,
        'NAME': 'django',
        'USER': 'django',
        'PASSWORD': 'django',
        'TEST': {
            'NAME': 'django_test',
        },
    },
}

if 'test' in sys.argv:
    DEFAULT_TABLESPACE = 'ram_disk'

SECRET_KEY = 'a' * 21

Now when the tests run they're using the RAM drive.

$ python manage.py test
Creating test database for alias 'default'...
.
----------------------------------------------------------------------
Ran 1 test in 0.027s

OK
Destroying test database for alias 'default'...

This code base only has one test but on projects with a lot of tests there should be a significant decrease in the amount of time it takes to run the test suite.

Make the RAM drive permanent

To make sure the RAM drive is available if the system restarts add the following line to your /etc/fstab file:

tmpfs           /media/pg_ram_drive  tmpfs   defaults,noatime,mode=1777 0 0

After a reboot the drive will be mounted and empty. When you run Django's test runner it'll create a test database from scratch on the drive again.

$ sudo reboot
...
$ python manage.py test
Creating test database for alias 'default'...
.
----------------------------------------------------------------------
Ran 1 test in 0.059s

OK
Destroying test database for alias 'default'...
Thank you for taking the time to read this post. I offer both consulting and hands-on development services to clients in North America and Europe. If you'd like to discuss how my offerings can help your business please contact me via LinkedIn.

Copyright © 2014 - 2024 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.