More Flexible Test Databases in Flask

Published: 2022-11-07

Category: Code | Tags: database, flask, unittest


I've been longing for an easier way to manage test data in Flask. Specifically, when running automated tests, I wanted an easier way to populate a database with some known values which would then be used in the tests themselves. This turned out to be tricker than I thought, but I learned a bunch along the way and I'll share that process in detail here.

Why I needed test data

I tend to focus on integration tests - I'm interested in how the application takes in requests and returns a response. Having test data in my database allows me to define test results easily. I know what types of responses I should be getting from each route and dynamically loaded data from a JSON file allows me to quickly define those results over and over.

Up until this point, I would create database objects like normal, using a model constructor:

import unittest

from myapp.models import Event, User


class MyTestClass(unittest.TestCase):
  def setUp(self):
    user = User(name="My name", email="myname@example.com")
    user2 = User(name="Another name", email="another@example.com")

    event = Event(title="Some event")

    db.session.add_all([user, user2, event])
    db.session.commit()

  # the rest of the tests

class AnotherTestClass(unittest.TestCase):
  def setUp(self):
    # do the same thing...

The problem with this is that it is extremely repetitive. Each test (or each TestCase instance) has its own database declarations which have to be loaded when the test is run. That means I'm either typing each record for each test or I'm copy/pasting items in between tests. If my routes ever change I have to change each instance of the test as a result, which is no fun.

Using libraries

I came across two libraries, but neither really solved my problem but for different reasons.

Flask Fixtures is a library which allows you to run unit tests based on JSON representations of your data. It takes in a list of JSON files and then populates an in-memory sqlite database. I tried this method, but the library hasn't been updated in several years and didn't play well with Flask's application factory pattern.

I had Factory Boy recommended, and while tempting, I needed to have consistent data in memory to run tests against. That said, I'll probably come back to Factory Boy for generating large data sets where I have more freedom in how I test.

I like the pattern of using JSON to populate a test database on the fly. I ended up writing my own, much simplified, version of Flask Fixtures.

JSON structure

I followed the pattern in Flask Fixtures because it provides a clear, extensible way of loading data into the application.

[
  {
    "table": "user",
    "records": [
      {
        "id": 1,
        "name": "Admin",
        "email": "admin@example.com",
        "usertype_id": 1,
        "location_id": 1
       },
    }
]

Each file can be expanded as necessary, adding new items or new files to expand the test database scope on the fly. These files live inside /test/fixtures in my project tree.

Dynamically loading test data

Instead of defining database records at the start of each test, I now define records in JSON files which can be loaded on demand within a test or set of tests. The biggest change in my app structure was to handle application context appropriately.

A new Loader module is created with the current application instance, database, and a list of fixtures to load into sqlite. The module only runs within the current context, so I can control when loading happens within the individual tests, even loading data after the setUp function has run.

import json
import os
import unittest

from sqlalchemy import Table

from app.extensions import db

class Loader(object):
    """
    Reusable class for loading fixture data into test databases.
    Initialize with an in-context application and database engine.
    """

    def __init__(self, app, db, fixtures):
        self.app = app
        self.connection = db.engine.connect()
        self.fixtures = fixtures
        self.metadata = db.metadata

    def load(self):
        for filename in self.fixtures:
            filepath = os.path.join(self.app.config["FIXTURES_DIR"], filename)
            with open(filepath) as file_in:
                self.data = json.load(file_in)
                self.load_from_file()

    def load_from_file(self):
        table = Table(self.data[0]["table"], self.metadata)
        self.connection.execute(table.insert(), self.data[0]["records"])
        return


class MyTest(unittest.TestCase):
    def create(self):
        self.app = create_app(TestConfig)

        # Build the database structure in the application context
        with self.app.app_context():
            db.init_app(self.app)
            db.create_all()
        return self.app

    def setUp(self):
        self.app = self.create()

        # Set up the application context manually to build the database
        # and test client for requests.
        ctx = self.app.app_context()
        ctx.push()

        self.client = self.app.test_client()

        # Include any data to be loaded into the database
        fixtures = [
            "events.json",
            "users.json",
        ]

        # Now that we're in context, we can load the database.
        loader = Loader(self.app, db, fixtures)
        loader.load()

    def tearDown(self):
        db.session.remove()
        db.drop_all()

Main takeaways

My biggest frustration was figuring out application context. This update to my test runner included moving to an application factory pattern, so I had to rethink how everything ran from the ground up. In the main application, context is handled by the create_app function and I didn't have to think about what context was active. In the tests, that has to be done manually with each instance. Moving the app context into startUp ensured only one context was used at a given time.

Being self taught, I do my best to apply best practice principles like "don't repeat yourself" (DRY). This was especially noticeable as the number of tests increased and I'm happy with this solution. There's still some boilerplate for each test case and one of my goals is to wrap that up in a unittest.TestCase subclass so I can simply inherit the boilerplate rather than type it out. I'm still working on the best way to do that for my use case.

This is probably one of the more complex problems I've had to solve on my own. The application itself is just a layer to interact with database records, so the logic itself isn't too complex. Writing my own module to handle the automated work was new and I'm happy with the result. I'm hoping to be able to expand on it and eventually (maybe?) package it up into something I can import and use in some other projects. But that's another task for another day.

Comments are always open. You can get in touch by sending me an email at brian@ohheybrian.com