RegreSQL: Regression Testing For PostgreSQL Queries

TL;DR , RegreSQL brings PostgreSQL’s regression testing methodology to your application queries, catching both correctness bugs and performance regressions before production.

As tricky as it may seem, the common problem with changes in production is the ever-present “AHA” moment when things start slowing down or instantly crash. It is not as easy to test, but there is a wide variation in practice when it comes to testing SQL queries. Some people may pretend to “fix it” by using ORM to make the problem go away. Others consider SQL to be “just glue code” that does not deserve systematic testing. Most agree to integration tests that verify application layer functions, never actually testing whether their queries will survive the next schema change or index modification.

For PostgreSQL development, the project has a robust regression testing suite that has been preventing disasters in core development for decades. The database itself knows how to test SQL systematically – we don’t use the same techniques for our own queries. Enter RegreSQL, a tool originally created by Dmitry Fontaine The art of PostgreSQL The book (which is excellent for understanding and mastering PostgreSQL as a database system), is designed to bring the same regression testing framework to our application queries.

I’ve been trying to use it for a while, but have given up several times due to missing features and limitations. So far. I decided to spend the time necessary to pursue the project and take it to the next level.

Introduction#

RegreSql The promise starts with the biggest strengths and perceived weaknesses of SQL queries. They’re just strings. And unless you use something like SQLC (for Go), pg’OCaml, or Rust’s SQLX toolkit, which gives you compile-time checking, your queries are only valid when they are executed. Which in the best case means either a usually slow test suite or integration tests, in the worst case only deployed. ORM is another possibility – separating SQL completely (but more on that later).

But even with compile-time checking, you’re only checking for one class of problems: schema mismatches. What about changes in behavior after schema migration or performance regression? How about understanding whether your optimizations actually made things faster or shifted the problem elsewhere?

This is where RegreSQL comes in. Instead of trying to convert SQL into something else, RegreSQL embraces the “SQL as strings” reality and applies the same testing methodology that PostgreSQL itself uses: regression testing. You write (or generate – continue reading) your SQL queries, provide input data, and RegreSQL verifies that future changes do not break those expectations.

The features don’t stop there though – it tracks performance baselines, detects common query plan regressions (like sequential scans), and gives you a framework for systematic experimentation with schema changes and query change management.

Basic Regression Testing#

Enough with the theory. Let’s get straight into the action and see what a sample run of RegreSQL looks like

$ regresql text
Connecting to 'postgres://radim:password123@192.168.139.28/cdstore_test'… ✓

Running regression tests...

✓ album-by-artist_list-albums-by-artist.1.json (0.00s)
✓ album-by-artist_list-albums-by-artist.2.json (0.00s)

✓ album-tracks_list-tracks-by-albumid.2.json (0.00s)
✓ album-tracks_list-tracks-by-albumid.1.json (0.00s)

✓ artist_top-artists-by-album.1.json (0.00s)

✓ genre-topn_genre-top-n.top-1.json (0.00s)
✓ genre-topn_genre-top-n.top-3.json (0.00s)

✓ genre-tracks_tracks-by-genre.json (0.00s)

Results: 8 passed, 0 failed, 8 skipped (0.00s)

In this example based on the Chinook database (as originally used in the book The Art of PostgreSQL), RegreSQL scans the current directory (or the one provided by -C /path/to/project) For *.sql Tries to run all queries against the files and configured PostgreSQL connection.

Different files can contain single or multiple sql queries. like the following example

-- name: top-artists-by-album
-- Get the list of the N artists with the most albums
SELECT
    artist.name,
    count(*) AS albums
FROM
    artist
    LEFT JOIN album USING (artist_id)
GROUP BY
    artist.name
ORDER BY
    albums DESC
LIMIT :n;

The syntax for queries supports both positional arguments (e.g. $1 known from libpq library) or (preferred) psql style variable (:varnameEach identified query (not a file) is executed 0..N times based on a number of predefined plans and verified for expected results – verifying the expected data matches the returned data. Support for managing SQL files is available separately with https://github.com/boringSQL/queries (Go version only for now,

This tells you what the basic RegreSQL tool has to offer – change your schema, refactor the query, run regresql test And immediately see what’s broken. The test suite now has the ability to catch regressions before they are committed/shipped. The current version, built on top of this, gives you a better console formatter instead of TAP style output, as well as jUnit, JSON and GitHub action formatter for better integration into your CI/CD pipelines.

Performance Regression Testing#

Basic regression testing catches correctness issues – wrong results, broken queries, schema mismatches. But there is another set of production issues that this leaves out. Performance Regression. No matter how unbelievable it may sound, but queries are deployed without proper indexes, or they change over time. Simple solutions – for both handwritten SQL or ORM code – can be switched from milliseconds to seconds. You add an index that helps one query, but doesn’t help another query. You modify the conditions and accidentally force a sequential scan of millions of rows. This is where the pain occurs.

RegreSQL addresses this by tracking correctness as well as performance baselines. Once the baseline is ready

$ regresql baseline
Connecting to 'postgres://appuser:password123@192.168.139.28/cdstore_test'… ✓
Creating baselines directory: regresql/baselines
Creating directory 'regresql/baselines'

Creating baselines for queries:

  ./
  Created baseline: album-by-artist_list-albums-by-artist.1.json
  Created baseline: album-by-artist_list-albums-by-artist.2.json
  Created baseline: album-tracks_list-tracks-by-albumid.1.json
  Created baseline: album-tracks_list-tracks-by-albumid.2.json
  Created baseline: artist_top-artists-by-album.1.json
  Created baseline: genre-topn_genre-top-n.top-1.json
  Created baseline: genre-topn_genre-top-n.top-3.json
  Created baseline: genre-tracks_tracks-by-genre.json

Baselines have been created successfully!
Baseline files are stored in: regresql/baselines

The test command not only tests the captured time regressions, but also detects common bad patterns in query execution plans. For now it provides warnings to detect sequential scans – both on their and/or nested loops and multiple sort operations. I believe this alone can provide valuable insight and reduce accidents in production. This is also a place where further development of RegreSQL will take place.

To demonstrate this, let’s review the test output with baselines.

Connecting to 'postgres://appuser:password123@192.168.139.28/cdstore_test'… ✓

Running regression tests...

✓ album-by-artist_list-albums-by-artist.1.json (0.00s)
✓ album-by-artist_list-albums-by-artist.2.json (0.00s)
✓ album-by-artist_list-albums-by-artist.1.cost (22.09 <= 22.09 * 110%) (0.00s)
  ⚠️  Sequential scan detected on table 'artist'
    Suggestion: Consider adding an index if this table is large or this query is frequently executed
  ⚠️  Nested loop join with sequential scan detected
    Suggestion: Add index on join column to avoid repeated sequential scans
✓ album-by-artist_list-albums-by-artist.2.cost (22.09 <= 22.09 * 110%) (0.00s)
  ⚠️  Sequential scan detected on table 'artist'
    Suggestion: Consider adding an index if this table is large or this query is frequently executed
  ⚠️  Nested loop join with sequential scan detected
    Suggestion: Add index on join column to avoid repeated sequential scans

✓ album-tracks_list-tracks-by-albumid.1.json (0.00s)
✓ album-tracks_list-tracks-by-albumid.2.json (0.00s)
✓ album-tracks_list-tracks-by-albumid.1.cost (8.23 <= 8.23 * 110%) (0.00s)
✓ album-tracks_list-tracks-by-albumid.2.cost (8.23 <= 8.23 * 110%) (0.00s)

✓ artist_top-artists-by-album.1.json (0.00s)
✓ artist_top-artists-by-album.1.cost (35.70 <= 35.70 * 110%) (0.00s)
  ⚠️  Multiple sequential scans detected on tables: album, artist
    Suggestion: Review query and consider adding indexes on filtered/joined columns

✓ genre-topn_genre-top-n.top-1.json (0.00s)
✓ genre-topn_genre-top-n.top-3.json (0.00s)
✓ genre-topn_genre-top-n.top-1.cost (6610.59 <= 6610.59 * 110%) (0.00s)
  ⚠️  Multiple sequential scans detected on tables: genre, artist
    Suggestion: Review query and consider adding indexes on filtered/joined columns
  ⚠️  Multiple sort operations detected (2 sorts)
    Suggestion: Consider composite indexes for ORDER BY clauses to avoid sorting
  ⚠️  Nested loop join with sequential scan detected
    Suggestion: Add index on join column to avoid repeated sequential scans
✓ genre-topn_genre-top-n.top-3.cost (6610.59 <= 6610.59 * 110%) (0.00s)
  ⚠️  Multiple sequential scans detected on tables: artist, genre
    Suggestion: Review query and consider adding indexes on filtered/joined columns
  ⚠️  Multiple sort operations detected (2 sorts)
    Suggestion: Consider composite indexes for ORDER BY clauses to avoid sorting
  ⚠️  Nested loop join with sequential scan detected
    Suggestion: Add index on join column to avoid repeated sequential scans

✓ genre-tracks_tracks-by-genre.json (0.00s)
✓ genre-tracks_tracks-by-genre.cost (37.99 <= 37.99 * 110%) (0.00s)
  ⚠️  Multiple sequential scans detected on tables: genre, track
    Suggestion: Review query and consider adding indexes on filtered/joined columns

Results: 16 passed (0.00s)

As you can see, despite not having a baseline, RegreSQL is able to detect basic bad patterns that must be addressed before queries can be considered “production ready”.

In some cases, detecting sequential scans, or tracking only the query cost baseline, may be considered undesirable, which will lead to false positives. RegreSQL enables this to be addressed by query metadata, as shown below.

-- name: query_name
-- metadata: key1=value1, key2=value2
SELECT ...;

At this point RegreSQL recognizes

notest Skipping query testing altogether (not just cost tracking)
nobaseline To quit cost tracking
noseqscanwarn To keep track of costs but disable sequential scan warnings
And difffloattolerance Cost up to failure limit (currently default 10%).

-- name: query_name
-- regresql: notest, nobaseline
-- regresql: noseqscanwarn
-- regresql: difffloattolerance:0.25
-- query that can vary in cost by 20% without being considered a failure
SELECT ...;

ORM enters the room#

ORMs abstract out the SQL, but they still generate it – and the generated SQL may have performance issues that you won’t catch until production. Consider this common scenario: You start with a simple SQLAlchemy query that works fine, then months later add eager loading for related data:

orders = (
    session.query(Order)
    .filter(Order.user_id == user_id)
    .options(
        joinedload(Order.user),
        joinedload(Order.shipping_address),
        selectinload(Order.items)  # NEW: Load order items
    )
    .all()
)

that innocent one selectinload(Order.items) Generates a different query – and without any indexes order_items.order_idIt performs a sequential scan.

RegreSQL can catch this by intercepting ORM-generated SQL using SQLAlchemy’s event system:

@event.listens_for(engine, "before_cursor_execute")
def capture_sql(conn, cursor, statement, *args):
    captured_queries.append(statement)

Run your ORM code, capture the SQL, save it as a .sql file, and test it with RegreSQL. Performance baseline testing will flag missing indices before production begins. This is currently experimental, but ORM integration is an important area for future development of RegreSQL.

Test Data Management#

So far we have covered how RegreSQL verifies the correctness of queries and tracks performance regressions. But there is one important condition which we have met only cursorily. Every regression test requires consistent, reproducible data. Change the data, change their cardinality, and your expected results will become meaningless. Your performance baselines slip. Your tests get messed up.

Traditional approaches to generating test data may include

database dump Become unmanageable – 500MB files that you can’t review, can’t understand, that break with every schema migration, and whose data becomes out of date as production evolves. What version of your test dumps are you using?
SQL Script Might be better than dump, but still mandatory and difficult to maintain. You end up with INSERT statements scattered across multiple files, manually managing foreign keys, and debugging constraint violations.
Factories in application code Can work great for integration tests, but we’re testing SQL directly. Do you really want to maintain parallel data generation in your application language just for SQL tests?
shared test database Synonymous with the classic “works on my machine” problem. State leaks between tests. Parallel execution becomes impossible. Debugging is a nightmare.

We need something that is declarative (what data, not how to insert it), reproducible (same data every time), composable (build complex scenarios from simple pieces), and scalable (from 10 rows to 100,000).

This is where the next improvement to RegreSQL’s fixture system comes in. Think of it as the infrastructure-code for your test data. You describe the data you need in YAML files, and RegreSQL handles the rest – dependencies, cleanup, foreign keys, and even large-scale realistic data generation.

RegreSQL’s fixture system lets you define test data in stored YAML files regresql/fixtures/Here is a simple example

  fixture: basic_users
  description: a handful of test users
  cleanup: rollback

  data:
    - table: users
      rows:
        - id: 1
          email: alice@example.com
          name: Alice Anderson
          created_at: 2024-01-15
        - id: 2
          email: bob@example.com
          name: Bob Builder
          created_at: 2024-02-20

To use this fixture in your tests, reference it in the query’s plan file (regresql/plans/get-user.yaml) You can only reference fixtures

  fixtures:
    - basic_users

  "1":
    email: alice@example.com

  "2":
    email: bob@example.com

and when you run regresql testThe fixture is automatically loaded before the query is executed, and cleared afterward. No manual setup scripts, no state leakage between tests. But it doesn’t stop with static fixtures. You can use Range when you want to test queries against realistic volume data generator Involved

Sequence, random integer, decimal, string, UUID, email and name generator
date_between to generate random timestamps within a range
Foreign key reference to be able to reuse data from other table’s fixtures
Range to select value from predefined sources
go template support

 fixture: realistic_orders
  generate:
    - table: customers
      count: 1000
      columns:
        id:
          generator: sequence
          start: 1
        email:
          generator: email
          domain: shop.example.com
        name:
          generator: name
          type: full
        created_at:
          generator: date_between
          start: "2023-01-01"
          end: "2024-12-31"

    - table: orders
      count: 5000
      columns:
        id:
          generator: sequence
          start: 1
        customer_id:
          generator: int
          min: 1
          max: 1000
        amount:
          generator: decimal
          min: 10.00
          max: 999.99
          precision: 2
        order_date:
          generator: date_between
          start: "2023-01-01"
          end: "2024-12-31"

It generates 1,000 customers and 5,000 orders with realistic looking data – name, email, date and quantity that feels like production.

there are also fixtures stackable And can be built on top of each other. For example, if you need to ensure that user fixtures are created before order fixtures, simply declare the dependency (an already planned improvement to avoid ID hard-coding is to include support for automatic foreign-key detection). RegreSQL loads fixtures in dependency order and handles cleanup in reverse.

  fixture: orders_with_shipping
  depends_on:
    - basic_users

  data:
    - table: orders
      rows:
        - id: 101
          user_id: 1  # References Alice from basic_users
          total: 99.99
          status: shipped

If the options available for fixtures (manual data or data generator) are not enough, you always have the option of using good old SQL based data generation.

  fixture: mixed_setup
  description: Combine SQL with YAML and generated data
  cleanup: rollback

  # SQL executes first (either as file or inline)
  sql:
    - file: sql/setup_schema.sql
    - inline: "INSERT INTO config (key, value) VALUES ('version', '1.0');"

  # followed YAML data
  data:
    - table: users
      rows:
        - id: 1
          email: admin@example.com

  # and finally generated data
  generate:
    - table: orders
      count: 100
      columns:
        id:
          generator: sequence
          start: 1
        user_id:
          generator: int
          min: 1
          max: 1

RegreSQL provides commands to inspect and verify your fixtures

  # List all available fixtures
  regresql fixtures list

  # Show fixture details and dependencies
  regresql fixtures show realistic_orders

  # Validate fixture definitions
  regresql fixtures validate

  # Show dependency graph
  regresql fixtures deps

  # Apply fixture manually (for debugging)
  regresql fixtures apply basic_users

The fixture system is designed to transform test data from a maintenance burden into a documented, version-controlled process. Your YAML files become a single source of truth for the data needed for your tests, making it easier to understand test scenarios and maintain test data as the application evolves.

RegreSQL future#

Introducing a new open source project is an ambitious goal, and RegreSQL is just getting started. Despite this, the fork has been in use for almost 2 years. I’m planning further improvements, as well as better documentation and more tutorials, in the coming weeks and months. The project is maintained as part of my BoringSql brand, where it is a key component to the creation of SQL Labs which will (as I fully expect) provide a foundation for its further development.

At the same time RegreSql This is an attempt to give back to the welcoming PostgreSQL community, if possible improve the developer user experience a bit and (maybe maybe) provide another argument against the case that SQL queries are not testable.

RegreSQL is available on GitHub – feel free to open an issue, or send me an email about the project at radim@boringsql.com or connect on LinkedIn.

RegreSQL: Regression Testing for PostgreSQL Queries

Introduction#

Basic Regression Testing#

Performance Regression Testing#

ORM enters the room#

Test Data Management#

RegreSQL future#

Like this:

Related

Leave a Comment Cancel reply

Introduction#

Basic Regression Testing#

Performance Regression Testing#

ORM enters the room#

Test Data Management#

RegreSQL future#

Share this:

Like this:

Related

Leave a Comment Cancel reply