SQL Testing Library for Python

Unit test SQL queries with mock data injection for BigQuery, Snowflake, Redshift, Athena, Trino, and DuckDB. Pytest integration for data engineering and ETL testing.

Get started now View on GitHub


🎯 Why Use SQL Testing Library for Python?

SQL unit testing in data engineering can be challenging, especially when working with cloud databases like BigQuery, Snowflake, Redshift, and Athena. This Python SQL testing framework addresses critical pain points:

  • Fragile Integration Tests: Traditional SQL tests that depend on live data break when data changes, causing flaky CI/CD pipelines
  • Slow Feedback Loops: Running database tests against full datasets takes too long for continuous integration
  • Database Engine Upgrades: UDF semantics and SQL behavior change between versions (e.g., Athena v2 to v3), causing silent production failures
  • Database Lock-in: SQL tests written for BigQuery don’t work on Snowflake or Redshift without rewrites
  • Complex Setup: Each cloud database requires different mocking strategies, credentials, and testing tools

Perfect for Data Engineering Teams

Whether you’re building ETL pipelines, validating data transformations, or testing analytics queries, this library provides a unified pytest-based framework for SQL testing across all major cloud databases.

✨ Key Features

πŸš€ Multi-Database Support

Test your SQL queries across BigQuery, Athena, Redshift, Trino, Snowflake, and DuckDB with a unified API.

🎯 Type-Safe Testing

Use Python dataclasses and Pydantic models for type-safe test data and results.

⚑ Flexible Execution

Automatically switches between CTE injection and physical tables based on query size.

πŸ§ͺ Pytest Integration

Seamlessly integrates with pytest using the @sql_test decorator.

πŸ“Š Comprehensive Type Support

Supports primitive types, arrays, decimals, dates, optional values, and struct types (Athena/Trino/BigQuery) across databases.

πŸ” SQL Logging & Debugging

Automatic SQL logging with formatted output, temp table queries, and full error tracebacks for easy debugging.

πŸ“‹ Quick Example

from dataclasses import dataclass
from sql_testing_library import sql_test, TestCase
from sql_testing_library.mock_table import BaseMockTable
from pydantic import BaseModel

@dataclass
class User:
    user_id: int
    name: str
    email: str

class UserResult(BaseModel):
    user_id: int
    name: str

class UsersMockTable(BaseMockTable):
    def get_database_name(self) -> str:
        return "test_db"

    def get_table_name(self) -> str:
        return "users"

@sql_test(
    mock_tables=[
        UsersMockTable([
            User(1, "Alice", "alice@example.com"),
            User(2, "Bob", "bob@example.com")
        ])
    ],
    result_class=UserResult
)
def test_user_query():
    return TestCase(
        query="SELECT user_id, name FROM users WHERE user_id = 1",
        default_namespace="test_db"
    )

πŸ—οΈ Supported Databases

Database CTE Mode Physical Tables Query Size Limit
BigQuery βœ… βœ… ~1MB
Athena βœ… βœ… 256KB
Redshift βœ… βœ… 16MB
Trino βœ… βœ… ~16MB
Snowflake βœ… βœ… 1MB
DuckDB βœ… βœ… No limit

Data Types Support

βœ… Supported Types:

  • String
  • Integer
  • Float
  • Boolean
  • Date
  • Datetime
  • Decimal
  • Arrays
  • Map/Dict types (Dict[K, V])
  • Optional/Nullable types
  • Struct/Record types (Athena/Trino/BigQuery - using dataclasses or Pydantic models)

❌ Not Yet Supported:

  • Struct/Record types for Redshift and Snowflake
  • Nested Arrays (arrays of arrays)

πŸ“š Documentation

Getting Started

Installation, configuration, and your first test

Database Adapters

Configure and use different database engines

API Reference

Complete reference for all classes and methods

Mocksmith Integration

Generate realistic test data automatically

πŸš€ Getting Started

Installation

# Install with specific database support
pip install sql-testing-library[bigquery]
pip install sql-testing-library[athena]
pip install sql-testing-library[redshift]
pip install sql-testing-library[trino]
pip install sql-testing-library[snowflake]
pip install sql-testing-library[duckdb]

# Or install with all database adapters
pip install sql-testing-library[all]

Configuration

Create a pytest.ini file in your project root:

[sql_testing]
adapter = bigquery  # Choose your database

[sql_testing.bigquery]
project_id = my-test-project
dataset_id = test_dataset
credentials_path = path/to/credentials.json

Write Your First Test

@sql_test
def test_simple_query():
    return TestCase(
        query="SELECT 1 as value",
        result_class=dict
    )

🀝 Contributing

Contributions are welcome! Please check out our Contributing Guide for details.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

Built with ❀️ by the data engineering community. Special thanks to all contributors.


πŸ€” Frequently Asked Questions

How do I unit test SQL queries in Python?

Use SQL Testing Library with pytest to write unit tests for SQL queries. The library injects mock data via CTEs or temporary tables, allowing you to test query logic without accessing real databases. Perfect for testing BigQuery, Snowflake, Redshift, Athena, Trino, and DuckDB queries.

Can I test BigQuery SQL queries without a BigQuery account?

Yes! SQL Testing Library creates temporary mock data within your BigQuery project, so you only need access to a test project. No production data needed. The library works with BigQuery’s free tier.

How do I test Snowflake SQL queries locally?

Configure the Snowflake adapter in pytest.ini with your test warehouse credentials. The library creates temporary tables that auto-cleanup after each test. Alternatively, use DuckDB adapter for fully local testing with similar SQL syntax.

What’s the best way to test ETL pipelines?

Use SQL Testing Library to test individual SQL transformations with controlled mock data. This approach is faster and more reliable than end-to-end integration tests. Write pytest tests for each transformation step in your data pipeline.

Can I use this for data validation testing?

Absolutely! SQL Testing Library is perfect for testing data validation rules, business logic, and quality checks written in SQL. Test assertions, constraints, and transformations with type-safe mock data using Python dataclasses or Pydantic models.

Does this work with dbt?

Yes! You can test dbt SQL models by extracting the compiled SQL and testing it with SQL Testing Library. This provides unit-level testing for your dbt transformations before running full dbt tests.

How do I test SQL queries across multiple databases?

Write your test once and use different adapters (BigQuery, Snowflake, Redshift, etc.) by specifying adapter_type in the @sql_test decorator. Perfect for testing query compatibility when migrating between cloud databases.