SQL Testing Library for Python
Unit test SQL queries with mock data injection for BigQuery, Snowflake, Redshift, Athena, Trino, and DuckDB. Pytest integration for data engineering and ETL testing.
Get started now View on GitHub
π― Why Use SQL Testing Library for Python?
SQL unit testing in data engineering can be challenging, especially when working with cloud databases like BigQuery, Snowflake, Redshift, and Athena. This Python SQL testing framework addresses critical pain points:
- Fragile Integration Tests: Traditional SQL tests that depend on live data break when data changes, causing flaky CI/CD pipelines
- Slow Feedback Loops: Running database tests against full datasets takes too long for continuous integration
- Database Engine Upgrades: UDF semantics and SQL behavior change between versions (e.g., Athena v2 to v3), causing silent production failures
- Database Lock-in: SQL tests written for BigQuery donβt work on Snowflake or Redshift without rewrites
- Complex Setup: Each cloud database requires different mocking strategies, credentials, and testing tools
Perfect for Data Engineering Teams
Whether youβre building ETL pipelines, validating data transformations, or testing analytics queries, this library provides a unified pytest-based framework for SQL testing across all major cloud databases.
β¨ Key Features
π Multi-Database Support
Test your SQL queries across BigQuery, Athena, Redshift, Trino, Snowflake, and DuckDB with a unified API.
π― Type-Safe Testing
Use Python dataclasses and Pydantic models for type-safe test data and results.
β‘ Flexible Execution
Automatically switches between CTE injection and physical tables based on query size.
π§ͺ Pytest Integration
Seamlessly integrates with pytest using the @sql_test decorator.
π Comprehensive Type Support
Supports primitive types, arrays, decimals, dates, optional values, and struct types (Athena/Trino/BigQuery) across databases.
π SQL Logging & Debugging
Automatic SQL logging with formatted output, temp table queries, and full error tracebacks for easy debugging.
π Quick Example
from dataclasses import dataclass
from sql_testing_library import sql_test, TestCase
from sql_testing_library.mock_table import BaseMockTable
from pydantic import BaseModel
@dataclass
class User:
user_id: int
name: str
email: str
class UserResult(BaseModel):
user_id: int
name: str
class UsersMockTable(BaseMockTable):
def get_database_name(self) -> str:
return "test_db"
def get_table_name(self) -> str:
return "users"
@sql_test(
mock_tables=[
UsersMockTable([
User(1, "Alice", "alice@example.com"),
User(2, "Bob", "bob@example.com")
])
],
result_class=UserResult
)
def test_user_query():
return TestCase(
query="SELECT user_id, name FROM users WHERE user_id = 1",
default_namespace="test_db"
)
ποΈ Supported Databases
| Database | CTE Mode | Physical Tables | Query Size Limit |
|---|---|---|---|
| BigQuery | β | β | ~1MB |
| Athena | β | β | 256KB |
| Redshift | β | β | 16MB |
| Trino | β | β | ~16MB |
| Snowflake | β | β | 1MB |
| DuckDB | β | β | No limit |
Data Types Support
β Supported Types:
- String
- Integer
- Float
- Boolean
- Date
- Datetime
- Decimal
- Arrays
- Map/Dict types (Dict[K, V])
- Optional/Nullable types
- Struct/Record types (Athena/Trino/BigQuery - using dataclasses or Pydantic models)
β Not Yet Supported:
- Struct/Record types for Redshift and Snowflake
- Nested Arrays (arrays of arrays)
π Documentation
π Getting Started
Installation
# Install with specific database support
pip install sql-testing-library[bigquery]
pip install sql-testing-library[athena]
pip install sql-testing-library[redshift]
pip install sql-testing-library[trino]
pip install sql-testing-library[snowflake]
pip install sql-testing-library[duckdb]
# Or install with all database adapters
pip install sql-testing-library[all]
Configuration
Create a pytest.ini file in your project root:
[sql_testing]
adapter = bigquery # Choose your database
[sql_testing.bigquery]
project_id = my-test-project
dataset_id = test_dataset
credentials_path = path/to/credentials.json
Write Your First Test
@sql_test
def test_simple_query():
return TestCase(
query="SELECT 1 as value",
result_class=dict
)
π€ Contributing
Contributions are welcome! Please check out our Contributing Guide for details.
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
Built with β€οΈ by the data engineering community. Special thanks to all contributors.
π€ Frequently Asked Questions
How do I unit test SQL queries in Python?
Use SQL Testing Library with pytest to write unit tests for SQL queries. The library injects mock data via CTEs or temporary tables, allowing you to test query logic without accessing real databases. Perfect for testing BigQuery, Snowflake, Redshift, Athena, Trino, and DuckDB queries.
Can I test BigQuery SQL queries without a BigQuery account?
Yes! SQL Testing Library creates temporary mock data within your BigQuery project, so you only need access to a test project. No production data needed. The library works with BigQueryβs free tier.
How do I test Snowflake SQL queries locally?
Configure the Snowflake adapter in pytest.ini with your test warehouse credentials. The library creates temporary tables that auto-cleanup after each test. Alternatively, use DuckDB adapter for fully local testing with similar SQL syntax.
Whatβs the best way to test ETL pipelines?
Use SQL Testing Library to test individual SQL transformations with controlled mock data. This approach is faster and more reliable than end-to-end integration tests. Write pytest tests for each transformation step in your data pipeline.
Can I use this for data validation testing?
Absolutely! SQL Testing Library is perfect for testing data validation rules, business logic, and quality checks written in SQL. Test assertions, constraints, and transformations with type-safe mock data using Python dataclasses or Pydantic models.
Does this work with dbt?
Yes! You can test dbt SQL models by extracting the compiled SQL and testing it with SQL Testing Library. This provides unit-level testing for your dbt transformations before running full dbt tests.
How do I test SQL queries across multiple databases?
Write your test once and use different adapters (BigQuery, Snowflake, Redshift, etc.) by specifying adapter_type in the @sql_test decorator. Perfect for testing query compatibility when migrating between cloud databases.