Getting Started
Table of contents
- Installation
- Configuration
- Writing Your First Test
- Understanding the Basics
- Common Patterns
- Next Steps
- Quick Tips
Installation
The SQL Testing Library supports multiple database backends. Install only what you need:
Install with specific database support
# For BigQuery
pip install sql-testing-library[bigquery]
# For Athena
pip install sql-testing-library[athena]
# For Redshift
pip install sql-testing-library[redshift]
# For Trino
pip install sql-testing-library[trino]
# For Snowflake
pip install sql-testing-library[snowflake]
Install with all database adapters
pip install sql-testing-library[all]
Development installation
If you’re contributing to the library:
# Clone the repository
git clone https://github.com/gurmeetsaran/sqltesting.git
cd sqltesting
# Install with poetry
poetry install --with bigquery,athena,redshift,trino,snowflake,dev
Configuration
The library uses pytest configuration files to manage database connections. Create a pytest.ini
file in your project root:
Basic configuration
[sql_testing]
adapter = bigquery # Choose: bigquery, athena, redshift, trino, or snowflake
Database-specific configuration
BigQuery
[sql_testing.bigquery]
project_id = my-test-project
dataset_id = test_dataset
credentials_path = path/to/service-account-key.json
Athena
[sql_testing.athena]
database = test_database
s3_output_location = s3://my-athena-results/
region = us-west-2
# Optional: if not using default AWS credentials
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_KEY
Redshift
[sql_testing.redshift]
host = redshift-cluster.region.redshift.amazonaws.com
database = test_database
user = redshift_user
password = redshift_password
port = 5439 # Optional, defaults to 5439
Trino
[sql_testing.trino]
host = trino-server.example.com
port = 8080 # Optional, defaults to 8080
user = trino_user
catalog = memory # Optional, defaults to 'memory'
schema = default # Optional, defaults to 'default'
http_scheme = http # Optional, use 'https' for secure connections
# For Basic Authentication:
auth_type = basic
password = trino_password
# For JWT Authentication:
# auth_type = jwt
# token = your_jwt_token
Snowflake
[sql_testing.snowflake]
account = account-identifier
user = snowflake_user
database = test_database
schema = PUBLIC # Optional, defaults to 'PUBLIC'
warehouse = compute_wh # Required
role = my_role # Optional
# Authentication (choose one):
# Option 1: Key-pair authentication (recommended for MFA)
private_key_path = /path/to/private_key.pem
# Or use environment variable SNOWFLAKE_PRIVATE_KEY
# Option 2: Password authentication (for accounts without MFA)
password = snowflake_password
Writing Your First Test
1. Create a mock table
from dataclasses import dataclass
from sql_testing_library.mock_table import BaseMockTable
@dataclass
class User:
user_id: int
name: str
email: str
active: bool
class UsersMockTable(BaseMockTable):
def get_database_name(self) -> str:
return "test_db" # Your test database/dataset
def get_table_name(self) -> str:
return "users"
2. Write a test with the @sql_test decorator
from sql_testing_library import sql_test, TestCase
from pydantic import BaseModel
# Define result model
class ActiveUserResult(BaseModel):
user_id: int
name: str
# Create test
@sql_test(
mock_tables=[
UsersMockTable([
User(1, "Alice", "alice@example.com", True),
User(2, "Bob", "bob@example.com", False),
User(3, "Charlie", "charlie@example.com", True)
])
],
result_class=ActiveUserResult
)
def test_active_users():
return TestCase(
query="""
SELECT user_id, name
FROM users
WHERE active = true
ORDER BY user_id
""",
default_namespace="test_db"
)
3. Run your test
# Run all tests
pytest test_users.py
# Run only SQL tests (using the sql_test marker)
pytest -m sql_test
# Exclude SQL tests from your test run
pytest -m "not sql_test"
# Run a specific test
pytest test_users.py::test_active_users -v
# With poetry
poetry run pytest test_users.py
# Combine markers with other pytest options
pytest -m sql_test -v --tb=short
Note: The @sql_test
decorator automatically adds a pytest marker to your tests, making it easy to run or exclude SQL tests from your test suite.
Understanding the Basics
Mock Tables
Mock tables represent your database tables with test data:
- Inherit from
BaseMockTable
- Implement
get_database_name()
andget_table_name()
- Pass data as a list of dataclasses, dictionaries, or objects
Test Cases
The TestCase
class defines your SQL test:
query
: The SQL query to testdefault_namespace
: Database context for unqualified table namesmock_tables
: List of mock tables (can be in decorator or TestCase)result_class
: Class for deserializing results (dataclass or Pydantic)
Result Classes
Define expected results using:
- Python dataclasses
- Pydantic models
- Dict (for simple key-value results)
Common Patterns
Pattern 1: All configuration in decorator
@sql_test(
mock_tables=[users_mock, orders_mock],
result_class=OrderSummary
)
def test_order_summary():
return TestCase(
query="SELECT * FROM orders JOIN users ON orders.user_id = users.id",
default_namespace="test_db"
)
Pattern 2: All configuration in TestCase
@sql_test() # Empty decorator
def test_order_summary():
return TestCase(
query="SELECT * FROM orders JOIN users ON orders.user_id = users.id",
default_namespace="test_db",
mock_tables=[users_mock, orders_mock],
result_class=OrderSummary
)
Pattern 3: Using physical tables for large datasets
@sql_test(
mock_tables=[large_dataset_mock],
result_class=ResultClass,
use_physical_tables=True # Force physical tables
)
def test_large_dataset():
return TestCase(
query="SELECT * FROM large_table",
default_namespace="test_db"
)
Next Steps
- Learn about database adapters
- Explore advanced examples
- Read the API reference
- Debug with SQL logging
- Troubleshooting guide
Quick Tips
- Start Simple: Begin with basic queries and gradually add complexity
- Use Type Hints: Leverage dataclasses and Pydantic for type safety
- Test Incrementally: Test individual CTEs and subqueries separately
- Mock Realistically: Use representative test data that matches production schemas
- Check Query Plans: Use
EXPLAIN
to understand how your queries execute