Database Migrations#

This guide explains how to use database migrations in the smartem-decisions project.

Overview#

We use Alembic for database schema versioning and migrations. Alembic integrates with our SQLModel-based database models to provide:

Schema versioning: Track database changes over time
Automated migration generation: Generate migrations from model changes
Rollback support: Safely revert database changes
Environment-specific deployments: Apply migrations across development, staging, and production

Prerequisites#

Ensure you have the backend dependencies installed:

pip install -e .[backend]

This includes Alembic as a dependency.

Basic Usage#

Initial Database Setup#

For a completely new database, run the following command to apply all migrations from scratch:

python -m alembic upgrade head

This will:

Create the Alembic version tracking table
Apply migration 001: Create user preferences table
Apply baseline migration (6e6302f1ccb6): Create all core SmartEM schema tables (acquisition, grid, gridsquare, micrograph, foilhole, atlas, quality prediction models, etc.)
Apply migration 002: Add performance indexes for acquisition datetime queries
Apply remaining migrations including agent communication tables

Running Migrations#

Apply all pending migrations to bring your database up to the latest schema:

python -m alembic upgrade head

Checking Migration Status#

See the current database version:

python -m alembic current

View migration history:

python -m alembic history --verbose

Rolling Back Migrations#

Downgrade to a specific revision:

python -m alembic downgrade <revision_id>

Rollback one migration:

python -m alembic downgrade -1

Creating New Migrations#

Auto-generating Migrations#

When you modify SQLModel classes in src/smartem_backend/model/database.py, generate a migration automatically:

python -m alembic revision --autogenerate -m "Add new field to Grid model"

Important: Always review auto-generated migrations before applying them. Alembic may not detect all changes (like column renames or complex constraints).

Manual Migrations#

For data migrations or complex schema changes, create an empty migration:

python -m alembic revision -m "Seed initial user data"

Edit the generated file in src/smartem_backend/migrations/versions/ to add your custom logic.

Migration Examples#

Structure Changes#

def upgrade() -> None:
    op.create_table('user_preferences',
        sa.Column('id', sa.Integer(), nullable=False),
        sa.Column('user_id', sa.String(), nullable=False),
        sa.Column('preference_key', sa.String(255), nullable=False),
        sa.Column('preference_value', postgresql.JSONB(), nullable=True),
        sa.PrimaryKeyConstraint('id')
    )

def downgrade() -> None:
    op.drop_table('user_preferences')

Index Changes#

def upgrade() -> None:
    op.create_index(
        'idx_gridsquare_acquisition_datetime_status',
        'gridsquare',
        ['acquisition_datetime', 'status'],
        postgresql_using='btree'
    )

def downgrade() -> None:
    op.drop_index('idx_gridsquare_acquisition_datetime_status')

Data Changes#

from sqlalchemy.sql import table, column

def upgrade() -> None:
    user_preferences = table('user_preferences',
        column('user_id', sa.String),
        column('preference_key', sa.String),
        column('preference_value', sa.JSON)
    )
    
    op.bulk_insert(user_preferences, [
        {
            'user_id': 'system',
            'preference_key': 'default_settings',
            'preference_value': {'theme': 'light', 'page_size': 50}
        }
    ])

def downgrade() -> None:
    op.execute("DELETE FROM user_preferences WHERE user_id = 'system'")

Best Practices#

Migration Safety#

Always backup production data before running migrations
Test migrations on a copy of production data first
Review auto-generated migrations carefully before applying
Write reversible migrations with proper downgrade logic

Schema Changes#

Add columns as nullable first, then make them non-nullable in a separate migration if needed
Use separate migrations for structure and data changes
Include appropriate indexes for performance-critical fields
Document breaking changes in migration comments

Development Workflow#

Create feature branch for database changes
Generate migration after modifying models
Test migration locally (up and down)
Review migration code before committing
Coordinate with team for schema changes affecting multiple developers

Deployment#

Production Deployment#

# 1. Backup database
pg_dump smartem_production > backup_$(date +%Y%m%d_%H%M%S).sql

# 2. Apply migrations
python -m alembic upgrade head

# 3. Verify application works
# 4. Clean up old backups (optional)

Staging Environment#

# Apply migrations to staging first
python -m alembic upgrade head

# Test application functionality
# Run integration tests

Troubleshooting#

Common Issues#

Migration conflicts: When multiple developers create migrations simultaneously:

# Resolve by creating a merge migration
python -m alembic merge <rev1> <rev2> -m "Merge migrations"

Failed migration: If a migration fails partway through:

# Check current state
python -m alembic current

# Fix data/schema issues manually
# Mark migration as complete (if safe)
python -m alembic stamp <revision_id>

Model out of sync: When models don’t match database:

# Generate migration to sync
python -m alembic revision --autogenerate -m "Sync models with database"

Migration-Specific Issues#

Missing baseline schema: If you encounter errors like “relation ‘gridsquare’ does not exist” when running migrations:

This indicates that migration 002 is trying to create indexes on tables that don’t exist yet. This happens when the baseline schema migration is missing or not properly applied. The fix is to ensure the baseline migration (6e6302f1ccb6) is applied before migration 002.

# Solution: Run the complete migration sequence from scratch
python -m alembic downgrade base
python -m alembic upgrade head

Enum type conflicts: If you encounter “type already exists” errors:

This can happen when enum types exist from previous schema creation attempts but don’t match the migration expectations.

# Clean up existing enum types and restart migrations
python -m alembic downgrade base
# Then apply migrations fresh
python -m alembic upgrade head

Database Connection Issues#

The migration system uses your existing database connection configuration from smartem_backend.utils.setup_postgres_connection().

Ensure your environment variables are set:

Database connection parameters
PostgreSQL credentials
Network access to database server

Files and Structure#

├── alembic.ini                              # Alembic configuration
├── src/smartem_backend/migrations/
│   ├── env.py                              # Migration environment
│   ├── script.py.mako                      # Migration template
│   └── versions/                           # Individual migration files
│       ├── 001_add_user_preferences.py
│       ├── 002_add_acquisition_time_index.py
│       └── 003_seed_system_config.py