Files
sneakyklaus/docs/decisions/0005-database-migrations.md
Phil Skentelbery eaafa78cf3 feat: add Participant and MagicToken models with automatic migrations
Implements Phase 2 infrastructure for participant registration and authentication:

Database Models:
- Add Participant model with exchange scoping and soft deletes
- Add MagicToken model for passwordless authentication
- Add participants relationship to Exchange model
- Include proper indexes and foreign key constraints

Migration Infrastructure:
- Generate Alembic migration for new models
- Create entrypoint.sh script for automatic migrations on container startup
- Update Containerfile to use entrypoint script and include uv binary
- Remove db.create_all() in favor of migration-based schema management

This establishes the foundation for implementing stories 4.1-4.3, 5.1-5.3, and 10.1.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 16:23:47 -07:00

12 KiB

0005. Database Migrations with Alembic

Date: 2025-12-22

Status

Accepted

Context

Sneaky Klaus uses SQLite as its database (see ADR-0001). As the application evolves, the database schema needs to change to support new features. There are two primary approaches to managing database schema:

  1. db.create_all(): SQLAlchemy's create_all() method creates tables based on current model definitions. Simple but has critical limitations:

    • Cannot modify existing tables (add/remove columns, change types)
    • Cannot migrate data during schema changes
    • No version tracking or rollback capability
    • Unsafe for databases with existing data
  2. Schema Migrations: Tools like Alembic track schema changes as versioned migration files:

    • Supports incremental schema changes (add columns, modify constraints, etc.)
    • Enables data migrations during schema evolution
    • Provides version tracking and rollback capability
    • Safe for production databases with existing data

Key considerations:

  • Phase 1 (v0.1.0) established Admin and Exchange models
  • Phase 2 (v0.2.0) adds Participant and MagicToken models
  • Future phases will continue evolving the schema
  • Self-hosted deployments may have persistent data from day one
  • Users may skip versions or upgrade incrementally

The question is: when should we start using proper database migrations?

Decision

We will use Alembic for all database schema changes starting from Phase 2 (v0.2.0) onward.

Specifically:

  1. Alembic is already configured in the codebase (alembic.ini, migrations/ directory)
  2. An initial migration already exists for Admin and Exchange models (created in Phase 1)
  3. All new models and schema changes will be managed through Alembic migrations
  4. db.create_all() must not be used for schema creation in production environments

Migration Workflow

For all schema changes:

  1. Modify SQLAlchemy models in src/models/
  2. Generate migration: uv run alembic revision --autogenerate -m "description"
  3. Review the generated migration file in migrations/versions/
  4. Test the migration (upgrade and downgrade paths)
  5. Commit the migration file with model changes
  6. Apply in deployments: uv run alembic upgrade head

Naming Conventions

Migration messages should be:

  • Descriptive and imperative: "Add Participant model", "Add email index to Participant"
  • Under 80 characters
  • Use lowercase except for model/table names
  • Examples:
    • "Add Participant and MagicToken models"
    • "Add withdrawn_at column to Participant"
    • "Create composite index on exchange_id and email"

Testing Requirements

Every migration must be tested for:

  1. Upgrade path: alembic upgrade head succeeds
  2. Downgrade path: alembic downgrade -1 and alembic upgrade head both succeed
  3. Schema correctness: Database schema matches SQLAlchemy model definitions
  4. Application compatibility: All tests pass after migration

Handling Existing Databases

For databases created with db.create_all() before migrations were established:

Option 1 - Stamp (preserves data):

uv run alembic stamp head

This marks the database as being at the current migration version without running migrations.

Option 2 - Recreate (development only):

rm data/sneaky-klaus.db
uv run alembic upgrade head

This creates a fresh database from migrations. Only suitable for development.

Removing db.create_all()

The db.create_all() call currently in src/app.py should be:

  • Removed from production code paths
  • Only used in test fixtures where appropriate
  • Never used for schema initialization in deployments

Production deployments must use alembic upgrade head for schema initialization and updates.

Automatic Migrations for Self-Hosted Deployments

For self-hosted deployments using containers, migrations must be applied automatically when the container starts. This ensures that:

  • Users pulling new container images automatically get schema updates
  • No manual migration commands required
  • Schema is always in sync with application code
  • First-run deployments get proper schema initialization

Implementation Approach: Container Entrypoint Script

An entrypoint script runs alembic upgrade head before starting the application server. This approach is chosen because:

  • Timing: Migrations run before application starts, avoiding race conditions
  • Separation of concerns: Database initialization is separate from application startup
  • Clear error handling: Migration failures prevent application startup
  • Standard pattern: Common practice for containerized applications with databases
  • Works with gunicorn: Gunicorn workers don't need to coordinate migrations

Entrypoint Script Responsibilities:

  1. Run alembic upgrade head to apply all pending migrations
  2. Log migration status (success or failure)
  3. Exit with error if migrations fail (preventing container startup)
  4. Start application server (gunicorn) if migrations succeed

Implementation:

#!/bin/bash
set -e  # Exit on any error

echo "Running database migrations..."
if uv run alembic upgrade head; then
    echo "Database migrations completed successfully"
else
    echo "ERROR: Database migration failed!"
    echo "Please check the logs above for details."
    exit 1
fi

echo "Starting application server..."
exec gunicorn --bind 0.0.0.0:8000 --workers 2 --threads 4 main:app

Error Handling:

  • Migration failures are logged to stderr
  • Container exits with code 1 on migration failure
  • Container orchestrator (podman/docker compose) will show failed state
  • Users can inspect logs with podman logs sneaky-klaus or docker logs sneaky-klaus

Containerfile Changes:

  • Copy entrypoint script: COPY entrypoint.sh /app/entrypoint.sh
  • Make executable: RUN chmod +x /app/entrypoint.sh
  • Change CMD to use entrypoint: CMD ["/app/entrypoint.sh"]

First-Run Initialization: When no database exists, alembic upgrade head will:

  1. Create the database file (SQLite)
  2. Create the alembic_version table to track migration state
  3. Run all migrations from scratch
  4. Leave database in up-to-date state

Update Scenarios: When updating to a new container image with schema changes:

  1. Container starts and runs entrypoint script
  2. Alembic detects current schema version from alembic_version table
  3. Applies only new migrations (incremental upgrade)
  4. Application starts with updated schema

Development Workflow: For local development (non-containerized), developers continue to run migrations manually:

uv run alembic upgrade head

This gives developers explicit control over when migrations run during development.

Alternative Considered: Application Startup Migrations

Running migrations in src/app.py during Flask application startup was considered but rejected:

  • Race conditions: Multiple gunicorn workers could try to run migrations simultaneously
  • Locking complexity: Would need migration locks to prevent concurrent runs
  • Startup delays: Application health checks might fail during migration
  • Error visibility: Migration failures less visible than container startup failures
  • Not idiomatic: Flask apps typically don't modify their own schema on startup

The entrypoint script approach is simpler, safer, and more aligned with containerized deployment best practices.

Consequences

Positive

  • Safe schema evolution: Can modify existing tables without data loss
  • Version control: Schema changes tracked in git alongside code changes
  • Rollback capability: Can revert problematic schema changes
  • Data migrations: Can transform data during schema changes (e.g., populate new required columns)
  • Production ready: Proper migration strategy from the start avoids migration debt
  • Clear deployment process: alembic upgrade head is explicit and auditable
  • Multi-environment support: Same migrations work across dev, staging, and production

Negative

  • Additional complexity: Developers must learn Alembic workflow
  • Migration review required: Auto-generated migrations must be reviewed for correctness
  • Migration discipline needed: Schema changes require creating and testing migrations
  • Downgrade path maintenance: Must write downgrade logic for each migration
  • Linear migration history: Merge conflicts in migrations can require rebasing

Neutral

  • Learning curve: Alembic has good documentation but requires initial learning
  • Migration conflicts: Multiple developers changing schema simultaneously may need coordination
  • Test database setup: Tests may need to apply migrations rather than using create_all()

Implementation Notes

Phase 2 Implementation

For Phase 2 (v0.2.0), the developer should:

  1. Create Participant and MagicToken models in src/models/
  2. Generate migration:
    uv run alembic revision --autogenerate -m "Add Participant and MagicToken models"
    
  3. Review the generated migration file:
    • Verify all new tables, columns, and indexes are included
    • Check foreign key constraints are correct
    • Ensure indexes are created for performance-critical queries
  4. Test the migration:
    # Test upgrade
    uv run alembic upgrade head
    
    # Test downgrade (optional but recommended)
    uv run alembic downgrade -1
    uv run alembic upgrade head
    
  5. Run application tests to verify compatibility
  6. Commit migration file with model changes

Migration File Structure

Migration files are in migrations/versions/ and follow this structure:

"""Add Participant and MagicToken models

Revision ID: abc123def456
Revises: eeff6e1a89cd
Create Date: 2025-12-22 10:30:00.000000

"""
from alembic import op
import sqlalchemy as sa

# revision identifiers
revision = 'abc123def456'
down_revision = 'eeff6e1a89cd'
branch_labels = None
depends_on = None

def upgrade():
    # Schema changes for upgrade
    op.create_table('participant',
        sa.Column('id', sa.Integer(), nullable=False),
        # ... other columns
    )

def downgrade():
    # Schema changes for rollback
    op.drop_table('participant')

Alembic Configuration

Alembic is configured via alembic.ini:

  • Migration directory: migrations/
  • SQLAlchemy URL: Configured dynamically from Flask config in migrations/env.py
  • Auto-generate support: Enabled

Documentation Updates

The following documentation has been updated to reflect this decision:

  • Phase 2 Implementation Decisions (section 9.1)
  • Data Model v0.2.0 (Migration Strategy section)
  • System Architecture Overview v0.2.0 (Database Layer section)

Alternatives Considered

Continue using db.create_all()

Rejected: While simpler initially, db.create_all() cannot handle schema evolution. Since:

  • Alembic infrastructure already exists in the codebase
  • We expect ongoing schema evolution across multiple phases
  • Self-hosted deployments may have persistent data
  • Production-ready approach prevents migration debt

Starting with Alembic now is the right choice despite the added complexity.

Manual SQL migrations

Rejected: Writing raw SQL migrations is error-prone and doesn't integrate with SQLAlchemy models. Alembic's autogenerate feature significantly reduces migration creation effort while maintaining safety.

Django-style migrations

Rejected: Django's migration system is tightly coupled to Django ORM. Alembic is the standard for SQLAlchemy-based applications and integrates well with Flask.

Defer migrations until schema is stable

Rejected: The schema will evolve continuously as new features are added. Deferring migrations creates migration debt and makes it harder to support existing deployments. Starting with migrations from Phase 2 establishes good patterns early.

References