Files

Phil Skentelbery eaafa78cf3 feat: add Participant and MagicToken models with automatic migrations

Implements Phase 2 infrastructure for participant registration and authentication:

Database Models:
- Add Participant model with exchange scoping and soft deletes
- Add MagicToken model for passwordless authentication
- Add participants relationship to Exchange model
- Include proper indexes and foreign key constraints

Migration Infrastructure:
- Generate Alembic migration for new models
- Create entrypoint.sh script for automatic migrations on container startup
- Update Containerfile to use entrypoint script and include uv binary
- Remove db.create_all() in favor of migration-based schema management

This establishes the foundation for implementing stories 4.1-4.3, 5.1-5.3, and 10.1.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-22 16:23:47 -07:00

12 KiB

Raw Blame History

0005. Database Migrations with Alembic

Date: 2025-12-22

Status

Accepted

Context

Sneaky Klaus uses SQLite as its database (see ADR-0001). As the application evolves, the database schema needs to change to support new features. There are two primary approaches to managing database schema:

db.create_all(): SQLAlchemy's create_all() method creates tables based on current model definitions. Simple but has critical limitations:
- Cannot modify existing tables (add/remove columns, change types)
- Cannot migrate data during schema changes
- No version tracking or rollback capability
- Unsafe for databases with existing data
Schema Migrations: Tools like Alembic track schema changes as versioned migration files:
- Supports incremental schema changes (add columns, modify constraints, etc.)
- Enables data migrations during schema evolution
- Provides version tracking and rollback capability
- Safe for production databases with existing data

Key considerations:

Phase 1 (v0.1.0) established Admin and Exchange models
Phase 2 (v0.2.0) adds Participant and MagicToken models
Future phases will continue evolving the schema
Self-hosted deployments may have persistent data from day one
Users may skip versions or upgrade incrementally

The question is: when should we start using proper database migrations?

Decision

We will use Alembic for all database schema changes starting from Phase 2 (v0.2.0) onward.

Specifically:

Alembic is already configured in the codebase (alembic.ini, migrations/ directory)
An initial migration already exists for Admin and Exchange models (created in Phase 1)
All new models and schema changes will be managed through Alembic migrations
db.create_all() must not be used for schema creation in production environments

Migration Workflow

For all schema changes:

Modify SQLAlchemy models in src/models/
Generate migration: uv run alembic revision --autogenerate -m "description"
Review the generated migration file in migrations/versions/
Test the migration (upgrade and downgrade paths)
Commit the migration file with model changes
Apply in deployments: uv run alembic upgrade head

Naming Conventions

Migration messages should be:

Descriptive and imperative: "Add Participant model", "Add email index to Participant"
Under 80 characters
Use lowercase except for model/table names
Examples:
- "Add Participant and MagicToken models"
- "Add withdrawn_at column to Participant"
- "Create composite index on exchange_id and email"

Testing Requirements

Every migration must be tested for:

Upgrade path: alembic upgrade head succeeds
Downgrade path: alembic downgrade -1 and alembic upgrade head both succeed
Schema correctness: Database schema matches SQLAlchemy model definitions
Application compatibility: All tests pass after migration

Handling Existing Databases

For databases created with db.create_all() before migrations were established:

Option 1 - Stamp (preserves data):

uv run alembic stamp head

This marks the database as being at the current migration version without running migrations.

Option 2 - Recreate (development only):

rm data/sneaky-klaus.db
uv run alembic upgrade head

This creates a fresh database from migrations. Only suitable for development.

Removing db.create_all()

The db.create_all() call currently in src/app.py should be:

Removed from production code paths
Only used in test fixtures where appropriate
Never used for schema initialization in deployments

Production deployments must use alembic upgrade head for schema initialization and updates.

Automatic Migrations for Self-Hosted Deployments

For self-hosted deployments using containers, migrations must be applied automatically when the container starts. This ensures that:

Users pulling new container images automatically get schema updates
No manual migration commands required
Schema is always in sync with application code
First-run deployments get proper schema initialization

Implementation Approach: Container Entrypoint Script

An entrypoint script runs alembic upgrade head before starting the application server. This approach is chosen because:

Timing: Migrations run before application starts, avoiding race conditions
Separation of concerns: Database initialization is separate from application startup
Clear error handling: Migration failures prevent application startup
Standard pattern: Common practice for containerized applications with databases
Works with gunicorn: Gunicorn workers don't need to coordinate migrations

Entrypoint Script Responsibilities:

Run alembic upgrade head to apply all pending migrations
Log migration status (success or failure)
Exit with error if migrations fail (preventing container startup)
Start application server (gunicorn) if migrations succeed

Implementation:

#!/bin/bash
set -e  # Exit on any error

echo "Running database migrations..."
if uv run alembic upgrade head; then
    echo "Database migrations completed successfully"
else
    echo "ERROR: Database migration failed!"
    echo "Please check the logs above for details."
    exit 1
fi

echo "Starting application server..."
exec gunicorn --bind 0.0.0.0:8000 --workers 2 --threads 4 main:app

Error Handling:

Migration failures are logged to stderr
Container exits with code 1 on migration failure
Container orchestrator (podman/docker compose) will show failed state
Users can inspect logs with podman logs sneaky-klaus or docker logs sneaky-klaus

Containerfile Changes:

Copy entrypoint script: COPY entrypoint.sh /app/entrypoint.sh
Make executable: RUN chmod +x /app/entrypoint.sh
Change CMD to use entrypoint: CMD ["/app/entrypoint.sh"]

First-Run Initialization: When no database exists, alembic upgrade head will:

Create the database file (SQLite)
Create the alembic_version table to track migration state
Run all migrations from scratch
Leave database in up-to-date state

Update Scenarios: When updating to a new container image with schema changes:

Container starts and runs entrypoint script
Alembic detects current schema version from alembic_version table
Applies only new migrations (incremental upgrade)
Application starts with updated schema

Development Workflow: For local development (non-containerized), developers continue to run migrations manually:

uv run alembic upgrade head

This gives developers explicit control over when migrations run during development.

Alternative Considered: Application Startup Migrations

Running migrations in src/app.py during Flask application startup was considered but rejected:

Race conditions: Multiple gunicorn workers could try to run migrations simultaneously
Locking complexity: Would need migration locks to prevent concurrent runs
Startup delays: Application health checks might fail during migration
Error visibility: Migration failures less visible than container startup failures
Not idiomatic: Flask apps typically don't modify their own schema on startup

The entrypoint script approach is simpler, safer, and more aligned with containerized deployment best practices.

Consequences

Positive

Safe schema evolution: Can modify existing tables without data loss
Version control: Schema changes tracked in git alongside code changes
Rollback capability: Can revert problematic schema changes
Data migrations: Can transform data during schema changes (e.g., populate new required columns)
Production ready: Proper migration strategy from the start avoids migration debt
Clear deployment process: alembic upgrade head is explicit and auditable
Multi-environment support: Same migrations work across dev, staging, and production

Negative

Additional complexity: Developers must learn Alembic workflow
Migration review required: Auto-generated migrations must be reviewed for correctness
Migration discipline needed: Schema changes require creating and testing migrations
Downgrade path maintenance: Must write downgrade logic for each migration
Linear migration history: Merge conflicts in migrations can require rebasing

Neutral

Learning curve: Alembic has good documentation but requires initial learning
Migration conflicts: Multiple developers changing schema simultaneously may need coordination
Test database setup: Tests may need to apply migrations rather than using create_all()

Implementation Notes

Phase 2 Implementation

For Phase 2 (v0.2.0), the developer should:

Create Participant and MagicToken models in src/models/

Generate migration:

uv run alembic revision --autogenerate -m "Add Participant and MagicToken models"

Review the generated migration file:
- Verify all new tables, columns, and indexes are included
- Check foreign key constraints are correct
- Ensure indexes are created for performance-critical queries

Test the migration:

# Test upgrade
uv run alembic upgrade head

# Test downgrade (optional but recommended)
uv run alembic downgrade -1
uv run alembic upgrade head

Run application tests to verify compatibility
Commit migration file with model changes

Migration File Structure

Migration files are in migrations/versions/ and follow this structure:

"""Add Participant and MagicToken models

Revision ID: abc123def456
Revises: eeff6e1a89cd
Create Date: 2025-12-22 10:30:00.000000

"""
from alembic import op
import sqlalchemy as sa

# revision identifiers
revision = 'abc123def456'
down_revision = 'eeff6e1a89cd'
branch_labels = None
depends_on = None

def upgrade():
    # Schema changes for upgrade
    op.create_table('participant',
        sa.Column('id', sa.Integer(), nullable=False),
        # ... other columns
    )

def downgrade():
    # Schema changes for rollback
    op.drop_table('participant')

Alembic Configuration

Alembic is configured via alembic.ini:

Migration directory: migrations/
SQLAlchemy URL: Configured dynamically from Flask config in migrations/env.py
Auto-generate support: Enabled

Documentation Updates

The following documentation has been updated to reflect this decision:

Phase 2 Implementation Decisions (section 9.1)
Data Model v0.2.0 (Migration Strategy section)
System Architecture Overview v0.2.0 (Database Layer section)

Alternatives Considered

Continue using db.create_all()

Rejected: While simpler initially, db.create_all() cannot handle schema evolution. Since:

Alembic infrastructure already exists in the codebase
We expect ongoing schema evolution across multiple phases
Self-hosted deployments may have persistent data
Production-ready approach prevents migration debt

Starting with Alembic now is the right choice despite the added complexity.

Manual SQL migrations

Rejected: Writing raw SQL migrations is error-prone and doesn't integrate with SQLAlchemy models. Alembic's autogenerate feature significantly reduces migration creation effort while maintaining safety.

Django-style migrations

Rejected: Django's migration system is tightly coupled to Django ORM. Alembic is the standard for SQLAlchemy-based applications and integrates well with Flask.

Defer migrations until schema is stable

Rejected: The schema will evolve continuously as new features are added. Deferring migrations creates migration debt and makes it harder to support existing deployments. Starting with migrations from Phase 2 establishes good patterns early.

References

Alembic documentation: https://alembic.sqlalchemy.org/
SQLAlchemy documentation: https://docs.sqlalchemy.org/
ADR-0001: Core Technology Stack
Phase 2 Implementation Decisions (section 9.1)
Data Model v0.2.0

12 KiB Raw Blame History