Files

Phil Skentelbery 3ed77fd45f fix: Resolve database migration failure on existing databases

Fixes critical issue where migration 002 indexes already existed in SCHEMA_SQL,
causing 'index already exists' errors on databases created before v1.0.0-rc.1.

Changes:
- Removed duplicate index definitions from SCHEMA_SQL (database.py)
- Enhanced migration system to detect and handle indexes properly
- Added comprehensive documentation of the fix

Version bumped to 1.0.0-rc.2 with full changelog entry.

Refs: docs/reports/2025-11-24-migration-fix-v1.0.0-rc.2.md

2025-11-24 13:11:14 -07:00

4.8 KiB

Raw Blame History

ADR-031: Database Migration System Redesign

Status

Proposed

Context

The v1.0.0-rc.1 release exposed a critical flaw in our database initialization and migration system. The system fails when upgrading existing production databases because:

SCHEMA_SQL represents the current (latest) schema structure
SCHEMA_SQL is executed BEFORE migrations run
Existing databases have old table structures that conflict with SCHEMA_SQL's expectations
The system tries to create indexes on columns that don't exist yet

This creates an impossible situation where:

Fresh databases work fine (SCHEMA_SQL creates the latest structure)
Existing databases fail (SCHEMA_SQL conflicts with old structure)

Decision

Redesign the database initialization system to follow these principles:

SCHEMA_SQL represents the initial v0.1.0 schema, not the current schema
All schema evolution happens through migrations
Migrations run BEFORE schema creation attempts
Fresh databases get the initial schema then run ALL migrations

Implementation Strategy

Phase 1: Immediate Fix (v1.0.1)

Remove problematic index creation from SCHEMA_SQL since migrations create them:

# Remove from SCHEMA_SQL:
# CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
# Let migration 002 handle this

Phase 2: Proper Redesign (v1.1.0)

Create INITIAL_SCHEMA_SQL with the v0.1.0 database structure

Modify init_db() logic:

def init_db(app=None):
    # 1. Check if database exists and has tables
    if database_exists_with_tables():
        # Existing database - only run migrations
        run_migrations()
    else:
        # Fresh database - create initial schema then migrate
        conn.executescript(INITIAL_SCHEMA_SQL)
        run_all_migrations()

Add explicit schema versioning:

CREATE TABLE schema_info (
    version TEXT PRIMARY KEY,
    upgraded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Rationale

Why Initial Schema + Migrations?

Predictable upgrade path: Every database follows the same evolution
Testable: Can test upgrades from any version to any version
Auditable: Migration history shows exact evolution path
Reversible: Can potentially support rollbacks
Industry standard: Follows patterns from Rails, Django, Alembic

Why Current Approach Failed

Dual source of truth: Schema defined in both SCHEMA_SQL and migrations
Temporal coupling: SCHEMA_SQL assumes post-migration state
No upgrade path: Can't get from old state to new state
Hidden dependencies: Index creation depends on migration execution

Consequences

Positive

Reliable database upgrades from any version
Clear separation of concerns (initial vs evolution)
Easier to test migration paths
Follows established patterns
Supports future rollback capabilities

Negative

Requires maintaining historical schema (INITIAL_SCHEMA_SQL)
Fresh databases take longer to initialize (run all migrations)
More complex initialization logic
Need to reconstruct v0.1.0 schema

Migration Path

v1.0.1: Quick fix - remove conflicting indexes from SCHEMA_SQL
v1.0.1: Add manual upgrade instructions for production
v1.1.0: Implement full redesign with INITIAL_SCHEMA_SQL
v1.1.0: Add comprehensive migration testing

Alternatives Considered

1. Dynamic Schema Detection

Approach: Detect existing table structure and conditionally apply indexes

Rejected because:

Complex conditional logic
Fragile heuristics
Doesn't solve root cause
Hard to test all paths

2. Schema Snapshots

Approach: Maintain schema snapshots for each version, apply appropriate one

Rejected because:

Maintenance burden
Storage overhead
Complex version detection
Still doesn't provide upgrade path

3. Migration-Only Schema

Approach: No SCHEMA_SQL at all, everything through migrations

Rejected because:

Slower fresh installations
Need to maintain migration 000 as "initial schema"
Harder to see current schema structure
Goes against SQLite's lightweight philosophy

References

Rails Database Migrations
Django Migrations
Alembic Documentation
Production incident: v1.0.0-rc.1 deployment failure
/docs/reports/migration-failure-diagnosis-v1.0.0-rc.1.md

Implementation Checklist

Create INITIAL_SCHEMA_SQL from v0.1.0 structure
Modify init_db() to check database state
Update migration runner to handle fresh databases
Add schema_info table for version tracking
Create migration test suite
Document upgrade procedures
Test upgrade paths from all released versions

4.8 KiB Raw Blame History

ADR-031: Database Migration System Redesign

Status

Context

Decision

Implementation Strategy

Phase 1: Immediate Fix (v1.0.1)

Phase 2: Proper Redesign (v1.1.0)

Rationale

Why Initial Schema + Migrations?

Why Current Approach Failed

Consequences

Positive

Negative

Migration Path

Alternatives Considered

1. Dynamic Schema Detection

2. Schema Snapshots

3. Migration-Only Schema

References

Implementation Checklist

4.8 KiB

Raw Blame History