Fixes critical issue where migration 002 indexes already existed in SCHEMA_SQL, causing 'index already exists' errors on databases created before v1.0.0-rc.1. Changes: - Removed duplicate index definitions from SCHEMA_SQL (database.py) - Enhanced migration system to detect and handle indexes properly - Added comprehensive documentation of the fix Version bumped to 1.0.0-rc.2 with full changelog entry. Refs: docs/reports/2025-11-24-migration-fix-v1.0.0-rc.2.md
4.8 KiB
4.8 KiB
ADR-031: Database Migration System Redesign
Status
Proposed
Context
The v1.0.0-rc.1 release exposed a critical flaw in our database initialization and migration system. The system fails when upgrading existing production databases because:
SCHEMA_SQLrepresents the current (latest) schema structureSCHEMA_SQLis executed BEFORE migrations run- Existing databases have old table structures that conflict with SCHEMA_SQL's expectations
- The system tries to create indexes on columns that don't exist yet
This creates an impossible situation where:
- Fresh databases work fine (SCHEMA_SQL creates the latest structure)
- Existing databases fail (SCHEMA_SQL conflicts with old structure)
Decision
Redesign the database initialization system to follow these principles:
- SCHEMA_SQL represents the initial v0.1.0 schema, not the current schema
- All schema evolution happens through migrations
- Migrations run BEFORE schema creation attempts
- Fresh databases get the initial schema then run ALL migrations
Implementation Strategy
Phase 1: Immediate Fix (v1.0.1)
Remove problematic index creation from SCHEMA_SQL since migrations create them:
# Remove from SCHEMA_SQL:
# CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
# Let migration 002 handle this
Phase 2: Proper Redesign (v1.1.0)
-
Create
INITIAL_SCHEMA_SQLwith the v0.1.0 database structure -
Modify
init_db()logic:def init_db(app=None): # 1. Check if database exists and has tables if database_exists_with_tables(): # Existing database - only run migrations run_migrations() else: # Fresh database - create initial schema then migrate conn.executescript(INITIAL_SCHEMA_SQL) run_all_migrations() -
Add explicit schema versioning:
CREATE TABLE schema_info ( version TEXT PRIMARY KEY, upgraded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP );
Rationale
Why Initial Schema + Migrations?
- Predictable upgrade path: Every database follows the same evolution
- Testable: Can test upgrades from any version to any version
- Auditable: Migration history shows exact evolution path
- Reversible: Can potentially support rollbacks
- Industry standard: Follows patterns from Rails, Django, Alembic
Why Current Approach Failed
- Dual source of truth: Schema defined in both SCHEMA_SQL and migrations
- Temporal coupling: SCHEMA_SQL assumes post-migration state
- No upgrade path: Can't get from old state to new state
- Hidden dependencies: Index creation depends on migration execution
Consequences
Positive
- Reliable database upgrades from any version
- Clear separation of concerns (initial vs evolution)
- Easier to test migration paths
- Follows established patterns
- Supports future rollback capabilities
Negative
- Requires maintaining historical schema (INITIAL_SCHEMA_SQL)
- Fresh databases take longer to initialize (run all migrations)
- More complex initialization logic
- Need to reconstruct v0.1.0 schema
Migration Path
- v1.0.1: Quick fix - remove conflicting indexes from SCHEMA_SQL
- v1.0.1: Add manual upgrade instructions for production
- v1.1.0: Implement full redesign with INITIAL_SCHEMA_SQL
- v1.1.0: Add comprehensive migration testing
Alternatives Considered
1. Dynamic Schema Detection
Approach: Detect existing table structure and conditionally apply indexes
Rejected because:
- Complex conditional logic
- Fragile heuristics
- Doesn't solve root cause
- Hard to test all paths
2. Schema Snapshots
Approach: Maintain schema snapshots for each version, apply appropriate one
Rejected because:
- Maintenance burden
- Storage overhead
- Complex version detection
- Still doesn't provide upgrade path
3. Migration-Only Schema
Approach: No SCHEMA_SQL at all, everything through migrations
Rejected because:
- Slower fresh installations
- Need to maintain migration 000 as "initial schema"
- Harder to see current schema structure
- Goes against SQLite's lightweight philosophy
References
- Rails Database Migrations
- Django Migrations
- Alembic Documentation
- Production incident: v1.0.0-rc.1 deployment failure
/docs/reports/migration-failure-diagnosis-v1.0.0-rc.1.md
Implementation Checklist
- Create INITIAL_SCHEMA_SQL from v0.1.0 structure
- Modify init_db() to check database state
- Update migration runner to handle fresh databases
- Add schema_info table for version tracking
- Create migration test suite
- Document upgrade procedures
- Test upgrade paths from all released versions