Files
StarPunk/docs/decisions/ADR-031-database-migration-system-redesign.md
Phil Skentelbery 3ed77fd45f fix: Resolve database migration failure on existing databases
Fixes critical issue where migration 002 indexes already existed in SCHEMA_SQL,
causing 'index already exists' errors on databases created before v1.0.0-rc.1.

Changes:
- Removed duplicate index definitions from SCHEMA_SQL (database.py)
- Enhanced migration system to detect and handle indexes properly
- Added comprehensive documentation of the fix

Version bumped to 1.0.0-rc.2 with full changelog entry.

Refs: docs/reports/2025-11-24-migration-fix-v1.0.0-rc.2.md
2025-11-24 13:11:14 -07:00

144 lines
4.8 KiB
Markdown

# ADR-031: Database Migration System Redesign
## Status
Proposed
## Context
The v1.0.0-rc.1 release exposed a critical flaw in our database initialization and migration system. The system fails when upgrading existing production databases because:
1. `SCHEMA_SQL` represents the current (latest) schema structure
2. `SCHEMA_SQL` is executed BEFORE migrations run
3. Existing databases have old table structures that conflict with SCHEMA_SQL's expectations
4. The system tries to create indexes on columns that don't exist yet
This creates an impossible situation where:
- Fresh databases work fine (SCHEMA_SQL creates the latest structure)
- Existing databases fail (SCHEMA_SQL conflicts with old structure)
## Decision
Redesign the database initialization system to follow these principles:
1. **SCHEMA_SQL represents the initial v0.1.0 schema**, not the current schema
2. **All schema evolution happens through migrations**
3. **Migrations run BEFORE schema creation attempts**
4. **Fresh databases get the initial schema then run ALL migrations**
### Implementation Strategy
#### Phase 1: Immediate Fix (v1.0.1)
Remove problematic index creation from SCHEMA_SQL since migrations create them:
```python
# Remove from SCHEMA_SQL:
# CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
# Let migration 002 handle this
```
#### Phase 2: Proper Redesign (v1.1.0)
1. Create `INITIAL_SCHEMA_SQL` with the v0.1.0 database structure
2. Modify `init_db()` logic:
```python
def init_db(app=None):
# 1. Check if database exists and has tables
if database_exists_with_tables():
# Existing database - only run migrations
run_migrations()
else:
# Fresh database - create initial schema then migrate
conn.executescript(INITIAL_SCHEMA_SQL)
run_all_migrations()
```
3. Add explicit schema versioning:
```sql
CREATE TABLE schema_info (
version TEXT PRIMARY KEY,
upgraded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```
## Rationale
### Why Initial Schema + Migrations?
1. **Predictable upgrade path**: Every database follows the same evolution
2. **Testable**: Can test upgrades from any version to any version
3. **Auditable**: Migration history shows exact evolution path
4. **Reversible**: Can potentially support rollbacks
5. **Industry standard**: Follows patterns from Rails, Django, Alembic
### Why Current Approach Failed
1. **Dual source of truth**: Schema defined in both SCHEMA_SQL and migrations
2. **Temporal coupling**: SCHEMA_SQL assumes post-migration state
3. **No upgrade path**: Can't get from old state to new state
4. **Hidden dependencies**: Index creation depends on migration execution
## Consequences
### Positive
- Reliable database upgrades from any version
- Clear separation of concerns (initial vs evolution)
- Easier to test migration paths
- Follows established patterns
- Supports future rollback capabilities
### Negative
- Requires maintaining historical schema (INITIAL_SCHEMA_SQL)
- Fresh databases take longer to initialize (run all migrations)
- More complex initialization logic
- Need to reconstruct v0.1.0 schema
### Migration Path
1. v1.0.1: Quick fix - remove conflicting indexes from SCHEMA_SQL
2. v1.0.1: Add manual upgrade instructions for production
3. v1.1.0: Implement full redesign with INITIAL_SCHEMA_SQL
4. v1.1.0: Add comprehensive migration testing
## Alternatives Considered
### 1. Dynamic Schema Detection
**Approach**: Detect existing table structure and conditionally apply indexes
**Rejected because**:
- Complex conditional logic
- Fragile heuristics
- Doesn't solve root cause
- Hard to test all paths
### 2. Schema Snapshots
**Approach**: Maintain schema snapshots for each version, apply appropriate one
**Rejected because**:
- Maintenance burden
- Storage overhead
- Complex version detection
- Still doesn't provide upgrade path
### 3. Migration-Only Schema
**Approach**: No SCHEMA_SQL at all, everything through migrations
**Rejected because**:
- Slower fresh installations
- Need to maintain migration 000 as "initial schema"
- Harder to see current schema structure
- Goes against SQLite's lightweight philosophy
## References
- [Rails Database Migrations](https://guides.rubyonrails.org/active_record_migrations.html)
- [Django Migrations](https://docs.djangoproject.com/en/stable/topics/migrations/)
- [Alembic Documentation](https://alembic.sqlalchemy.org/)
- Production incident: v1.0.0-rc.1 deployment failure
- `/docs/reports/migration-failure-diagnosis-v1.0.0-rc.1.md`
## Implementation Checklist
- [ ] Create INITIAL_SCHEMA_SQL from v0.1.0 structure
- [ ] Modify init_db() to check database state
- [ ] Update migration runner to handle fresh databases
- [ ] Add schema_info table for version tracking
- [ ] Create migration test suite
- [ ] Document upgrade procedures
- [ ] Test upgrade paths from all released versions