Fixes critical issue where migration 002 indexes already existed in SCHEMA_SQL, causing 'index already exists' errors on databases created before v1.0.0-rc.1. Changes: - Removed duplicate index definitions from SCHEMA_SQL (database.py) - Enhanced migration system to detect and handle indexes properly - Added comprehensive documentation of the fix Version bumped to 1.0.0-rc.2 with full changelog entry. Refs: docs/reports/2025-11-24-migration-fix-v1.0.0-rc.2.md
144 lines
4.8 KiB
Markdown
144 lines
4.8 KiB
Markdown
# ADR-031: Database Migration System Redesign
|
|
|
|
## Status
|
|
Proposed
|
|
|
|
## Context
|
|
|
|
The v1.0.0-rc.1 release exposed a critical flaw in our database initialization and migration system. The system fails when upgrading existing production databases because:
|
|
|
|
1. `SCHEMA_SQL` represents the current (latest) schema structure
|
|
2. `SCHEMA_SQL` is executed BEFORE migrations run
|
|
3. Existing databases have old table structures that conflict with SCHEMA_SQL's expectations
|
|
4. The system tries to create indexes on columns that don't exist yet
|
|
|
|
This creates an impossible situation where:
|
|
- Fresh databases work fine (SCHEMA_SQL creates the latest structure)
|
|
- Existing databases fail (SCHEMA_SQL conflicts with old structure)
|
|
|
|
## Decision
|
|
|
|
Redesign the database initialization system to follow these principles:
|
|
|
|
1. **SCHEMA_SQL represents the initial v0.1.0 schema**, not the current schema
|
|
2. **All schema evolution happens through migrations**
|
|
3. **Migrations run BEFORE schema creation attempts**
|
|
4. **Fresh databases get the initial schema then run ALL migrations**
|
|
|
|
### Implementation Strategy
|
|
|
|
#### Phase 1: Immediate Fix (v1.0.1)
|
|
Remove problematic index creation from SCHEMA_SQL since migrations create them:
|
|
```python
|
|
# Remove from SCHEMA_SQL:
|
|
# CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
|
|
# Let migration 002 handle this
|
|
```
|
|
|
|
#### Phase 2: Proper Redesign (v1.1.0)
|
|
1. Create `INITIAL_SCHEMA_SQL` with the v0.1.0 database structure
|
|
2. Modify `init_db()` logic:
|
|
```python
|
|
def init_db(app=None):
|
|
# 1. Check if database exists and has tables
|
|
if database_exists_with_tables():
|
|
# Existing database - only run migrations
|
|
run_migrations()
|
|
else:
|
|
# Fresh database - create initial schema then migrate
|
|
conn.executescript(INITIAL_SCHEMA_SQL)
|
|
run_all_migrations()
|
|
```
|
|
|
|
3. Add explicit schema versioning:
|
|
```sql
|
|
CREATE TABLE schema_info (
|
|
version TEXT PRIMARY KEY,
|
|
upgraded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
```
|
|
|
|
## Rationale
|
|
|
|
### Why Initial Schema + Migrations?
|
|
|
|
1. **Predictable upgrade path**: Every database follows the same evolution
|
|
2. **Testable**: Can test upgrades from any version to any version
|
|
3. **Auditable**: Migration history shows exact evolution path
|
|
4. **Reversible**: Can potentially support rollbacks
|
|
5. **Industry standard**: Follows patterns from Rails, Django, Alembic
|
|
|
|
### Why Current Approach Failed
|
|
|
|
1. **Dual source of truth**: Schema defined in both SCHEMA_SQL and migrations
|
|
2. **Temporal coupling**: SCHEMA_SQL assumes post-migration state
|
|
3. **No upgrade path**: Can't get from old state to new state
|
|
4. **Hidden dependencies**: Index creation depends on migration execution
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
- Reliable database upgrades from any version
|
|
- Clear separation of concerns (initial vs evolution)
|
|
- Easier to test migration paths
|
|
- Follows established patterns
|
|
- Supports future rollback capabilities
|
|
|
|
### Negative
|
|
- Requires maintaining historical schema (INITIAL_SCHEMA_SQL)
|
|
- Fresh databases take longer to initialize (run all migrations)
|
|
- More complex initialization logic
|
|
- Need to reconstruct v0.1.0 schema
|
|
|
|
### Migration Path
|
|
1. v1.0.1: Quick fix - remove conflicting indexes from SCHEMA_SQL
|
|
2. v1.0.1: Add manual upgrade instructions for production
|
|
3. v1.1.0: Implement full redesign with INITIAL_SCHEMA_SQL
|
|
4. v1.1.0: Add comprehensive migration testing
|
|
|
|
## Alternatives Considered
|
|
|
|
### 1. Dynamic Schema Detection
|
|
**Approach**: Detect existing table structure and conditionally apply indexes
|
|
|
|
**Rejected because**:
|
|
- Complex conditional logic
|
|
- Fragile heuristics
|
|
- Doesn't solve root cause
|
|
- Hard to test all paths
|
|
|
|
### 2. Schema Snapshots
|
|
**Approach**: Maintain schema snapshots for each version, apply appropriate one
|
|
|
|
**Rejected because**:
|
|
- Maintenance burden
|
|
- Storage overhead
|
|
- Complex version detection
|
|
- Still doesn't provide upgrade path
|
|
|
|
### 3. Migration-Only Schema
|
|
**Approach**: No SCHEMA_SQL at all, everything through migrations
|
|
|
|
**Rejected because**:
|
|
- Slower fresh installations
|
|
- Need to maintain migration 000 as "initial schema"
|
|
- Harder to see current schema structure
|
|
- Goes against SQLite's lightweight philosophy
|
|
|
|
## References
|
|
|
|
- [Rails Database Migrations](https://guides.rubyonrails.org/active_record_migrations.html)
|
|
- [Django Migrations](https://docs.djangoproject.com/en/stable/topics/migrations/)
|
|
- [Alembic Documentation](https://alembic.sqlalchemy.org/)
|
|
- Production incident: v1.0.0-rc.1 deployment failure
|
|
- `/docs/reports/migration-failure-diagnosis-v1.0.0-rc.1.md`
|
|
|
|
## Implementation Checklist
|
|
|
|
- [ ] Create INITIAL_SCHEMA_SQL from v0.1.0 structure
|
|
- [ ] Modify init_db() to check database state
|
|
- [ ] Update migration runner to handle fresh databases
|
|
- [ ] Add schema_info table for version tracking
|
|
- [ ] Create migration test suite
|
|
- [ ] Document upgrade procedures
|
|
- [ ] Test upgrade paths from all released versions |