Fixes critical issue where migration 002 indexes already existed in SCHEMA_SQL, causing 'index already exists' errors on databases created before v1.0.0-rc.1. Changes: - Removed duplicate index definitions from SCHEMA_SQL (database.py) - Enhanced migration system to detect and handle indexes properly - Added comprehensive documentation of the fix Version bumped to 1.0.0-rc.2 with full changelog entry. Refs: docs/reports/2025-11-24-migration-fix-v1.0.0-rc.2.md
229 lines
7.9 KiB
Markdown
229 lines
7.9 KiB
Markdown
# ADR-032: Initial Schema SQL Implementation for Migration System
|
|
|
|
## Status
|
|
Accepted
|
|
|
|
## Context
|
|
|
|
As documented in ADR-031, the current database migration system has a critical design flaw: `SCHEMA_SQL` represents the current (latest) schema structure rather than the initial v0.1.0 schema. This causes upgrade failures for existing databases because:
|
|
|
|
1. The system tries to create indexes on columns that don't exist yet
|
|
2. Schema creation happens BEFORE migrations run
|
|
3. There's no clear upgrade path from old to new database structures
|
|
|
|
Phase 2 of ADR-031's redesign requires creating an `INITIAL_SCHEMA_SQL` constant that represents the v0.1.0 baseline schema, allowing all schema evolution to happen through migrations.
|
|
|
|
## Decision
|
|
|
|
Create an `INITIAL_SCHEMA_SQL` constant that represents the exact database schema from the initial v0.1.0 release (commit a68fd57). This baseline schema will be used for:
|
|
|
|
1. **Fresh database initialization**: Create initial schema then run ALL migrations
|
|
2. **Existing database detection**: Skip initial schema if tables already exist
|
|
3. **Clear upgrade path**: Every database follows the same evolution through migrations
|
|
|
|
### INITIAL_SCHEMA_SQL Design
|
|
|
|
Based on analysis of the initial commit (a68fd57), the `INITIAL_SCHEMA_SQL` should contain:
|
|
|
|
```sql
|
|
-- Notes metadata (content is in files)
|
|
CREATE TABLE IF NOT EXISTS notes (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
slug TEXT UNIQUE NOT NULL,
|
|
file_path TEXT UNIQUE NOT NULL,
|
|
published BOOLEAN DEFAULT 0,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
|
updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
|
deleted_at TIMESTAMP,
|
|
content_hash TEXT
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_notes_created_at ON notes(created_at DESC);
|
|
CREATE INDEX IF NOT EXISTS idx_notes_published ON notes(published);
|
|
CREATE INDEX IF NOT EXISTS idx_notes_slug ON notes(slug);
|
|
CREATE INDEX IF NOT EXISTS idx_notes_deleted_at ON notes(deleted_at);
|
|
|
|
-- Authentication sessions (IndieLogin)
|
|
CREATE TABLE IF NOT EXISTS sessions (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
session_token TEXT UNIQUE NOT NULL,
|
|
me TEXT NOT NULL,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
|
expires_at TIMESTAMP NOT NULL,
|
|
last_used_at TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_sessions_token ON sessions(session_token);
|
|
CREATE INDEX IF NOT EXISTS idx_sessions_expires ON sessions(expires_at);
|
|
|
|
-- Micropub access tokens (original insecure version)
|
|
CREATE TABLE IF NOT EXISTS tokens (
|
|
token TEXT PRIMARY KEY,
|
|
me TEXT NOT NULL,
|
|
client_id TEXT,
|
|
scope TEXT,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
|
expires_at TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_tokens_me ON tokens(me);
|
|
|
|
-- CSRF state tokens (for IndieAuth flow)
|
|
CREATE TABLE IF NOT EXISTS auth_state (
|
|
state TEXT PRIMARY KEY,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
|
expires_at TIMESTAMP NOT NULL
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_auth_state_expires ON auth_state(expires_at);
|
|
```
|
|
|
|
### Key Differences from Current SCHEMA_SQL
|
|
|
|
1. **sessions table**: Uses `session_token` (plain text) instead of `session_token_hash`
|
|
2. **tokens table**: Original insecure structure with plain text tokens as PRIMARY KEY
|
|
3. **auth_state table**: No `code_verifier` column (added in migration 001)
|
|
4. **No authorization_codes table**: Added in migration 002
|
|
5. **No secure token columns**: token_hash, last_used_at, revoked_at added later
|
|
|
|
### Implementation Architecture
|
|
|
|
```python
|
|
# database.py structure
|
|
INITIAL_SCHEMA_SQL = """
|
|
-- V0.1.0 baseline schema (see ADR-032)
|
|
-- [SQL content as shown above]
|
|
"""
|
|
|
|
CURRENT_SCHEMA_SQL = """
|
|
-- Current complete schema for reference
|
|
-- NOT used for database initialization
|
|
-- [Current SCHEMA_SQL content - for documentation only]
|
|
"""
|
|
|
|
def init_db(app=None):
|
|
"""Initialize database with proper migration handling"""
|
|
|
|
# 1. Check if database exists and has tables
|
|
if database_exists_with_tables():
|
|
# Existing database - only run migrations
|
|
run_migrations(db_path, logger)
|
|
else:
|
|
# Fresh database - create initial schema then migrate
|
|
conn = sqlite3.connect(db_path)
|
|
try:
|
|
# Create v0.1.0 baseline schema
|
|
conn.executescript(INITIAL_SCHEMA_SQL)
|
|
conn.commit()
|
|
logger.info("Created initial v0.1.0 database schema")
|
|
finally:
|
|
conn.close()
|
|
|
|
# Run all migrations to bring to current version
|
|
run_migrations(db_path, logger)
|
|
```
|
|
|
|
### Migration Evolution Path
|
|
|
|
Starting from INITIAL_SCHEMA_SQL, the database evolves through:
|
|
|
|
1. **Migration 001**: Add code_verifier to auth_state (PKCE support)
|
|
2. **Migration 002**: Secure token storage (complete tokens table rebuild)
|
|
3. **Future migrations**: Continue evolution from this baseline
|
|
|
|
## Rationale
|
|
|
|
### Why This Specific Schema?
|
|
|
|
1. **Historical accuracy**: Represents the actual v0.1.0 release state
|
|
2. **Clean evolution**: All changes tracked through migrations
|
|
3. **Testable upgrades**: Can test upgrade path from any version
|
|
4. **No ambiguity**: Clear separation between initial and evolved state
|
|
|
|
### Why Not Alternative Approaches?
|
|
|
|
1. **Not using migration 000**: Migrations should represent changes, not initial state
|
|
2. **Not using current schema**: Would skip migration history for new databases
|
|
3. **Not detecting schema dynamically**: Too complex and fragile
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
- **Reliable upgrades**: Any database can upgrade to any version
|
|
- **Clear history**: Migration path shows exact evolution
|
|
- **Testable**: Can verify upgrade paths in CI/CD
|
|
- **Standard pattern**: Follows Rails/Django migration patterns
|
|
- **Maintainable**: Single source of truth for initial schema
|
|
|
|
### Negative
|
|
|
|
- **Historical maintenance**: Must preserve v0.1.0 schema forever
|
|
- **Slower fresh installs**: Must run all migrations on new databases
|
|
- **Documentation burden**: Need to explain two schema constants
|
|
|
|
### Implementation Requirements
|
|
|
|
1. **Code Changes**:
|
|
- Add `INITIAL_SCHEMA_SQL` constant to `database.py`
|
|
- Modify `init_db()` to use new initialization logic
|
|
- Add `database_exists_with_tables()` helper function
|
|
- Rename current `SCHEMA_SQL` to `CURRENT_SCHEMA_SQL` (documentation only)
|
|
|
|
2. **Testing Requirements**:
|
|
- Test fresh database initialization
|
|
- Test upgrade from v0.1.0 schema
|
|
- Test upgrade from each released version
|
|
- Test migration replay detection
|
|
- Verify all indexes created correctly
|
|
|
|
3. **Documentation Updates**:
|
|
- Update database.py docstrings
|
|
- Document schema evolution in architecture docs
|
|
- Add upgrade guide for production systems
|
|
- Update deployment documentation
|
|
|
|
## Migration Strategy
|
|
|
|
### For v1.1.0 Release
|
|
|
|
1. **Implement INITIAL_SCHEMA_SQL** as designed above
|
|
2. **Update init_db()** with new logic
|
|
3. **Comprehensive testing** of upgrade paths
|
|
4. **Documentation** of upgrade procedures
|
|
5. **Release notes** explaining the change
|
|
|
|
### For Existing Production Systems
|
|
|
|
After v1.1.0 deployment:
|
|
|
|
1. Existing databases will skip INITIAL_SCHEMA_SQL (tables exist)
|
|
2. Migrations run normally to update schema
|
|
3. No manual intervention required
|
|
4. Full backward compatibility maintained
|
|
|
|
## Testing Checklist
|
|
|
|
- [ ] Fresh database gets v0.1.0 schema then migrations
|
|
- [ ] Existing v0.1.0 database upgrades correctly
|
|
- [ ] Existing v1.0.0 database upgrades correctly
|
|
- [ ] All indexes created in correct order
|
|
- [ ] No duplicate table/index creation errors
|
|
- [ ] Migration history tracked correctly
|
|
- [ ] Performance acceptable for fresh installs
|
|
|
|
## References
|
|
|
|
- ADR-031: Database Migration System Redesign
|
|
- Original v0.1.0 schema (commit a68fd57)
|
|
- Migration 001: Add code_verifier to auth_state
|
|
- Migration 002: Secure tokens and authorization codes
|
|
- SQLite documentation on schema management
|
|
- Rails/Django migration patterns
|
|
|
|
## Implementation Notes
|
|
|
|
**Priority**: HIGH - Required for v1.1.0 release
|
|
**Complexity**: Medium - Clear requirements but needs careful testing
|
|
**Risk**: Low - Backward compatible, well-understood pattern
|
|
**Effort**: 4-6 hours including testing |