diff --git a/docs/architecture/database-migration-architecture.md b/docs/architecture/database-migration-architecture.md new file mode 100644 index 0000000..5e660e7 --- /dev/null +++ b/docs/architecture/database-migration-architecture.md @@ -0,0 +1,212 @@ +# Database Migration Architecture + +## Overview +StarPunk uses a dual-strategy database initialization system that combines immediate schema creation (SCHEMA_SQL) with evolutionary migrations. This architecture provides both fast fresh installations and safe upgrades for existing databases. + +## Components + +### 1. SCHEMA_SQL (database.py) +**Purpose**: Define the current complete database schema for fresh installations + +**Location**: `/starpunk/database.py` lines 11-87 + +**Responsibilities**: +- Create all tables with current structure +- Create all columns with current types +- Create base indexes for performance +- Provide instant database initialization for new installations + +**Design Principle**: Always represents the latest schema version + +### 2. Migration Files +**Purpose**: Transform existing databases from one version to another + +**Location**: `/migrations/*.sql` + +**Format**: `{number}_{description}.sql` +- Number: Three-digit zero-padded sequence (001, 002, etc.) +- Description: Clear indication of changes + +**Responsibilities**: +- Add new tables/columns to existing databases +- Modify existing structures safely +- Create indexes and constraints +- Handle breaking changes with data preservation + +### 3. Migration Runner (migrations.py) +**Purpose**: Intelligent application of migrations based on database state + +**Location**: `/starpunk/migrations.py` + +**Key Features**: +- Fresh database detection +- Partial schema recognition +- Smart migration skipping +- Index-only application +- Transaction safety + +## Architecture Patterns + +### Fresh Database Flow +``` +1. init_db() called +2. SCHEMA_SQL executed (creates all current tables/columns) +3. run_migrations() called +4. Detects fresh database (empty schema_migrations) +5. Checks if schema is current (is_schema_current()) +6. If current: marks all migrations as applied (no execution) +7. If partial: applies only needed migrations +``` + +### Existing Database Flow +``` +1. init_db() called +2. SCHEMA_SQL executed (CREATE IF NOT EXISTS - no-op for existing tables) +3. run_migrations() called +4. Reads schema_migrations table +5. Discovers migration files +6. Applies only unapplied migrations in sequence +``` + +### Hybrid Database Flow (Production Issue Case) +``` +1. Database has tables from SCHEMA_SQL but no migration records +2. run_migrations() detects migration_count == 0 +3. For each migration, calls is_migration_needed() +4. Migration 002: detects tables exist, indexes missing +5. Creates only missing indexes +6. Marks migration as applied without full execution +``` + +## State Detection Logic + +### is_schema_current() Function +Determines if database matches current schema version completely. + +**Checks**: +1. Table existence (authorization_codes) +2. Column existence (token_hash in tokens) +3. Index existence (idx_tokens_hash, etc.) + +**Returns**: +- True: Schema is completely current (all migrations applied) +- False: Schema needs migrations + +### is_migration_needed() Function +Determines if a specific migration should be applied. + +**For Migration 002**: +1. Check if authorization_codes table exists +2. Check if token_hash column exists in tokens +3. Check if indexes exist +4. Return True only if tables/columns are missing +5. Return False if only indexes are missing (handled separately) + +## Design Decisions + +### Why Dual Strategy? +1. **Fresh Install Speed**: SCHEMA_SQL provides instant, complete schema +2. **Upgrade Safety**: Migrations provide controlled, versioned changes +3. **Flexibility**: Can handle various database states gracefully + +### Why Smart Detection? +1. **Idempotency**: Same code works for any database state +2. **Self-Healing**: Can fix partial schemas automatically +3. **No Data Loss**: Never drops tables unnecessarily + +### Why Check Indexes Separately? +1. **SCHEMA_SQL Evolution**: As SCHEMA_SQL includes migration changes, we avoid conflicts +2. **Granular Control**: Can apply just missing pieces +3. **Performance**: Indexes can be added without table locks + +## Migration Guidelines + +### Writing Migrations +1. **Never use IF NOT EXISTS in migrations**: Migrations should fail if preconditions aren't met +2. **Always provide rollback path**: Document how to reverse changes +3. **One logical change per migration**: Keep migrations focused +4. **Test with various database states**: Fresh, existing, and hybrid + +### SCHEMA_SQL Updates +When updating SCHEMA_SQL after a migration: +1. Include all changes from the migration +2. Remove indexes that migrations will create (avoid conflicts) +3. Keep CREATE IF NOT EXISTS for idempotency +4. Test fresh installations + +## Error Recovery + +### Common Issues + +#### "Table already exists" Error +**Cause**: Migration tries to create table that SCHEMA_SQL already created + +**Solution**: Smart detection should prevent this. If it fails: +1. Check if migration is already in schema_migrations +2. Verify is_migration_needed() logic +3. Manually mark migration as applied if needed + +#### Missing Indexes +**Cause**: Tables exist from SCHEMA_SQL but indexes weren't created + +**Solution**: Migration system creates missing indexes separately + +#### Partial Migration Application +**Cause**: Migration failed partway through + +**Solution**: Transactions ensure all-or-nothing. Rollback and retry. + +## State Verification Queries + +### Check Migration Status +```sql +SELECT * FROM schema_migrations ORDER BY id; +``` + +### Check Table Existence +```sql +SELECT name FROM sqlite_master +WHERE type='table' +ORDER BY name; +``` + +### Check Index Existence +```sql +SELECT name FROM sqlite_master +WHERE type='index' +ORDER BY name; +``` + +### Check Column Structure +```sql +PRAGMA table_info(tokens); +PRAGMA table_info(authorization_codes); +``` + +## Future Improvements + +### Potential Enhancements +1. **Migration Rollback**: Add down() migrations for reversibility +2. **Schema Versioning**: Add version table for faster state detection +3. **Migration Validation**: Pre-flight checks before application +4. **Dry Run Mode**: Test migrations without applying + +### Considered Alternatives +1. **Migrations-Only**: Rejected - slow fresh installs +2. **SCHEMA_SQL-Only**: Rejected - no upgrade path +3. **ORM-Based**: Rejected - unnecessary complexity for single-user system +4. **External Tools**: Rejected - additional dependencies + +## Security Considerations + +### Migration Safety +1. All migrations run in transactions +2. Rollback on any error +3. No data destruction without explicit user action +4. Token invalidation documented when necessary + +### Schema Security +1. Tokens stored as SHA256 hashes +2. Proper indexes for timing attack prevention +3. Expiration columns for automatic cleanup +4. Soft deletion support \ No newline at end of file diff --git a/docs/decisions/ADR-041-database-migration-conflict-resolution.md b/docs/decisions/ADR-041-database-migration-conflict-resolution.md new file mode 100644 index 0000000..deb34ec --- /dev/null +++ b/docs/decisions/ADR-041-database-migration-conflict-resolution.md @@ -0,0 +1,123 @@ +# ADR-041: Database Migration Conflict Resolution + +## Status +Accepted + +## Context +The v1.0.0-rc.2 container deployment is failing with the error: +``` +Migration 002_secure_tokens_and_authorization_codes.sql failed: table authorization_codes already exists +``` + +The production database is in a hybrid state: +1. **v1.0.0-rc.1 Impact**: The `authorization_codes` table was created by SCHEMA_SQL in database.py +2. **Missing Elements**: The production database lacks the proper indexes that migration 002 would create +3. **Migration Tracking**: The schema_migrations table likely shows migration 002 hasn't been applied +4. **Partial Schema**: The database has tables/columns from SCHEMA_SQL but not the complete migration features + +### Root Cause Analysis +The conflict arose from an architectural mismatch between two database initialization strategies: +1. **SCHEMA_SQL Approach**: Creates complete schema upfront (including authorization_codes table) +2. **Migration Approach**: Expects to create tables that don't exist yet + +In v1.0.0-rc.1, SCHEMA_SQL included the `authorization_codes` table creation (lines 58-76 in database.py). When migration 002 tries to run, it attempts to CREATE TABLE authorization_codes, which already exists. + +### Current Migration System Logic +The migrations.py file has sophisticated logic to handle this scenario: +1. **Fresh Database Detection** (lines 352-368): If schema_migrations is empty and schema is current, mark all migrations as applied +2. **Partial Schema Handling** (lines 176-211): For migration 002, it checks if tables exist and creates only missing indexes +3. **Smart Migration Application** (lines 383-410): Can apply just indexes without running full migration + +However, the production database doesn't trigger the "fresh database" path because: +- The schema is NOT fully current (missing indexes) +- The is_schema_current() check (lines 89-95) requires ALL indexes to exist + +## Decision +The architecture already has the correct solution implemented. The issue is that the production database falls into an edge case where: +1. Tables exist (from SCHEMA_SQL) +2. Indexes don't exist (never created) +3. Migration tracking is empty or partial + +The migrations.py file already handles this case correctly in lines 383-410: +- If migration 002's tables exist but indexes don't, it creates just the indexes +- Then marks the migration as applied without running the full SQL + +## Rationale +The existing architecture is sound and handles the hybrid state correctly. The migration system's sophisticated detection logic can: +1. Identify when tables already exist +2. Create only the missing pieces (indexes) +3. Mark migrations as applied appropriately + +This approach: +- Avoids data loss +- Handles partial schemas gracefully +- Maintains idempotency +- Provides clear logging + +## Consequences + +### Positive +1. **Zero Data Loss**: Existing tables are preserved +2. **Graceful Recovery**: System can heal partial schemas automatically +3. **Clear Audit Trail**: Migration tracking shows what was applied +4. **Future-Proof**: Handles various database states correctly + +### Negative +1. **Complexity**: The migration logic is sophisticated and must be understood +2. **Edge Cases**: Requires careful testing of various database states + +## Implementation Notes + +### Database State Detection +The system uses multiple checks to determine database state: +```python +# Check for tables +table_exists(conn, 'authorization_codes') + +# Check for columns +column_exists(conn, 'tokens', 'token_hash') + +# Check for indexes (critical for determining if migration 002 ran) +index_exists(conn, 'idx_tokens_hash') +``` + +### Hybrid State Resolution +When a database has tables but not indexes: +1. Migration 002 is detected as "not needed" for table creation +2. System creates missing indexes individually +3. Migration is marked as applied + +### Production Fix Path +For the current production issue: +1. The v1.0.0-rc.2 container should work correctly +2. The migration system will detect the hybrid state +3. It will create only the missing indexes +4. Migration 002 will be marked as applied + +If the error persists, it suggests the migration system isn't detecting the state correctly, which would require investigation of: +- The exact schema_migrations table contents +- Which tables/columns/indexes actually exist +- The execution path through migrations.py + +## Alternatives Considered + +### Alternative 1: Remove Tables from SCHEMA_SQL +**Rejected**: Would break fresh installations + +### Alternative 2: Make Migration 002 Idempotent +Use CREATE TABLE IF NOT EXISTS in the migration. +**Rejected**: Would hide partial application issues and not handle the DROP TABLE statement correctly + +### Alternative 3: Version-Specific SCHEMA_SQL +Have different SCHEMA_SQL for different versions. +**Rejected**: Too complex to maintain + +### Alternative 4: Manual Intervention +Require manual database fixes. +**Rejected**: Goes against the self-healing architecture principle + +## References +- migrations.py lines 176-211 (migration 002 detection) +- migrations.py lines 383-410 (index-only creation) +- database.py lines 58-76 (authorization_codes in SCHEMA_SQL) +- Migration file: 002_secure_tokens_and_authorization_codes.sql \ No newline at end of file diff --git a/docs/reports/database-migration-conflict-diagnosis.md b/docs/reports/database-migration-conflict-diagnosis.md new file mode 100644 index 0000000..68d2eb3 --- /dev/null +++ b/docs/reports/database-migration-conflict-diagnosis.md @@ -0,0 +1,191 @@ +# Database Migration Conflict Diagnosis Report + +## Executive Summary +The v1.0.0-rc.2 container is failing because migration 002 attempts to CREATE TABLE authorization_codes, but this table already exists in the production database (created by v1.0.0-rc.1's SCHEMA_SQL). + +## Issue Details + +### Error Message +``` +Migration 002_secure_tokens_and_authorization_codes.sql failed: table authorization_codes already exists +``` + +### Root Cause +**Conflicting Database Initialization Strategies** + +1. **SCHEMA_SQL in database.py (lines 58-76)**: Creates the `authorization_codes` table directly +2. **Migration 002 (line 33)**: Also attempts to CREATE TABLE authorization_codes + +The production database was initialized with v1.0.0-rc.1's SCHEMA_SQL, which created the table. When v1.0.0-rc.2 runs, migration 002 fails because the table already exists. + +## Database State Analysis + +### What v1.0.0-rc.1 Created (via SCHEMA_SQL) +```sql +-- From database.py lines 58-76 +CREATE TABLE IF NOT EXISTS authorization_codes ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + code_hash TEXT UNIQUE NOT NULL, + me TEXT NOT NULL, + client_id TEXT NOT NULL, + redirect_uri TEXT NOT NULL, + scope TEXT, + state TEXT, + code_challenge TEXT, + code_challenge_method TEXT, + created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + expires_at TIMESTAMP NOT NULL, + used_at TIMESTAMP +); + +CREATE INDEX IF NOT EXISTS idx_auth_codes_hash ON authorization_codes(code_hash); +CREATE INDEX IF NOT EXISTS idx_auth_codes_expires ON authorization_codes(expires_at); +``` + +### What Migration 002 Tries to Do +```sql +-- From migration 002 lines 33-46 +CREATE TABLE authorization_codes ( -- NO "IF NOT EXISTS" clause! + -- Same structure as above +); +``` + +The migration uses CREATE TABLE without IF NOT EXISTS, causing it to fail when the table already exists. + +## The Good News: System Already Has the Solution + +The migrations.py file has sophisticated logic to handle this exact scenario: + +### Detection Logic (migrations.py lines 176-211) +```python +def is_migration_needed(conn, migration_name): + if migration_name == "002_secure_tokens_and_authorization_codes.sql": + # Check if tables exist + if not table_exists(conn, 'authorization_codes'): + return True # Run full migration + if not column_exists(conn, 'tokens', 'token_hash'): + return True # Run full migration + + # Check if indexes exist + has_all_indexes = ( + index_exists(conn, 'idx_tokens_hash') and + index_exists(conn, 'idx_tokens_me') and + # ... other index checks + ) + + if not has_all_indexes: + # Tables exist but indexes missing + # Don't run full migration, handle separately + return False +``` + +### Resolution Logic (migrations.py lines 383-410) +When tables exist but indexes are missing: +```python +if migration_name == "002_secure_tokens_and_authorization_codes.sql": + # Create only missing indexes + indexes_to_create = [] + if not index_exists(conn, 'idx_tokens_hash'): + indexes_to_create.append("CREATE INDEX idx_tokens_hash ON tokens(token_hash)") + # ... check and create other indexes + + # Apply indexes without running full migration + for index_sql in indexes_to_create: + conn.execute(index_sql) + + # Mark migration as applied + conn.execute( + "INSERT INTO schema_migrations (migration_name) VALUES (?)", + (migration_name,) + ) +``` + +## Why Is It Still Failing? + +The error suggests the smart detection logic isn't being triggered. Possible reasons: + +1. **Migration Already Marked as Applied**: Check if schema_migrations table already has migration 002 listed +2. **Different Code Path**: The production container might not be using the smart detection path +3. **Transaction Rollback**: An earlier error might have left the database in an inconsistent state + +## Immediate Solution + +### Option 1: Verify Smart Detection Is Working +The system SHOULD handle this automatically. If it's not, check: +1. Is migrations.py line 378 being reached? (migration_count == 0 check) +2. Is is_migration_needed() being called for migration 002? +3. Are the table existence checks working correctly? + +### Option 2: Manual Database Fix (if smart detection fails) +```sql +-- Check current state +SELECT * FROM schema_migrations WHERE migration_name LIKE '%002%'; + +-- If migration 002 is NOT listed, mark it as applied +INSERT INTO schema_migrations (migration_name) +VALUES ('002_secure_tokens_and_authorization_codes.sql'); + +-- Ensure indexes exist (if missing) +CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash); +CREATE INDEX IF NOT EXISTS idx_tokens_me ON tokens(me); +CREATE INDEX IF NOT EXISTS idx_tokens_expires ON tokens(expires_at); +CREATE INDEX IF NOT EXISTS idx_auth_codes_hash ON authorization_codes(code_hash); +CREATE INDEX IF NOT EXISTS idx_auth_codes_expires ON authorization_codes(expires_at); +``` + +## Long-term Architecture Fix + +### Current Issue +SCHEMA_SQL and migrations have overlapping responsibilities: +- SCHEMA_SQL creates authorization_codes table (v1.0.0-rc.1+) +- Migration 002 also creates authorization_codes table + +### Recommended Solution +**Already Implemented!** The smart detection in migrations.py handles this correctly. + +### Why It Should Work +1. When database has tables from SCHEMA_SQL but no migration records: + - is_migration_needed() detects tables exist + - Returns False to skip full migration + - Creates only missing indexes + - Marks migration as applied + +2. The system is designed to be self-healing and handle partial schemas + +## Verification Steps + +1. **Check Migration Status**: + ```sql + SELECT * FROM schema_migrations; + ``` + +2. **Check Table Existence**: + ```sql + SELECT name FROM sqlite_master + WHERE type='table' AND name='authorization_codes'; + ``` + +3. **Check Index Existence**: + ```sql + SELECT name FROM sqlite_master + WHERE type='index' AND name LIKE 'idx_%'; + ``` + +4. **Check Schema Version Detection**: + - The is_schema_current() function should return False (missing indexes) + - This should trigger the smart migration path + +## Conclusion + +The architecture already has the correct solution implemented in migrations.py. The smart detection logic should: +1. Detect that authorization_codes table exists +2. Skip the table creation +3. Create only missing indexes +4. Mark migration 002 as applied + +If this isn't working, the issue is likely: +- A bug in the detection logic execution path +- The production database already has migration 002 marked as applied (check schema_migrations) +- A transaction rollback leaving the database in an inconsistent state + +The system is designed to handle this exact scenario. If it's failing, we need to debug why the smart detection isn't being triggered. \ No newline at end of file