Documents the diagnosis and resolution of database migration detection conflicts
212 lines
6.4 KiB
Markdown
212 lines
6.4 KiB
Markdown
# Database Migration Architecture
|
|
|
|
## Overview
|
|
StarPunk uses a dual-strategy database initialization system that combines immediate schema creation (SCHEMA_SQL) with evolutionary migrations. This architecture provides both fast fresh installations and safe upgrades for existing databases.
|
|
|
|
## Components
|
|
|
|
### 1. SCHEMA_SQL (database.py)
|
|
**Purpose**: Define the current complete database schema for fresh installations
|
|
|
|
**Location**: `/starpunk/database.py` lines 11-87
|
|
|
|
**Responsibilities**:
|
|
- Create all tables with current structure
|
|
- Create all columns with current types
|
|
- Create base indexes for performance
|
|
- Provide instant database initialization for new installations
|
|
|
|
**Design Principle**: Always represents the latest schema version
|
|
|
|
### 2. Migration Files
|
|
**Purpose**: Transform existing databases from one version to another
|
|
|
|
**Location**: `/migrations/*.sql`
|
|
|
|
**Format**: `{number}_{description}.sql`
|
|
- Number: Three-digit zero-padded sequence (001, 002, etc.)
|
|
- Description: Clear indication of changes
|
|
|
|
**Responsibilities**:
|
|
- Add new tables/columns to existing databases
|
|
- Modify existing structures safely
|
|
- Create indexes and constraints
|
|
- Handle breaking changes with data preservation
|
|
|
|
### 3. Migration Runner (migrations.py)
|
|
**Purpose**: Intelligent application of migrations based on database state
|
|
|
|
**Location**: `/starpunk/migrations.py`
|
|
|
|
**Key Features**:
|
|
- Fresh database detection
|
|
- Partial schema recognition
|
|
- Smart migration skipping
|
|
- Index-only application
|
|
- Transaction safety
|
|
|
|
## Architecture Patterns
|
|
|
|
### Fresh Database Flow
|
|
```
|
|
1. init_db() called
|
|
2. SCHEMA_SQL executed (creates all current tables/columns)
|
|
3. run_migrations() called
|
|
4. Detects fresh database (empty schema_migrations)
|
|
5. Checks if schema is current (is_schema_current())
|
|
6. If current: marks all migrations as applied (no execution)
|
|
7. If partial: applies only needed migrations
|
|
```
|
|
|
|
### Existing Database Flow
|
|
```
|
|
1. init_db() called
|
|
2. SCHEMA_SQL executed (CREATE IF NOT EXISTS - no-op for existing tables)
|
|
3. run_migrations() called
|
|
4. Reads schema_migrations table
|
|
5. Discovers migration files
|
|
6. Applies only unapplied migrations in sequence
|
|
```
|
|
|
|
### Hybrid Database Flow (Production Issue Case)
|
|
```
|
|
1. Database has tables from SCHEMA_SQL but no migration records
|
|
2. run_migrations() detects migration_count == 0
|
|
3. For each migration, calls is_migration_needed()
|
|
4. Migration 002: detects tables exist, indexes missing
|
|
5. Creates only missing indexes
|
|
6. Marks migration as applied without full execution
|
|
```
|
|
|
|
## State Detection Logic
|
|
|
|
### is_schema_current() Function
|
|
Determines if database matches current schema version completely.
|
|
|
|
**Checks**:
|
|
1. Table existence (authorization_codes)
|
|
2. Column existence (token_hash in tokens)
|
|
3. Index existence (idx_tokens_hash, etc.)
|
|
|
|
**Returns**:
|
|
- True: Schema is completely current (all migrations applied)
|
|
- False: Schema needs migrations
|
|
|
|
### is_migration_needed() Function
|
|
Determines if a specific migration should be applied.
|
|
|
|
**For Migration 002**:
|
|
1. Check if authorization_codes table exists
|
|
2. Check if token_hash column exists in tokens
|
|
3. Check if indexes exist
|
|
4. Return True only if tables/columns are missing
|
|
5. Return False if only indexes are missing (handled separately)
|
|
|
|
## Design Decisions
|
|
|
|
### Why Dual Strategy?
|
|
1. **Fresh Install Speed**: SCHEMA_SQL provides instant, complete schema
|
|
2. **Upgrade Safety**: Migrations provide controlled, versioned changes
|
|
3. **Flexibility**: Can handle various database states gracefully
|
|
|
|
### Why Smart Detection?
|
|
1. **Idempotency**: Same code works for any database state
|
|
2. **Self-Healing**: Can fix partial schemas automatically
|
|
3. **No Data Loss**: Never drops tables unnecessarily
|
|
|
|
### Why Check Indexes Separately?
|
|
1. **SCHEMA_SQL Evolution**: As SCHEMA_SQL includes migration changes, we avoid conflicts
|
|
2. **Granular Control**: Can apply just missing pieces
|
|
3. **Performance**: Indexes can be added without table locks
|
|
|
|
## Migration Guidelines
|
|
|
|
### Writing Migrations
|
|
1. **Never use IF NOT EXISTS in migrations**: Migrations should fail if preconditions aren't met
|
|
2. **Always provide rollback path**: Document how to reverse changes
|
|
3. **One logical change per migration**: Keep migrations focused
|
|
4. **Test with various database states**: Fresh, existing, and hybrid
|
|
|
|
### SCHEMA_SQL Updates
|
|
When updating SCHEMA_SQL after a migration:
|
|
1. Include all changes from the migration
|
|
2. Remove indexes that migrations will create (avoid conflicts)
|
|
3. Keep CREATE IF NOT EXISTS for idempotency
|
|
4. Test fresh installations
|
|
|
|
## Error Recovery
|
|
|
|
### Common Issues
|
|
|
|
#### "Table already exists" Error
|
|
**Cause**: Migration tries to create table that SCHEMA_SQL already created
|
|
|
|
**Solution**: Smart detection should prevent this. If it fails:
|
|
1. Check if migration is already in schema_migrations
|
|
2. Verify is_migration_needed() logic
|
|
3. Manually mark migration as applied if needed
|
|
|
|
#### Missing Indexes
|
|
**Cause**: Tables exist from SCHEMA_SQL but indexes weren't created
|
|
|
|
**Solution**: Migration system creates missing indexes separately
|
|
|
|
#### Partial Migration Application
|
|
**Cause**: Migration failed partway through
|
|
|
|
**Solution**: Transactions ensure all-or-nothing. Rollback and retry.
|
|
|
|
## State Verification Queries
|
|
|
|
### Check Migration Status
|
|
```sql
|
|
SELECT * FROM schema_migrations ORDER BY id;
|
|
```
|
|
|
|
### Check Table Existence
|
|
```sql
|
|
SELECT name FROM sqlite_master
|
|
WHERE type='table'
|
|
ORDER BY name;
|
|
```
|
|
|
|
### Check Index Existence
|
|
```sql
|
|
SELECT name FROM sqlite_master
|
|
WHERE type='index'
|
|
ORDER BY name;
|
|
```
|
|
|
|
### Check Column Structure
|
|
```sql
|
|
PRAGMA table_info(tokens);
|
|
PRAGMA table_info(authorization_codes);
|
|
```
|
|
|
|
## Future Improvements
|
|
|
|
### Potential Enhancements
|
|
1. **Migration Rollback**: Add down() migrations for reversibility
|
|
2. **Schema Versioning**: Add version table for faster state detection
|
|
3. **Migration Validation**: Pre-flight checks before application
|
|
4. **Dry Run Mode**: Test migrations without applying
|
|
|
|
### Considered Alternatives
|
|
1. **Migrations-Only**: Rejected - slow fresh installs
|
|
2. **SCHEMA_SQL-Only**: Rejected - no upgrade path
|
|
3. **ORM-Based**: Rejected - unnecessary complexity for single-user system
|
|
4. **External Tools**: Rejected - additional dependencies
|
|
|
|
## Security Considerations
|
|
|
|
### Migration Safety
|
|
1. All migrations run in transactions
|
|
2. Rollback on any error
|
|
3. No data destruction without explicit user action
|
|
4. Token invalidation documented when necessary
|
|
|
|
### Schema Security
|
|
1. Tokens stored as SHA256 hashes
|
|
2. Proper indexes for timing attack prevention
|
|
3. Expiration columns for automatic cleanup
|
|
4. Soft deletion support |