Files

Phil Skentelbery 2b2849a58d docs: Add database migration architecture and conflict resolution documentation

Documents the diagnosis and resolution of database migration detection conflicts

2025-11-24 13:27:19 -07:00

6.4 KiB

Raw Blame History

Database Migration Architecture

Overview

StarPunk uses a dual-strategy database initialization system that combines immediate schema creation (SCHEMA_SQL) with evolutionary migrations. This architecture provides both fast fresh installations and safe upgrades for existing databases.

Components

1. SCHEMA_SQL (database.py)

Purpose: Define the current complete database schema for fresh installations

Location: /starpunk/database.py lines 11-87

Responsibilities:

Create all tables with current structure
Create all columns with current types
Create base indexes for performance
Provide instant database initialization for new installations

Design Principle: Always represents the latest schema version

2. Migration Files

Purpose: Transform existing databases from one version to another

Location: /migrations/*.sql

Format: {number}_{description}.sql

Number: Three-digit zero-padded sequence (001, 002, etc.)
Description: Clear indication of changes

Responsibilities:

Add new tables/columns to existing databases
Modify existing structures safely
Create indexes and constraints
Handle breaking changes with data preservation

3. Migration Runner (migrations.py)

Purpose: Intelligent application of migrations based on database state

Location: /starpunk/migrations.py

Key Features:

Fresh database detection
Partial schema recognition
Smart migration skipping
Index-only application
Transaction safety

Architecture Patterns

Fresh Database Flow

1. init_db() called
2. SCHEMA_SQL executed (creates all current tables/columns)
3. run_migrations() called
4. Detects fresh database (empty schema_migrations)
5. Checks if schema is current (is_schema_current())
6. If current: marks all migrations as applied (no execution)
7. If partial: applies only needed migrations

Existing Database Flow

1. init_db() called
2. SCHEMA_SQL executed (CREATE IF NOT EXISTS - no-op for existing tables)
3. run_migrations() called
4. Reads schema_migrations table
5. Discovers migration files
6. Applies only unapplied migrations in sequence

Hybrid Database Flow (Production Issue Case)

1. Database has tables from SCHEMA_SQL but no migration records
2. run_migrations() detects migration_count == 0
3. For each migration, calls is_migration_needed()
4. Migration 002: detects tables exist, indexes missing
5. Creates only missing indexes
6. Marks migration as applied without full execution

State Detection Logic

is_schema_current() Function

Determines if database matches current schema version completely.

Checks:

Table existence (authorization_codes)
Column existence (token_hash in tokens)
Index existence (idx_tokens_hash, etc.)

Returns:

True: Schema is completely current (all migrations applied)
False: Schema needs migrations

is_migration_needed() Function

Determines if a specific migration should be applied.

For Migration 002:

Check if authorization_codes table exists
Check if token_hash column exists in tokens
Check if indexes exist
Return True only if tables/columns are missing
Return False if only indexes are missing (handled separately)

Design Decisions

Why Dual Strategy?

Fresh Install Speed: SCHEMA_SQL provides instant, complete schema
Upgrade Safety: Migrations provide controlled, versioned changes
Flexibility: Can handle various database states gracefully

Why Smart Detection?

Idempotency: Same code works for any database state
Self-Healing: Can fix partial schemas automatically
No Data Loss: Never drops tables unnecessarily

Why Check Indexes Separately?

SCHEMA_SQL Evolution: As SCHEMA_SQL includes migration changes, we avoid conflicts
Granular Control: Can apply just missing pieces
Performance: Indexes can be added without table locks

Migration Guidelines

Writing Migrations

Never use IF NOT EXISTS in migrations: Migrations should fail if preconditions aren't met
Always provide rollback path: Document how to reverse changes
One logical change per migration: Keep migrations focused
Test with various database states: Fresh, existing, and hybrid

SCHEMA_SQL Updates

When updating SCHEMA_SQL after a migration:

Include all changes from the migration
Remove indexes that migrations will create (avoid conflicts)
Keep CREATE IF NOT EXISTS for idempotency
Test fresh installations

Error Recovery

Common Issues

"Table already exists" Error

Cause: Migration tries to create table that SCHEMA_SQL already created

Solution: Smart detection should prevent this. If it fails:

Check if migration is already in schema_migrations
Verify is_migration_needed() logic
Manually mark migration as applied if needed

Missing Indexes

Cause: Tables exist from SCHEMA_SQL but indexes weren't created

Solution: Migration system creates missing indexes separately

Partial Migration Application

Cause: Migration failed partway through

Solution: Transactions ensure all-or-nothing. Rollback and retry.

State Verification Queries

Check Migration Status

SELECT * FROM schema_migrations ORDER BY id;

Check Table Existence

SELECT name FROM sqlite_master
WHERE type='table'
ORDER BY name;

Check Index Existence

SELECT name FROM sqlite_master
WHERE type='index'
ORDER BY name;

Check Column Structure

PRAGMA table_info(tokens);
PRAGMA table_info(authorization_codes);

6.4 KiB

Raw Blame History

Database Migration Architecture

Overview

Components

1. SCHEMA_SQL (database.py)

2. Migration Files

3. Migration Runner (migrations.py)

Architecture Patterns

Fresh Database Flow

Existing Database Flow

Hybrid Database Flow (Production Issue Case)

State Detection Logic

is_schema_current() Function

is_migration_needed() Function

Design Decisions

Why Dual Strategy?

Why Smart Detection?

Why Check Indexes Separately?

Migration Guidelines

Writing Migrations

SCHEMA_SQL Updates

Error Recovery

Common Issues

"Table already exists" Error

Missing Indexes

Partial Migration Application

State Verification Queries

Check Migration Status

Check Table Existence

Check Index Existence

Check Column Structure

Future Improvements

Potential Enhancements

Considered Alternatives

Security Considerations

Migration Safety

Schema Security

6.4 KiB Raw Blame History

Database Migration Architecture

Overview

Components

1. SCHEMA_SQL (database.py)

2. Migration Files

3. Migration Runner (migrations.py)

Architecture Patterns

Fresh Database Flow

Existing Database Flow

Hybrid Database Flow (Production Issue Case)

State Detection Logic

is_schema_current() Function

is_migration_needed() Function

Design Decisions

Why Dual Strategy?

Why Smart Detection?

Why Check Indexes Separately?

Migration Guidelines

Writing Migrations

SCHEMA_SQL Updates

Error Recovery

Common Issues

"Table already exists" Error

Missing Indexes

Partial Migration Application

State Verification Queries

Check Migration Status

Check Table Existence

Check Index Existence

Check Column Structure

Future Improvements

Potential Enhancements

Considered Alternatives

Security Considerations

Migration Safety

Schema Security

6.4 KiB

Raw Blame History