# Migration Failure Diagnosis - v1.0.0-rc.1 ## Executive Summary The v1.0.0-rc.1 container is experiencing a critical startup failure due to a **race condition in the database initialization and migration system**. The error `sqlite3.OperationalError: no such column: token_hash` occurs when `SCHEMA_SQL` attempts to create indexes for a `tokens` table structure that no longer exists after migration 002 drops and recreates it. ## Root Cause Analysis ### The Execution Order Problem 1. **Database Initialization** (`init_db()` in `database.py:94-127`) - Line 115: `conn.executescript(SCHEMA_SQL)` - Creates initial schema - Line 126: `run_migrations()` - Applies pending migrations 2. **SCHEMA_SQL Definition** (`database.py:46-60`) - Creates `tokens` table WITH `token_hash` column (lines 46-56) - Creates indexes including `idx_tokens_hash` (line 58) 3. **Migration 002** (`002_secure_tokens_and_authorization_codes.sql`) - Line 17: `DROP TABLE IF EXISTS tokens;` - Lines 20-30: Creates NEW `tokens` table with same structure - Lines 49-51: Creates indexes again ### The Critical Issue For an **existing production database** (v0.9.5): 1. Database already has an OLD `tokens` table (without `token_hash` column) 2. `init_db()` runs `SCHEMA_SQL` which includes: ```sql CREATE TABLE IF NOT EXISTS tokens ( ... token_hash TEXT UNIQUE NOT NULL, ... ); CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash); ``` 3. The `CREATE TABLE IF NOT EXISTS` is a no-op (table exists) 4. The `CREATE INDEX` tries to create an index on `token_hash` column 5. **ERROR**: Column `token_hash` doesn't exist in the old table structure 6. Container crashes before migrations can run ### Why This Wasn't Caught Earlier - **Fresh databases** work fine - SCHEMA_SQL creates the correct structure - **Test environments** likely started fresh or had the new schema - **Production** has an existing v0.9.5 database with the old `tokens` table structure ## The Schema Evolution Mismatch ### Original tokens table (v0.9.5) The old structure likely had columns like: - `token` (plain text - security issue) - `me` - `client_id` - `scope` - etc. ### New tokens table (v1.0.0-rc.1) - `token_hash` (SHA256 hash - secure) - Same other columns ### The Problem SCHEMA_SQL was updated to match the POST-migration structure, but it runs BEFORE migrations. This creates an impossible situation for existing databases. ## Migration System Design Flaw The current system has a fundamental ordering issue: 1. **SCHEMA_SQL** should represent the INITIAL schema (v0.1.0) 2. **Migrations** should evolve from that base 3. **Current Reality**: SCHEMA_SQL represents the LATEST schema This works for fresh databases but fails for existing ones that need migration. ## Recommended Fix ### Option 1: Conditional Index Creation (Quick Fix) Modify SCHEMA_SQL to use conditional logic or remove problematic indexes from SCHEMA_SQL since migration 002 creates them anyway. ### Option 2: Fix Execution Order (Better) 1. Run migrations BEFORE attempting schema creation 2. Only use SCHEMA_SQL for truly fresh databases ### Option 3: Proper Schema Versioning (Best) 1. SCHEMA_SQL should be the v0.1.0 schema 2. All evolution happens through migrations 3. Fresh databases run all migrations from the beginning ## Immediate Workaround For the production deployment: 1. **Manual intervention before upgrade**: ```sql -- Connect to production database -- Manually add the column before v1.0.0-rc.1 starts ALTER TABLE tokens ADD COLUMN token_hash TEXT; ``` 2. **Then deploy v1.0.0-rc.1**: - SCHEMA_SQL will succeed (column exists) - Migration 002 will drop and recreate the table properly - System will work correctly ## Verification Steps 1. Check production database structure: ```sql PRAGMA table_info(tokens); ``` 2. Verify migration status: ```sql SELECT * FROM schema_migrations; ``` 3. Test with a v0.9.5 database locally to reproduce ## Long-term Architecture Recommendations 1. **Separate Initial Schema from Current Schema** - `INITIAL_SCHEMA_SQL` - The v0.1.0 starting point - Migrations handle ALL evolution 2. **Migration-First Initialization** - Check for existing database - Run migrations first if database exists - Only apply SCHEMA_SQL to truly empty databases 3. **Schema Version Tracking** - Add a `schema_version` table - Track the current schema version explicitly - Make decisions based on version, not heuristics 4. **Testing Strategy** - Always test upgrades from previous production version - Include migration testing in CI/CD pipeline - Maintain database snapshots for each released version ## Conclusion This is a **critical architectural issue** in the migration system that affects all existing production deployments. The immediate fix is straightforward, but the system needs architectural changes to prevent similar issues in future releases. The core principle violated: **SCHEMA_SQL should represent the beginning, not the end state**.