Fixes critical issue where migration 002 indexes already existed in SCHEMA_SQL, causing 'index already exists' errors on databases created before v1.0.0-rc.1. Changes: - Removed duplicate index definitions from SCHEMA_SQL (database.py) - Enhanced migration system to detect and handle indexes properly - Added comprehensive documentation of the fix Version bumped to 1.0.0-rc.2 with full changelog entry. Refs: docs/reports/2025-11-24-migration-fix-v1.0.0-rc.2.md
145 lines
5.0 KiB
Markdown
145 lines
5.0 KiB
Markdown
# Migration Failure Diagnosis - v1.0.0-rc.1
|
|
|
|
## Executive Summary
|
|
|
|
The v1.0.0-rc.1 container is experiencing a critical startup failure due to a **race condition in the database initialization and migration system**. The error `sqlite3.OperationalError: no such column: token_hash` occurs when `SCHEMA_SQL` attempts to create indexes for a `tokens` table structure that no longer exists after migration 002 drops and recreates it.
|
|
|
|
## Root Cause Analysis
|
|
|
|
### The Execution Order Problem
|
|
|
|
1. **Database Initialization** (`init_db()` in `database.py:94-127`)
|
|
- Line 115: `conn.executescript(SCHEMA_SQL)` - Creates initial schema
|
|
- Line 126: `run_migrations()` - Applies pending migrations
|
|
|
|
2. **SCHEMA_SQL Definition** (`database.py:46-60`)
|
|
- Creates `tokens` table WITH `token_hash` column (lines 46-56)
|
|
- Creates indexes including `idx_tokens_hash` (line 58)
|
|
|
|
3. **Migration 002** (`002_secure_tokens_and_authorization_codes.sql`)
|
|
- Line 17: `DROP TABLE IF EXISTS tokens;`
|
|
- Lines 20-30: Creates NEW `tokens` table with same structure
|
|
- Lines 49-51: Creates indexes again
|
|
|
|
### The Critical Issue
|
|
|
|
For an **existing production database** (v0.9.5):
|
|
|
|
1. Database already has an OLD `tokens` table (without `token_hash` column)
|
|
2. `init_db()` runs `SCHEMA_SQL` which includes:
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS tokens (
|
|
...
|
|
token_hash TEXT UNIQUE NOT NULL,
|
|
...
|
|
);
|
|
CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
|
|
```
|
|
3. The `CREATE TABLE IF NOT EXISTS` is a no-op (table exists)
|
|
4. The `CREATE INDEX` tries to create an index on `token_hash` column
|
|
5. **ERROR**: Column `token_hash` doesn't exist in the old table structure
|
|
6. Container crashes before migrations can run
|
|
|
|
### Why This Wasn't Caught Earlier
|
|
|
|
- **Fresh databases** work fine - SCHEMA_SQL creates the correct structure
|
|
- **Test environments** likely started fresh or had the new schema
|
|
- **Production** has an existing v0.9.5 database with the old `tokens` table structure
|
|
|
|
## The Schema Evolution Mismatch
|
|
|
|
### Original tokens table (v0.9.5)
|
|
The old structure likely had columns like:
|
|
- `token` (plain text - security issue)
|
|
- `me`
|
|
- `client_id`
|
|
- `scope`
|
|
- etc.
|
|
|
|
### New tokens table (v1.0.0-rc.1)
|
|
- `token_hash` (SHA256 hash - secure)
|
|
- Same other columns
|
|
|
|
### The Problem
|
|
SCHEMA_SQL was updated to match the POST-migration structure, but it runs BEFORE migrations. This creates an impossible situation for existing databases.
|
|
|
|
## Migration System Design Flaw
|
|
|
|
The current system has a fundamental ordering issue:
|
|
|
|
1. **SCHEMA_SQL** should represent the INITIAL schema (v0.1.0)
|
|
2. **Migrations** should evolve from that base
|
|
3. **Current Reality**: SCHEMA_SQL represents the LATEST schema
|
|
|
|
This works for fresh databases but fails for existing ones that need migration.
|
|
|
|
## Recommended Fix
|
|
|
|
### Option 1: Conditional Index Creation (Quick Fix)
|
|
Modify SCHEMA_SQL to use conditional logic or remove problematic indexes from SCHEMA_SQL since migration 002 creates them anyway.
|
|
|
|
### Option 2: Fix Execution Order (Better)
|
|
1. Run migrations BEFORE attempting schema creation
|
|
2. Only use SCHEMA_SQL for truly fresh databases
|
|
|
|
### Option 3: Proper Schema Versioning (Best)
|
|
1. SCHEMA_SQL should be the v0.1.0 schema
|
|
2. All evolution happens through migrations
|
|
3. Fresh databases run all migrations from the beginning
|
|
|
|
## Immediate Workaround
|
|
|
|
For the production deployment:
|
|
|
|
1. **Manual intervention before upgrade**:
|
|
```sql
|
|
-- Connect to production database
|
|
-- Manually add the column before v1.0.0-rc.1 starts
|
|
ALTER TABLE tokens ADD COLUMN token_hash TEXT;
|
|
```
|
|
|
|
2. **Then deploy v1.0.0-rc.1**:
|
|
- SCHEMA_SQL will succeed (column exists)
|
|
- Migration 002 will drop and recreate the table properly
|
|
- System will work correctly
|
|
|
|
## Verification Steps
|
|
|
|
1. Check production database structure:
|
|
```sql
|
|
PRAGMA table_info(tokens);
|
|
```
|
|
|
|
2. Verify migration status:
|
|
```sql
|
|
SELECT * FROM schema_migrations;
|
|
```
|
|
|
|
3. Test with a v0.9.5 database locally to reproduce
|
|
|
|
## Long-term Architecture Recommendations
|
|
|
|
1. **Separate Initial Schema from Current Schema**
|
|
- `INITIAL_SCHEMA_SQL` - The v0.1.0 starting point
|
|
- Migrations handle ALL evolution
|
|
|
|
2. **Migration-First Initialization**
|
|
- Check for existing database
|
|
- Run migrations first if database exists
|
|
- Only apply SCHEMA_SQL to truly empty databases
|
|
|
|
3. **Schema Version Tracking**
|
|
- Add a `schema_version` table
|
|
- Track the current schema version explicitly
|
|
- Make decisions based on version, not heuristics
|
|
|
|
4. **Testing Strategy**
|
|
- Always test upgrades from previous production version
|
|
- Include migration testing in CI/CD pipeline
|
|
- Maintain database snapshots for each released version
|
|
|
|
## Conclusion
|
|
|
|
This is a **critical architectural issue** in the migration system that affects all existing production deployments. The immediate fix is straightforward, but the system needs architectural changes to prevent similar issues in future releases.
|
|
|
|
The core principle violated: **SCHEMA_SQL should represent the beginning, not the end state**. |