docs: Add ADR-020 and migration system implementation guidance

Architecture documentation for automatic database migrations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 16:11:17 -07:00
parent 9a805ec316
commit ebca9064c5
3 changed files with 2049 additions and 0 deletions
--- a/docs/decisions/ADR-020-automatic-database-migrations.md
+++ b/docs/decisions/ADR-020-automatic-database-migrations.md
--- a/docs/reports/2025-11-19-migration-implementation-quick-reference.md
+++ b/docs/reports/2025-11-19-migration-implementation-quick-reference.md
@@ -0,0 +1,104 @@
+# Migration System - Quick Reference Card
+
+**TL;DR**: Add fresh database detection to `migrations.py` to solve chicken-and-egg problem.
+
+## The Problem
+
+- `SCHEMA_SQL` includes `code_verifier` column (line 60, database.py)
+- Migration 001 tries to add same column
+- Fresh databases fail: "column already exists"
+
+## The Solution
+
+**SCHEMA_SQL = Target State** (complete current schema)
+- Fresh installs: Execute SCHEMA_SQL, skip migrations (already at target)
+- Existing installs: Run migrations to reach target
+
+## Code Changes Required
+
+### 1. Add to `migrations.py` (before `run_migrations`):
+
+```python
+def is_schema_current(conn):
+    """Check if database schema matches current SCHEMA_SQL"""
+    try:
+        cursor = conn.execute("PRAGMA table_info(auth_state)")
+        columns = [row[1] for row in cursor.fetchall()]
+        return 'code_verifier' in columns
+    except sqlite3.OperationalError:
+        return False
+```
+
+### 2. Modify `run_migrations()` in `migrations.py`:
+
+After `create_migrations_table(conn)`, before applying migrations, add:
+
+```python
+# Check if this is a fresh database
+cursor = conn.execute("SELECT COUNT(*) FROM schema_migrations")
+migration_count = cursor.fetchone()[0]
+
+# Discover migration files
+migration_files = discover_migration_files(migrations_dir)
+
+# Fresh database detection
+if migration_count == 0 and is_schema_current(conn):
+    # Mark all migrations as applied (schema already current)
+    for migration_name, _ in migration_files:
+        conn.execute(
+            "INSERT INTO schema_migrations (migration_name) VALUES (?)",
+            (migration_name,)
+        )
+    conn.commit()
+    logger.info(f"Fresh database: marked {len(migration_files)} migrations as applied")
+    return
+```
+
+### 3. Optional Helpers (add to `migrations.py` for future use):
+
+```python
+def table_exists(conn, table_name):
+    cursor = conn.execute(
+        "SELECT name FROM sqlite_master WHERE type='table' AND name=?",
+        (table_name,)
+    )
+    return cursor.fetchone() is not None
+
+def column_exists(conn, table_name, column_name):
+    try:
+        cursor = conn.execute(f"PRAGMA table_info({table_name})")
+        columns = [row[1] for row in cursor.fetchall()]
+        return column_name in columns
+    except sqlite3.OperationalError:
+        return False
+```
+
+## Test It
+
+```bash
+# Test 1: Fresh database
+rm data/starpunk.db && uv run flask --app app.py run
+# Expected: "Fresh database: marked 1 migrations as applied"
+
+# Test 2: Legacy database (before PKCE)
+# Create old schema, run app
+# Expected: "Applied migration: 001_add_code_verifier..."
+```
+
+## All Other Questions Answered
+
+- **Q2**: schema_migrations only in migrations.py ✓ (already correct)
+- **Q3**: Accept non-idempotent SQL, rely on tracking ✓ (already works)
+- **Q4**: Flexible filename validation ✓ (already implemented)
+- **Q5**: Automatic transition via Q1 solution ✓
+- **Q6**: Helpers provided for advanced use ✓ (see above)
+- **Q7**: SCHEMA_SQL is target state ✓ (no changes needed)
+
+## Full Details
+
+See: `/home/phil/Projects/starpunk/docs/reports/2025-11-19-migration-system-implementation-guidance.md`
+
+## Architecture Reference
+
+See: `/home/phil/Projects/starpunk/docs/decisions/ADR-020-automatic-database-migrations.md`
+(New section: "Developer Questions & Architectural Responses")
--- a/docs/reports/2025-11-19-migration-system-implementation-guidance.md
+++ b/docs/reports/2025-11-19-migration-system-implementation-guidance.md
@@ -0,0 +1,345 @@
+# Migration System Implementation Guidance
+
+**Date**: 2025-11-19
+**Architect**: StarPunk Architect
+**Developer**: StarPunk Developer
+**Status**: Ready for Implementation
+
+## Executive Summary
+
+All 7 critical questions have been answered with decisive architectural decisions. The implementation is straightforward and production-ready.
+
+## Critical Decisions Summary
+
+| # | Question | Decision | Action Required |
+|---|----------|----------|-----------------|
+| **1** | Chicken-and-egg problem | Fresh database detection | Add `is_schema_current()` to migrations.py |
+| **2** | schema_migrations location | Only in migrations.py | No changes needed (already correct) |
+| **3** | ALTER TABLE idempotency | Accept non-idempotency | No changes needed (tracking handles it) |
+| **4** | Filename validation | Flexible glob + sort | No changes needed (already implemented) |
+| **5** | Existing database path | Automatic via heuristic | Handled by Q1 solution |
+| **6** | Column helpers | Provide as advanced utils | Add 3 helper functions to migrations.py |
+| **7** | SCHEMA_SQL purpose | Complete target state | No changes needed (already correct) |
+
+## Implementation Checklist
+
+### Step 1: Add Helper Functions to `starpunk/migrations.py`
+
+Add these three utility functions (for advanced usage, not required for migration 001):
+
+```python
+def table_exists(conn, table_name):
+    """Check if table exists in database"""
+    cursor = conn.execute(
+        "SELECT name FROM sqlite_master WHERE type='table' AND name=?",
+        (table_name,)
+    )
+    return cursor.fetchone() is not None
+
+
+def column_exists(conn, table_name, column_name):
+    """Check if column exists in table"""
+    try:
+        cursor = conn.execute(f"PRAGMA table_info({table_name})")
+        columns = [row[1] for row in cursor.fetchall()]
+        return column_name in columns
+    except sqlite3.OperationalError:
+        return False
+
+
+def index_exists(conn, index_name):
+    """Check if index exists in database"""
+    cursor = conn.execute(
+        "SELECT name FROM sqlite_master WHERE type='index' AND name=?",
+        (index_name,)
+    )
+    return cursor.fetchone() is not None
+```
+
+### Step 2: Add Fresh Database Detection
+
+Add this function before `run_migrations()`:
+
+```python
+def is_schema_current(conn):
+    """
+    Check if database schema is current (matches SCHEMA_SQL)
+
+    Uses heuristic: Check for presence of latest schema features
+    Currently checks for code_verifier column in auth_state table
+
+    Args:
+        conn: SQLite connection
+
+    Returns:
+        bool: True if schema appears current, False if legacy
+    """
+    try:
+        cursor = conn.execute("PRAGMA table_info(auth_state)")
+        columns = [row[1] for row in cursor.fetchall()]
+        return 'code_verifier' in columns
+    except sqlite3.OperationalError:
+        # Table doesn't exist - definitely not current
+        return False
+```
+
+**Important**: This heuristic checks for `code_verifier` column. When you add future migrations, update this function to check for the latest schema feature.
+
+### Step 3: Modify `run_migrations()` Function
+
+Replace the migration application logic with fresh database detection:
+
+**Find this section** (after `create_migrations_table(conn)`):
+
+```python
+# Get already-applied migrations
+applied = get_applied_migrations(conn)
+
+# Discover migration files
+migration_files = discover_migration_files(migrations_dir)
+
+if not migration_files:
+    logger.info("No migration files found")
+    return
+
+# Apply pending migrations
+pending_count = 0
+for migration_name, migration_path in migration_files:
+    if migration_name not in applied:
+        apply_migration(conn, migration_name, migration_path, logger)
+        pending_count += 1
+```
+
+**Replace with**:
+
+```python
+# Check if this is a fresh database with current schema
+cursor = conn.execute("SELECT COUNT(*) FROM schema_migrations")
+migration_count = cursor.fetchone()[0]
+
+# Discover migration files
+migration_files = discover_migration_files(migrations_dir)
+
+if not migration_files:
+    logger.info("No migration files found")
+    return
+
+# Fresh database detection
+if migration_count == 0:
+    if is_schema_current(conn):
+        # Schema is current - mark all migrations as applied
+        for migration_name, _ in migration_files:
+            conn.execute(
+                "INSERT INTO schema_migrations (migration_name) VALUES (?)",
+                (migration_name,)
+            )
+        conn.commit()
+        logger.info(
+            f"Fresh database detected: marked {len(migration_files)} "
+            f"migrations as applied (schema already current)"
+        )
+        return
+    else:
+        logger.info("Legacy database detected: applying all migrations")
+
+# Get already-applied migrations
+applied = get_applied_migrations(conn)
+
+# Apply pending migrations
+pending_count = 0
+for migration_name, migration_path in migration_files:
+    if migration_name not in applied:
+        apply_migration(conn, migration_name, migration_path, logger)
+        pending_count += 1
+```
+
+## Files That Need Changes
+
+1. **`/home/phil/Projects/starpunk/starpunk/migrations.py`**
+   - Add `is_schema_current()` function
+   - Add `table_exists()` helper
+   - Add `column_exists()` helper
+   - Add `index_exists()` helper
+   - Modify `run_migrations()` to include fresh database detection
+
+2. **No other files need changes**
+   - `SCHEMA_SQL` is correct (includes code_verifier)
+   - Migration 001 is correct (adds code_verifier)
+   - `database.py` is correct (calls run_migrations)
+
+## Test Scenarios
+
+After implementation, verify these scenarios:
+
+### Test 1: Fresh Database (New Install)
+```bash
+rm data/starpunk.db
+uv run flask --app app.py run
+```
+
+**Expected Log Output**:
+```
+[INFO] Database initialized: data/starpunk.db
+[INFO] Fresh database detected: marked 1 migrations as applied (schema already current)
+```
+
+**Verify**:
+```bash
+sqlite3 data/starpunk.db "SELECT * FROM schema_migrations;"
+# Should show: 1|001_add_code_verifier_to_auth_state.sql|<timestamp>
+
+sqlite3 data/starpunk.db "PRAGMA table_info(auth_state);"
+# Should include code_verifier column
+```
+
+### Test 2: Legacy Database (Before PKCE Feature)
+```bash
+# Create old database without code_verifier
+sqlite3 data/starpunk.db "
+CREATE TABLE auth_state (
+    state TEXT PRIMARY KEY,
+    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    expires_at TIMESTAMP NOT NULL,
+    redirect_uri TEXT
+);
+"
+
+uv run flask --app app.py run
+```
+
+**Expected Log Output**:
+```
+[INFO] Database initialized: data/starpunk.db
+[INFO] Legacy database detected: applying all migrations
+[INFO] Applied migration: 001_add_code_verifier_to_auth_state.sql
+[INFO] Migrations complete: 1 applied, 1 total
+```
+
+**Verify**:
+```bash
+sqlite3 data/starpunk.db "PRAGMA table_info(auth_state);"
+# Should now include code_verifier column
+```
+
+### Test 3: Current Database (Already Has code_verifier, No Migration Tracking)
+```bash
+# Simulate database created after PKCE but before migrations
+rm data/starpunk.db
+# Run once to create current schema
+uv run flask --app app.py run
+# Delete migration tracking to simulate upgrade scenario
+sqlite3 data/starpunk.db "DROP TABLE schema_migrations;"
+
+# Now run again (simulates upgrade)
+uv run flask --app app.py run
+```
+
+**Expected Log Output**:
+```
+[INFO] Database initialized: data/starpunk.db
+[INFO] Fresh database detected: marked 1 migrations as applied (schema already current)
+```
+
+**Verify**: Migration 001 should NOT execute (would fail on duplicate column).
+
+### Test 4: Up-to-Date Database
+```bash
+# Database already migrated
+uv run flask --app app.py run
+```
+
+**Expected Log Output**:
+```
+[INFO] Database initialized: data/starpunk.db
+[INFO] All migrations up to date (1 total)
+```
+
+## Edge Cases Handled
+
+1. **Fresh install**: SCHEMA_SQL creates complete schema, migrations marked as applied, never executed ✓
+2. **Upgrade from pre-PKCE**: Migration 001 executes, adds code_verifier ✓
+3. **Upgrade from post-PKCE, pre-migrations**: Fresh DB detection marks migrations as applied ✓
+4. **Re-running on current database**: Idempotent, no changes ✓
+5. **Migration already applied**: Skipped via tracking table ✓
+
+## Future Migration Pattern
+
+When adding future schema changes:
+
+1. **Update SCHEMA_SQL** in `database.py` with new tables/columns
+2. **Create migration file** `002_description.sql` with same SQL
+3. **Update `is_schema_current()`** to check for new feature (latest heuristic)
+4. **Test with all 4 scenarios above**
+
+Example for adding tags feature:
+
+**`database.py` SCHEMA_SQL**:
+```python
+# Add at end of SCHEMA_SQL
+CREATE TABLE IF NOT EXISTS tags (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    name TEXT UNIQUE NOT NULL
+);
+```
+
+**`migrations/002_add_tags_table.sql`**:
+```sql
+-- Migration: Add tags table
+-- Date: 2025-11-20
+
+CREATE TABLE IF NOT EXISTS tags (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    name TEXT UNIQUE NOT NULL
+);
+```
+
+**Update `is_schema_current()`**:
+```python
+def is_schema_current(conn):
+    """Check if database schema is current"""
+    try:
+        # Check for latest feature (tags table in this case)
+        return table_exists(conn, 'tags')
+    except sqlite3.OperationalError:
+        return False
+```
+
+## Key Architectural Principles
+
+1. **SCHEMA_SQL is the destination**: It represents complete current state
+2. **Migrations are the journey**: They get existing databases to that state
+3. **Fresh databases skip the journey**: They're already at the destination
+4. **Heuristic detection is sufficient**: Check for latest feature to determine currency
+5. **Migration tracking is the safety net**: Prevents re-running migrations
+6. **Idempotency is nice-to-have**: Tracking is the primary mechanism
+
+## Common Pitfalls to Avoid
+
+1. **Don't remove from SCHEMA_SQL**: Only add, never remove (even if you "undo" via migration)
+2. **Don't create migration without SCHEMA_SQL update**: They must stay in sync
+3. **Don't hardcode schema checks**: Use `is_schema_current()` heuristic
+4. **Don't forget to update heuristic**: When adding new migrations, update the check
+5. **Don't make migrations complex**: Keep them simple, let tracking handle safety
+
+## Questions?
+
+All architectural decisions are documented in:
+- `/home/phil/Projects/starpunk/docs/decisions/ADR-020-automatic-database-migrations.md`
+
+See the "Developer Questions & Architectural Responses" section for detailed rationale on all 7 questions.
+
+## Ready to Implement
+
+You have:
+- Clear implementation steps
+- Complete code examples
+- Test scenarios
+- Edge case handling
+- Future migration pattern
+
+Proceed with implementation. The architecture is solid and production-ready.
+
+---
+
+**Architect Sign-Off**: Ready for implementation
+**Next Step**: Developer implements modifications to `migrations.py`