Files
StarPunk/docs/design/v1.0.0/2025-11-24-migration-fix-v1.0.0-rc.2.md
Phil Skentelbery f10d0679da feat(tags): Add database schema and tags module (v1.3.0 Phase 1)
Implements tag/category system backend following microformats2 p-category specification.

Database changes:
- Migration 008: Add tags and note_tags tables
- Normalized tag storage (case-insensitive lookup, display name preserved)
- Indexes for performance

New module:
- starpunk/tags.py: Tag management functions
  - normalize_tag: Normalize tag strings
  - get_or_create_tag: Get or create tag records
  - add_tags_to_note: Associate tags with notes (replaces existing)
  - get_note_tags: Retrieve note tags (alphabetically ordered)
  - get_tag_by_name: Lookup tag by normalized name
  - get_notes_by_tag: Get all notes with specific tag
  - parse_tag_input: Parse comma-separated tag input

Model updates:
- Note.tags property (lazy-loaded, prefer pre-loading in routes)
- Note.to_dict() add include_tags parameter

CRUD updates:
- create_note() accepts tags parameter
- update_note() accepts tags parameter (None = no change, [] = remove all)

Micropub integration:
- Pass tags to create_note() (tags already extracted by extract_tags())
- Return tags in q=source response

Per design doc: docs/design/v1.3.0/microformats-tags-design.md

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-10 11:24:23 -07:00

8.8 KiB

Implementation Report: Migration Fix for v1.0.0-rc.2

Date: 2025-11-24 Version: v1.0.0-rc.2 Type: Hotfix Status: Implemented Branch: hotfix/1.0.0-rc.2-migration-fix

Summary

Fixed critical database migration failure that occurred when applying migration 002 to existing databases created with v1.0.0-rc.1 or earlier. The issue was caused by duplicate index definitions in both SCHEMA_SQL and migration files, causing "index already exists" errors.

Problem Statement

Root Cause

When v1.0.0-rc.1 was released, the SCHEMA_SQL in database.py included index creation statements for token-related indexes:

CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
CREATE INDEX IF NOT EXISTS idx_tokens_me ON tokens(me);
CREATE INDEX IF NOT EXISTS idx_tokens_expires ON tokens(expires_at);

However, these same indexes were also created by migration 002_secure_tokens_and_authorization_codes.sql:

CREATE INDEX idx_tokens_hash ON tokens(token_hash);
CREATE INDEX idx_tokens_me ON tokens(me);
CREATE INDEX idx_tokens_expires ON tokens(expires_at);

Failure Scenario

For databases created with v1.0.0-rc.1:

  1. init_db() runs SCHEMA_SQL, creating tables and indexes
  2. Migration system detects no migrations have been applied
  3. Tries to apply migration 002
  4. Migration fails because indexes already exist (migration uses CREATE INDEX without IF NOT EXISTS)

Affected Databases

  • Any database created with v1.0.0-rc.1 where init_db() was called
  • Fresh databases where SCHEMA_SQL ran before migrations could apply

Solution

Phase 1: Remove Duplicate Index Definitions

File: starpunk/database.py

Removed the three index creation statements from SCHEMA_SQL (lines 58-60):

  • CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
  • CREATE INDEX IF NOT EXISTS idx_tokens_me ON tokens(me);
  • CREATE INDEX IF NOT EXISTS idx_tokens_expires ON tokens(expires_at);

Rationale: Migration 002 should be the sole source of truth for these indexes. SCHEMA_SQL should only create tables, not indexes that are managed by migrations.

Phase 2: Smart Migration Detection

File: starpunk/migrations.py

Enhanced the migration system to handle databases where SCHEMA_SQL already includes features from migrations:

  1. Added is_migration_needed() function: Checks database state to determine if a specific migration needs to run

    • Migration 001: Checks if code_verifier column exists
    • Migration 002: Checks if tables exist with correct structure and if indexes exist
  2. Updated is_schema_current() function: Now checks for presence of indexes, not just tables/columns

    • Returns False if indexes are missing (even if tables exist)
    • This triggers the "fresh database with partial schema" path
  3. Enhanced run_migrations() function: Smart handling of migrations on fresh databases

    • Detects when migration features are already in SCHEMA_SQL
    • Skips migrations that would fail (tables already exist)
    • Creates missing indexes separately for migration 002
    • Marks skipped migrations as applied in tracking table

Migration Logic Flow

Fresh Database Init:
1. SCHEMA_SQL creates tables/columns (no indexes for tokens/auth_codes)
2. is_schema_current() returns False (indexes missing)
3. run_migrations() detects fresh database with partial schema
4. For migration 001:
   - is_migration_needed() returns False (code_verifier exists)
   - Skips migration, marks as applied
5. For migration 002:
   - is_migration_needed() returns False (tables exist, no indexes)
   - Creates missing indexes separately
   - Marks migration as applied

Changes Made

File: starpunk/database.py

  • Lines 58-60 removed: Duplicate index creation statements for tokens table

File: starpunk/migrations.py

  • Lines 50-99: Updated is_schema_current() to check for indexes
  • Lines 158-214: Added is_migration_needed() function for smart migration detection
  • Lines 373-422: Enhanced migration application loop with index creation for migration 002

File: starpunk/__init__.py

  • Lines 156-157: Version bumped to 1.0.0-rc.2

File: CHANGELOG.md

  • Lines 10-25: Added v1.0.0-rc.2 entry documenting the fix

Testing

Test Case 1: Fresh Database Initialization

# Create fresh database with current SCHEMA_SQL
init_db(app)

# Verify:
# - Migration 001: Marked as applied (code_verifier in SCHEMA_SQL)
# - Migration 002: Marked as applied with indexes created
# - All 3 token indexes exist: idx_tokens_hash, idx_tokens_me, idx_tokens_expires
# - All 2 auth_code indexes exist: idx_auth_codes_hash, idx_auth_codes_expires

Result: ✓ PASS

  • Created 3 missing token indexes from migration 002
  • Migrations complete: 0 applied, 2 skipped (already in SCHEMA_SQL), 2 total
  • All indexes present and functional

Test Case 2: Legacy Database Migration

# Database from v0.9.x (before migration 002)
# Has old tokens table, no authorization_codes, no indexes

run_migrations(db_path)

# Verify:
# - Migration 001: Applied (added code_verifier)
# - Migration 002: Applied (dropped old tokens, created new tables, created indexes)

Result: Would work correctly (migration 002 would fully apply)

Test Case 3: Existing v1.0.0-rc.1 Database

# Database created with v1.0.0-rc.1
# Has tokens table with indexes from SCHEMA_SQL
# Has no migration tracking records

run_migrations(db_path)

# Verify:
# - Migration 001: Skipped (code_verifier exists)
# - Migration 002: Skipped (tables exist), indexes already present

Result: Would work correctly (detects indexes already exist, marks as applied)

Backwards Compatibility

For Fresh Databases

  • Before fix: Would fail on migration 002 (table already exists)
  • After fix: Successfully initializes with all features

For Existing v1.0.0-rc.1 Databases

  • Before fix: Would fail on migration 002 (index already exists)
  • After fix: Detects indexes exist, marks migration as applied without running

For Legacy Databases (pre-v1.0.0-rc.1)

  • No change: Migrations apply normally as before

Technical Details

Index Creation Strategy

Migration 002 creates 5 indexes total:

  1. idx_tokens_hash - For token lookup by hash
  2. idx_tokens_me - For finding all tokens for a user
  3. idx_tokens_expires - For finding expired tokens to clean up
  4. idx_auth_codes_hash - For authorization code lookup
  5. idx_auth_codes_expires - For finding expired codes

These indexes are now ONLY created by:

  1. Migration 002 (for legacy databases)
  2. Smart migration detection (for fresh databases with SCHEMA_SQL)

Migration Tracking

All scenarios now correctly record migrations in schema_migrations table:

  • Fresh database: Both migrations marked as applied
  • Legacy database: Migrations applied and recorded
  • Existing rc.1 database: Migrations detected and marked as applied

Deployment Notes

Upgrading from v1.0.0-rc.1

  1. Stop application
  2. Backup database: cp data/starpunk.db data/starpunk.db.backup
  3. Update code to v1.0.0-rc.2
  4. Start application
  5. Migrations will detect existing indexes and mark as applied
  6. No data loss or schema changes

Fresh Installation

  1. Install v1.0.0-rc.2
  2. Run application
  3. Database initializes with SCHEMA_SQL + smart migrations
  4. All indexes created correctly

Verification

Check Migration Status

sqlite3 data/starpunk.db "SELECT * FROM schema_migrations ORDER BY id"

Expected output:

1|001_add_code_verifier_to_auth_state.sql|2025-11-24 ...
2|002_secure_tokens_and_authorization_codes.sql|2025-11-24 ...

Check Indexes

sqlite3 data/starpunk.db "SELECT name FROM sqlite_master WHERE type='index' AND name LIKE 'idx_tokens%' ORDER BY name"

Expected output:

idx_tokens_expires
idx_tokens_hash
idx_tokens_me

Lessons Learned

  1. Single Source of Truth: Migrations should be the sole source for schema changes, not duplicated in SCHEMA_SQL
  2. Migration Idempotency: Migrations should be idempotent or the migration system should handle partial application
  3. Smart Detection: Fresh database detection needs to consider specific features, not just "all or nothing"
  4. Index Management: Indexes created by migrations should not be duplicated in base schema
  • ADR-020: Automatic Database Migration System
  • Git Branching Strategy: docs/standards/git-branching-strategy.md
  • Versioning Strategy: docs/standards/versioning-strategy.md

Next Steps

  1. Wait for approval
  2. Merge hotfix branch to main
  3. Tag v1.0.0-rc.2
  4. Test in production
  5. Monitor for any migration issues

Files Modified

  • starpunk/database.py (3 lines removed)
  • starpunk/migrations.py (enhanced smart migration detection)
  • starpunk/__init__.py (version bump)
  • CHANGELOG.md (release notes)
  • docs/reports/2025-11-24-migration-fix-v1.0.0-rc.2.md (this report)