Architecture documentation for automatic database migrations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1601 lines
51 KiB
Markdown
1601 lines
51 KiB
Markdown
# ADR-020: Automatic Database Migration System
|
|
|
|
## Status
|
|
|
|
Accepted
|
|
|
|
## Context
|
|
|
|
StarPunk currently requires manual database migration execution before starting the application. This creates operational friction and is particularly problematic in containerized deployments where the database schema must be initialized automatically on startup.
|
|
|
|
### Current State
|
|
|
|
- **Database**: SQLite at `data/starpunk.db`
|
|
- **Initial Schema**: Defined in `starpunk/database.py` as `SCHEMA_SQL` constant
|
|
- **Migrations**: SQL files in `migrations/` directory (e.g., `001_add_code_verifier_to_auth_state.sql`)
|
|
- **Initialization**: `init_db()` creates tables using `CREATE TABLE IF NOT EXISTS`
|
|
- **Problem**: Schema changes require manual SQL execution, no tracking of applied migrations
|
|
|
|
### Pain Points
|
|
|
|
1. **Manual Intervention Required**: Deploying schema changes requires SSH access and manual SQL execution
|
|
2. **No Migration History**: No way to know which migrations have been applied to a database
|
|
3. **Error-Prone**: Easy to forget migrations or apply them out of order
|
|
4. **Container Unfriendly**: Containers should be stateless and self-initializing
|
|
5. **Development Friction**: Each developer must manually track and apply migrations
|
|
6. **Testing Complexity**: Test databases require manual migration setup
|
|
|
|
### Requirements
|
|
|
|
1. **Automatic Execution**: Migrations run on application startup
|
|
2. **Idempotency**: Safe to run multiple times, only applies pending migrations
|
|
3. **Order Preservation**: Migrations applied in deterministic order
|
|
4. **Tracking**: Record which migrations have been applied
|
|
5. **Safety**: Clear errors, no partial application of migrations
|
|
6. **Simplicity**: Minimal complexity, easy to understand and debug
|
|
7. **Container Compatible**: Works in ephemeral container environments
|
|
8. **Developer Friendly**: Easy to add new migrations
|
|
|
|
## Decision
|
|
|
|
Implement an **automatic sequential migration system** that runs on application startup, using numbered SQL files and a migration tracking table.
|
|
|
|
### Core Components
|
|
|
|
1. **Migration Tracking Table**: `schema_migrations` table in SQLite
|
|
2. **Migration Files**: Sequentially numbered `.sql` files in `migrations/` directory
|
|
3. **Migration Runner**: `run_migrations()` function in `starpunk/migrations.py`
|
|
4. **Integration Point**: Called from `init_db()` in `starpunk/database.py`
|
|
|
|
### Migration File Format
|
|
|
|
**Naming Convention**: `{number:03d}_{description}.sql`
|
|
|
|
**Examples**:
|
|
- `001_add_code_verifier_to_auth_state.sql`
|
|
- `002_add_tags_table.sql`
|
|
- `003_add_note_syndication_urls.sql`
|
|
|
|
**Format Rules**:
|
|
- Three-digit zero-padded number (001, 002, 003, ...)
|
|
- Underscore separator
|
|
- Lowercase descriptive name with underscores
|
|
- `.sql` extension
|
|
|
|
**File Content**:
|
|
```sql
|
|
-- Migration: {Description}
|
|
-- Date: {YYYY-MM-DD}
|
|
-- ADR: {ADR reference if applicable}
|
|
|
|
{SQL statements}
|
|
|
|
-- Each statement should be idempotent where possible
|
|
-- Use IF NOT EXISTS for CREATE TABLE/INDEX
|
|
-- Use default values for ALTER TABLE ADD COLUMN
|
|
```
|
|
|
|
### Migration Tracking Schema
|
|
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS schema_migrations (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
migration_name TEXT UNIQUE NOT NULL,
|
|
applied_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_schema_migrations_name
|
|
ON schema_migrations(migration_name);
|
|
```
|
|
|
|
**Fields**:
|
|
- `id`: Auto-increment primary key
|
|
- `migration_name`: Filename of migration (e.g., `001_add_code_verifier_to_auth_state.sql`)
|
|
- `applied_at`: Timestamp when migration was applied
|
|
|
|
### Migration Discovery and Execution
|
|
|
|
**Algorithm**:
|
|
|
|
1. **Initialize tracking table** (if not exists)
|
|
2. **Discover migration files** in `migrations/` directory
|
|
3. **Sort by filename** (numeric prefix ensures order)
|
|
4. **Check each migration** against `schema_migrations` table
|
|
5. **Apply pending migrations** in order
|
|
6. **Record successful migrations** in tracking table
|
|
7. **Fail fast** on any error with clear message
|
|
|
|
**Execution Order**:
|
|
- Alphanumeric sort of filenames ensures correct order
|
|
- `001_*.sql` runs before `002_*.sql`
|
|
- New migrations added with next available number
|
|
|
|
### SQLite Transaction Handling
|
|
|
|
**Approach**: Execute each migration in a transaction
|
|
|
|
**Implementation**:
|
|
```python
|
|
try:
|
|
conn.execute("BEGIN")
|
|
conn.executescript(migration_sql)
|
|
conn.execute(
|
|
"INSERT INTO schema_migrations (migration_name) VALUES (?)",
|
|
(migration_file,)
|
|
)
|
|
conn.commit()
|
|
except Exception as e:
|
|
conn.rollback()
|
|
raise MigrationError(f"Migration {migration_file} failed: {e}")
|
|
```
|
|
|
|
**Note on SQLite DDL**: SQLite does not support full rollback of DDL statements (CREATE TABLE, ALTER TABLE) within transactions. However:
|
|
- Most DDL is auto-committed immediately
|
|
- Migration failures leave partial state
|
|
- **Mitigation**: Write idempotent migrations using `IF NOT EXISTS`, `DEFAULT` values, etc.
|
|
- **Recovery**: Failed migrations must be manually fixed, then re-run
|
|
|
|
### Integration Points
|
|
|
|
**In `starpunk/database.py`**:
|
|
```python
|
|
def init_db(app=None):
|
|
"""
|
|
Initialize database schema and run migrations
|
|
|
|
Args:
|
|
app: Flask application instance (optional, for config access)
|
|
"""
|
|
if app:
|
|
db_path = app.config["DATABASE_PATH"]
|
|
logger = app.logger
|
|
else:
|
|
db_path = Path("./data/starpunk.db")
|
|
logger = None
|
|
|
|
# Ensure parent directory exists
|
|
db_path.parent.mkdir(parents=True, exist_ok=True)
|
|
|
|
# Create initial schema
|
|
conn = sqlite3.connect(db_path)
|
|
try:
|
|
conn.executescript(SCHEMA_SQL)
|
|
conn.commit()
|
|
if logger:
|
|
logger.info(f"Database initialized: {db_path}")
|
|
finally:
|
|
conn.close()
|
|
|
|
# Run migrations
|
|
from starpunk.migrations import run_migrations
|
|
run_migrations(db_path, logger=logger)
|
|
```
|
|
|
|
**Call Order**:
|
|
1. `create_app()` → `init_db(app)`
|
|
2. `init_db()` → create base schema → `run_migrations()`
|
|
3. `run_migrations()` → apply pending migrations
|
|
4. Application starts serving requests
|
|
|
|
### Error Handling
|
|
|
|
**Error Types**:
|
|
|
|
1. **Migration File Error**: Invalid SQL syntax
|
|
- **Action**: Log error with filename and line number
|
|
- **Result**: Application fails to start
|
|
- **Recovery**: Fix migration SQL, restart
|
|
|
|
2. **Migration Conflict**: Two migrations with same number
|
|
- **Action**: Log error listing conflicting files
|
|
- **Result**: Application fails to start
|
|
- **Recovery**: Renumber migrations, restart
|
|
|
|
3. **Database Lock**: SQLite database locked
|
|
- **Action**: Retry with exponential backoff (3 attempts)
|
|
- **Result**: Fail if still locked after retries
|
|
- **Recovery**: Ensure no other processes accessing database
|
|
|
|
4. **Partial Migration**: Migration fails mid-execution
|
|
- **Action**: Log error with migration name and error details
|
|
- **Result**: Application fails to start
|
|
- **Recovery**: Fix issue manually, restart (migration will retry)
|
|
|
|
**Error Message Format**:
|
|
```
|
|
[ERROR] Database migration failed: 002_add_tags_table.sql
|
|
Reason: near "CRATE": syntax error at line 5
|
|
Action required: Fix migration file and restart application
|
|
```
|
|
|
|
### Logging Strategy
|
|
|
|
**Log Levels**:
|
|
|
|
- **INFO**: Migration discovery and successful application
|
|
```
|
|
[INFO] Discovered 3 migration files
|
|
[INFO] Applied migration: 001_add_code_verifier_to_auth_state.sql
|
|
[INFO] All migrations applied successfully (3 total, 1 pending)
|
|
```
|
|
|
|
- **DEBUG**: Detailed migration execution
|
|
```
|
|
[DEBUG] Migration file: 001_add_code_verifier_to_auth_state.sql
|
|
[DEBUG] Migration status: pending
|
|
[DEBUG] Executing migration SQL...
|
|
[DEBUG] Migration recorded in schema_migrations
|
|
```
|
|
|
|
- **WARNING**: Unusual but non-fatal conditions
|
|
```
|
|
[WARNING] No migrations directory found, skipping migrations
|
|
[WARNING] Migrations directory empty
|
|
```
|
|
|
|
- **ERROR**: Migration failures
|
|
```
|
|
[ERROR] Migration failed: 002_add_tags_table.sql
|
|
[ERROR] Database error: near "CRATE": syntax error
|
|
```
|
|
|
|
**Logging Output**:
|
|
- Development: Console (handled by Flask logger)
|
|
- Production: Container logs (stdout/stderr)
|
|
- Format: Timestamp, level, message
|
|
|
|
### Developer Workflow
|
|
|
|
**Adding a New Migration**:
|
|
|
|
1. **Create migration file**:
|
|
```bash
|
|
# Determine next number
|
|
ls migrations/ | tail -1
|
|
# Creates: 002_add_tags_table.sql
|
|
|
|
touch migrations/002_add_tags_table.sql
|
|
```
|
|
|
|
2. **Write migration SQL**:
|
|
```sql
|
|
-- Migration: Add tags table
|
|
-- Date: 2025-11-19
|
|
-- ADR: ADR-025-tags-feature
|
|
|
|
CREATE TABLE IF NOT EXISTS tags (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
name TEXT UNIQUE NOT NULL,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_tags_name ON tags(name);
|
|
```
|
|
|
|
3. **Test migration**:
|
|
```bash
|
|
# Start application (migration runs automatically)
|
|
flask --app app.py run
|
|
|
|
# Check logs for migration success
|
|
# [INFO] Applied migration: 002_add_tags_table.sql
|
|
```
|
|
|
|
4. **Commit migration**:
|
|
```bash
|
|
git add migrations/002_add_tags_table.sql
|
|
git commit -m "Add tags table migration"
|
|
```
|
|
|
|
**Migration Best Practices**:
|
|
|
|
1. **Make migrations additive**: Add columns, don't remove (mark as deprecated instead)
|
|
2. **Use defaults for new columns**: `ALTER TABLE ... ADD COLUMN ... DEFAULT ...`
|
|
3. **Write idempotent SQL**: Use `IF NOT EXISTS` where possible
|
|
4. **Test on copy of production database**: Verify migration works with real data
|
|
5. **Keep migrations small**: One logical change per migration
|
|
6. **Document purpose**: Include header comment explaining change
|
|
|
|
## Rationale
|
|
|
|
### Why Sequential Numbers Instead of Timestamps?
|
|
|
|
**Decision**: Use sequential numbers (`001`, `002`, `003`)
|
|
|
|
**Alternatives Considered**:
|
|
- Timestamps (`20251119_143522_add_tags.sql`)
|
|
- UUIDs (`a7b3c9d1-add-tags.sql`)
|
|
- Git SHAs (`a7b3c9d-add-tags.sql`)
|
|
|
|
**Rationale**:
|
|
1. **Simplicity**: Easy to see order at a glance
|
|
2. **No Conflicts**: Single developer unlikely to have conflicts
|
|
3. **Readability**: Shorter filenames
|
|
4. **Team Compatible**: Even with multiple developers, merge conflicts explicit
|
|
5. **Sortability**: Lexicographic sort equals execution order
|
|
|
|
**Trade-off**: Two developers working on separate branches may create conflicting numbers. Resolution is simple (renumber before merge).
|
|
|
|
### Why Run on Startup Instead of Manual Command?
|
|
|
|
**Decision**: Automatic execution on `create_app()`
|
|
|
|
**Alternatives Considered**:
|
|
- CLI command: `flask db migrate`
|
|
- Separate initialization script
|
|
- Container entrypoint script
|
|
|
|
**Rationale**:
|
|
1. **Container Friendly**: Containers self-initialize on startup
|
|
2. **Developer Friendly**: `git pull` + `flask run` just works
|
|
3. **No Forgotten Migrations**: Impossible to skip migrations
|
|
4. **Idempotent**: Safe to run multiple times
|
|
5. **Fail Fast**: Application won't start with incomplete schema
|
|
|
|
**Trade-off**: Application startup slightly slower (negligible for SQLite). Migrations must be fast (<1s each).
|
|
|
|
### Why SQLite Transaction Per Migration?
|
|
|
|
**Decision**: Each migration executes in its own transaction
|
|
|
|
**Alternatives Considered**:
|
|
- Single transaction for all migrations
|
|
- No transaction (auto-commit)
|
|
|
|
**Rationale**:
|
|
1. **Isolation**: Failed migration doesn't affect previously successful ones
|
|
2. **Resume**: Can continue from last successful migration
|
|
3. **SQLite Limitation**: DDL statements auto-commit anyway
|
|
4. **Tracking**: Each successful migration recorded immediately
|
|
|
|
**Trade-off**: SQLite DDL rollback is limited. Partial migration may leave inconsistent state requiring manual fix.
|
|
|
|
### Why No Down Migrations?
|
|
|
|
**Decision**: Only forward migrations, no rollback
|
|
|
|
**Alternatives Considered**:
|
|
- Paired up/down migrations (Django, Rails style)
|
|
- Snapshot-based rollback
|
|
|
|
**Rationale**:
|
|
1. **Simplicity**: Half the code, half the complexity
|
|
2. **IndieWeb Philosophy**: Own your data, fix forward
|
|
3. **SQLite Limitations**: Limited ALTER TABLE support makes rollbacks difficult
|
|
4. **Production Reality**: Rollbacks rarely used, risky
|
|
5. **Alternative**: Restore from backup if needed
|
|
|
|
**Trade-off**: Cannot automatically rollback. Must fix forward or restore from backup.
|
|
|
|
### Why In-Application Instead of External Tool?
|
|
|
|
**Decision**: Migration runner built into application
|
|
|
|
**Alternatives Considered**:
|
|
- Alembic (SQLAlchemy migrations)
|
|
- Flask-Migrate (Flask + Alembic)
|
|
- Custom CLI tool
|
|
|
|
**Rationale**:
|
|
1. **No Dependencies**: Alembic adds complexity and dependencies
|
|
2. **Perfect for SQLite**: Simple file-based migrations sufficient
|
|
3. **Single Codebase**: No separate migration tool to maintain
|
|
4. **Minimal Code**: ~100 lines vs. thousands in Alembic
|
|
5. **Alignment**: "Every line must justify its existence"
|
|
|
|
**Trade-off**: Less powerful than Alembic (no auto-generation, model diffing). For StarPunk's simple schema, this is acceptable.
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
1. **Zero-Touch Deployment**: Containers start with correct schema automatically
|
|
2. **Developer Productivity**: No manual migration tracking or execution
|
|
3. **Safer Deployments**: Migrations always applied in correct order
|
|
4. **Better Testing**: Test databases automatically migrated
|
|
5. **Audit Trail**: Clear history of schema changes in `schema_migrations`
|
|
6. **Idempotent**: Safe to run migrations multiple times
|
|
7. **Simple**: Easy to understand, debug, and maintain
|
|
8. **No Dependencies**: Pure Python + SQLite, no external tools
|
|
|
|
### Negative
|
|
|
|
1. **Startup Time**: Migrations add ~50-200ms to startup (negligible)
|
|
2. **No Auto-Generation**: Migrations must be written manually (acceptable for simple schema)
|
|
3. **No Rollback**: Cannot automatically undo migrations (restore from backup instead)
|
|
4. **SQLite Limitations**: Limited ALTER TABLE support, no full DDL transactions
|
|
5. **Sequential Conflicts**: Multiple developers may create conflicting numbers (rare, easy to fix)
|
|
|
|
### Neutral
|
|
|
|
1. **Migration File Management**: Developers must number migrations correctly
|
|
2. **Testing Requirement**: Migrations should be tested on production-like data
|
|
3. **Documentation Need**: Migration best practices should be documented
|
|
|
|
## Implementation Specification
|
|
|
|
### File Structure
|
|
|
|
```
|
|
starpunk/
|
|
├── migrations.py # NEW: Migration runner
|
|
├── database.py # MODIFIED: Call run_migrations()
|
|
├── __init__.py # No changes
|
|
└── config.py # No changes
|
|
|
|
migrations/ # EXISTING DIRECTORY
|
|
├── 001_add_code_verifier_to_auth_state.sql # EXISTING
|
|
└── 002_*.sql # Future migrations
|
|
```
|
|
|
|
### New File: `starpunk/migrations.py`
|
|
|
|
```python
|
|
"""
|
|
Database migration runner for StarPunk
|
|
|
|
Automatically discovers and applies pending migrations on startup.
|
|
Migrations are numbered SQL files in the migrations/ directory.
|
|
"""
|
|
|
|
import sqlite3
|
|
from pathlib import Path
|
|
import logging
|
|
|
|
|
|
class MigrationError(Exception):
|
|
"""Raised when a migration fails to apply"""
|
|
pass
|
|
|
|
|
|
def create_migrations_table(conn):
|
|
"""
|
|
Create schema_migrations tracking table if it doesn't exist
|
|
|
|
Args:
|
|
conn: SQLite connection
|
|
"""
|
|
conn.execute("""
|
|
CREATE TABLE IF NOT EXISTS schema_migrations (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
migration_name TEXT UNIQUE NOT NULL,
|
|
applied_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
)
|
|
""")
|
|
|
|
conn.execute("""
|
|
CREATE INDEX IF NOT EXISTS idx_schema_migrations_name
|
|
ON schema_migrations(migration_name)
|
|
""")
|
|
|
|
conn.commit()
|
|
|
|
|
|
def get_applied_migrations(conn):
|
|
"""
|
|
Get set of already-applied migration names
|
|
|
|
Args:
|
|
conn: SQLite connection
|
|
|
|
Returns:
|
|
set: Set of migration filenames that have been applied
|
|
"""
|
|
cursor = conn.execute(
|
|
"SELECT migration_name FROM schema_migrations ORDER BY id"
|
|
)
|
|
return set(row[0] for row in cursor.fetchall())
|
|
|
|
|
|
def discover_migration_files(migrations_dir):
|
|
"""
|
|
Discover all migration files in migrations directory
|
|
|
|
Args:
|
|
migrations_dir: Path to migrations directory
|
|
|
|
Returns:
|
|
list: Sorted list of (filename, full_path) tuples
|
|
"""
|
|
if not migrations_dir.exists():
|
|
return []
|
|
|
|
migration_files = []
|
|
for file_path in migrations_dir.glob("*.sql"):
|
|
migration_files.append((file_path.name, file_path))
|
|
|
|
# Sort by filename (numeric prefix ensures correct order)
|
|
migration_files.sort(key=lambda x: x[0])
|
|
|
|
return migration_files
|
|
|
|
|
|
def apply_migration(conn, migration_name, migration_path, logger=None):
|
|
"""
|
|
Apply a single migration file
|
|
|
|
Args:
|
|
conn: SQLite connection
|
|
migration_name: Filename of migration
|
|
migration_path: Full path to migration file
|
|
logger: Optional logger for output
|
|
|
|
Raises:
|
|
MigrationError: If migration fails to apply
|
|
"""
|
|
try:
|
|
# Read migration SQL
|
|
migration_sql = migration_path.read_text()
|
|
|
|
if logger:
|
|
logger.debug(f"Applying migration: {migration_name}")
|
|
|
|
# Execute migration in transaction
|
|
conn.execute("BEGIN")
|
|
conn.executescript(migration_sql)
|
|
|
|
# Record migration as applied
|
|
conn.execute(
|
|
"INSERT INTO schema_migrations (migration_name) VALUES (?)",
|
|
(migration_name,)
|
|
)
|
|
|
|
conn.commit()
|
|
|
|
if logger:
|
|
logger.info(f"Applied migration: {migration_name}")
|
|
|
|
except Exception as e:
|
|
conn.rollback()
|
|
error_msg = f"Migration {migration_name} failed: {e}"
|
|
if logger:
|
|
logger.error(error_msg)
|
|
raise MigrationError(error_msg)
|
|
|
|
|
|
def run_migrations(db_path, logger=None):
|
|
"""
|
|
Run all pending database migrations
|
|
|
|
Called automatically during database initialization.
|
|
Discovers migration files, checks which have been applied,
|
|
and applies any pending migrations in order.
|
|
|
|
Args:
|
|
db_path: Path to SQLite database file
|
|
logger: Optional logger for output
|
|
|
|
Raises:
|
|
MigrationError: If any migration fails to apply
|
|
"""
|
|
if logger is None:
|
|
logger = logging.getLogger(__name__)
|
|
|
|
# Determine migrations directory
|
|
# Assumes migrations/ is in project root, sibling to starpunk/
|
|
migrations_dir = Path(__file__).parent.parent / "migrations"
|
|
|
|
if not migrations_dir.exists():
|
|
logger.warning(f"Migrations directory not found: {migrations_dir}")
|
|
return
|
|
|
|
# Connect to database
|
|
conn = sqlite3.connect(db_path)
|
|
|
|
try:
|
|
# Ensure migrations tracking table exists
|
|
create_migrations_table(conn)
|
|
|
|
# Get already-applied migrations
|
|
applied = get_applied_migrations(conn)
|
|
|
|
# Discover migration files
|
|
migration_files = discover_migration_files(migrations_dir)
|
|
|
|
if not migration_files:
|
|
logger.info("No migration files found")
|
|
return
|
|
|
|
# Apply pending migrations
|
|
pending_count = 0
|
|
for migration_name, migration_path in migration_files:
|
|
if migration_name not in applied:
|
|
apply_migration(conn, migration_name, migration_path, logger)
|
|
pending_count += 1
|
|
|
|
# Summary
|
|
total_count = len(migration_files)
|
|
if pending_count > 0:
|
|
logger.info(
|
|
f"Migrations complete: {pending_count} applied, "
|
|
f"{total_count} total"
|
|
)
|
|
else:
|
|
logger.info(f"All migrations up to date ({total_count} total)")
|
|
|
|
except MigrationError:
|
|
# Re-raise migration errors (already logged)
|
|
raise
|
|
|
|
except Exception as e:
|
|
error_msg = f"Migration system error: {e}"
|
|
logger.error(error_msg)
|
|
raise MigrationError(error_msg)
|
|
|
|
finally:
|
|
conn.close()
|
|
```
|
|
|
|
### Modified File: `starpunk/database.py`
|
|
|
|
**Changes**:
|
|
|
|
1. Import migration runner at top:
|
|
```python
|
|
from starpunk.migrations import run_migrations
|
|
```
|
|
|
|
2. Modify `init_db()` to call migrations:
|
|
```python
|
|
def init_db(app=None):
|
|
"""
|
|
Initialize database schema and run migrations
|
|
|
|
Args:
|
|
app: Flask application instance (optional, for config access)
|
|
"""
|
|
if app:
|
|
db_path = app.config["DATABASE_PATH"]
|
|
logger = app.logger
|
|
else:
|
|
# Fallback to default path
|
|
db_path = Path("./data/starpunk.db")
|
|
logger = None
|
|
|
|
# Ensure parent directory exists
|
|
db_path.parent.mkdir(parents=True, exist_ok=True)
|
|
|
|
# Create database and initial schema
|
|
conn = sqlite3.connect(db_path)
|
|
try:
|
|
conn.executescript(SCHEMA_SQL)
|
|
conn.commit()
|
|
if logger:
|
|
logger.info(f"Database initialized: {db_path}")
|
|
else:
|
|
print(f"Database initialized: {db_path}")
|
|
finally:
|
|
conn.close()
|
|
|
|
# Run migrations
|
|
run_migrations(db_path, logger=logger)
|
|
```
|
|
|
|
### Migration Tracking Table SQL
|
|
|
|
**Location**: Created automatically by `create_migrations_table()`
|
|
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS schema_migrations (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
migration_name TEXT UNIQUE NOT NULL,
|
|
applied_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_schema_migrations_name
|
|
ON schema_migrations(migration_name);
|
|
```
|
|
|
|
### Example Migration File
|
|
|
|
**File**: `migrations/002_add_tags_table.sql`
|
|
|
|
```sql
|
|
-- Migration: Add tags table for note categorization
|
|
-- Date: 2025-11-19
|
|
-- ADR: ADR-025-tags-feature
|
|
|
|
-- Tags table
|
|
CREATE TABLE IF NOT EXISTS tags (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
name TEXT UNIQUE NOT NULL,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_tags_name ON tags(name);
|
|
|
|
-- Note-Tag junction table
|
|
CREATE TABLE IF NOT EXISTS note_tags (
|
|
note_id INTEGER NOT NULL,
|
|
tag_id INTEGER NOT NULL,
|
|
PRIMARY KEY (note_id, tag_id),
|
|
FOREIGN KEY (note_id) REFERENCES notes(id) ON DELETE CASCADE,
|
|
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_note_tags_note ON note_tags(note_id);
|
|
CREATE INDEX IF NOT EXISTS idx_note_tags_tag ON note_tags(tag_id);
|
|
```
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
|
|
**Test File**: `tests/test_migrations.py`
|
|
|
|
**Test Cases**:
|
|
|
|
1. **test_create_migrations_table()**: Verify table created with correct schema
|
|
2. **test_get_applied_migrations()**: Verify retrieval of applied migrations
|
|
3. **test_discover_migration_files()**: Verify discovery and sorting
|
|
4. **test_apply_migration_success()**: Verify successful migration application
|
|
5. **test_apply_migration_failure()**: Verify error handling and rollback
|
|
6. **test_run_migrations_empty()**: Verify behavior with no migrations
|
|
7. **test_run_migrations_all_applied()**: Verify idempotency
|
|
8. **test_run_migrations_partial()**: Verify applying only pending migrations
|
|
9. **test_run_migrations_order()**: Verify migrations applied in correct order
|
|
|
|
### Integration Tests
|
|
|
|
**Test File**: `tests/test_database_init.py`
|
|
|
|
**Test Cases**:
|
|
|
|
1. **test_init_db_creates_schema_and_migrations()**: Verify full initialization
|
|
2. **test_init_db_idempotent()**: Verify safe to call multiple times
|
|
3. **test_migration_applied_on_startup()**: Verify app startup applies migrations
|
|
|
|
### Manual Testing
|
|
|
|
**Procedure**:
|
|
|
|
1. **Fresh Database**:
|
|
```bash
|
|
rm data/starpunk.db
|
|
flask --app app.py run
|
|
# Verify: [INFO] Applied migration: 001_add_code_verifier_to_auth_state.sql
|
|
```
|
|
|
|
2. **Existing Database**:
|
|
```bash
|
|
flask --app app.py run
|
|
# Verify: [INFO] All migrations up to date (1 total)
|
|
```
|
|
|
|
3. **Add New Migration**:
|
|
```bash
|
|
echo "-- Test migration" > migrations/002_test.sql
|
|
flask --app app.py run
|
|
# Verify: [INFO] Applied migration: 002_test.sql
|
|
```
|
|
|
|
4. **Migration Failure**:
|
|
```bash
|
|
echo "INVALID SQL;" > migrations/003_fail.sql
|
|
flask --app app.py run
|
|
# Verify: [ERROR] Migration 003_fail.sql failed: near "INVALID": syntax error
|
|
```
|
|
|
|
5. **Container Startup**:
|
|
```bash
|
|
docker run -v $(pwd)/data:/app/data starpunk
|
|
# Verify: Migrations applied automatically
|
|
```
|
|
|
|
## Migration Management Guide
|
|
|
|
### Adding a New Migration
|
|
|
|
**Step-by-Step**:
|
|
|
|
1. **Determine next number**:
|
|
```bash
|
|
ls migrations/ | tail -1
|
|
# Output: 001_add_code_verifier_to_auth_state.sql
|
|
# Next: 002
|
|
```
|
|
|
|
2. **Create migration file**:
|
|
```bash
|
|
touch migrations/002_add_tags_table.sql
|
|
```
|
|
|
|
3. **Write migration SQL**:
|
|
```sql
|
|
-- Migration: Add tags table
|
|
-- Date: 2025-11-19
|
|
-- ADR: ADR-025-tags-feature
|
|
|
|
CREATE TABLE IF NOT EXISTS tags (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
name TEXT UNIQUE NOT NULL,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_tags_name ON tags(name);
|
|
```
|
|
|
|
4. **Test migration locally**:
|
|
```bash
|
|
# Backup database
|
|
cp data/starpunk.db data/starpunk.db.backup
|
|
|
|
# Run application (migration auto-applies)
|
|
flask --app app.py run
|
|
|
|
# Check logs for success
|
|
# Verify database schema
|
|
sqlite3 data/starpunk.db ".schema tags"
|
|
```
|
|
|
|
5. **Commit migration**:
|
|
```bash
|
|
git add migrations/002_add_tags_table.sql
|
|
git commit -m "Add tags table migration"
|
|
```
|
|
|
|
### Handling Migration Conflicts
|
|
|
|
**Scenario**: Two developers create migration 002 on different branches
|
|
|
|
**Resolution**:
|
|
|
|
1. **Developer A**: Created `002_add_tags_table.sql` on `feature/tags`
|
|
2. **Developer B**: Created `002_add_comments_table.sql` on `feature/comments`
|
|
3. **Developer A merges first**: `002_add_tags_table.sql` is in main
|
|
4. **Developer B rebases**:
|
|
```bash
|
|
git checkout feature/comments
|
|
git rebase main
|
|
# Conflict: both have 002_*.sql
|
|
|
|
# Renumber Developer B's migration
|
|
git mv migrations/002_add_comments_table.sql \
|
|
migrations/003_add_comments_table.sql
|
|
|
|
git add migrations/003_add_comments_table.sql
|
|
git rebase --continue
|
|
```
|
|
|
|
### Rolling Back a Migration
|
|
|
|
**Not Supported Automatically**
|
|
|
|
**Manual Procedure**:
|
|
|
|
1. **Restore from backup**:
|
|
```bash
|
|
cp data/starpunk.db.backup data/starpunk.db
|
|
```
|
|
|
|
2. **OR Fix forward**:
|
|
```sql
|
|
-- Create new migration that undoes change
|
|
-- migrations/004_remove_tags_table.sql
|
|
DROP TABLE IF EXISTS tags;
|
|
```
|
|
|
|
3. **OR Manual SQL**:
|
|
```bash
|
|
sqlite3 data/starpunk.db
|
|
sqlite> DROP TABLE tags;
|
|
sqlite> DELETE FROM schema_migrations
|
|
WHERE migration_name = '002_add_tags_table.sql';
|
|
sqlite> .quit
|
|
```
|
|
|
|
### Best Practices for Writing Migrations
|
|
|
|
1. **Always use IF NOT EXISTS** for CREATE statements
|
|
2. **Always use DEFAULT** for new NOT NULL columns
|
|
3. **Test on production data copy** before deploying
|
|
4. **Keep migrations small** - one logical change per file
|
|
5. **Document purpose** in header comment
|
|
6. **Make migrations additive** when possible
|
|
7. **Avoid data transformations** in structure migrations
|
|
8. **Use descriptive names** for migration files
|
|
|
|
**Good Migration**:
|
|
```sql
|
|
-- Migration: Add published_url column to notes
|
|
-- Date: 2025-11-19
|
|
-- ADR: ADR-026-syndication-tracking
|
|
|
|
ALTER TABLE notes
|
|
ADD COLUMN published_url TEXT DEFAULT NULL;
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_notes_published_url
|
|
ON notes(published_url);
|
|
```
|
|
|
|
**Bad Migration**:
|
|
```sql
|
|
-- Migration: Updates
|
|
-- Date: 2025-11-19
|
|
|
|
ALTER TABLE notes ADD COLUMN url TEXT NOT NULL; -- FAILS: no default!
|
|
DROP TABLE old_stuff; -- Destructive!
|
|
UPDATE notes SET ...; -- Data transformation, hard to debug
|
|
```
|
|
|
|
## Version Impact
|
|
|
|
**Change Type**: Infrastructure improvement (new feature)
|
|
|
|
**Semantic Versioning Analysis**:
|
|
- **Adds new functionality**: Automatic migrations
|
|
- **Backward compatible**: Existing databases work, migrations optional
|
|
- **No breaking changes**: API unchanged, behavior compatible
|
|
- **Infrastructure improvement**: Developer experience enhancement
|
|
|
|
**Recommended Version**: MINOR increment (e.g., 0.8.0 → 0.9.0)
|
|
|
|
**Rationale**: Adds significant new functionality (automatic migrations) but maintains full backward compatibility.
|
|
|
|
## Compliance
|
|
|
|
### Project Standards
|
|
|
|
- **Minimal Code**: ~150 lines for complete migration system
|
|
- **No Dependencies**: Pure Python + SQLite, no external tools
|
|
- **Standards First**: Follows standard migration patterns
|
|
- **Single Responsibility**: Migration system does one thing well
|
|
- **Documentation as Code**: Migrations self-document schema changes
|
|
|
|
### Security Considerations
|
|
|
|
- **SQL Injection**: Migration files are trusted code (not user input)
|
|
- **File Access**: Only reads from trusted `migrations/` directory
|
|
- **Database Access**: Uses existing database connection patterns
|
|
- **Error Exposure**: Logs sanitized error messages only
|
|
|
|
### IndieWeb Compatibility
|
|
|
|
- **Data Ownership**: Migration tracking stored in user's database
|
|
- **Portability**: Standard SQL migrations, easily portable
|
|
- **Self-Hosting**: No external services required
|
|
- **Transparency**: Clear audit trail of schema changes
|
|
|
|
## References
|
|
|
|
### Migration System Patterns
|
|
|
|
- **Django Migrations**: Inspiration for tracking table
|
|
- **Rails ActiveRecord Migrations**: Inspiration for sequential numbering
|
|
- **Flyway**: Inspiration for SQL-based migrations
|
|
- **Alembic**: Considered but rejected (too complex for needs)
|
|
|
|
### SQLite Documentation
|
|
|
|
- **SQLite Transaction Support**: https://www.sqlite.org/lang_transaction.html
|
|
- **SQLite ALTER TABLE**: https://www.sqlite.org/lang_altertable.html
|
|
- **SQLite Limitations**: https://www.sqlite.org/limits.html
|
|
|
|
### Internal Documentation
|
|
|
|
- **ADR-004**: File-based note storage (similar pattern)
|
|
- **ADR-008**: Versioning strategy (migration impact)
|
|
- **docs/standards/versioning-strategy.md**: Version management
|
|
|
|
## Developer Questions & Architectural Responses
|
|
|
|
This section addresses critical implementation questions identified during developer review.
|
|
|
|
### Q1: SCHEMA_SQL Chicken-and-Egg Problem
|
|
|
|
**Question**: Current `SCHEMA_SQL` (line 60 in `database.py`) already includes `code_verifier TEXT NOT NULL DEFAULT ''` in the `auth_state` table. Migration `001_add_code_verifier_to_auth_state.sql` tries to add the same column. On fresh databases, this fails because the column already exists.
|
|
|
|
**Decision**: **SCHEMA_SQL represents the complete target state** (current schema after all migrations applied)
|
|
|
|
**Rationale**:
|
|
- Fresh installs should get the latest schema immediately (no migration overhead)
|
|
- Existing installs need migrations to reach the target state
|
|
- This is the standard pattern used by Django, Rails, and other frameworks
|
|
- Migrations are time-based snapshots, SCHEMA_SQL is the destination
|
|
|
|
**Implementation**:
|
|
|
|
1. **Keep `code_verifier` in SCHEMA_SQL** - It's part of the current schema
|
|
2. **Migration 001 is for existing databases only** - Databases created before PKCE feature
|
|
3. **Auto-skip migrations on fresh installs** - Detect and skip migrations already in SCHEMA_SQL
|
|
|
|
**Solution Pattern**:
|
|
```python
|
|
def run_migrations(db_path, logger=None):
|
|
# ... existing code ...
|
|
|
|
# Check if this is a fresh database (no schema_migrations table existed before we created it)
|
|
cursor = conn.execute(
|
|
"SELECT COUNT(*) FROM schema_migrations"
|
|
)
|
|
migration_count = cursor.fetchone()[0]
|
|
|
|
# If fresh database (0 migrations recorded), mark all migrations as applied
|
|
# since SCHEMA_SQL already contains all changes
|
|
if migration_count == 0:
|
|
for migration_name, _ in migration_files:
|
|
conn.execute(
|
|
"INSERT INTO schema_migrations (migration_name) VALUES (?)",
|
|
(migration_name,)
|
|
)
|
|
conn.commit()
|
|
logger.info(f"Fresh database: marked {len(migration_files)} migrations as applied")
|
|
return
|
|
|
|
# Otherwise, apply pending migrations normally
|
|
# ... existing migration application code ...
|
|
```
|
|
|
|
**Consequence**: Fresh installs never run migrations (already at target state), existing installs run only pending migrations.
|
|
|
|
### Q2: schema_migrations Table Location
|
|
|
|
**Question**: Should the `schema_migrations` table be in `SCHEMA_SQL` or only created by `migrations.py`?
|
|
|
|
**Decision**: **Only in migrations.py** - Do NOT add to SCHEMA_SQL
|
|
|
|
**Rationale**:
|
|
1. **Separation of Concerns**: Migration tracking is infrastructure, not application schema
|
|
2. **Detection Mechanism**: Absence of table indicates fresh database (see Q1 solution)
|
|
3. **Cleaner Schema**: Application schema stays focused on application tables
|
|
4. **Migration System Ownership**: Migration system creates its own tracking table
|
|
|
|
**Implementation**:
|
|
- `create_migrations_table()` in `migrations.py` creates the table
|
|
- `SCHEMA_SQL` remains unchanged (no `schema_migrations` table)
|
|
- Fresh database detection relies on table non-existence
|
|
|
|
### Q3: ALTER TABLE Idempotency
|
|
|
|
**Question**: SQLite doesn't support `IF NOT EXISTS` for `ALTER TABLE ADD COLUMN`. How do we make migrations idempotent?
|
|
|
|
**Decision**: **Accept non-idempotency, rely on migration tracking**
|
|
|
|
**Rationale**:
|
|
1. **SQL Limitation**: SQLite ALTER TABLE operations are not inherently idempotent
|
|
2. **Tracking Is Sufficient**: `schema_migrations` table prevents re-application
|
|
3. **Failure Handling**: Failed migrations leave clear error messages
|
|
4. **Production Reality**: Migrations rarely fail, and when they do, they need manual intervention anyway
|
|
|
|
**Implementation**:
|
|
|
|
**For Fresh Databases** (Q1 solution):
|
|
- All migrations automatically marked as applied
|
|
- Never actually executed (schema already complete)
|
|
- No idempotency issue
|
|
|
|
**For Existing Databases**:
|
|
- Migration tracking prevents re-running
|
|
- If migration fails, manual intervention required:
|
|
```bash
|
|
# Option 1: Fix the issue and re-run (migration will retry)
|
|
sqlite3 data/starpunk.db "ALTER TABLE ..."
|
|
# Then restart app - migration will succeed
|
|
|
|
# Option 2: Mark as applied manually (if change already exists)
|
|
sqlite3 data/starpunk.db \
|
|
"INSERT INTO schema_migrations (migration_name) VALUES ('001_...');"
|
|
```
|
|
|
|
**Helper Function** (optional, if needed):
|
|
```python
|
|
def column_exists(conn, table_name, column_name):
|
|
"""Check if column exists in table (helper for conditional migrations)"""
|
|
cursor = conn.execute(f"PRAGMA table_info({table_name})")
|
|
columns = [row[1] for row in cursor.fetchall()]
|
|
return column_name in columns
|
|
```
|
|
|
|
**Use in Migration** (if absolutely necessary):
|
|
```python
|
|
# In migration file (rare case where you need idempotency)
|
|
if not column_exists(conn, 'auth_state', 'code_verifier'):
|
|
conn.execute("ALTER TABLE auth_state ADD COLUMN code_verifier TEXT NOT NULL DEFAULT ''")
|
|
```
|
|
|
|
**Decision**: Do NOT use helper functions by default. Only add if specific migration requires it. Prefer Q1 solution (fresh database detection).
|
|
|
|
### Q4: Migration Filename Validation
|
|
|
|
**Question**: Should we enforce strict `\d{3}_description.sql` pattern or be flexible?
|
|
|
|
**Decision**: **Flexible with strong convention**
|
|
|
|
**Pattern**:
|
|
- **Recommended**: `\d{3}_lowercase_with_underscores.sql` (e.g., `001_add_code_verifier.sql`)
|
|
- **Required**: Must be `.sql` file, must start with digits, must be sortable
|
|
- **Sorting**: Alphanumeric sort determines execution order
|
|
|
|
**Rationale**:
|
|
1. **Simplicity**: Glob pattern `*.sql` + alphanumeric sort is simplest
|
|
2. **Error Tolerance**: Don't fail on filename format (warn instead)
|
|
3. **Developer Freedom**: Allow variations (001.sql, 0001_desc.sql, etc.)
|
|
4. **Order Matters**: Only requirement is deterministic sort order
|
|
|
|
**Implementation**:
|
|
```python
|
|
def discover_migration_files(migrations_dir):
|
|
"""
|
|
Discover all migration files in migrations directory
|
|
Files must be .sql and sortable alphanumerically
|
|
"""
|
|
if not migrations_dir.exists():
|
|
return []
|
|
|
|
migration_files = []
|
|
for file_path in migrations_dir.glob("*.sql"):
|
|
migration_files.append((file_path.name, file_path))
|
|
|
|
# Sort alphanumerically (001_... before 002_...)
|
|
migration_files.sort(key=lambda x: x[0])
|
|
|
|
return migration_files
|
|
```
|
|
|
|
**Validation** (optional warning):
|
|
```python
|
|
import re
|
|
|
|
RECOMMENDED_PATTERN = re.compile(r'^\d{3}_[a-z0-9_]+\.sql$')
|
|
|
|
for migration_name, _ in migration_files:
|
|
if not RECOMMENDED_PATTERN.match(migration_name):
|
|
logger.warning(
|
|
f"Migration {migration_name} doesn't follow recommended pattern: "
|
|
f"NNN_lowercase_description.sql"
|
|
)
|
|
```
|
|
|
|
**Decision**: Implement glob + sort (required), skip validation (optional).
|
|
|
|
### Q5: Existing Database Migration Path
|
|
|
|
**Question**: How do existing StarPunk users transition when they upgrade to the version with automatic migrations?
|
|
|
|
**Decision**: **Automatic and transparent**
|
|
|
|
**Scenario Analysis**:
|
|
|
|
**Scenario A: Database created BEFORE code_verifier feature**
|
|
- Database exists, has auth_state table WITHOUT code_verifier column
|
|
- User upgrades to version with automatic migrations
|
|
- On startup: `run_migrations()` executes
|
|
- Migration 001 runs: `ALTER TABLE auth_state ADD COLUMN code_verifier...`
|
|
- Result: Database updated, migration tracked
|
|
|
|
**Scenario B: Database created AFTER code_verifier feature but BEFORE automatic migrations**
|
|
- Database exists, has auth_state table WITH code_verifier column
|
|
- `schema_migrations` table does NOT exist
|
|
- User upgrades to version with automatic migrations
|
|
- On startup: `run_migrations()` executes
|
|
- `create_migrations_table()` creates tracking table
|
|
- **Problem**: Migration 001 will try to add existing column and FAIL
|
|
|
|
**Solution for Scenario B**: Fresh database detection (Q1 solution)
|
|
```python
|
|
# Detect if database has code_verifier but no migration tracking
|
|
# This indicates database created between PKCE feature and migration system
|
|
|
|
cursor = conn.execute("PRAGMA table_info(auth_state)")
|
|
columns = [row[1] for row in cursor.fetchall()]
|
|
has_code_verifier = 'code_verifier' in columns
|
|
|
|
cursor = conn.execute("SELECT COUNT(*) FROM schema_migrations")
|
|
migration_count = cursor.fetchone()[0]
|
|
|
|
if migration_count == 0 and has_code_verifier:
|
|
# Database created after PKCE but before migrations
|
|
# Mark all migrations as applied
|
|
for migration_name, _ in migration_files:
|
|
conn.execute(
|
|
"INSERT INTO schema_migrations (migration_name) VALUES (?)",
|
|
(migration_name,)
|
|
)
|
|
conn.commit()
|
|
logger.info("Existing database: migrations marked as applied")
|
|
return
|
|
```
|
|
|
|
**Refined Solution**: Check if SCHEMA_SQL is already applied
|
|
```python
|
|
def is_schema_current(conn):
|
|
"""
|
|
Check if database schema matches current SCHEMA_SQL
|
|
Heuristic: Check for latest schema feature (code_verifier column)
|
|
"""
|
|
cursor = conn.execute("PRAGMA table_info(auth_state)")
|
|
columns = [row[1] for row in cursor.fetchall()]
|
|
return 'code_verifier' in columns
|
|
|
|
def run_migrations(db_path, logger=None):
|
|
# ... setup ...
|
|
|
|
cursor = conn.execute("SELECT COUNT(*) FROM schema_migrations")
|
|
migration_count = cursor.fetchone()[0]
|
|
|
|
# If no migrations recorded AND schema is current, mark all as applied
|
|
if migration_count == 0 and is_schema_current(conn):
|
|
for migration_name, _ in migration_files:
|
|
conn.execute(
|
|
"INSERT INTO schema_migrations (migration_name) VALUES (?)",
|
|
(migration_name,)
|
|
)
|
|
conn.commit()
|
|
logger.info(f"Database up-to-date: marked {len(migration_files)} migrations as applied")
|
|
return
|
|
|
|
# Otherwise apply pending migrations...
|
|
```
|
|
|
|
**User Impact**: None. Upgrade is automatic and transparent.
|
|
|
|
### Q6: Column Existence Helpers
|
|
|
|
**Question**: Should we provide helper functions for checking column/table existence, or keep it pure SQL?
|
|
|
|
**Decision**: **Provide optional helper, but don't use by default**
|
|
|
|
**Rationale**:
|
|
- Primary solution is fresh database detection (Q1/Q5)
|
|
- Helpers useful for edge cases only
|
|
- Keep migration system simple by default
|
|
- Document helpers for future use
|
|
|
|
**Helpers to Provide** (in `migrations.py`):
|
|
```python
|
|
def table_exists(conn, table_name):
|
|
"""Check if table exists in database"""
|
|
cursor = conn.execute(
|
|
"SELECT name FROM sqlite_master WHERE type='table' AND name=?",
|
|
(table_name,)
|
|
)
|
|
return cursor.fetchone() is not None
|
|
|
|
def column_exists(conn, table_name, column_name):
|
|
"""Check if column exists in table"""
|
|
cursor = conn.execute(f"PRAGMA table_info({table_name})")
|
|
columns = [row[1] for row in cursor.fetchall()]
|
|
return column_name in columns
|
|
|
|
def index_exists(conn, index_name):
|
|
"""Check if index exists in database"""
|
|
cursor = conn.execute(
|
|
"SELECT name FROM sqlite_master WHERE type='index' AND name=?",
|
|
(index_name,)
|
|
)
|
|
return cursor.fetchone() is not None
|
|
```
|
|
|
|
**Documentation**:
|
|
```python
|
|
"""
|
|
Helper functions for conditional migrations (advanced usage only)
|
|
|
|
These are provided for edge cases where migrations need conditional logic.
|
|
In most cases, the migration system's fresh database detection handles
|
|
idempotency automatically.
|
|
|
|
Example usage in migration:
|
|
from starpunk.migrations import column_exists
|
|
|
|
if not column_exists(conn, 'notes', 'published_url'):
|
|
conn.execute("ALTER TABLE notes ADD COLUMN published_url TEXT")
|
|
"""
|
|
```
|
|
|
|
**Decision**: Include helpers in `migrations.py`, document as "advanced usage", don't use in migration 001 (use fresh DB detection instead).
|
|
|
|
### Q7: SCHEMA_SQL Purpose Clarification
|
|
|
|
**Question**: What should `SCHEMA_SQL` represent - initial state, current state, or minimal state?
|
|
|
|
**Decision**: **SCHEMA_SQL is the complete current state** (target schema after all migrations)
|
|
|
|
**Definition**:
|
|
```
|
|
SCHEMA_SQL = {Complete database schema as of the current version}
|
|
= {Initial schema} + {All migrations applied}
|
|
```
|
|
|
|
**Guidelines**:
|
|
|
|
1. **When adding a new feature with schema changes**:
|
|
- Add new tables/columns to SCHEMA_SQL
|
|
- Create migration file for existing databases
|
|
- Example: PKCE feature added `code_verifier` to both SCHEMA_SQL and migration 001
|
|
|
|
2. **When creating a fresh database**:
|
|
- Execute SCHEMA_SQL → complete schema immediately
|
|
- Mark all migrations as applied (never execute them)
|
|
|
|
3. **When upgrading existing database**:
|
|
- SCHEMA_SQL already executed (during original creation)
|
|
- Run pending migrations to reach current state
|
|
- Each migration is a "delta" to reach SCHEMA_SQL
|
|
|
|
**Maintenance Rules**:
|
|
|
|
**DO**:
|
|
- Update SCHEMA_SQL when schema changes
|
|
- Create migration for same change
|
|
- Keep SCHEMA_SQL as single source of truth for "current state"
|
|
|
|
**DON'T**:
|
|
- Remove changes from SCHEMA_SQL (only add)
|
|
- Create migration without updating SCHEMA_SQL
|
|
- Expect SCHEMA_SQL to be "minimal" or "initial"
|
|
|
|
**Example Workflow**:
|
|
|
|
**Adding Tags Feature**:
|
|
1. Update SCHEMA_SQL: Add `tags` table and `note_tags` junction table
|
|
2. Create migration: `002_add_tags_table.sql` with same SQL
|
|
3. Fresh installs: Get tags via SCHEMA_SQL, migration 002 marked as applied
|
|
4. Existing installs: Migration 002 executes, adds tags table
|
|
|
|
**Migration 002 SQL** (mirrors SCHEMA_SQL):
|
|
```sql
|
|
-- Migration: Add tags table
|
|
-- Date: 2025-11-19
|
|
|
|
CREATE TABLE IF NOT EXISTS tags (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
name TEXT UNIQUE NOT NULL,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_tags_name ON tags(name);
|
|
|
|
CREATE TABLE IF NOT EXISTS note_tags (
|
|
note_id INTEGER NOT NULL,
|
|
tag_id INTEGER NOT NULL,
|
|
PRIMARY KEY (note_id, tag_id),
|
|
FOREIGN KEY (note_id) REFERENCES notes(id) ON DELETE CASCADE,
|
|
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_note_tags_note ON note_tags(note_id);
|
|
CREATE INDEX IF NOT EXISTS idx_note_tags_tag ON note_tags(tag_id);
|
|
```
|
|
|
|
**SCHEMA_SQL Update** (same content as migration):
|
|
```python
|
|
SCHEMA_SQL = """
|
|
-- Notes metadata (content is in files)
|
|
CREATE TABLE IF NOT EXISTS notes (
|
|
-- ... existing ...
|
|
);
|
|
|
|
-- ... existing tables ...
|
|
|
|
-- Tags table (added in migration 002)
|
|
CREATE TABLE IF NOT EXISTS tags (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
name TEXT UNIQUE NOT NULL,
|
|
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_tags_name ON tags(name);
|
|
|
|
-- Note-Tag junction table (added in migration 002)
|
|
CREATE TABLE IF NOT EXISTS note_tags (
|
|
note_id INTEGER NOT NULL,
|
|
tag_id INTEGER NOT NULL,
|
|
PRIMARY KEY (note_id, tag_id),
|
|
FOREIGN KEY (note_id) REFERENCES notes(id) ON DELETE CASCADE,
|
|
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_note_tags_note ON note_tags(note_id);
|
|
CREATE INDEX IF NOT EXISTS idx_note_tags_tag ON note_tags(tag_id);
|
|
"""
|
|
```
|
|
|
|
## Summary of Architectural Decisions
|
|
|
|
| Question | Decision | Implementation |
|
|
|----------|----------|----------------|
|
|
| **Q1: Chicken-and-egg problem** | SCHEMA_SQL is target state, auto-skip migrations on fresh DBs | Fresh database detection in `run_migrations()` |
|
|
| **Q2: schema_migrations location** | Only in migrations.py, NOT in SCHEMA_SQL | `create_migrations_table()` creates it |
|
|
| **Q3: ALTER TABLE idempotency** | Accept non-idempotency, rely on tracking | Migration tracking prevents re-runs |
|
|
| **Q4: Filename validation** | Flexible: `*.sql` + alphanumeric sort | No strict validation, warn if off-pattern |
|
|
| **Q5: Existing database transition** | Automatic via fresh DB detection | Check `code_verifier` existence heuristic |
|
|
| **Q6: Column helpers** | Provide but don't use by default | Include in `migrations.py` for advanced use |
|
|
| **Q7: SCHEMA_SQL purpose** | Complete current state (target schema) | Update SCHEMA_SQL with every schema change |
|
|
|
|
## Implementation Specification Updates
|
|
|
|
### Modified: `starpunk/migrations.py`
|
|
|
|
**Add fresh database detection**:
|
|
|
|
```python
|
|
def is_schema_current(conn):
|
|
"""
|
|
Check if database schema is current (matches SCHEMA_SQL)
|
|
|
|
Uses heuristic: Check for presence of latest schema features
|
|
Currently checks for code_verifier column in auth_state table
|
|
|
|
Args:
|
|
conn: SQLite connection
|
|
|
|
Returns:
|
|
bool: True if schema appears current, False if legacy
|
|
"""
|
|
try:
|
|
cursor = conn.execute("PRAGMA table_info(auth_state)")
|
|
columns = [row[1] for row in cursor.fetchall()]
|
|
return 'code_verifier' in columns
|
|
except sqlite3.OperationalError:
|
|
# Table doesn't exist - definitely not current
|
|
return False
|
|
|
|
|
|
def table_exists(conn, table_name):
|
|
"""Check if table exists in database"""
|
|
cursor = conn.execute(
|
|
"SELECT name FROM sqlite_master WHERE type='table' AND name=?",
|
|
(table_name,)
|
|
)
|
|
return cursor.fetchone() is not None
|
|
|
|
|
|
def column_exists(conn, table_name, column_name):
|
|
"""Check if column exists in table"""
|
|
try:
|
|
cursor = conn.execute(f"PRAGMA table_info({table_name})")
|
|
columns = [row[1] for row in cursor.fetchall()]
|
|
return column_name in columns
|
|
except sqlite3.OperationalError:
|
|
return False
|
|
|
|
|
|
def index_exists(conn, index_name):
|
|
"""Check if index exists in database"""
|
|
cursor = conn.execute(
|
|
"SELECT name FROM sqlite_master WHERE type='index' AND name=?",
|
|
(index_name,)
|
|
)
|
|
return cursor.fetchone() is not None
|
|
|
|
|
|
def run_migrations(db_path, logger=None):
|
|
"""
|
|
Run all pending database migrations
|
|
|
|
Fresh Database Behavior:
|
|
- If schema_migrations table is empty AND schema is current
|
|
- Marks all migrations as applied (skip execution)
|
|
- This handles databases created with current SCHEMA_SQL
|
|
|
|
Existing Database Behavior:
|
|
- Applies only pending migrations
|
|
- Migrations already in schema_migrations are skipped
|
|
|
|
Args:
|
|
db_path: Path to SQLite database file
|
|
logger: Optional logger for output
|
|
|
|
Raises:
|
|
MigrationError: If any migration fails to apply
|
|
"""
|
|
if logger is None:
|
|
logger = logging.getLogger(__name__)
|
|
|
|
# Determine migrations directory
|
|
migrations_dir = Path(__file__).parent.parent / "migrations"
|
|
|
|
if not migrations_dir.exists():
|
|
logger.warning(f"Migrations directory not found: {migrations_dir}")
|
|
return
|
|
|
|
# Connect to database
|
|
conn = sqlite3.connect(db_path)
|
|
|
|
try:
|
|
# Ensure migrations tracking table exists
|
|
create_migrations_table(conn)
|
|
|
|
# Check if this is a fresh database with current schema
|
|
cursor = conn.execute("SELECT COUNT(*) FROM schema_migrations")
|
|
migration_count = cursor.fetchone()[0]
|
|
|
|
# Discover migration files
|
|
migration_files = discover_migration_files(migrations_dir)
|
|
|
|
if not migration_files:
|
|
logger.info("No migration files found")
|
|
return
|
|
|
|
# Fresh database detection
|
|
if migration_count == 0:
|
|
if is_schema_current(conn):
|
|
# Schema is current - mark all migrations as applied
|
|
for migration_name, _ in migration_files:
|
|
conn.execute(
|
|
"INSERT INTO schema_migrations (migration_name) VALUES (?)",
|
|
(migration_name,)
|
|
)
|
|
conn.commit()
|
|
logger.info(
|
|
f"Fresh database detected: marked {len(migration_files)} "
|
|
f"migrations as applied (schema already current)"
|
|
)
|
|
return
|
|
else:
|
|
logger.info("Legacy database detected: applying all migrations")
|
|
|
|
# Get already-applied migrations
|
|
applied = get_applied_migrations(conn)
|
|
|
|
# Apply pending migrations
|
|
pending_count = 0
|
|
for migration_name, migration_path in migration_files:
|
|
if migration_name not in applied:
|
|
apply_migration(conn, migration_name, migration_path, logger)
|
|
pending_count += 1
|
|
|
|
# Summary
|
|
total_count = len(migration_files)
|
|
if pending_count > 0:
|
|
logger.info(
|
|
f"Migrations complete: {pending_count} applied, "
|
|
f"{total_count} total"
|
|
)
|
|
else:
|
|
logger.info(f"All migrations up to date ({total_count} total)")
|
|
|
|
except MigrationError:
|
|
raise
|
|
|
|
except Exception as e:
|
|
error_msg = f"Migration system error: {e}"
|
|
logger.error(error_msg)
|
|
raise MigrationError(error_msg)
|
|
|
|
finally:
|
|
conn.close()
|
|
```
|
|
|
|
### Modified: `SCHEMA_SQL` Maintenance
|
|
|
|
**SCHEMA_SQL does NOT change** - it already includes `code_verifier` (correct).
|
|
|
|
**Rule for Future Changes**:
|
|
1. Add new schema elements to `SCHEMA_SQL`
|
|
2. Create corresponding migration file
|
|
3. Migration contains same SQL as SCHEMA_SQL addition
|
|
4. Fresh installs get it from SCHEMA_SQL
|
|
5. Existing installs get it from migration
|
|
|
|
### Migration 001 Status
|
|
|
|
**No changes needed** to `001_add_code_verifier_to_auth_state.sql`
|
|
|
|
**Behavior**:
|
|
- **Fresh databases**: Never executes (marked as applied via fresh DB detection)
|
|
- **Legacy databases** (before PKCE): Executes successfully (column doesn't exist)
|
|
- **Mid-version databases** (after PKCE, before migrations): Never executes (fresh DB detection)
|
|
|
|
**This is correct and requires no changes.**
|
|
|
|
## What We Learned
|
|
|
|
1. **Simplicity Wins**: 150 lines beats thousands in Alembic for our use case
|
|
2. **Container Requirements**: Modern deployment requires automatic initialization
|
|
3. **SQLite Is Sufficient**: No need for complex migration frameworks
|
|
4. **Sequential Works**: Numbering beats timestamps for small teams
|
|
5. **Forward-Only Is OK**: Rollback capability rarely needed in practice
|
|
6. **Fresh DB Detection Solves Bootstrap**: Heuristic check prevents chicken-and-egg problems
|
|
7. **SCHEMA_SQL as Target State**: Clearest mental model for developers
|
|
8. **Migration Tracking Is Primary Safety**: SQL idempotency is secondary
|
|
|
|
---
|
|
|
|
**Decided**: 2025-11-19
|
|
**Updated**: 2025-11-19 (Developer Q&A section added)
|
|
**Author**: StarPunk Architect
|
|
**Implements**: Automatic database migration system
|
|
**Version Impact**: MINOR increment recommended
|