feat(tags): Add database schema and tags module (v1.3.0 Phase 1)
Implements tag/category system backend following microformats2 p-category specification. Database changes: - Migration 008: Add tags and note_tags tables - Normalized tag storage (case-insensitive lookup, display name preserved) - Indexes for performance New module: - starpunk/tags.py: Tag management functions - normalize_tag: Normalize tag strings - get_or_create_tag: Get or create tag records - add_tags_to_note: Associate tags with notes (replaces existing) - get_note_tags: Retrieve note tags (alphabetically ordered) - get_tag_by_name: Lookup tag by normalized name - get_notes_by_tag: Get all notes with specific tag - parse_tag_input: Parse comma-separated tag input Model updates: - Note.tags property (lazy-loaded, prefer pre-loading in routes) - Note.to_dict() add include_tags parameter CRUD updates: - create_note() accepts tags parameter - update_note() accepts tags parameter (None = no change, [] = remove all) Micropub integration: - Pass tags to create_note() (tags already extracted by extract_tags()) - Return tags in q=source response Per design doc: docs/design/v1.3.0/microformats-tags-design.md Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
145
docs/design/v1.0.0/migration-failure-diagnosis-v1.0.0-rc.1.md
Normal file
145
docs/design/v1.0.0/migration-failure-diagnosis-v1.0.0-rc.1.md
Normal file
@@ -0,0 +1,145 @@
|
||||
# Migration Failure Diagnosis - v1.0.0-rc.1
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The v1.0.0-rc.1 container is experiencing a critical startup failure due to a **race condition in the database initialization and migration system**. The error `sqlite3.OperationalError: no such column: token_hash` occurs when `SCHEMA_SQL` attempts to create indexes for a `tokens` table structure that no longer exists after migration 002 drops and recreates it.
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### The Execution Order Problem
|
||||
|
||||
1. **Database Initialization** (`init_db()` in `database.py:94-127`)
|
||||
- Line 115: `conn.executescript(SCHEMA_SQL)` - Creates initial schema
|
||||
- Line 126: `run_migrations()` - Applies pending migrations
|
||||
|
||||
2. **SCHEMA_SQL Definition** (`database.py:46-60`)
|
||||
- Creates `tokens` table WITH `token_hash` column (lines 46-56)
|
||||
- Creates indexes including `idx_tokens_hash` (line 58)
|
||||
|
||||
3. **Migration 002** (`002_secure_tokens_and_authorization_codes.sql`)
|
||||
- Line 17: `DROP TABLE IF EXISTS tokens;`
|
||||
- Lines 20-30: Creates NEW `tokens` table with same structure
|
||||
- Lines 49-51: Creates indexes again
|
||||
|
||||
### The Critical Issue
|
||||
|
||||
For an **existing production database** (v0.9.5):
|
||||
|
||||
1. Database already has an OLD `tokens` table (without `token_hash` column)
|
||||
2. `init_db()` runs `SCHEMA_SQL` which includes:
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS tokens (
|
||||
...
|
||||
token_hash TEXT UNIQUE NOT NULL,
|
||||
...
|
||||
);
|
||||
CREATE INDEX IF NOT EXISTS idx_tokens_hash ON tokens(token_hash);
|
||||
```
|
||||
3. The `CREATE TABLE IF NOT EXISTS` is a no-op (table exists)
|
||||
4. The `CREATE INDEX` tries to create an index on `token_hash` column
|
||||
5. **ERROR**: Column `token_hash` doesn't exist in the old table structure
|
||||
6. Container crashes before migrations can run
|
||||
|
||||
### Why This Wasn't Caught Earlier
|
||||
|
||||
- **Fresh databases** work fine - SCHEMA_SQL creates the correct structure
|
||||
- **Test environments** likely started fresh or had the new schema
|
||||
- **Production** has an existing v0.9.5 database with the old `tokens` table structure
|
||||
|
||||
## The Schema Evolution Mismatch
|
||||
|
||||
### Original tokens table (v0.9.5)
|
||||
The old structure likely had columns like:
|
||||
- `token` (plain text - security issue)
|
||||
- `me`
|
||||
- `client_id`
|
||||
- `scope`
|
||||
- etc.
|
||||
|
||||
### New tokens table (v1.0.0-rc.1)
|
||||
- `token_hash` (SHA256 hash - secure)
|
||||
- Same other columns
|
||||
|
||||
### The Problem
|
||||
SCHEMA_SQL was updated to match the POST-migration structure, but it runs BEFORE migrations. This creates an impossible situation for existing databases.
|
||||
|
||||
## Migration System Design Flaw
|
||||
|
||||
The current system has a fundamental ordering issue:
|
||||
|
||||
1. **SCHEMA_SQL** should represent the INITIAL schema (v0.1.0)
|
||||
2. **Migrations** should evolve from that base
|
||||
3. **Current Reality**: SCHEMA_SQL represents the LATEST schema
|
||||
|
||||
This works for fresh databases but fails for existing ones that need migration.
|
||||
|
||||
## Recommended Fix
|
||||
|
||||
### Option 1: Conditional Index Creation (Quick Fix)
|
||||
Modify SCHEMA_SQL to use conditional logic or remove problematic indexes from SCHEMA_SQL since migration 002 creates them anyway.
|
||||
|
||||
### Option 2: Fix Execution Order (Better)
|
||||
1. Run migrations BEFORE attempting schema creation
|
||||
2. Only use SCHEMA_SQL for truly fresh databases
|
||||
|
||||
### Option 3: Proper Schema Versioning (Best)
|
||||
1. SCHEMA_SQL should be the v0.1.0 schema
|
||||
2. All evolution happens through migrations
|
||||
3. Fresh databases run all migrations from the beginning
|
||||
|
||||
## Immediate Workaround
|
||||
|
||||
For the production deployment:
|
||||
|
||||
1. **Manual intervention before upgrade**:
|
||||
```sql
|
||||
-- Connect to production database
|
||||
-- Manually add the column before v1.0.0-rc.1 starts
|
||||
ALTER TABLE tokens ADD COLUMN token_hash TEXT;
|
||||
```
|
||||
|
||||
2. **Then deploy v1.0.0-rc.1**:
|
||||
- SCHEMA_SQL will succeed (column exists)
|
||||
- Migration 002 will drop and recreate the table properly
|
||||
- System will work correctly
|
||||
|
||||
## Verification Steps
|
||||
|
||||
1. Check production database structure:
|
||||
```sql
|
||||
PRAGMA table_info(tokens);
|
||||
```
|
||||
|
||||
2. Verify migration status:
|
||||
```sql
|
||||
SELECT * FROM schema_migrations;
|
||||
```
|
||||
|
||||
3. Test with a v0.9.5 database locally to reproduce
|
||||
|
||||
## Long-term Architecture Recommendations
|
||||
|
||||
1. **Separate Initial Schema from Current Schema**
|
||||
- `INITIAL_SCHEMA_SQL` - The v0.1.0 starting point
|
||||
- Migrations handle ALL evolution
|
||||
|
||||
2. **Migration-First Initialization**
|
||||
- Check for existing database
|
||||
- Run migrations first if database exists
|
||||
- Only apply SCHEMA_SQL to truly empty databases
|
||||
|
||||
3. **Schema Version Tracking**
|
||||
- Add a `schema_version` table
|
||||
- Track the current schema version explicitly
|
||||
- Make decisions based on version, not heuristics
|
||||
|
||||
4. **Testing Strategy**
|
||||
- Always test upgrades from previous production version
|
||||
- Include migration testing in CI/CD pipeline
|
||||
- Maintain database snapshots for each released version
|
||||
|
||||
## Conclusion
|
||||
|
||||
This is a **critical architectural issue** in the migration system that affects all existing production deployments. The immediate fix is straightforward, but the system needs architectural changes to prevent similar issues in future releases.
|
||||
|
||||
The core principle violated: **SCHEMA_SQL should represent the beginning, not the end state**.
|
||||
Reference in New Issue
Block a user