# StarPunk v1.1.0 Feature Architecture ## Overview This document defines the architectural design for the three major features in v1.1.0: Migration System Redesign, Full-Text Search, and Custom Slugs. Each component has been designed following our core principle of minimal, elegant solutions. ## System Architecture Diagram ``` ┌─────────────────────────────────────────────────────────────┐ │ StarPunk CMS v1.1.0 │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ │ Micropub │ │ Web UI │ │ Search API │ │ │ │ Endpoint │ │ │ │ /api/search │ │ │ └──────┬──────┘ └──────┬───────┘ └────────┬─────────┘ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ Application Layer │ │ │ │ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │ │ │ │ │ Custom │ │ Note │ │ Search │ │ │ │ │ │ Slugs │ │ CRUD │ │ Engine │ │ │ │ │ └────────────┘ └────────────┘ └────────────────┘ │ │ │ └──────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ Data Layer (SQLite) │ │ │ │ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │ │ │ │ │ notes │ │ notes_fts │ │ migrations │ │ │ │ │ │ table │◄─┤ (FTS5) │ │ table │ │ │ │ │ └────────────┘ └────────────┘ └────────────────┘ │ │ │ │ │ ▲ │ │ │ │ │ └──────────────┴───────────────────┘ │ │ │ │ Triggers keep FTS in sync │ │ │ └──────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ File System Layer │ │ │ │ data/notes/YYYY/MM/[slug].md │ │ │ └──────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────┘ ``` ## Component Architecture ### 1. Migration System Redesign #### Current Problem ``` [Fresh Install] [Upgrade Path] │ │ ▼ ▼ SCHEMA_SQL Migration Files (full schema) (partial schema) │ │ └────────┬───────────────┘ ▼ DUPLICATION! ``` #### New Architecture ``` [Fresh Install] [Upgrade Path] │ │ ▼ ▼ INITIAL_SCHEMA_SQL ──────► Migrations (v1.0.0 only) (changes only) │ │ └────────┬───────────────┘ ▼ Single Source ``` #### Key Components - **INITIAL_SCHEMA_SQL**: Frozen v1.0.0 schema - **Migration Files**: Only incremental changes - **Migration Runner**: Handles both paths intelligently ### 2. Full-Text Search Architecture #### Data Flow ``` 1. User Query │ ▼ 2. Query Parser │ ▼ 3. FTS5 Engine ───► SQLite Query Planner │ │ ▼ ▼ 4. BM25 Ranking Index Lookup │ │ └──────────┬───────────┘ ▼ 5. Results + Snippets ``` #### Database Schema ```sql notes (main table) notes_fts (virtual table) ┌──────────────┐ ┌──────────────────┐ │ id (PK) │◄───────────┤ rowid (FK) │ │ slug │ │ slug (UNINDEXED) │ │ content │───trigger──► title │ │ published │ │ content │ └──────────────┘ └──────────────────┘ ``` #### Synchronization Strategy - **INSERT Trigger**: Automatically indexes new notes - **UPDATE Trigger**: Re-indexes modified notes - **DELETE Trigger**: Removes deleted notes from index - **Initial Build**: One-time indexing of existing notes ### 3. Custom Slugs Architecture #### Request Flow ``` Micropub Request │ ▼ Extract mp-slug ──► No mp-slug ──► Auto-generate │ │ ▼ │ Validate Format │ │ │ ▼ │ Check Uniqueness │ │ │ ├─► Unique ────────────────────┤ │ │ └─► Duplicate │ │ │ ▼ ▼ Add suffix Create Note (my-slug-2) ``` #### Validation Pipeline ``` Input: "My/Cool/../Post!" │ ▼ 1. Lowercase: "my/cool/../post!" │ ▼ 2. Remove Invalid: "my/cool/post" │ ▼ 3. Security Check: Reject "../" │ ▼ 4. Pattern Match: ^[a-z0-9-/]+$ │ ▼ 5. Reserved Check: Not in blocklist │ ▼ Output: "my-cool-post" ``` ## Data Models ### Migration Record ```python class Migration: version: str # "001", "002", etc. description: str # Human-readable applied_at: datetime checksum: str # Verify integrity ``` ### Search Result ```python class SearchResult: slug: str title: str snippet: str # With highlights rank: float # BM25 score published: bool created_at: datetime ``` ### Slug Validation ```python class SlugValidator: pattern: regex = r'^[a-z0-9-/]+$' max_length: int = 200 reserved: set = {'api', 'admin', 'auth', 'feed'} def validate(slug: str) -> bool def sanitize(slug: str) -> str def ensure_unique(slug: str) -> str ``` ## Interface Specifications ### Search API Contract ```yaml endpoint: GET /api/search parameters: q: string (required) - Search query limit: int (optional, default: 20, max: 100) offset: int (optional, default: 0) published_only: bool (optional, default: true) response: 200 OK: content-type: application/json schema: query: string total: integer results: array[SearchResult] 400 Bad Request: error: "invalid_query" description: string ``` ### Micropub Slug Extension ```yaml property: mp-slug type: string required: false validation: - URL-safe characters only - Maximum 200 characters - Not in reserved list - Unique (or auto-incremented) example: properties: content: ["My post"] mp-slug: ["my-custom-url"] ``` ## Performance Characteristics ### Migration System - Fresh install: ~100ms (schema + migrations) - Upgrade: ~50ms per migration - Rollback: Not supported (forward-only) ### Full-Text Search - Index build: 1ms per note - Query latency: <10ms for 10K notes - Index size: ~30% of text - Memory usage: Negligible (SQLite managed) ### Custom Slugs - Validation: <1ms - Uniqueness check: <5ms - Conflict resolution: <10ms - No performance impact on existing flows ## Security Architecture ### Search Security 1. **Input Sanitization**: FTS5 handles SQL injection 2. **Output Escaping**: HTML escaped in snippets 3. **Rate Limiting**: 100 requests/minute per IP 4. **Access Control**: Unpublished notes require auth ### Slug Security 1. **Path Traversal Prevention**: Reject `..` patterns 2. **Reserved Routes**: Block system endpoints 3. **Length Limits**: Prevent DoS via long slugs 4. **Character Whitelist**: Only allow safe chars ### Migration Security 1. **Checksum Verification**: Detect tampering 2. **Transaction Safety**: All-or-nothing execution 3. **No User Input**: Migrations are code-only 4. **Audit Trail**: Track all applied migrations ## Deployment Considerations ### Database Upgrade Path ```bash # v1.0.x → v1.1.0 1. Backup database 2. Apply migration 002 (FTS5 tables) 3. Build initial search index 4. Verify functionality 5. Remove backup after confirmation ``` ### Rollback Strategy ```bash # Emergency rollback (data preserved) 1. Stop application 2. Restore v1.0.x code 3. Database remains compatible 4. FTS tables ignored by old code 5. Custom slugs work as regular slugs ``` ### Container Deployment ```dockerfile # No changes to container required # SQLite FTS5 included by default # No new dependencies added ``` ## Testing Strategy ### Unit Test Coverage - Migration path logic: 100% - Slug validation: 100% - Search query parsing: 100% - Trigger behavior: 100% ### Integration Test Scenarios 1. Fresh installation flow 2. Upgrade from each version 3. Search with special characters 4. Micropub with various slugs 5. Concurrent note operations ### Performance Benchmarks - 1,000 notes: <5ms search - 10,000 notes: <10ms search - 100,000 notes: <50ms search - Index size: Confirm ~30% ratio ## Monitoring & Observability ### Key Metrics 1. Search query latency (p50, p95, p99) 2. Index size growth rate 3. Slug conflict frequency 4. Migration execution time ### Log Events ```python # Search INFO: "Search query: {query}, results: {count}, latency: {ms}" # Slugs WARN: "Slug conflict resolved: {original} → {final}" # Migrations INFO: "Migration {version} applied in {ms}ms" ERROR: "Migration {version} failed: {error}" ``` ## Future Considerations ### Potential Enhancements 1. **Search Filters**: by date, author, tags 2. **Hierarchical Slugs**: `/2024/11/25/post` 3. **Migration Rollback**: Bi-directional migrations 4. **Search Suggestions**: Auto-complete support ### Scaling Considerations 1. **Search Index Sharding**: If >1M notes 2. **External Search**: Meilisearch for multi-user 3. **Slug Namespaces**: Per-user slug spaces 4. **Migration Parallelization**: For large datasets ## Conclusion The v1.1.0 architecture maintains StarPunk's commitment to minimalism while adding essential features. Each component: - Solves a specific user need - Uses standard, proven technologies - Avoids external dependencies - Maintains backward compatibility - Follows the principle: "Every line of code must justify its existence" The architecture is designed to be understood, maintained, and extended by a single developer, staying true to the IndieWeb philosophy of personal publishing platforms.