- ADR-033: Database migration redesign - ADR-034: Full-text search with FTS5 - ADR-035: Custom slugs in Micropub - ADR-036: IndieAuth token verification method - ADR-039: Micropub URL construction fix - Implementation plan and decisions - Architecture specifications - Validation reports for implementation and search UI 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
13 KiB
13 KiB
StarPunk v1.1.0 Feature Architecture
Overview
This document defines the architectural design for the three major features in v1.1.0: Migration System Redesign, Full-Text Search, and Custom Slugs. Each component has been designed following our core principle of minimal, elegant solutions.
System Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ StarPunk CMS v1.1.0 │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Micropub │ │ Web UI │ │ Search API │ │
│ │ Endpoint │ │ │ │ /api/search │ │
│ └──────┬──────┘ └──────┬───────┘ └────────┬─────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Application Layer │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │ │
│ │ │ Custom │ │ Note │ │ Search │ │ │
│ │ │ Slugs │ │ CRUD │ │ Engine │ │ │
│ │ └────────────┘ └────────────┘ └────────────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Data Layer (SQLite) │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │ │
│ │ │ notes │ │ notes_fts │ │ migrations │ │ │
│ │ │ table │◄─┤ (FTS5) │ │ table │ │ │
│ │ └────────────┘ └────────────┘ └────────────────┘ │ │
│ │ │ ▲ │ │ │
│ │ └──────────────┴───────────────────┘ │ │
│ │ Triggers keep FTS in sync │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ File System Layer │ │
│ │ data/notes/YYYY/MM/[slug].md │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Component Architecture
1. Migration System Redesign
Current Problem
[Fresh Install] [Upgrade Path]
│ │
▼ ▼
SCHEMA_SQL Migration Files
(full schema) (partial schema)
│ │
└────────┬───────────────┘
▼
DUPLICATION!
New Architecture
[Fresh Install] [Upgrade Path]
│ │
▼ ▼
INITIAL_SCHEMA_SQL ──────► Migrations
(v1.0.0 only) (changes only)
│ │
└────────┬───────────────┘
▼
Single Source
Key Components
- INITIAL_SCHEMA_SQL: Frozen v1.0.0 schema
- Migration Files: Only incremental changes
- Migration Runner: Handles both paths intelligently
2. Full-Text Search Architecture
Data Flow
1. User Query
│
▼
2. Query Parser
│
▼
3. FTS5 Engine ───► SQLite Query Planner
│ │
▼ ▼
4. BM25 Ranking Index Lookup
│ │
└──────────┬───────────┘
▼
5. Results + Snippets
Database Schema
notes (main table) notes_fts (virtual table)
┌──────────────┐ ┌──────────────────┐
│ id (PK) │◄───────────┤ rowid (FK) │
│ slug │ │ slug (UNINDEXED) │
│ content │───trigger──► title │
│ published │ │ content │
└──────────────┘ └──────────────────┘
Synchronization Strategy
- INSERT Trigger: Automatically indexes new notes
- UPDATE Trigger: Re-indexes modified notes
- DELETE Trigger: Removes deleted notes from index
- Initial Build: One-time indexing of existing notes
3. Custom Slugs Architecture
Request Flow
Micropub Request
│
▼
Extract mp-slug ──► No mp-slug ──► Auto-generate
│ │
▼ │
Validate Format │
│ │
▼ │
Check Uniqueness │
│ │
├─► Unique ────────────────────┤
│ │
└─► Duplicate │
│ │
▼ ▼
Add suffix Create Note
(my-slug-2)
Validation Pipeline
Input: "My/Cool/../Post!"
│
▼
1. Lowercase: "my/cool/../post!"
│
▼
2. Remove Invalid: "my/cool/post"
│
▼
3. Security Check: Reject "../"
│
▼
4. Pattern Match: ^[a-z0-9-/]+$
│
▼
5. Reserved Check: Not in blocklist
│
▼
Output: "my-cool-post"
Data Models
Migration Record
class Migration:
version: str # "001", "002", etc.
description: str # Human-readable
applied_at: datetime
checksum: str # Verify integrity
Search Result
class SearchResult:
slug: str
title: str
snippet: str # With <mark> highlights
rank: float # BM25 score
published: bool
created_at: datetime
Slug Validation
class SlugValidator:
pattern: regex = r'^[a-z0-9-/]+$'
max_length: int = 200
reserved: set = {'api', 'admin', 'auth', 'feed'}
def validate(slug: str) -> bool
def sanitize(slug: str) -> str
def ensure_unique(slug: str) -> str
Interface Specifications
Search API Contract
endpoint: GET /api/search
parameters:
q: string (required) - Search query
limit: int (optional, default: 20, max: 100)
offset: int (optional, default: 0)
published_only: bool (optional, default: true)
response:
200 OK:
content-type: application/json
schema:
query: string
total: integer
results: array[SearchResult]
400 Bad Request:
error: "invalid_query"
description: string
Micropub Slug Extension
property: mp-slug
type: string
required: false
validation:
- URL-safe characters only
- Maximum 200 characters
- Not in reserved list
- Unique (or auto-incremented)
example:
properties:
content: ["My post"]
mp-slug: ["my-custom-url"]
Performance Characteristics
Migration System
- Fresh install: ~100ms (schema + migrations)
- Upgrade: ~50ms per migration
- Rollback: Not supported (forward-only)
Full-Text Search
- Index build: 1ms per note
- Query latency: <10ms for 10K notes
- Index size: ~30% of text
- Memory usage: Negligible (SQLite managed)
Custom Slugs
- Validation: <1ms
- Uniqueness check: <5ms
- Conflict resolution: <10ms
- No performance impact on existing flows
Security Architecture
Search Security
- Input Sanitization: FTS5 handles SQL injection
- Output Escaping: HTML escaped in snippets
- Rate Limiting: 100 requests/minute per IP
- Access Control: Unpublished notes require auth
Slug Security
- Path Traversal Prevention: Reject
..patterns - Reserved Routes: Block system endpoints
- Length Limits: Prevent DoS via long slugs
- Character Whitelist: Only allow safe chars
Migration Security
- Checksum Verification: Detect tampering
- Transaction Safety: All-or-nothing execution
- No User Input: Migrations are code-only
- Audit Trail: Track all applied migrations
Deployment Considerations
Database Upgrade Path
# v1.0.x → v1.1.0
1. Backup database
2. Apply migration 002 (FTS5 tables)
3. Build initial search index
4. Verify functionality
5. Remove backup after confirmation
Rollback Strategy
# Emergency rollback (data preserved)
1. Stop application
2. Restore v1.0.x code
3. Database remains compatible
4. FTS tables ignored by old code
5. Custom slugs work as regular slugs
Container Deployment
# No changes to container required
# SQLite FTS5 included by default
# No new dependencies added
Testing Strategy
Unit Test Coverage
- Migration path logic: 100%
- Slug validation: 100%
- Search query parsing: 100%
- Trigger behavior: 100%
Integration Test Scenarios
- Fresh installation flow
- Upgrade from each version
- Search with special characters
- Micropub with various slugs
- Concurrent note operations
Performance Benchmarks
- 1,000 notes: <5ms search
- 10,000 notes: <10ms search
- 100,000 notes: <50ms search
- Index size: Confirm ~30% ratio
Monitoring & Observability
Key Metrics
- Search query latency (p50, p95, p99)
- Index size growth rate
- Slug conflict frequency
- Migration execution time
Log Events
# Search
INFO: "Search query: {query}, results: {count}, latency: {ms}"
# Slugs
WARN: "Slug conflict resolved: {original} → {final}"
# Migrations
INFO: "Migration {version} applied in {ms}ms"
ERROR: "Migration {version} failed: {error}"
Future Considerations
Potential Enhancements
- Search Filters: by date, author, tags
- Hierarchical Slugs:
/2024/11/25/post - Migration Rollback: Bi-directional migrations
- Search Suggestions: Auto-complete support
Scaling Considerations
- Search Index Sharding: If >1M notes
- External Search: Meilisearch for multi-user
- Slug Namespaces: Per-user slug spaces
- Migration Parallelization: For large datasets
Conclusion
The v1.1.0 architecture maintains StarPunk's commitment to minimalism while adding essential features. Each component:
- Solves a specific user need
- Uses standard, proven technologies
- Avoids external dependencies
- Maintains backward compatibility
- Follows the principle: "Every line of code must justify its existence"
The architecture is designed to be understood, maintained, and extended by a single developer, staying true to the IndieWeb philosophy of personal publishing platforms.