- ADR-033: Database migration redesign - ADR-034: Full-text search with FTS5 - ADR-035: Custom slugs in Micropub - ADR-036: IndieAuth token verification method - ADR-039: Micropub URL construction fix - Implementation plan and decisions - Architecture specifications - Validation reports for implementation and search UI 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
6.2 KiB
6.2 KiB
ADR-034: Full-Text Search with SQLite FTS5
Status
Proposed
Context
Users need the ability to search through their notes efficiently. Currently, finding specific content requires manually browsing through notes or using external tools. A built-in search capability is essential for any content management system, especially as the number of notes grows.
Requirements:
- Fast search across all note content
- Support for phrase searching and boolean operators
- Ranking by relevance
- Minimal performance impact on write operations
- No external dependencies (Elasticsearch, Solr, etc.)
- Works with existing SQLite database
Decision
Implement full-text search using SQLite's FTS5 (Full-Text Search version 5) extension:
- FTS5 Virtual Table: Create a shadow FTS table that indexes note content
- Synchronized Updates: Keep FTS index in sync with note operations
- Search Endpoint: New
/api/searchendpoint for queries - Search UI: Simple search interface in the web UI
- Advanced Operators: Support FTS5's query syntax for power users
Database schema:
-- FTS5 virtual table for note content
CREATE VIRTUAL TABLE IF NOT EXISTS notes_fts USING fts5(
slug UNINDEXED, -- For result retrieval, not searchable
title, -- Note title (first line)
content, -- Full markdown content
tokenize='porter unicode61' -- Stem words, handle unicode
);
-- Trigger to keep FTS in sync with notes table
CREATE TRIGGER notes_fts_insert AFTER INSERT ON notes
BEGIN
INSERT INTO notes_fts (rowid, slug, title, content)
SELECT id, slug, title_from_content(content), content
FROM notes WHERE id = NEW.id;
END;
-- Similar triggers for UPDATE and DELETE
Rationale
SQLite FTS5 is the optimal choice because:
- Native Integration: Built into SQLite, no external dependencies
- Performance: Highly optimized C implementation
- Features: Rich query syntax (phrases, NEAR, boolean, wildcards)
- Ranking: Built-in BM25 ranking algorithm
- Simplicity: Just another table in our existing database
- Maintenance-free: No separate search service to manage
- Size: Minimal storage overhead (~30% of original text)
Query capabilities:
- Simple terms:
indieweb - Phrases:
"static site" - Wildcards:
micro* - Boolean:
micropub OR websub - Exclusions:
indieweb NOT wordpress - Field-specific:
title:announcement
Consequences
Positive
- Powerful search with zero external dependencies
- Fast queries even with thousands of notes
- Rich query syntax for power users
- Automatic stemming (search "running" finds "run", "runs")
- Unicode support for international content
- Integrates seamlessly with existing SQLite database
Negative
- FTS index increases database size by ~30%
- Initial indexing of existing notes required
- Must maintain sync triggers for consistency
- FTS5 requires SQLite 3.9.0+ (2015, widely available)
- Cannot search in encrypted/binary content
Performance Characteristics
- Index build: ~1ms per note
- Search query: <10ms for 10,000 notes
- Index size: ~30% of indexed text
- Write overhead: ~5% increase in note creation time
Alternatives Considered
Alternative 1: Simple LIKE Queries
SELECT * FROM notes WHERE content LIKE '%search term%'
- Pros: No setup, works today
- Cons: Extremely slow on large datasets, no ranking, no advanced features
- Rejected because: Performance degrades quickly with scale
Alternative 2: External Search Service (Elasticsearch/Meilisearch)
- Pros: More features, dedicated search infrastructure
- Cons: External dependency, complex setup, overkill for single-user CMS
- Rejected because: Violates minimal philosophy, adds operational complexity
Alternative 3: Client-Side Search (Lunr.js)
- Pros: No server changes needed
- Cons: Must download all content to browser, doesn't scale
- Rejected because: Impractical beyond a few hundred notes
Alternative 4: Regex/Grep-based Search
- Pros: Powerful pattern matching
- Cons: Slow, no ranking, must read all files from disk
- Rejected because: Poor performance, no relevance ranking
Implementation Plan
Phase 1: Database Schema (2 hours)
- Add FTS5 table creation to migrations
- Create sync triggers for INSERT/UPDATE/DELETE
- Build initial index from existing notes
- Test sync on note operations
Phase 2: Search API (2 hours)
- Create
/api/searchendpoint - Implement query parser and validation
- Add result ranking and pagination
- Return structured results with snippets
Phase 3: Search UI (1 hour)
- Add search box to navigation
- Create search results page
- Highlight matching terms in results
- Add search query syntax help
Phase 4: Testing (1 hour)
- Test with various query types
- Benchmark with large datasets
- Verify sync triggers work correctly
- Test Unicode and special characters
API Design
Search Endpoint
GET /api/search?q={query}&limit=20&offset=0
Response:
{
"query": "indieweb micropub",
"total": 15,
"results": [
{
"slug": "implementing-micropub",
"title": "Implementing Micropub",
"snippet": "...the <mark>IndieWeb</mark> <mark>Micropub</mark> specification...",
"rank": 2.4,
"published": true,
"created_at": "2024-01-15T10:00:00Z"
}
]
}
Query Syntax Examples
indieweb- Find notes containing "indieweb""static site"- Exact phrasemicro*- Prefix searchtitle:announcement- Search in title onlymicropub OR websub- Boolean operatorsindieweb -wordpress- Exclusion
Security Considerations
- Sanitize queries to prevent SQL injection (FTS5 handles this)
- Rate limit search endpoint to prevent abuse
- Only search published notes for anonymous users
- Escape HTML in snippets to prevent XSS
Migration Strategy
- Check SQLite version supports FTS5 (3.9.0+)
- Create FTS table and triggers in migration
- Build initial index from existing notes
- Monitor index size and performance
- Document search syntax for users
References
- SQLite FTS5 Documentation: https://www.sqlite.org/fts5.html
- BM25 Ranking: https://en.wikipedia.org/wiki/Okapi_BM25
- FTS5 Performance: https://www.sqlite.org/fts5.html#performance