- Updated 42 references from indieauth.spec.indieweb.org to www.w3.org/TR/indieauth - Ensures consistency across all documentation - Points to the authoritative W3C specification - No functional changes, documentation update only Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
1131 lines
39 KiB
Markdown
1131 lines
39 KiB
Markdown
# StarPunk Architecture Overview
|
|
|
|
**Version**: v0.9.5 (2025-11-24)
|
|
**Status**: Pre-V1 Release (Micropub endpoint pending)
|
|
|
|
## Executive Summary
|
|
|
|
StarPunk is a minimal, single-user IndieWeb CMS designed around the principle: "Every line of code must justify its existence." The architecture prioritizes simplicity, standards compliance, and user data ownership through careful technology selection and hybrid data storage.
|
|
|
|
**Core Architecture**: Flask web application with hybrid file+database storage, server-side rendering, delegated authentication (IndieLogin.com), and containerized deployment.
|
|
|
|
**Technology Stack**: Python 3.11, Flask, SQLite, Jinja2, Gunicorn, uv package manager
|
|
**Deployment**: Container-based (Podman/Docker) with automated CI/CD (Gitea Actions)
|
|
**Authentication**: IndieAuth via IndieLogin.com with PKCE security
|
|
|
|
## System Architecture
|
|
|
|
### High-Level Components
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ User Browser │
|
|
└───────────────┬─────────────────────────────────────────────┘
|
|
│
|
|
│ HTTP/HTTPS
|
|
↓
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Flask Application │
|
|
│ ┌─────────────────────────────────────────────────────────┤
|
|
│ │ Web Interface (Jinja2 Templates) │
|
|
│ │ - Public: Homepage, Note Permalinks │
|
|
│ │ - Admin: Dashboard, Note Editor │
|
|
│ └──────────────────────────────┬──────────────────────────┘
|
|
│ ┌──────────────────────────────┴──────────────────────────┐
|
|
│ │ API Layer (RESTful + Micropub) │
|
|
│ │ - Notes CRUD API │
|
|
│ │ - Micropub Endpoint │
|
|
│ │ - RSS Feed Generator │
|
|
│ │ - Authentication Handlers │
|
|
│ └──────────────────────────────┬──────────────────────────┘
|
|
│ ┌──────────────────────────────┴──────────────────────────┐
|
|
│ │ Business Logic │
|
|
│ │ - Note Management (create, read, update, delete) │
|
|
│ │ - File/Database Sync │
|
|
│ │ - Markdown Rendering │
|
|
│ │ - Slug Generation │
|
|
│ │ - Session Management │
|
|
│ └──────────────────────────────┬──────────────────────────┘
|
|
│ ┌──────────────────────────────┴──────────────────────────┐
|
|
│ │ Data Layer │
|
|
│ │ ┌──────────────────┐ ┌─────────────────────────┐ │
|
|
│ │ │ File Storage │ │ SQLite Database │ │
|
|
│ │ │ │ │ │ │
|
|
│ │ │ Markdown Files │ │ - Note Metadata │ │
|
|
│ │ │ (Pure Content) │ │ - Sessions │ │
|
|
│ │ │ │ │ - Tokens │ │
|
|
│ │ │ data/notes/ │ │ - Auth State │ │
|
|
│ │ │ YYYY/MM/ │ │ │ │
|
|
│ │ │ slug.md │ │ data/starpunk.db │ │
|
|
│ │ └──────────────────┘ └─────────────────────────┘ │
|
|
│ └─────────────────────────────────────────────────────────┘
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│
|
|
│ HTTPS
|
|
↓
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ External Services │
|
|
│ - IndieLogin.com (Authentication) │
|
|
│ - User's Website (Identity Verification) │
|
|
│ - Micropub Clients (Publishing) │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Core Principles
|
|
|
|
### 1. Radical Simplicity
|
|
- Total dependencies: 6 direct packages
|
|
- No build tools, no npm, no bundlers
|
|
- Server-side rendering eliminates frontend complexity
|
|
- Single file SQLite database
|
|
- Zero configuration frameworks
|
|
|
|
### 2. Hybrid Data Architecture
|
|
**Files for Content**: Markdown notes stored as plain text files
|
|
- Maximum portability
|
|
- Human-readable
|
|
- Direct user access
|
|
- Easy backup (copy, rsync, git)
|
|
|
|
**Database for Metadata**: SQLite stores structured data
|
|
- Fast queries and indexes
|
|
- Referential integrity
|
|
- Efficient filtering and sorting
|
|
- Transaction support
|
|
|
|
**Sync Strategy**: Files are authoritative for content; database is authoritative for metadata. Both must stay in sync.
|
|
|
|
### 3. Standards-First Design
|
|
- IndieWeb: Microformats2, IndieAuth, Micropub
|
|
- Web: HTML5, RSS 2.0, HTTP standards
|
|
- Security: OAuth 2.0, HTTPS, secure cookies
|
|
- Data: CommonMark markdown
|
|
|
|
### 4. API-First Architecture
|
|
All functionality exposed via API, web interface consumes API. This enables:
|
|
- Micropub client support
|
|
- Future client applications
|
|
- Scriptable automation
|
|
- Clean separation of concerns
|
|
|
|
### 5. Progressive Enhancement
|
|
- Core functionality works without JavaScript
|
|
- JavaScript adds optional enhancements (markdown preview)
|
|
- Server-side rendering for fast initial loads
|
|
- Mobile-responsive from the start
|
|
|
|
## Component Descriptions
|
|
|
|
### Web Layer
|
|
|
|
#### Public Interface
|
|
**Purpose**: Display published notes to the world
|
|
**Technology**: Server-side rendered HTML (Jinja2)
|
|
**Status**: ✅ IMPLEMENTED (v0.5.0)
|
|
|
|
**Routes** (Implemented):
|
|
- `GET /` - Homepage with recent published notes
|
|
- `GET /note/<slug>` - Individual note permalink
|
|
- `GET /feed.xml` - RSS 2.0 feed (v0.6.0)
|
|
- `GET /health` - Health check endpoint (v0.6.0)
|
|
|
|
**Features**:
|
|
- Microformats2 markup (h-entry, h-card, h-feed) - ⚠️ Not validated
|
|
- Reverse chronological note list
|
|
- Clean, minimal responsive CSS
|
|
- Mobile-responsive
|
|
- No JavaScript required
|
|
|
|
#### Admin Interface
|
|
**Purpose**: Manage notes (create, edit, publish)
|
|
**Technology**: Server-side rendered HTML (Jinja2)
|
|
**Status**: ✅ IMPLEMENTED (v0.5.2)
|
|
|
|
**Routes** (Implemented):
|
|
- `GET /auth/login` - Login form (v0.9.2: moved from /admin/login)
|
|
- `POST /auth/login` - Initiate IndieLogin OAuth flow
|
|
- `GET /auth/callback` - Handle IndieLogin callback
|
|
- `POST /auth/logout` - Logout and destroy session
|
|
- `GET /admin` - Dashboard (list of all notes, published + drafts)
|
|
- `GET /admin/new` - Create note form
|
|
- `POST /admin/new` - Create note handler
|
|
- `GET /admin/edit/<slug>` - Edit note form
|
|
- `POST /admin/edit/<slug>` - Update note handler
|
|
- `POST /admin/delete/<slug>` - Delete note handler
|
|
|
|
**Development Routes** (DEV_MODE only):
|
|
- `GET /dev/login` - Development authentication bypass (v0.5.0)
|
|
|
|
**Features**:
|
|
- Markdown editor (textarea)
|
|
- No real-time preview (deferred to V2)
|
|
- Publish/draft toggle
|
|
- Protected by session authentication
|
|
- Flash messages for feedback
|
|
- Note: Admin routes changed from `/admin/*` to `/auth/*` for auth in v0.9.2
|
|
|
|
### API Layer
|
|
|
|
#### Notes API
|
|
**Purpose**: RESTful CRUD operations for notes
|
|
**Authentication**: Session-based (admin interface)
|
|
**Status**: ❌ NOT IMPLEMENTED (Optional for V1, deferred to V2)
|
|
|
|
**Planned Routes** (Not Implemented):
|
|
```
|
|
GET /api/notes List published notes (JSON)
|
|
POST /api/notes Create new note (JSON)
|
|
GET /api/notes/<slug> Get single note (JSON)
|
|
PUT /api/notes/<slug> Update note (JSON)
|
|
DELETE /api/notes/<slug> Delete note (JSON)
|
|
```
|
|
|
|
**Current Workaround**: Admin interface uses HTML forms (POST), not JSON API
|
|
**Note**: Not required for V1, admin interface is fully functional without REST API
|
|
|
|
#### Micropub Endpoint
|
|
**Purpose**: Accept posts from external Micropub clients (Quill, Indigenous, etc.)
|
|
**Authentication**: IndieAuth bearer tokens
|
|
**Status**: ❌ NOT IMPLEMENTED (Critical blocker for V1)
|
|
|
|
**Planned Routes** (Not Implemented):
|
|
```
|
|
POST /api/micropub Create note (h-entry)
|
|
GET /api/micropub?q=config Query configuration
|
|
GET /api/micropub?q=source Query note source by URL
|
|
```
|
|
|
|
**Planned Content Types**:
|
|
- application/json
|
|
- application/x-www-form-urlencoded
|
|
|
|
**Target Compliance**: Micropub specification
|
|
**Current Status**:
|
|
- Token model exists in database
|
|
- No endpoint implementation
|
|
- No token validation logic
|
|
- Will require IndieAuth token endpoint or external token service
|
|
|
|
#### RSS Feed
|
|
**Purpose**: Syndicate published notes
|
|
**Technology**: feedgen library
|
|
**Status**: ✅ IMPLEMENTED (v0.6.0)
|
|
|
|
**Route**: `GET /feed.xml`
|
|
**Format**: Valid RSS 2.0 XML
|
|
**Caching**: 5 minutes server-side (configurable via FEED_CACHE_SECONDS)
|
|
**Features**:
|
|
- Limit to 50 most recent published notes (configurable via FEED_MAX_ITEMS)
|
|
- RFC-822 date formatting (pubDate)
|
|
- CDATA-wrapped HTML content for feed readers
|
|
- Proper GUID for each item (note permalink)
|
|
- Auto-discovery link in HTML templates (<link rel="alternate">)
|
|
- Cache-Control headers for client caching
|
|
- ETag support for conditional requests
|
|
|
|
### Business Logic Layer
|
|
|
|
#### Note Management
|
|
**Operations**:
|
|
1. **Create**: Generate slug → write file → insert database record
|
|
2. **Read**: Query database for path → read file → render markdown
|
|
3. **Update**: Write file atomically → update database timestamp
|
|
4. **Delete**: Mark deleted in database → optionally archive file
|
|
|
|
**Key Components**:
|
|
- Slug generation (URL-safe, unique)
|
|
- Markdown rendering (markdown library)
|
|
- Content hashing (integrity verification)
|
|
- Atomic file operations (prevent corruption)
|
|
|
|
#### File/Database Sync
|
|
**Strategy**: Write files first, then database
|
|
**Rollback**: If database operation fails, delete/restore file
|
|
**Verification**: Content hash detects external modifications
|
|
**Integrity Check**: Optional scan for orphaned files/records
|
|
|
|
#### Authentication
|
|
**Admin Auth**: IndieLogin.com OAuth 2.0 flow with PKCE
|
|
**Status**: ✅ IMPLEMENTED (v0.8.0, refined through v0.9.5)
|
|
|
|
**Flow**:
|
|
1. User enters website URL (their "me" identity)
|
|
2. Generate PKCE code_verifier and code_challenge (SHA-256)
|
|
3. Store state token + code_verifier in database (5 min expiry)
|
|
4. Redirect to indielogin.com/authorize with:
|
|
- client_id (SITE_URL with trailing slash)
|
|
- redirect_uri (SITE_URL/auth/callback)
|
|
- state (CSRF protection)
|
|
- code_challenge + code_challenge_method (S256)
|
|
5. IndieLogin.com verifies identity via RelMeAuth or email
|
|
6. Callback to /auth/callback with code + state
|
|
7. Verify state token (CSRF check)
|
|
8. POST code + code_verifier to indielogin.com/authorize (NOT /token)
|
|
9. Receive verified "me" URL
|
|
10. Verify "me" matches ADMIN_ME config
|
|
11. Create session with SHA-256 hashed token
|
|
12. Store in HttpOnly, Secure, SameSite=Lax cookie named "starpunk_session"
|
|
|
|
**Security Features** (v0.8.0-v0.9.5):
|
|
- PKCE prevents authorization code interception
|
|
- State tokens prevent CSRF attacks
|
|
- Session token hashing (SHA-256) before database storage
|
|
- Single-use state tokens with short expiry
|
|
- Automatic trailing slash normalization on SITE_URL (v0.9.1)
|
|
- Uses authorization endpoint (not token endpoint) per IndieAuth spec (v0.9.4)
|
|
- Session cookie renamed to avoid Flask session collision (v0.5.1)
|
|
|
|
**Development Mode** (v0.5.0):
|
|
- `/dev/login` bypasses IndieLogin for local development
|
|
- Requires DEV_MODE=true and DEV_ADMIN_ME configuration
|
|
- Shows warning in logs
|
|
|
|
**Micropub Auth**: IndieAuth token verification
|
|
**Status**: ❌ NOT IMPLEMENTED (Required for Micropub)
|
|
|
|
**Planned Implementation**:
|
|
- Client obtains token via external IndieAuth token endpoint
|
|
- Token sent as Bearer in Authorization header
|
|
- Verify token exists in database and not expired
|
|
- Check scope permissions (create, update, delete)
|
|
- OR: Delegate token verification to external IndieAuth server
|
|
|
|
### Data Layer
|
|
|
|
#### File Storage
|
|
**Location**: `data/notes/`
|
|
**Structure**: `YYYY/MM/slug.md`
|
|
**Format**: Pure markdown, no frontmatter
|
|
**Operations**:
|
|
- Atomic writes (temp file → rename)
|
|
- Directory creation (makedirs)
|
|
- Content reading (UTF-8 encoding)
|
|
|
|
**Example**:
|
|
```
|
|
data/notes/
|
|
├── 2024/
|
|
│ ├── 11/
|
|
│ │ ├── my-first-note.md
|
|
│ │ └── another-note.md
|
|
│ └── 12/
|
|
│ └── december-note.md
|
|
```
|
|
|
|
#### Database Storage
|
|
**Location**: `data/starpunk.db`
|
|
**Engine**: SQLite3
|
|
**Status**: ✅ IMPLEMENTED with automatic migration system (v0.9.0)
|
|
|
|
**Tables**:
|
|
- `notes` - Note metadata (slug, file_path, published, created_at, updated_at, deleted_at, content_hash)
|
|
- `sessions` - Admin auth sessions (session_token_hash, me, created_at, expires_at, last_used_at, user_agent, ip_address)
|
|
- `tokens` - Micropub bearer tokens (token, me, client_id, scope, created_at, expires_at) - **Table exists but unused**
|
|
- `auth_state` - CSRF state tokens (state, created_at, expires_at, redirect_uri, code_verifier)
|
|
- `schema_migrations` - Migration tracking (migration_name, applied_at) - **Added v0.9.0**
|
|
|
|
**Indexes**:
|
|
- `notes.created_at` (DESC) - Fast chronological queries
|
|
- `notes.published` - Fast published note filtering
|
|
- `notes.slug` (UNIQUE) - Fast lookup by slug, uniqueness enforcement
|
|
- `notes.deleted_at` - Fast soft-delete filtering
|
|
- `sessions.session_token_hash` (UNIQUE) - Fast auth checks
|
|
- `sessions.me` - Fast user lookups
|
|
- `auth_state.state` (UNIQUE) - Fast state token validation
|
|
|
|
**Migration System** (v0.9.0):
|
|
- Automatic schema updates on application startup
|
|
- Migration files in `migrations/` directory (SQL format)
|
|
- Executed in alphanumeric order (001, 002, 003...)
|
|
- Fresh database detection (marks migrations as applied without execution)
|
|
- Legacy database detection (applies pending migrations automatically)
|
|
- Migration tracking in schema_migrations table
|
|
- Fail-safe: Application refuses to start if migrations fail
|
|
|
|
**Queries**: Direct SQL using Python sqlite3 module (no ORM)
|
|
|
|
## Data Flow Examples
|
|
|
|
### Creating a Note (via Admin Interface)
|
|
|
|
```
|
|
1. User fills out form at /admin/new
|
|
↓
|
|
2. POST to /api/notes with markdown content
|
|
↓
|
|
3. Verify user session (check session cookie)
|
|
↓
|
|
4. Generate unique slug from content or timestamp
|
|
↓
|
|
5. Determine file path: data/notes/2024/11/slug.md
|
|
↓
|
|
6. Create directories if needed (makedirs)
|
|
↓
|
|
7. Write markdown content to file (atomic write)
|
|
↓
|
|
8. Calculate SHA-256 hash of content
|
|
↓
|
|
9. Begin database transaction
|
|
↓
|
|
10. Insert record into notes table:
|
|
- slug
|
|
- file_path
|
|
- published (from form)
|
|
- created_at (now)
|
|
- updated_at (now)
|
|
- content_hash
|
|
↓
|
|
11. If database insert fails:
|
|
- Delete file
|
|
- Return error to user
|
|
↓
|
|
12. If database insert succeeds:
|
|
- Commit transaction
|
|
- Return success with note URL
|
|
↓
|
|
13. Redirect user to /admin (dashboard)
|
|
```
|
|
|
|
### Reading a Note (via Public Interface)
|
|
|
|
```
|
|
1. User visits /note/my-first-note
|
|
↓
|
|
2. Extract slug from URL
|
|
↓
|
|
3. Query database:
|
|
SELECT file_path, created_at, published
|
|
FROM notes
|
|
WHERE slug = 'my-first-note' AND published = 1
|
|
↓
|
|
4. If not found → 404 error
|
|
↓
|
|
5. Read markdown content from file:
|
|
- Open data/notes/2024/11/my-first-note.md
|
|
- Read UTF-8 content
|
|
↓
|
|
6. Render markdown to HTML (markdown.markdown())
|
|
↓
|
|
7. Render Jinja2 template with:
|
|
- content_html (rendered HTML)
|
|
- created_at (timestamp)
|
|
- slug (for permalink)
|
|
↓
|
|
8. Return HTML with microformats markup
|
|
```
|
|
|
|
### Publishing via Micropub
|
|
|
|
```
|
|
1. Micropub client POSTs to /api/micropub
|
|
Headers: Authorization: Bearer {token}
|
|
Body: {"type": ["h-entry"], "properties": {"content": ["..."]}}
|
|
↓
|
|
2. Extract bearer token from Authorization header
|
|
↓
|
|
3. Query database:
|
|
SELECT me, scope FROM tokens
|
|
WHERE token = {token} AND expires_at > now()
|
|
↓
|
|
4. If token invalid → 401 Unauthorized
|
|
↓
|
|
5. Parse Micropub JSON payload
|
|
↓
|
|
6. Extract content from properties.content[0]
|
|
↓
|
|
7. Create note (same flow as admin interface):
|
|
- Generate slug
|
|
- Write file
|
|
- Insert database record
|
|
↓
|
|
8. If successful:
|
|
- Return 201 Created
|
|
- Set Location header to note URL
|
|
↓
|
|
9. Client receives note URL, displays success
|
|
```
|
|
|
|
### IndieLogin Authentication Flow (v0.9.5 with PKCE)
|
|
|
|
```
|
|
1. User visits /auth/login
|
|
↓
|
|
2. User enters their website: https://alice.example.com
|
|
↓
|
|
3. POST to /auth/login with "me" parameter
|
|
↓
|
|
4. Validate URL format (must be https://)
|
|
↓
|
|
5. Generate PKCE code_verifier (43 random bytes, base64-url encoded)
|
|
↓
|
|
6. Generate code_challenge from code_verifier (SHA256 hash, base64-url encoded)
|
|
↓
|
|
7. Generate random state token (CSRF protection)
|
|
↓
|
|
8. Store state + code_verifier in auth_state table (5-minute expiry)
|
|
↓
|
|
9. Normalize client_id by adding trailing slash if missing (v0.9.1)
|
|
↓
|
|
10. Build IndieLogin authorization URL:
|
|
https://indielogin.com/authorize?
|
|
me=https://alice.example.com
|
|
client_id=https://starpunk.example.com/ (note trailing slash)
|
|
redirect_uri=https://starpunk.example.com/auth/callback
|
|
state={random_state}
|
|
code_challenge={code_challenge}
|
|
code_challenge_method=S256
|
|
↓
|
|
11. Redirect user to IndieLogin
|
|
↓
|
|
12. IndieLogin verifies user's identity:
|
|
- Checks rel="me" links on alice.example.com
|
|
- Or sends email verification
|
|
- User authenticates via chosen method
|
|
↓
|
|
13. IndieLogin redirects back:
|
|
/auth/callback?code={auth_code}&state={state}
|
|
↓
|
|
14. Verify state matches stored value (CSRF check, single-use)
|
|
↓
|
|
15. Retrieve code_verifier from database using state
|
|
↓
|
|
16. Delete state token (single-use enforcement)
|
|
↓
|
|
17. Exchange code for verified identity (v0.9.4: uses /authorize, not /token):
|
|
POST https://indielogin.com/authorize
|
|
code={auth_code}
|
|
client_id=https://starpunk.example.com/
|
|
redirect_uri=https://starpunk.example.com/auth/callback
|
|
code_verifier={code_verifier}
|
|
↓
|
|
18. IndieLogin returns: {"me": "https://alice.example.com"}
|
|
↓
|
|
19. Verify me == ADMIN_ME (config)
|
|
↓
|
|
20. If match:
|
|
- Generate session token (secrets.token_urlsafe(32))
|
|
- Hash token with SHA-256
|
|
- Insert into sessions table with hash (not plaintext)
|
|
- Set cookie "starpunk_session" (HttpOnly, Secure, SameSite=Lax)
|
|
- Redirect to /admin
|
|
↓
|
|
21. If no match:
|
|
- Return "Unauthorized" error
|
|
- Log attempt with WARNING level
|
|
```
|
|
|
|
**Key Security Features**:
|
|
- PKCE prevents code interception attacks (v0.8.0)
|
|
- State tokens prevent CSRF (v0.4.0)
|
|
- Session token hashing prevents token exposure if database compromised (v0.4.0)
|
|
- Single-use state tokens (deleted after verification)
|
|
- Short-lived state tokens (5 minutes)
|
|
- Trailing slash normalization fixes client_id validation (v0.9.1)
|
|
- Correct endpoint usage (/authorize not /token) per IndieAuth spec (v0.9.4)
|
|
|
|
## Security Architecture
|
|
|
|
### Authentication Security
|
|
|
|
#### Session Management
|
|
- **Token Generation**: `secrets.token_urlsafe(32)` (256-bit entropy)
|
|
- **Storage**: SHA-256 hash stored in database (plaintext token NEVER stored)
|
|
- **Cookie Name**: `starpunk_session` (v0.5.1: renamed to avoid Flask session collision)
|
|
- **Cookies**: HttpOnly, Secure, SameSite=Lax
|
|
- **Expiry**: 30 days, extendable on use
|
|
- **Validation**: Every protected route checks session via `@require_auth` decorator
|
|
- **Metadata**: Tracks user_agent and ip_address for audit purposes
|
|
|
|
#### CSRF Protection
|
|
- **State Tokens**: Random tokens for OAuth flows
|
|
- **Expiry**: 5 minutes (short-lived)
|
|
- **Single-Use**: Deleted after verification
|
|
- **SameSite**: Cookies set to Lax mode
|
|
|
|
#### Access Control
|
|
- **Admin Routes**: Require valid session
|
|
- **Micropub Routes**: Require valid bearer token
|
|
- **Public Routes**: No authentication needed
|
|
- **Identity Verification**: Only ADMIN_ME can authenticate
|
|
|
|
### Input Validation
|
|
|
|
#### User Input
|
|
- **Markdown**: Sanitize to prevent XSS in rendered HTML
|
|
- **URLs**: Validate format and scheme (https://)
|
|
- **Slugs**: Alphanumeric + hyphens only
|
|
- **JSON**: Parse and validate structure
|
|
- **File Paths**: Prevent directory traversal (validate against base path)
|
|
|
|
#### Micropub Payloads
|
|
- **Content-Type**: Verify matches expected format
|
|
- **Required Fields**: Validate h-entry structure
|
|
- **Size Limits**: Prevent DoS via large payloads
|
|
- **Scope Verification**: Check token has required permissions
|
|
|
|
### Database Security
|
|
|
|
#### SQL Injection Prevention
|
|
- **Parameterized Queries**: Always use parameter substitution
|
|
- **No String Interpolation**: Never build SQL with f-strings
|
|
- **Input Sanitization**: Validate before database operations
|
|
|
|
Example:
|
|
```python
|
|
# GOOD
|
|
cursor.execute("SELECT * FROM notes WHERE slug = ?", (slug,))
|
|
|
|
# BAD (SQL injection vulnerable)
|
|
cursor.execute(f"SELECT * FROM notes WHERE slug = '{slug}'")
|
|
```
|
|
|
|
#### Data Integrity
|
|
- **Transactions**: Use for multi-step operations
|
|
- **Constraints**: UNIQUE on slugs, file_paths
|
|
- **Foreign Keys**: Enforce relationships (if applicable)
|
|
- **Content Hashing**: Detect unauthorized file modifications
|
|
|
|
### Network Security
|
|
|
|
#### HTTPS
|
|
- **Production Requirement**: TLS 1.2+ required
|
|
- **Reverse Proxy**: Nginx/Caddy handles SSL termination
|
|
- **Certificate Validation**: Verify SSL certs on outbound requests
|
|
- **HSTS**: Set Strict-Transport-Security header
|
|
|
|
#### Security Headers
|
|
```python
|
|
# Set on all responses
|
|
Content-Security-Policy: default-src 'self'
|
|
X-Frame-Options: DENY
|
|
X-Content-Type-Options: nosniff
|
|
Referrer-Policy: strict-origin-when-cross-origin
|
|
```
|
|
|
|
#### Rate Limiting
|
|
- **Implementation**: Reverse proxy (nginx/Caddy)
|
|
- **Admin Routes**: Stricter limits
|
|
- **API Routes**: Moderate limits
|
|
- **Public Routes**: Permissive limits
|
|
|
|
### File System Security
|
|
|
|
#### Atomic Operations
|
|
```python
|
|
# Write to temp file, then atomic rename
|
|
temp_path = f"{target_path}.tmp"
|
|
with open(temp_path, 'w') as f:
|
|
f.write(content)
|
|
os.rename(temp_path, target_path) # Atomic on POSIX
|
|
```
|
|
|
|
#### Path Validation
|
|
```python
|
|
# Prevent directory traversal
|
|
base_path = os.path.abspath(DATA_PATH)
|
|
requested_path = os.path.abspath(os.path.join(base_path, user_input))
|
|
if not requested_path.startswith(base_path):
|
|
raise SecurityError("Path traversal detected")
|
|
```
|
|
|
|
#### File Permissions
|
|
- **Data Directory**: 700 (owner only)
|
|
- **Database File**: 600 (owner read/write)
|
|
- **Note Files**: 600 (owner read/write)
|
|
- **Application User**: Dedicated non-root user
|
|
|
|
## Performance Considerations
|
|
|
|
### Response Time Targets
|
|
- **API Responses**: < 100ms (database + file read)
|
|
- **Page Renders**: < 200ms (template rendering)
|
|
- **RSS Feed**: < 300ms (query + file reads + XML generation)
|
|
|
|
### Optimization Strategies
|
|
|
|
#### Database
|
|
- **Indexes**: On frequently queried columns (created_at, slug, published)
|
|
- **Connection Pooling**: Single connection (single-user, no contention)
|
|
- **Query Optimization**: SELECT only needed columns
|
|
- **Prepared Statements**: Reuse compiled queries
|
|
|
|
#### File System
|
|
- **Caching**: Consider caching rendered HTML in memory (optional)
|
|
- **Directory Structure**: Year/Month prevents large directories
|
|
- **Atomic Reads**: Fast sequential reads, no locking needed
|
|
|
|
#### HTTP
|
|
- **Static Assets**: Cache headers on CSS/JS (1 year)
|
|
- **RSS Feed**: Cache for 5 minutes (Cache-Control)
|
|
- **Compression**: gzip/brotli via reverse proxy
|
|
- **ETags**: For conditional requests
|
|
|
|
#### Rendering
|
|
- **Template Compilation**: Jinja2 compiles templates automatically
|
|
- **Minimal Templating**: Simple templates render fast
|
|
- **Server-Side**: No client-side rendering overhead
|
|
|
|
### Resource Usage
|
|
|
|
#### Memory
|
|
- **Flask Process**: ~50MB base
|
|
- **SQLite**: ~10MB typical working set
|
|
- **Total**: < 100MB under normal load
|
|
|
|
#### Disk
|
|
- **Application**: ~5MB (code + dependencies)
|
|
- **Database**: ~1MB per 1000 notes
|
|
- **Notes**: ~5KB average per markdown file
|
|
- **Total**: Scales linearly with note count
|
|
|
|
#### CPU
|
|
- **Idle**: Near zero
|
|
- **Request Handling**: Minimal (no heavy processing)
|
|
- **Markdown Rendering**: Fast (pure Python)
|
|
- **Database Queries**: Indexed, sub-millisecond
|
|
|
|
## Deployment Architecture
|
|
|
|
**Current State**: ✅ IMPLEMENTED (v0.6.0 - v0.9.5)
|
|
**Technology**: Container-based with Gunicorn WSGI server
|
|
**CI/CD**: Gitea Actions automated builds (v0.9.5)
|
|
|
|
### Container Deployment (v0.6.0)
|
|
|
|
**Containerfile**: Multi-stage build using Python 3.11-slim base
|
|
- Stage 1: Build dependencies with uv package manager
|
|
- Stage 2: Production image with non-root user (starpunk:1000)
|
|
- Final size: ~174MB
|
|
|
|
**Features**:
|
|
- Health check endpoint: `/health` (validates database and filesystem)
|
|
- Gunicorn WSGI server with 4 workers (configurable)
|
|
- Log rotation (10MB max, 3 files)
|
|
- Resource limits (memory, CPU)
|
|
- SELinux compatibility (volume mount flags)
|
|
- Automatic database initialization on first run
|
|
|
|
**Container Orchestration**:
|
|
- Podman-compatible (rootless, userns=keep-id)
|
|
- Docker Compose compatible
|
|
- Volume mounts for data persistence (`./data:/app/data`)
|
|
- Port mapping (8080:8000)
|
|
- Environment variables for configuration
|
|
|
|
**CI/CD Pipeline** (v0.9.5):
|
|
- Gitea Actions workflow (.gitea/workflows/build-container.yml)
|
|
- Automated builds on push to main branch
|
|
- Manual trigger support
|
|
- Container registry push
|
|
- Docker and git dependencies installed
|
|
- Node.js support for GitHub Actions compatibility
|
|
|
|
### Single-Server Deployment
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Internet │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
│ Port 443 (HTTPS)
|
|
↓
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Nginx/Caddy (Reverse Proxy) │
|
|
│ - SSL/TLS termination │
|
|
│ - Static file serving │
|
|
│ - Rate limiting │
|
|
│ - Compression │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
│ Port 8000 (HTTP)
|
|
↓
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Gunicorn (WSGI Server) │
|
|
│ - 4 worker processes │
|
|
│ - Process management │
|
|
│ - Load balancing (round-robin) │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
│ WSGI
|
|
↓
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Flask Application │
|
|
│ - Request handling │
|
|
│ - Business logic │
|
|
│ - Template rendering │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
↓
|
|
┌────────────────────────────┬────────────────────┐
|
|
│ File System │ SQLite Database │
|
|
│ data/notes/ │ data/starpunk.db │
|
|
│ YYYY/MM/slug.md │ │
|
|
└────────────────────────────┴────────────────────┘
|
|
```
|
|
|
|
### Process Management (systemd)
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=StarPunk CMS
|
|
After=network.target
|
|
|
|
[Service]
|
|
Type=notify
|
|
User=starpunk
|
|
WorkingDirectory=/opt/starpunk
|
|
Environment="PATH=/opt/starpunk/venv/bin"
|
|
ExecStart=/opt/starpunk/venv/bin/gunicorn -w 4 -b 127.0.0.1:8000 app:app
|
|
Restart=always
|
|
RestartSec=10
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
### Backup Strategy
|
|
|
|
#### Automated Daily Backup
|
|
```bash
|
|
#!/bin/bash
|
|
# backup.sh - Run daily via cron
|
|
|
|
DATE=$(date +%Y%m%d)
|
|
BACKUP_DIR="/backup/starpunk"
|
|
|
|
# Backup data directory (notes + database)
|
|
rsync -av /opt/starpunk/data/ "$BACKUP_DIR/$DATE/"
|
|
|
|
# Keep last 30 days
|
|
find "$BACKUP_DIR" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;
|
|
```
|
|
|
|
#### Manual Backup
|
|
```bash
|
|
# Simple copy
|
|
cp -r /opt/starpunk/data /backup/starpunk-$(date +%Y%m%d)
|
|
|
|
# Or with compression
|
|
tar -czf starpunk-backup-$(date +%Y%m%d).tar.gz /opt/starpunk/data
|
|
```
|
|
|
|
### Restore Process
|
|
|
|
1. Stop application: `sudo systemctl stop starpunk`
|
|
2. Restore data directory: `rsync -av /backup/starpunk/20241118/ /opt/starpunk/data/`
|
|
3. Fix permissions: `chown -R starpunk:starpunk /opt/starpunk/data`
|
|
4. Start application: `sudo systemctl start starpunk`
|
|
5. Verify: Visit site, check recent notes
|
|
|
|
## Testing Strategy
|
|
|
|
### Test Pyramid
|
|
|
|
```
|
|
┌─────────────┐
|
|
/ \
|
|
/ Manual Tests \ Validation, Real Services
|
|
/───────────────── \
|
|
/ \
|
|
/ Integration Tests \ API Flows, Database + Files
|
|
/─────────────────────── \
|
|
/ \
|
|
/ Unit Tests \ Functions, Logic, Parsing
|
|
/───────────────────────────────\
|
|
```
|
|
|
|
### Unit Tests (pytest)
|
|
**Coverage**: Business logic, utilities, models
|
|
**Examples**:
|
|
- Slug generation and uniqueness
|
|
- Markdown rendering with various inputs
|
|
- Content hash calculation
|
|
- File path validation
|
|
- Token generation and verification
|
|
- Date formatting for RSS
|
|
- Micropub payload parsing
|
|
|
|
### Integration Tests
|
|
**Coverage**: Component interactions, full flows
|
|
**Examples**:
|
|
- Create note: file write + database insert
|
|
- Read note: database query + file read
|
|
- IndieLogin flow with mocked API
|
|
- Micropub creation with token validation
|
|
- RSS feed generation with multiple notes
|
|
- Session authentication on protected routes
|
|
|
|
### End-to-End Tests
|
|
**Coverage**: Full user workflows
|
|
**Examples**:
|
|
- Admin login via IndieLogin (mocked)
|
|
- Create note via web interface
|
|
- Publish note via Micropub client (mocked)
|
|
- View note on public site
|
|
- Verify RSS feed includes note
|
|
|
|
### Validation Tests
|
|
**Coverage**: Standards compliance
|
|
**Tools**:
|
|
- W3C HTML Validator (validate templates)
|
|
- W3C Feed Validator (validate RSS output)
|
|
- IndieWebify.me (verify microformats)
|
|
- Micropub.rocks (test Micropub compliance)
|
|
|
|
### Manual Tests
|
|
**Coverage**: Real-world usage
|
|
**Examples**:
|
|
- Authenticate with real indielogin.com
|
|
- Publish from actual Micropub client (Quill, Indigenous)
|
|
- Subscribe to feed in actual RSS reader
|
|
- Browser compatibility (Chrome, Firefox, Safari, mobile)
|
|
- Accessibility with screen reader
|
|
|
|
## Monitoring and Observability
|
|
|
|
### Logging Strategy
|
|
|
|
#### Application Logs
|
|
```python
|
|
# Structured logging
|
|
import logging
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
# Info: Normal operations
|
|
logger.info("Note created", extra={
|
|
"slug": slug,
|
|
"published": published,
|
|
"user": session.me
|
|
})
|
|
|
|
# Warning: Recoverable issues
|
|
logger.warning("State token expired", extra={
|
|
"state": state,
|
|
"age": age_seconds
|
|
})
|
|
|
|
# Error: Failed operations
|
|
logger.error("File write failed", extra={
|
|
"path": file_path,
|
|
"error": str(e)
|
|
})
|
|
```
|
|
|
|
#### Log Levels
|
|
- **DEBUG**: Development only (verbose)
|
|
- **INFO**: Normal operations (note creation, auth success)
|
|
- **WARNING**: Unusual but handled (expired tokens, invalid input)
|
|
- **ERROR**: Failed operations (file I/O errors, database errors)
|
|
- **CRITICAL**: System failures (database unreachable)
|
|
|
|
#### Log Destinations
|
|
- **Development**: Console (stdout)
|
|
- **Production**: File rotation (logrotate) + optional syslog
|
|
|
|
### Metrics (Optional for V2)
|
|
|
|
**Simple Metrics** (if desired):
|
|
- Note count (query database)
|
|
- Request count (nginx logs)
|
|
- Error rate (grep application logs)
|
|
- Response times (nginx logs)
|
|
|
|
**Advanced Metrics** (V2):
|
|
- Prometheus exporter
|
|
- Grafana dashboard
|
|
- Alert on error rate spike
|
|
|
|
### Health Checks
|
|
|
|
```python
|
|
@app.route('/health')
|
|
def health_check():
|
|
"""Simple health check for monitoring"""
|
|
try:
|
|
# Check database
|
|
db.execute("SELECT 1").fetchone()
|
|
|
|
# Check file system
|
|
os.path.exists(DATA_PATH)
|
|
|
|
return {"status": "ok"}, 200
|
|
except Exception as e:
|
|
return {"status": "error", "detail": str(e)}, 500
|
|
```
|
|
|
|
## Migration and Evolution
|
|
|
|
### V1 to V2 Migration
|
|
|
|
#### Database Schema Changes
|
|
```sql
|
|
-- Add new column with default
|
|
ALTER TABLE notes ADD COLUMN tags TEXT DEFAULT '';
|
|
|
|
-- Create new table
|
|
CREATE TABLE tags (
|
|
id INTEGER PRIMARY KEY,
|
|
name TEXT UNIQUE NOT NULL
|
|
);
|
|
|
|
-- Migration script updates existing notes
|
|
```
|
|
|
|
#### File Format Evolution
|
|
**V1**: Pure markdown
|
|
**V2** (if needed): Add optional frontmatter
|
|
```markdown
|
|
---
|
|
tags: indieweb, cms
|
|
---
|
|
Note content here
|
|
```
|
|
|
|
**Backward Compatibility**: Parser checks for frontmatter, falls back to pure markdown.
|
|
|
|
#### API Versioning
|
|
```
|
|
# V1 (current)
|
|
GET /api/notes
|
|
|
|
# V2 (future)
|
|
GET /api/v2/notes # New features
|
|
GET /api/notes # Still works, returns V1 response
|
|
```
|
|
|
|
### Data Export/Import
|
|
|
|
#### Export Formats
|
|
1. **Markdown Bundle**: Zip of all notes (already portable)
|
|
2. **JSON Export**: Notes + metadata
|
|
```json
|
|
{
|
|
"version": "1.0",
|
|
"exported_at": "2024-11-18T12:00:00Z",
|
|
"notes": [
|
|
{
|
|
"slug": "my-note",
|
|
"content": "Note content...",
|
|
"created_at": "2024-11-01T12:00:00Z",
|
|
"published": true
|
|
}
|
|
]
|
|
}
|
|
```
|
|
3. **RSS Archive**: Existing feed.xml
|
|
|
|
#### Import (V2)
|
|
- From JSON export
|
|
- From WordPress XML
|
|
- From markdown directory
|
|
- From other IndieWeb CMSs
|
|
|
|
## Implementation Status (v0.9.5)
|
|
|
|
### ✅ Fully Implemented Features
|
|
|
|
1. **Note Management** (v0.3.0)
|
|
- Full CRUD operations (create, read, update, delete)
|
|
- Hybrid file+database storage with sync
|
|
- Soft and hard delete support
|
|
- Markdown rendering
|
|
- Slug generation with uniqueness
|
|
|
|
2. **Authentication** (v0.8.0)
|
|
- IndieLogin.com OAuth 2.0 with PKCE
|
|
- Session management with token hashing
|
|
- CSRF protection with state tokens
|
|
- Development mode authentication bypass
|
|
|
|
3. **Web Interface** (v0.5.2)
|
|
- Public site: homepage and note permalinks
|
|
- Admin dashboard with note management
|
|
- Login/logout flows
|
|
- Responsive design
|
|
- Microformats2 markup (h-entry, h-card, h-feed)
|
|
|
|
4. **RSS Feed** (v0.6.0)
|
|
- RSS 2.0 compliant feed generation
|
|
- Auto-discovery links
|
|
- Server-side caching
|
|
- ETag support
|
|
|
|
5. **Container Deployment** (v0.6.0)
|
|
- Multi-stage Containerfile
|
|
- Gunicorn WSGI server
|
|
- Health check endpoint
|
|
- Volume persistence
|
|
|
|
6. **CI/CD Pipeline** (v0.9.5)
|
|
- Gitea Actions workflow
|
|
- Automated container builds
|
|
- Registry push
|
|
|
|
7. **Database Migrations** (v0.9.0)
|
|
- Automatic migration system
|
|
- Fresh database detection
|
|
- Legacy database migration
|
|
- Migration tracking
|
|
|
|
8. **Development Tools**
|
|
- uv package manager for Python
|
|
- Comprehensive test suite (87% coverage)
|
|
- Black code formatting
|
|
- Flake8 linting
|
|
|
|
### ❌ Not Yet Implemented (Blocking V1)
|
|
|
|
1. **Micropub Endpoint**
|
|
- POST /api/micropub for creating notes
|
|
- GET /api/micropub?q=config
|
|
- GET /api/micropub?q=source
|
|
- Token validation
|
|
- **Status**: Critical blocker for V1 release
|
|
|
|
2. **IndieAuth Token Endpoint**
|
|
- Token issuance for Micropub clients
|
|
- **Alternative**: May use external IndieAuth server
|
|
|
|
### ⚠️ Partially Implemented
|
|
|
|
1. **Standards Validation**
|
|
- HTML5: Markup exists, not validated
|
|
- Microformats: Markup exists, not validated
|
|
- RSS: Validated and compliant
|
|
- Micropub: N/A (not implemented)
|
|
|
|
2. **REST API** (Optional)
|
|
- JSON API for notes CRUD
|
|
- **Status**: Deferred to V2 (admin interface works without it)
|
|
|
|
## Success Metrics
|
|
|
|
The architecture is successful if it enables:
|
|
|
|
1. **Fast Development**: < 1 week to implement V1 - ✅ **ACHIEVED** (~35 hours, 70% complete)
|
|
2. **Easy Deployment**: < 5 minutes to get running - ✅ **ACHIEVED** (containerized)
|
|
3. **Low Maintenance**: Runs for months without intervention - ✅ **ACHIEVED** (automated migrations)
|
|
4. **High Performance**: All responses < 300ms - ✅ **ACHIEVED**
|
|
5. **Data Ownership**: User has direct access to all content - ✅ **ACHIEVED** (file-based storage)
|
|
6. **Standards Compliance**: Passes all validators - ⚠️ **PARTIAL** (RSS yes, others pending)
|
|
7. **Extensibility**: Can add V2 features without rewrite - ✅ **ACHIEVED** (migration system ready)
|
|
|
|
## References
|
|
|
|
### Internal Documentation
|
|
- [Technology Stack](/home/phil/Projects/starpunk/docs/architecture/technology-stack.md)
|
|
- [ADR-001: Python Web Framework](/home/phil/Projects/starpunk/docs/decisions/ADR-001-python-web-framework.md)
|
|
- [ADR-002: Flask Extensions](/home/phil/Projects/starpunk/docs/decisions/ADR-002-flask-extensions.md)
|
|
- [ADR-003: Frontend Technology](/home/phil/Projects/starpunk/docs/decisions/ADR-003-frontend-technology.md)
|
|
- [ADR-004: File-Based Storage](/home/phil/Projects/starpunk/docs/decisions/ADR-004-file-based-note-storage.md)
|
|
- [ADR-005: IndieLogin Authentication](/home/phil/Projects/starpunk/docs/decisions/ADR-005-indielogin-authentication.md)
|
|
|
|
### External Standards
|
|
- [IndieWeb](https://indieweb.org/)
|
|
- [IndieAuth Spec](https://www.w3.org/TR/indieauth/)
|
|
- [Micropub Spec](https://micropub.spec.indieweb.org/)
|
|
- [Microformats2](http://microformats.org/wiki/h-entry)
|
|
- [RSS 2.0](https://www.rssboard.org/rss-specification)
|
|
- [Flask Documentation](https://flask.palletsprojects.com/)
|