Files
StarPunk/docs/architecture/overview.md
Phil Skentelbery 2eaf67279d docs: Standardize all IndieAuth spec references to W3C URL
- Updated 42 references from indieauth.spec.indieweb.org to www.w3.org/TR/indieauth
- Ensures consistency across all documentation
- Points to the authoritative W3C specification
- No functional changes, documentation update only

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 11:54:04 -07:00

39 KiB

StarPunk Architecture Overview

Version: v0.9.5 (2025-11-24) Status: Pre-V1 Release (Micropub endpoint pending)

Executive Summary

StarPunk is a minimal, single-user IndieWeb CMS designed around the principle: "Every line of code must justify its existence." The architecture prioritizes simplicity, standards compliance, and user data ownership through careful technology selection and hybrid data storage.

Core Architecture: Flask web application with hybrid file+database storage, server-side rendering, delegated authentication (IndieLogin.com), and containerized deployment.

Technology Stack: Python 3.11, Flask, SQLite, Jinja2, Gunicorn, uv package manager Deployment: Container-based (Podman/Docker) with automated CI/CD (Gitea Actions) Authentication: IndieAuth via IndieLogin.com with PKCE security

System Architecture

High-Level Components

┌─────────────────────────────────────────────────────────────┐
│                         User Browser                         │
└───────────────┬─────────────────────────────────────────────┘
                │
                │ HTTP/HTTPS
                ↓
┌─────────────────────────────────────────────────────────────┐
│                      Flask Application                       │
│  ┌─────────────────────────────────────────────────────────┤
│  │ Web Interface (Jinja2 Templates)                         │
│  │  - Public: Homepage, Note Permalinks                     │
│  │  - Admin: Dashboard, Note Editor                         │
│  └──────────────────────────────┬──────────────────────────┘
│  ┌──────────────────────────────┴──────────────────────────┐
│  │ API Layer (RESTful + Micropub)                           │
│  │  - Notes CRUD API                                        │
│  │  - Micropub Endpoint                                     │
│  │  - RSS Feed Generator                                    │
│  │  - Authentication Handlers                               │
│  └──────────────────────────────┬──────────────────────────┘
│  ┌──────────────────────────────┴──────────────────────────┐
│  │ Business Logic                                           │
│  │  - Note Management (create, read, update, delete)        │
│  │  - File/Database Sync                                    │
│  │  - Markdown Rendering                                    │
│  │  - Slug Generation                                       │
│  │  - Session Management                                    │
│  └──────────────────────────────┬──────────────────────────┘
│  ┌──────────────────────────────┴──────────────────────────┐
│  │ Data Layer                                               │
│  │  ┌──────────────────┐    ┌─────────────────────────┐   │
│  │  │ File Storage     │    │ SQLite Database         │   │
│  │  │                  │    │                         │   │
│  │  │ Markdown Files   │    │ - Note Metadata         │   │
│  │  │ (Pure Content)   │    │ - Sessions              │   │
│  │  │                  │    │ - Tokens                │   │
│  │  │ data/notes/      │    │ - Auth State            │   │
│  │  │   YYYY/MM/       │    │                         │   │
│  │  │     slug.md      │    │ data/starpunk.db        │   │
│  │  └──────────────────┘    └─────────────────────────┘   │
│  └─────────────────────────────────────────────────────────┘
└─────────────────────────────────────────────────────────────┘
                │
                │ HTTPS
                ↓
┌─────────────────────────────────────────────────────────────┐
│               External Services                              │
│  - IndieLogin.com (Authentication)                           │
│  - User's Website (Identity Verification)                    │
│  - Micropub Clients (Publishing)                             │
└─────────────────────────────────────────────────────────────┘

Core Principles

1. Radical Simplicity

  • Total dependencies: 6 direct packages
  • No build tools, no npm, no bundlers
  • Server-side rendering eliminates frontend complexity
  • Single file SQLite database
  • Zero configuration frameworks

2. Hybrid Data Architecture

Files for Content: Markdown notes stored as plain text files

  • Maximum portability
  • Human-readable
  • Direct user access
  • Easy backup (copy, rsync, git)

Database for Metadata: SQLite stores structured data

  • Fast queries and indexes
  • Referential integrity
  • Efficient filtering and sorting
  • Transaction support

Sync Strategy: Files are authoritative for content; database is authoritative for metadata. Both must stay in sync.

3. Standards-First Design

  • IndieWeb: Microformats2, IndieAuth, Micropub
  • Web: HTML5, RSS 2.0, HTTP standards
  • Security: OAuth 2.0, HTTPS, secure cookies
  • Data: CommonMark markdown

4. API-First Architecture

All functionality exposed via API, web interface consumes API. This enables:

  • Micropub client support
  • Future client applications
  • Scriptable automation
  • Clean separation of concerns

5. Progressive Enhancement

  • Core functionality works without JavaScript
  • JavaScript adds optional enhancements (markdown preview)
  • Server-side rendering for fast initial loads
  • Mobile-responsive from the start

Component Descriptions

Web Layer

Public Interface

Purpose: Display published notes to the world Technology: Server-side rendered HTML (Jinja2) Status: IMPLEMENTED (v0.5.0)

Routes (Implemented):

  • GET / - Homepage with recent published notes
  • GET /note/<slug> - Individual note permalink
  • GET /feed.xml - RSS 2.0 feed (v0.6.0)
  • GET /health - Health check endpoint (v0.6.0)

Features:

  • Microformats2 markup (h-entry, h-card, h-feed) - ⚠️ Not validated
  • Reverse chronological note list
  • Clean, minimal responsive CSS
  • Mobile-responsive
  • No JavaScript required

Admin Interface

Purpose: Manage notes (create, edit, publish) Technology: Server-side rendered HTML (Jinja2) Status: IMPLEMENTED (v0.5.2)

Routes (Implemented):

  • GET /auth/login - Login form (v0.9.2: moved from /admin/login)
  • POST /auth/login - Initiate IndieLogin OAuth flow
  • GET /auth/callback - Handle IndieLogin callback
  • POST /auth/logout - Logout and destroy session
  • GET /admin - Dashboard (list of all notes, published + drafts)
  • GET /admin/new - Create note form
  • POST /admin/new - Create note handler
  • GET /admin/edit/<slug> - Edit note form
  • POST /admin/edit/<slug> - Update note handler
  • POST /admin/delete/<slug> - Delete note handler

Development Routes (DEV_MODE only):

  • GET /dev/login - Development authentication bypass (v0.5.0)

Features:

  • Markdown editor (textarea)
  • No real-time preview (deferred to V2)
  • Publish/draft toggle
  • Protected by session authentication
  • Flash messages for feedback
  • Note: Admin routes changed from /admin/* to /auth/* for auth in v0.9.2

API Layer

Notes API

Purpose: RESTful CRUD operations for notes Authentication: Session-based (admin interface) Status: NOT IMPLEMENTED (Optional for V1, deferred to V2)

Planned Routes (Not Implemented):

GET    /api/notes           List published notes (JSON)
POST   /api/notes           Create new note (JSON)
GET    /api/notes/<slug>    Get single note (JSON)
PUT    /api/notes/<slug>    Update note (JSON)
DELETE /api/notes/<slug>    Delete note (JSON)

Current Workaround: Admin interface uses HTML forms (POST), not JSON API Note: Not required for V1, admin interface is fully functional without REST API

Micropub Endpoint

Purpose: Accept posts from external Micropub clients (Quill, Indigenous, etc.) Authentication: IndieAuth bearer tokens Status: NOT IMPLEMENTED (Critical blocker for V1)

Planned Routes (Not Implemented):

POST /api/micropub          Create note (h-entry)
GET  /api/micropub?q=config Query configuration
GET  /api/micropub?q=source Query note source by URL

Planned Content Types:

  • application/json
  • application/x-www-form-urlencoded

Target Compliance: Micropub specification Current Status:

  • Token model exists in database
  • No endpoint implementation
  • No token validation logic
  • Will require IndieAuth token endpoint or external token service

RSS Feed

Purpose: Syndicate published notes Technology: feedgen library Status: IMPLEMENTED (v0.6.0)

Route: GET /feed.xml Format: Valid RSS 2.0 XML Caching: 5 minutes server-side (configurable via FEED_CACHE_SECONDS) Features:

  • Limit to 50 most recent published notes (configurable via FEED_MAX_ITEMS)
  • RFC-822 date formatting (pubDate)
  • CDATA-wrapped HTML content for feed readers
  • Proper GUID for each item (note permalink)
  • Auto-discovery link in HTML templates ()
  • Cache-Control headers for client caching
  • ETag support for conditional requests

Business Logic Layer

Note Management

Operations:

  1. Create: Generate slug → write file → insert database record
  2. Read: Query database for path → read file → render markdown
  3. Update: Write file atomically → update database timestamp
  4. Delete: Mark deleted in database → optionally archive file

Key Components:

  • Slug generation (URL-safe, unique)
  • Markdown rendering (markdown library)
  • Content hashing (integrity verification)
  • Atomic file operations (prevent corruption)

File/Database Sync

Strategy: Write files first, then database Rollback: If database operation fails, delete/restore file Verification: Content hash detects external modifications Integrity Check: Optional scan for orphaned files/records

Authentication

Admin Auth: IndieLogin.com OAuth 2.0 flow with PKCE Status: IMPLEMENTED (v0.8.0, refined through v0.9.5)

Flow:

  1. User enters website URL (their "me" identity)
  2. Generate PKCE code_verifier and code_challenge (SHA-256)
  3. Store state token + code_verifier in database (5 min expiry)
  4. Redirect to indielogin.com/authorize with:
    • client_id (SITE_URL with trailing slash)
    • redirect_uri (SITE_URL/auth/callback)
    • state (CSRF protection)
    • code_challenge + code_challenge_method (S256)
  5. IndieLogin.com verifies identity via RelMeAuth or email
  6. Callback to /auth/callback with code + state
  7. Verify state token (CSRF check)
  8. POST code + code_verifier to indielogin.com/authorize (NOT /token)
  9. Receive verified "me" URL
  10. Verify "me" matches ADMIN_ME config
  11. Create session with SHA-256 hashed token
  12. Store in HttpOnly, Secure, SameSite=Lax cookie named "starpunk_session"

Security Features (v0.8.0-v0.9.5):

  • PKCE prevents authorization code interception
  • State tokens prevent CSRF attacks
  • Session token hashing (SHA-256) before database storage
  • Single-use state tokens with short expiry
  • Automatic trailing slash normalization on SITE_URL (v0.9.1)
  • Uses authorization endpoint (not token endpoint) per IndieAuth spec (v0.9.4)
  • Session cookie renamed to avoid Flask session collision (v0.5.1)

Development Mode (v0.5.0):

  • /dev/login bypasses IndieLogin for local development
  • Requires DEV_MODE=true and DEV_ADMIN_ME configuration
  • Shows warning in logs

Micropub Auth: IndieAuth token verification Status: NOT IMPLEMENTED (Required for Micropub)

Planned Implementation:

  • Client obtains token via external IndieAuth token endpoint
  • Token sent as Bearer in Authorization header
  • Verify token exists in database and not expired
  • Check scope permissions (create, update, delete)
  • OR: Delegate token verification to external IndieAuth server

Data Layer

File Storage

Location: data/notes/ Structure: YYYY/MM/slug.md Format: Pure markdown, no frontmatter Operations:

  • Atomic writes (temp file → rename)
  • Directory creation (makedirs)
  • Content reading (UTF-8 encoding)

Example:

data/notes/
├── 2024/
│   ├── 11/
│   │   ├── my-first-note.md
│   │   └── another-note.md
│   └── 12/
│       └── december-note.md

Database Storage

Location: data/starpunk.db Engine: SQLite3 Status: IMPLEMENTED with automatic migration system (v0.9.0)

Tables:

  • notes - Note metadata (slug, file_path, published, created_at, updated_at, deleted_at, content_hash)
  • sessions - Admin auth sessions (session_token_hash, me, created_at, expires_at, last_used_at, user_agent, ip_address)
  • tokens - Micropub bearer tokens (token, me, client_id, scope, created_at, expires_at) - Table exists but unused
  • auth_state - CSRF state tokens (state, created_at, expires_at, redirect_uri, code_verifier)
  • schema_migrations - Migration tracking (migration_name, applied_at) - Added v0.9.0

Indexes:

  • notes.created_at (DESC) - Fast chronological queries
  • notes.published - Fast published note filtering
  • notes.slug (UNIQUE) - Fast lookup by slug, uniqueness enforcement
  • notes.deleted_at - Fast soft-delete filtering
  • sessions.session_token_hash (UNIQUE) - Fast auth checks
  • sessions.me - Fast user lookups
  • auth_state.state (UNIQUE) - Fast state token validation

Migration System (v0.9.0):

  • Automatic schema updates on application startup
  • Migration files in migrations/ directory (SQL format)
  • Executed in alphanumeric order (001, 002, 003...)
  • Fresh database detection (marks migrations as applied without execution)
  • Legacy database detection (applies pending migrations automatically)
  • Migration tracking in schema_migrations table
  • Fail-safe: Application refuses to start if migrations fail

Queries: Direct SQL using Python sqlite3 module (no ORM)

Data Flow Examples

Creating a Note (via Admin Interface)

1. User fills out form at /admin/new
   ↓
2. POST to /api/notes with markdown content
   ↓
3. Verify user session (check session cookie)
   ↓
4. Generate unique slug from content or timestamp
   ↓
5. Determine file path: data/notes/2024/11/slug.md
   ↓
6. Create directories if needed (makedirs)
   ↓
7. Write markdown content to file (atomic write)
   ↓
8. Calculate SHA-256 hash of content
   ↓
9. Begin database transaction
   ↓
10. Insert record into notes table:
    - slug
    - file_path
    - published (from form)
    - created_at (now)
    - updated_at (now)
    - content_hash
   ↓
11. If database insert fails:
    - Delete file
    - Return error to user
   ↓
12. If database insert succeeds:
    - Commit transaction
    - Return success with note URL
   ↓
13. Redirect user to /admin (dashboard)

Reading a Note (via Public Interface)

1. User visits /note/my-first-note
   ↓
2. Extract slug from URL
   ↓
3. Query database:
    SELECT file_path, created_at, published
    FROM notes
    WHERE slug = 'my-first-note' AND published = 1
   ↓
4. If not found → 404 error
   ↓
5. Read markdown content from file:
    - Open data/notes/2024/11/my-first-note.md
    - Read UTF-8 content
   ↓
6. Render markdown to HTML (markdown.markdown())
   ↓
7. Render Jinja2 template with:
    - content_html (rendered HTML)
    - created_at (timestamp)
    - slug (for permalink)
   ↓
8. Return HTML with microformats markup

Publishing via Micropub

1. Micropub client POSTs to /api/micropub
   Headers: Authorization: Bearer {token}
   Body: {"type": ["h-entry"], "properties": {"content": ["..."]}}
   ↓
2. Extract bearer token from Authorization header
   ↓
3. Query database:
    SELECT me, scope FROM tokens
    WHERE token = {token} AND expires_at > now()
   ↓
4. If token invalid → 401 Unauthorized
   ↓
5. Parse Micropub JSON payload
   ↓
6. Extract content from properties.content[0]
   ↓
7. Create note (same flow as admin interface):
    - Generate slug
    - Write file
    - Insert database record
   ↓
8. If successful:
    - Return 201 Created
    - Set Location header to note URL
   ↓
9. Client receives note URL, displays success

IndieLogin Authentication Flow (v0.9.5 with PKCE)

1. User visits /auth/login
   ↓
2. User enters their website: https://alice.example.com
   ↓
3. POST to /auth/login with "me" parameter
   ↓
4. Validate URL format (must be https://)
   ↓
5. Generate PKCE code_verifier (43 random bytes, base64-url encoded)
   ↓
6. Generate code_challenge from code_verifier (SHA256 hash, base64-url encoded)
   ↓
7. Generate random state token (CSRF protection)
   ↓
8. Store state + code_verifier in auth_state table (5-minute expiry)
   ↓
9. Normalize client_id by adding trailing slash if missing (v0.9.1)
   ↓
10. Build IndieLogin authorization URL:
    https://indielogin.com/authorize?
      me=https://alice.example.com
      client_id=https://starpunk.example.com/  (note trailing slash)
      redirect_uri=https://starpunk.example.com/auth/callback
      state={random_state}
      code_challenge={code_challenge}
      code_challenge_method=S256
   ↓
11. Redirect user to IndieLogin
   ↓
12. IndieLogin verifies user's identity:
    - Checks rel="me" links on alice.example.com
    - Or sends email verification
    - User authenticates via chosen method
   ↓
13. IndieLogin redirects back:
    /auth/callback?code={auth_code}&state={state}
   ↓
14. Verify state matches stored value (CSRF check, single-use)
   ↓
15. Retrieve code_verifier from database using state
   ↓
16. Delete state token (single-use enforcement)
   ↓
17. Exchange code for verified identity (v0.9.4: uses /authorize, not /token):
    POST https://indielogin.com/authorize
      code={auth_code}
      client_id=https://starpunk.example.com/
      redirect_uri=https://starpunk.example.com/auth/callback
      code_verifier={code_verifier}
   ↓
18. IndieLogin returns: {"me": "https://alice.example.com"}
   ↓
19. Verify me == ADMIN_ME (config)
   ↓
20. If match:
    - Generate session token (secrets.token_urlsafe(32))
    - Hash token with SHA-256
    - Insert into sessions table with hash (not plaintext)
    - Set cookie "starpunk_session" (HttpOnly, Secure, SameSite=Lax)
    - Redirect to /admin
   ↓
21. If no match:
    - Return "Unauthorized" error
    - Log attempt with WARNING level

Key Security Features:

  • PKCE prevents code interception attacks (v0.8.0)
  • State tokens prevent CSRF (v0.4.0)
  • Session token hashing prevents token exposure if database compromised (v0.4.0)
  • Single-use state tokens (deleted after verification)
  • Short-lived state tokens (5 minutes)
  • Trailing slash normalization fixes client_id validation (v0.9.1)
  • Correct endpoint usage (/authorize not /token) per IndieAuth spec (v0.9.4)

Security Architecture

Authentication Security

Session Management

  • Token Generation: secrets.token_urlsafe(32) (256-bit entropy)
  • Storage: SHA-256 hash stored in database (plaintext token NEVER stored)
  • Cookie Name: starpunk_session (v0.5.1: renamed to avoid Flask session collision)
  • Cookies: HttpOnly, Secure, SameSite=Lax
  • Expiry: 30 days, extendable on use
  • Validation: Every protected route checks session via @require_auth decorator
  • Metadata: Tracks user_agent and ip_address for audit purposes

CSRF Protection

  • State Tokens: Random tokens for OAuth flows
  • Expiry: 5 minutes (short-lived)
  • Single-Use: Deleted after verification
  • SameSite: Cookies set to Lax mode

Access Control

  • Admin Routes: Require valid session
  • Micropub Routes: Require valid bearer token
  • Public Routes: No authentication needed
  • Identity Verification: Only ADMIN_ME can authenticate

Input Validation

User Input

  • Markdown: Sanitize to prevent XSS in rendered HTML
  • URLs: Validate format and scheme (https://)
  • Slugs: Alphanumeric + hyphens only
  • JSON: Parse and validate structure
  • File Paths: Prevent directory traversal (validate against base path)

Micropub Payloads

  • Content-Type: Verify matches expected format
  • Required Fields: Validate h-entry structure
  • Size Limits: Prevent DoS via large payloads
  • Scope Verification: Check token has required permissions

Database Security

SQL Injection Prevention

  • Parameterized Queries: Always use parameter substitution
  • No String Interpolation: Never build SQL with f-strings
  • Input Sanitization: Validate before database operations

Example:

# GOOD
cursor.execute("SELECT * FROM notes WHERE slug = ?", (slug,))

# BAD (SQL injection vulnerable)
cursor.execute(f"SELECT * FROM notes WHERE slug = '{slug}'")

Data Integrity

  • Transactions: Use for multi-step operations
  • Constraints: UNIQUE on slugs, file_paths
  • Foreign Keys: Enforce relationships (if applicable)
  • Content Hashing: Detect unauthorized file modifications

Network Security

HTTPS

  • Production Requirement: TLS 1.2+ required
  • Reverse Proxy: Nginx/Caddy handles SSL termination
  • Certificate Validation: Verify SSL certs on outbound requests
  • HSTS: Set Strict-Transport-Security header

Security Headers

# Set on all responses
Content-Security-Policy: default-src 'self'
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
Referrer-Policy: strict-origin-when-cross-origin

Rate Limiting

  • Implementation: Reverse proxy (nginx/Caddy)
  • Admin Routes: Stricter limits
  • API Routes: Moderate limits
  • Public Routes: Permissive limits

File System Security

Atomic Operations

# Write to temp file, then atomic rename
temp_path = f"{target_path}.tmp"
with open(temp_path, 'w') as f:
    f.write(content)
os.rename(temp_path, target_path)  # Atomic on POSIX

Path Validation

# Prevent directory traversal
base_path = os.path.abspath(DATA_PATH)
requested_path = os.path.abspath(os.path.join(base_path, user_input))
if not requested_path.startswith(base_path):
    raise SecurityError("Path traversal detected")

File Permissions

  • Data Directory: 700 (owner only)
  • Database File: 600 (owner read/write)
  • Note Files: 600 (owner read/write)
  • Application User: Dedicated non-root user

Performance Considerations

Response Time Targets

  • API Responses: < 100ms (database + file read)
  • Page Renders: < 200ms (template rendering)
  • RSS Feed: < 300ms (query + file reads + XML generation)

Optimization Strategies

Database

  • Indexes: On frequently queried columns (created_at, slug, published)
  • Connection Pooling: Single connection (single-user, no contention)
  • Query Optimization: SELECT only needed columns
  • Prepared Statements: Reuse compiled queries

File System

  • Caching: Consider caching rendered HTML in memory (optional)
  • Directory Structure: Year/Month prevents large directories
  • Atomic Reads: Fast sequential reads, no locking needed

HTTP

  • Static Assets: Cache headers on CSS/JS (1 year)
  • RSS Feed: Cache for 5 minutes (Cache-Control)
  • Compression: gzip/brotli via reverse proxy
  • ETags: For conditional requests

Rendering

  • Template Compilation: Jinja2 compiles templates automatically
  • Minimal Templating: Simple templates render fast
  • Server-Side: No client-side rendering overhead

Resource Usage

Memory

  • Flask Process: ~50MB base
  • SQLite: ~10MB typical working set
  • Total: < 100MB under normal load

Disk

  • Application: ~5MB (code + dependencies)
  • Database: ~1MB per 1000 notes
  • Notes: ~5KB average per markdown file
  • Total: Scales linearly with note count

CPU

  • Idle: Near zero
  • Request Handling: Minimal (no heavy processing)
  • Markdown Rendering: Fast (pure Python)
  • Database Queries: Indexed, sub-millisecond

Deployment Architecture

Current State: IMPLEMENTED (v0.6.0 - v0.9.5) Technology: Container-based with Gunicorn WSGI server CI/CD: Gitea Actions automated builds (v0.9.5)

Container Deployment (v0.6.0)

Containerfile: Multi-stage build using Python 3.11-slim base

  • Stage 1: Build dependencies with uv package manager
  • Stage 2: Production image with non-root user (starpunk:1000)
  • Final size: ~174MB

Features:

  • Health check endpoint: /health (validates database and filesystem)
  • Gunicorn WSGI server with 4 workers (configurable)
  • Log rotation (10MB max, 3 files)
  • Resource limits (memory, CPU)
  • SELinux compatibility (volume mount flags)
  • Automatic database initialization on first run

Container Orchestration:

  • Podman-compatible (rootless, userns=keep-id)
  • Docker Compose compatible
  • Volume mounts for data persistence (./data:/app/data)
  • Port mapping (8080:8000)
  • Environment variables for configuration

CI/CD Pipeline (v0.9.5):

  • Gitea Actions workflow (.gitea/workflows/build-container.yml)
  • Automated builds on push to main branch
  • Manual trigger support
  • Container registry push
  • Docker and git dependencies installed
  • Node.js support for GitHub Actions compatibility

Single-Server Deployment

┌─────────────────────────────────────────────────┐
│ Internet                                        │
└────────────────┬────────────────────────────────┘
                 │
                 │ Port 443 (HTTPS)
                 ↓
┌─────────────────────────────────────────────────┐
│ Nginx/Caddy (Reverse Proxy)                     │
│  - SSL/TLS termination                          │
│  - Static file serving                          │
│  - Rate limiting                                │
│  - Compression                                  │
└────────────────┬────────────────────────────────┘
                 │
                 │ Port 8000 (HTTP)
                 ↓
┌─────────────────────────────────────────────────┐
│ Gunicorn (WSGI Server)                          │
│  - 4 worker processes                           │
│  - Process management                           │
│  - Load balancing (round-robin)                 │
└────────────────┬────────────────────────────────┘
                 │
                 │ WSGI
                 ↓
┌─────────────────────────────────────────────────┐
│ Flask Application                               │
│  - Request handling                             │
│  - Business logic                               │
│  - Template rendering                           │
└────────────────┬────────────────────────────────┘
                 │
                 ↓
┌────────────────────────────┬────────────────────┐
│ File System                │ SQLite Database    │
│  data/notes/               │  data/starpunk.db  │
│    YYYY/MM/slug.md         │                    │
└────────────────────────────┴────────────────────┘

Process Management (systemd)

[Unit]
Description=StarPunk CMS
After=network.target

[Service]
Type=notify
User=starpunk
WorkingDirectory=/opt/starpunk
Environment="PATH=/opt/starpunk/venv/bin"
ExecStart=/opt/starpunk/venv/bin/gunicorn -w 4 -b 127.0.0.1:8000 app:app
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Backup Strategy

Automated Daily Backup

#!/bin/bash
# backup.sh - Run daily via cron

DATE=$(date +%Y%m%d)
BACKUP_DIR="/backup/starpunk"

# Backup data directory (notes + database)
rsync -av /opt/starpunk/data/ "$BACKUP_DIR/$DATE/"

# Keep last 30 days
find "$BACKUP_DIR" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;

Manual Backup

# Simple copy
cp -r /opt/starpunk/data /backup/starpunk-$(date +%Y%m%d)

# Or with compression
tar -czf starpunk-backup-$(date +%Y%m%d).tar.gz /opt/starpunk/data

Restore Process

  1. Stop application: sudo systemctl stop starpunk
  2. Restore data directory: rsync -av /backup/starpunk/20241118/ /opt/starpunk/data/
  3. Fix permissions: chown -R starpunk:starpunk /opt/starpunk/data
  4. Start application: sudo systemctl start starpunk
  5. Verify: Visit site, check recent notes

Testing Strategy

Test Pyramid

           ┌─────────────┐
          /               \
         /   Manual Tests  \      Validation, Real Services
        /─────────────────  \
       /                     \
      /  Integration Tests    \   API Flows, Database + Files
     /───────────────────────  \
    /                           \
   /        Unit Tests            \  Functions, Logic, Parsing
  /───────────────────────────────\

Unit Tests (pytest)

Coverage: Business logic, utilities, models Examples:

  • Slug generation and uniqueness
  • Markdown rendering with various inputs
  • Content hash calculation
  • File path validation
  • Token generation and verification
  • Date formatting for RSS
  • Micropub payload parsing

Integration Tests

Coverage: Component interactions, full flows Examples:

  • Create note: file write + database insert
  • Read note: database query + file read
  • IndieLogin flow with mocked API
  • Micropub creation with token validation
  • RSS feed generation with multiple notes
  • Session authentication on protected routes

End-to-End Tests

Coverage: Full user workflows Examples:

  • Admin login via IndieLogin (mocked)
  • Create note via web interface
  • Publish note via Micropub client (mocked)
  • View note on public site
  • Verify RSS feed includes note

Validation Tests

Coverage: Standards compliance Tools:

  • W3C HTML Validator (validate templates)
  • W3C Feed Validator (validate RSS output)
  • IndieWebify.me (verify microformats)
  • Micropub.rocks (test Micropub compliance)

Manual Tests

Coverage: Real-world usage Examples:

  • Authenticate with real indielogin.com
  • Publish from actual Micropub client (Quill, Indigenous)
  • Subscribe to feed in actual RSS reader
  • Browser compatibility (Chrome, Firefox, Safari, mobile)
  • Accessibility with screen reader

Monitoring and Observability

Logging Strategy

Application Logs

# Structured logging
import logging

logger = logging.getLogger(__name__)

# Info: Normal operations
logger.info("Note created", extra={
    "slug": slug,
    "published": published,
    "user": session.me
})

# Warning: Recoverable issues
logger.warning("State token expired", extra={
    "state": state,
    "age": age_seconds
})

# Error: Failed operations
logger.error("File write failed", extra={
    "path": file_path,
    "error": str(e)
})

Log Levels

  • DEBUG: Development only (verbose)
  • INFO: Normal operations (note creation, auth success)
  • WARNING: Unusual but handled (expired tokens, invalid input)
  • ERROR: Failed operations (file I/O errors, database errors)
  • CRITICAL: System failures (database unreachable)

Log Destinations

  • Development: Console (stdout)
  • Production: File rotation (logrotate) + optional syslog

Metrics (Optional for V2)

Simple Metrics (if desired):

  • Note count (query database)
  • Request count (nginx logs)
  • Error rate (grep application logs)
  • Response times (nginx logs)

Advanced Metrics (V2):

  • Prometheus exporter
  • Grafana dashboard
  • Alert on error rate spike

Health Checks

@app.route('/health')
def health_check():
    """Simple health check for monitoring"""
    try:
        # Check database
        db.execute("SELECT 1").fetchone()

        # Check file system
        os.path.exists(DATA_PATH)

        return {"status": "ok"}, 200
    except Exception as e:
        return {"status": "error", "detail": str(e)}, 500

Migration and Evolution

V1 to V2 Migration

Database Schema Changes

-- Add new column with default
ALTER TABLE notes ADD COLUMN tags TEXT DEFAULT '';

-- Create new table
CREATE TABLE tags (
    id INTEGER PRIMARY KEY,
    name TEXT UNIQUE NOT NULL
);

-- Migration script updates existing notes

File Format Evolution

V1: Pure markdown V2 (if needed): Add optional frontmatter

---
tags: indieweb, cms
---
Note content here

Backward Compatibility: Parser checks for frontmatter, falls back to pure markdown.

API Versioning

# V1 (current)
GET /api/notes

# V2 (future)
GET /api/v2/notes  # New features
GET /api/notes     # Still works, returns V1 response

Data Export/Import

Export Formats

  1. Markdown Bundle: Zip of all notes (already portable)
  2. JSON Export: Notes + metadata
    {
      "version": "1.0",
      "exported_at": "2024-11-18T12:00:00Z",
      "notes": [
        {
          "slug": "my-note",
          "content": "Note content...",
          "created_at": "2024-11-01T12:00:00Z",
          "published": true
        }
      ]
    }
    
  3. RSS Archive: Existing feed.xml

Import (V2)

  • From JSON export
  • From WordPress XML
  • From markdown directory
  • From other IndieWeb CMSs

Implementation Status (v0.9.5)

Fully Implemented Features

  1. Note Management (v0.3.0)

    • Full CRUD operations (create, read, update, delete)
    • Hybrid file+database storage with sync
    • Soft and hard delete support
    • Markdown rendering
    • Slug generation with uniqueness
  2. Authentication (v0.8.0)

    • IndieLogin.com OAuth 2.0 with PKCE
    • Session management with token hashing
    • CSRF protection with state tokens
    • Development mode authentication bypass
  3. Web Interface (v0.5.2)

    • Public site: homepage and note permalinks
    • Admin dashboard with note management
    • Login/logout flows
    • Responsive design
    • Microformats2 markup (h-entry, h-card, h-feed)
  4. RSS Feed (v0.6.0)

    • RSS 2.0 compliant feed generation
    • Auto-discovery links
    • Server-side caching
    • ETag support
  5. Container Deployment (v0.6.0)

    • Multi-stage Containerfile
    • Gunicorn WSGI server
    • Health check endpoint
    • Volume persistence
  6. CI/CD Pipeline (v0.9.5)

    • Gitea Actions workflow
    • Automated container builds
    • Registry push
  7. Database Migrations (v0.9.0)

    • Automatic migration system
    • Fresh database detection
    • Legacy database migration
    • Migration tracking
  8. Development Tools

    • uv package manager for Python
    • Comprehensive test suite (87% coverage)
    • Black code formatting
    • Flake8 linting

Not Yet Implemented (Blocking V1)

  1. Micropub Endpoint

    • POST /api/micropub for creating notes
    • GET /api/micropub?q=config
    • GET /api/micropub?q=source
    • Token validation
    • Status: Critical blocker for V1 release
  2. IndieAuth Token Endpoint

    • Token issuance for Micropub clients
    • Alternative: May use external IndieAuth server

⚠️ Partially Implemented

  1. Standards Validation

    • HTML5: Markup exists, not validated
    • Microformats: Markup exists, not validated
    • RSS: Validated and compliant
    • Micropub: N/A (not implemented)
  2. REST API (Optional)

    • JSON API for notes CRUD
    • Status: Deferred to V2 (admin interface works without it)

Success Metrics

The architecture is successful if it enables:

  1. Fast Development: < 1 week to implement V1 - ACHIEVED (~35 hours, 70% complete)
  2. Easy Deployment: < 5 minutes to get running - ACHIEVED (containerized)
  3. Low Maintenance: Runs for months without intervention - ACHIEVED (automated migrations)
  4. High Performance: All responses < 300ms - ACHIEVED
  5. Data Ownership: User has direct access to all content - ACHIEVED (file-based storage)
  6. Standards Compliance: Passes all validators - ⚠️ PARTIAL (RSS yes, others pending)
  7. Extensibility: Can add V2 features without rewrite - ACHIEVED (migration system ready)

References

Internal Documentation

External Standards