# System Architecture Overview ## Project Context Gondulf is a self-hosted IndieAuth server implementing the W3C IndieAuth specification. It enables users to use their own domain as their identity when authenticating to third-party applications, providing a decentralized alternative to centralized authentication providers. ### Key Differentiators - **Email-based authentication**: v1.0.0 uses email verification for domain ownership - **No client pre-registration**: Clients validate themselves through domain ownership verification - **Simplicity-first**: Minimal complexity, production-ready MVP - **Single-admin model**: Designed for individual operators, not multi-tenancy ## Technology Stack ### Core Platform - **Language**: Python 3.10+ - **Web Framework**: FastAPI 0.104+ - Chosen for: Native async/await, type hints, OAuth 2.0 support, automatic OpenAPI docs - See: `/docs/decisions/ADR-001-python-framework-selection.md` - **ASGI Server**: uvicorn with standard extras - **Data Validation**: Pydantic 2.0+ (bundled with FastAPI) ### Data Storage - **Primary Database**: SQLite 3.35+ - Sufficient for 10s of users - Simple file-based backups - No separate database server required - **Database Interface**: SQLAlchemy Core (NOT ORM) - Direct SQL-like interface without ORM complexity - Explicit queries, no hidden behavior - Simple schema management ### Session/State Storage (v1.0.0) - **In-Memory Storage**: Python dictionaries with TTL management - **Rationale**: - No Redis in v1.0.0 per user requirements - Authorization codes are short-lived (10 minutes max) - Single-process deployment acceptable for MVP - Upgrade path: Can add Redis later without code changes if persistence needed ### Development Environment - **Package Manager**: uv (Astral Rust-based tool) - See: `/docs/decisions/ADR-002-uv-environment-management.md` - Direct execution model (no environment activation) - **Linting**: Ruff + flake8 - **Type Checking**: mypy (strict mode) - **Formatting**: Black (88 character line length) - **Testing**: pytest with async, coverage, mocking ## System Architecture ### Component Diagram ``` ┌─────────────────────────────────────────────────────────────────┐ │ Client Application │ │ (Third-party IndieAuth client) │ └───────────────────────────┬─────────────────────────────────────┘ │ HTTPS │ IndieAuth Protocol ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Gondulf IndieAuth Server │ │ ┌────────────────────────────────────────────────────────────┐ │ │ │ FastAPI Application │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌─────────────────┐ │ │ │ │ │ Authorization │ │ Token │ │ Metadata │ │ │ │ │ │ Endpoint │ │ Endpoint │ │ Endpoint │ │ │ │ │ │ /authorize │ │ /token │ │ /.well-known │ │ │ │ │ └──────┬───────┘ └──────┬───────┘ └────────┬────────┘ │ │ │ │ │ │ │ │ │ │ │ └──────────────────┼────────────────────┘ │ │ │ │ │ │ │ │ │ ┌─────────────────────────▼──────────────────────────────┐ │ │ │ │ │ Business Logic Layer │ │ │ │ │ │ ┌───────────────┐ ┌────────────┐ ┌──────────────┐ │ │ │ │ │ │ │ AuthService │ │TokenService│ │DomainService │ │ │ │ │ │ │ │ - Auth flow │ │ - Token │ │ - Domain │ │ │ │ │ │ │ - Email send │ │ creation │ │ validation │ │ │ │ │ │ │ - Code gen │ │ - Token │ │ - TXT record │ │ │ │ │ │ │ │ │ verify │ │ check │ │ │ │ │ │ └───────────────┘ └────────────┘ └──────────────┘ │ │ │ │ │ └────────────────────────┬───────────────────────────────┘ │ │ │ │ │ │ │ │ │ ┌────────────────────────▼──────────────────────────────┐ │ │ │ │ │ Storage Layer │ │ │ │ │ │ ┌──────────────────┐ ┌────────────────────────┐ │ │ │ │ │ │ │ SQLite Database │ │ In-Memory Store │ │ │ │ │ │ │ │ - Tokens │ │ - Auth codes (10min) │ │ │ │ │ │ │ │ - Domains │ │ - Email codes (15min)│ │ │ │ │ │ │ └──────────────────┘ └────────────────────────┘ │ │ │ │ │ └───────────────────────────────────────────────────────┘ │ │ │ └────────────────────────────────────────────────────────────┘ │ └──────────┬──────────────────────────────────────┬───────────────┘ │ SMTP │ DNS ▼ ▼ ┌────────────────┐ ┌──────────────────┐ │ Email Server │ │ DNS Provider │ │ (external) │ │ (external) │ └────────────────┘ └──────────────────┘ ``` ### Component Responsibilities #### HTTP Endpoints Layer Handles all HTTP concerns: - Request validation (Pydantic models) - Parameter parsing and type coercion - HTTP response formatting - Error responses (OAuth 2.0 compliant) - CORS headers - Rate limiting (future) #### Business Logic Layer (Services) Contains all domain logic, completely independent of HTTP: **AuthService**: - Authorization flow orchestration - Email verification code generation and validation - Authorization code generation (cryptographically secure) - User consent management - PKCE support (future) **TokenService**: - Access token generation (JWT or opaque) - Token validation and introspection - Token revocation (future) - Token refresh (future) **DomainService**: - Domain ownership validation - DNS TXT record checking - Domain normalization - Security validation (prevent open redirects) #### Storage Layer Provides data persistence: **SQLite Database**: - Access tokens (long-lived) - Verified domains - Audit logs - Configuration **In-Memory Store**: - Authorization codes (TTL: 10 minutes) - Email verification codes (TTL: 15 minutes) - Rate limit counters (future) ### Data Flow: Authorization Flow ``` 1. Client → /authorize ↓ 2. Gondulf validates client_id, redirect_uri, state ↓ 3. Gondulf checks domain ownership (TXT record or cached) ↓ 4. User enters email address for their domain ↓ 5. Gondulf sends verification code to email ↓ 6. User enters code ↓ 7. Gondulf generates authorization code ↓ 8. Gondulf redirects to client with code + state ↓ 9. Client → /token with code ↓ 10. Gondulf validates code, generates access token ↓ 11. Gondulf returns token + me (user's domain) ``` ## Deployment Model ### Target Deployment - **Platform**: Docker container - **Scale**: 10s of users initially - **Process Model**: Single uvicorn process (sufficient for MVP) - **File System**: - `/data/gondulf.db` - SQLite database - `/data/backups/` - Database backups - `/app/` - Application code ### Configuration Management - **Environment Variables**: All configuration via environment - **Secrets**: Loaded from environment (SECRET_KEY, SMTP credentials) - **Config Validation**: Pydantic Settings validates on startup ### Backup Strategy Simple file-based SQLite backups: - Daily automated backups of `gondulf.db` - Backup rotation (keep last 7 days) - Simple shell script + cron - Future: S3/object storage support ## Security Architecture ### Authentication Method (v1.0.0) **Email-based verification only**: - User provides email address for their domain - Server sends time-limited verification code - User enters code to prove email access - No password storage - No external identity providers in v1.0.0 ### Domain Ownership Validation **Two-tier validation**: 1. **TXT Record (preferred)**: - Admin adds TXT record: `_gondulf.example.com` = `verified` - Server checks DNS before first use - Result cached in database - Periodic re-verification (configurable) 2. **Email-based (alternative)**: - If no TXT record, fall back to email verification - Email must be at verified domain (e.g., `admin@example.com`) - Less secure but more accessible for users ### Token Security - **Generation**: Cryptographically secure random tokens (secrets.token_urlsafe) - **Storage**: Hashed in database (SHA-256) - **Transmission**: HTTPS only (enforced in production) - **Expiration**: Configurable (default 1 hour) - **Validation**: Constant-time comparison (prevent timing attacks) ### Privacy Principles **Minimal Data Collection**: - NEVER store email addresses beyond verification flow - NEVER log user personal data - Store only: - Domain name (user's identity) - Token hashes (security) - Timestamps (auditing) - Client IDs (protocol requirement) ## Operational Architecture ### Logging Strategy **Structured logging** with appropriate levels: - **INFO**: Normal operations (auth success, token issued) - **WARNING**: Suspicious activity (failed validations, rate limit near) - **ERROR**: Failures requiring investigation (email send failed, DNS timeout) - **CRITICAL**: System failures (database unavailable, config invalid) **Log fields**: - Timestamp (ISO 8601) - Level - Event type - Domain (never email) - Client ID - Request ID (correlation) **Privacy**: - NEVER log email addresses - NEVER log full tokens (only first 8 chars for correlation) - NEVER log user-agent or IP in production (GDPR) ### Monitoring (Future) - Health check endpoint: `/health` - Metrics endpoint: `/metrics` (Prometheus format) - Key metrics: - Authorization requests/min - Token generation rate - Email delivery success rate - Domain validation cache hit rate - Error rate by type ## Upgrade Paths ### Future Enhancements (Post v1.0.0) **Persistence Layer**: - Add Redis for distributed sessions - Support PostgreSQL for larger deployments - No code changes required (SQLAlchemy abstraction) **Authentication Methods**: - GitHub/GitLab provider support - IndieAuth delegation - WebAuthn for passwordless - All additive, no breaking changes **Protocol Features**: - Token refresh - Token revocation endpoint - Scope management (authorization) - Dynamic client registration **Operational**: - Multi-process deployment (gunicorn) - Horizontal scaling (with Redis) - Metrics and monitoring - Admin dashboard ## Constraints and Trade-offs ### Conscious Simplifications (v1.0.0) 1. **No Redis**: In-memory storage acceptable for single-process deployment - Trade-off: Lose codes on restart (acceptable for 10-minute TTL) - Upgrade path: Add Redis when scaling needed 2. **No client pre-registration**: Domain-based validation sufficient - Trade-off: Must validate client_id on every request - Mitigation: Cache validation results 3. **Email-only authentication**: Simplest secure method - Trade-off: Requires SMTP configuration - Upgrade path: Add providers in future releases 4. **SQLite database**: Perfect for small deployments - Trade-off: No built-in replication - Upgrade path: Migrate to PostgreSQL when needed 5. **Single process**: No distributed coordination needed - Trade-off: Limited concurrent capacity - Upgrade path: Add Redis + gunicorn when scaling ### Non-Negotiable Requirements 1. **W3C IndieAuth compliance**: Full protocol compliance required 2. **Security best practices**: No shortcuts on security 3. **HTTPS in production**: Required for OAuth 2.0 security 4. **Minimal data collection**: Privacy by design 5. **Comprehensive testing**: 80%+ coverage minimum ## Documentation Structure ### For Developers - `/docs/architecture/` - This directory - `/docs/designs/` - Feature-specific designs - `/docs/decisions/` - Architecture Decision Records ### For Operators - `README.md` - Installation and usage - `/docs/operations/` - Deployment guides (future) - Environment variable reference (future) ### For Protocol Compliance - `/docs/architecture/indieauth-protocol.md` - Protocol implementation - `/docs/architecture/security.md` - Security model - Test suite demonstrating compliance ## Next Steps See `/docs/roadmap/v1.0.0.md` for the MVP feature set and implementation plan. Key architectural documents to review: - `/docs/architecture/indieauth-protocol.md` - Protocol design - `/docs/architecture/security.md` - Security design - `/docs/roadmap/backlog.md` - Feature prioritization