Files
Gondulf/docs/architecture/overview.md
Phil Skentelbery bebd47955f feat(core): implement Phase 1 foundation infrastructure
Implements Phase 1 Foundation with all core services:

Core Components:
- Configuration management with GONDULF_ environment variables
- Database layer with SQLAlchemy and migration system
- In-memory code storage with TTL support
- Email service with SMTP and TLS support (STARTTLS + implicit TLS)
- DNS service with TXT record verification
- Structured logging with Python standard logging
- FastAPI application with health check endpoint

Database Schema:
- authorization_codes table for OAuth 2.0 authorization codes
- domains table for domain verification
- migrations table for tracking schema versions
- Simple sequential migration system (001_initial_schema.sql)

Configuration:
- Environment-based configuration with validation
- .env.example template with all GONDULF_ variables
- Fail-fast validation on startup
- Sensible defaults for optional settings

Testing:
- 96 comprehensive tests (77 unit, 5 integration)
- 94.16% code coverage (exceeds 80% requirement)
- All tests passing
- Test coverage includes:
  - Configuration loading and validation
  - Database migrations and health checks
  - In-memory storage with expiration
  - Email service (STARTTLS, implicit TLS, authentication)
  - DNS service (TXT records, domain verification)
  - Health check endpoint integration

Documentation:
- Implementation report with test results
- Phase 1 clarifications document
- ADRs for key decisions (config, database, email, logging)

Technical Details:
- Python 3.10+ with type hints
- SQLite with configurable database URL
- System DNS with public DNS fallback
- Port-based TLS detection (465=SSL, 587=STARTTLS)
- Lazy configuration loading for testability

Exit Criteria Met:
✓ All foundation services implemented
✓ Application starts without errors
✓ Health check endpoint operational
✓ Database migrations working
✓ Test coverage exceeds 80%
✓ All tests passing

Ready for Architect review and Phase 2 development.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 12:21:42 -07:00

15 KiB

System Architecture Overview

Project Context

Gondulf is a self-hosted IndieAuth server implementing the W3C IndieAuth specification. It enables users to use their own domain as their identity when authenticating to third-party applications, providing a decentralized alternative to centralized authentication providers.

Key Differentiators

  • Email-based authentication: v1.0.0 uses email verification for domain ownership
  • No client pre-registration: Clients validate themselves through domain ownership verification
  • Simplicity-first: Minimal complexity, production-ready MVP
  • Single-admin model: Designed for individual operators, not multi-tenancy

Technology Stack

Core Platform

  • Language: Python 3.10+
  • Web Framework: FastAPI 0.104+
    • Chosen for: Native async/await, type hints, OAuth 2.0 support, automatic OpenAPI docs
    • See: /docs/decisions/ADR-001-python-framework-selection.md
  • ASGI Server: uvicorn with standard extras
  • Data Validation: Pydantic 2.0+ (bundled with FastAPI)

Data Storage

  • Primary Database: SQLite 3.35+
    • Sufficient for 10s of users
    • Simple file-based backups
    • No separate database server required
  • Database Interface: SQLAlchemy Core (NOT ORM)
    • Direct SQL-like interface without ORM complexity
    • Explicit queries, no hidden behavior
    • Simple schema management

Session/State Storage (v1.0.0)

  • In-Memory Storage: Python dictionaries with TTL management
  • Rationale:
    • No Redis in v1.0.0 per user requirements
    • Authorization codes are short-lived (10 minutes max)
    • Single-process deployment acceptable for MVP
    • Upgrade path: Can add Redis later without code changes if persistence needed

Development Environment

  • Package Manager: uv (Astral Rust-based tool)
    • See: /docs/decisions/ADR-002-uv-environment-management.md
    • Direct execution model (no environment activation)
  • Linting: Ruff + flake8
  • Type Checking: mypy (strict mode)
  • Formatting: Black (88 character line length)
  • Testing: pytest with async, coverage, mocking

System Architecture

Component Diagram

┌─────────────────────────────────────────────────────────────────┐
│                        Client Application                        │
│                    (Third-party IndieAuth client)               │
└───────────────────────────┬─────────────────────────────────────┘
                            │ HTTPS
                            │ IndieAuth Protocol
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                       Gondulf IndieAuth Server                   │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                    FastAPI Application                      │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌─────────────────┐  │ │
│  │  │ Authorization │  │    Token     │  │    Metadata     │  │ │
│  │  │   Endpoint    │  │   Endpoint   │  │    Endpoint     │  │ │
│  │  │  /authorize   │  │    /token    │  │   /.well-known  │  │ │
│  │  └──────┬───────┘  └──────┬───────┘  └────────┬────────┘  │ │
│  │         │                  │                    │           │ │
│  │         └──────────────────┼────────────────────┘           │ │
│  │                            │                                │ │
│  │  ┌─────────────────────────▼──────────────────────────────┐ │ │
│  │  │              Business Logic Layer                      │ │ │
│  │  │  ┌───────────────┐  ┌────────────┐  ┌──────────────┐  │ │ │
│  │  │  │ AuthService   │  │TokenService│  │DomainService │  │ │ │
│  │  │  │ - Auth flow   │  │ - Token    │  │ - Domain     │  │ │
│  │  │  │ - Email send  │  │   creation │  │   validation │  │ │
│  │  │  │ - Code gen    │  │ - Token    │  │ - TXT record │  │ │
│  │  │  │               │  │   verify   │  │   check      │  │ │
│  │  │  └───────────────┘  └────────────┘  └──────────────┘  │ │ │
│  │  └────────────────────────┬───────────────────────────────┘ │ │
│  │                           │                                 │ │
│  │  ┌────────────────────────▼──────────────────────────────┐ │ │
│  │  │               Storage Layer                           │ │ │
│  │  │  ┌──────────────────┐      ┌────────────────────────┐ │ │ │
│  │  │  │  SQLite Database │      │  In-Memory Store       │ │ │ │
│  │  │  │  - Tokens        │      │  - Auth codes (10min) │ │ │ │
│  │  │  │  - Domains       │      │  - Email codes (15min)│ │ │ │
│  │  │  └──────────────────┘      └────────────────────────┘ │ │ │
│  │  └───────────────────────────────────────────────────────┘ │ │
│  └────────────────────────────────────────────────────────────┘ │
└──────────┬──────────────────────────────────────┬───────────────┘
           │ SMTP                                  │ DNS
           ▼                                       ▼
  ┌────────────────┐                    ┌──────────────────┐
  │  Email Server  │                    │   DNS Provider   │
  │  (external)    │                    │   (external)     │
  └────────────────┘                    └──────────────────┘

Component Responsibilities

HTTP Endpoints Layer

Handles all HTTP concerns:

  • Request validation (Pydantic models)
  • Parameter parsing and type coercion
  • HTTP response formatting
  • Error responses (OAuth 2.0 compliant)
  • CORS headers
  • Rate limiting (future)

Business Logic Layer (Services)

Contains all domain logic, completely independent of HTTP:

AuthService:

  • Authorization flow orchestration
  • Email verification code generation and validation
  • Authorization code generation (cryptographically secure)
  • User consent management
  • PKCE support (future)

TokenService:

  • Access token generation (JWT or opaque)
  • Token validation and introspection
  • Token revocation (future)
  • Token refresh (future)

DomainService:

  • Domain ownership validation
  • DNS TXT record checking
  • Domain normalization
  • Security validation (prevent open redirects)

Storage Layer

Provides data persistence:

SQLite Database:

  • Access tokens (long-lived)
  • Verified domains
  • Audit logs
  • Configuration

In-Memory Store:

  • Authorization codes (TTL: 10 minutes)
  • Email verification codes (TTL: 15 minutes)
  • Rate limit counters (future)

Data Flow: Authorization Flow

1. Client → /authorize
   ↓
2. Gondulf validates client_id, redirect_uri, state
   ↓
3. Gondulf checks domain ownership (TXT record or cached)
   ↓
4. User enters email address for their domain
   ↓
5. Gondulf sends verification code to email
   ↓
6. User enters code
   ↓
7. Gondulf generates authorization code
   ↓
8. Gondulf redirects to client with code + state
   ↓
9. Client → /token with code
   ↓
10. Gondulf validates code, generates access token
   ↓
11. Gondulf returns token + me (user's domain)

Deployment Model

Target Deployment

  • Platform: Docker container
  • Scale: 10s of users initially
  • Process Model: Single uvicorn process (sufficient for MVP)
  • File System:
    • /data/gondulf.db - SQLite database
    • /data/backups/ - Database backups
    • /app/ - Application code

Configuration Management

  • Environment Variables: All configuration via environment
  • Secrets: Loaded from environment (SECRET_KEY, SMTP credentials)
  • Config Validation: Pydantic Settings validates on startup

Backup Strategy

Simple file-based SQLite backups:

  • Daily automated backups of gondulf.db
  • Backup rotation (keep last 7 days)
  • Simple shell script + cron
  • Future: S3/object storage support

Security Architecture

Authentication Method (v1.0.0)

Email-based verification only:

  • User provides email address for their domain
  • Server sends time-limited verification code
  • User enters code to prove email access
  • No password storage
  • No external identity providers in v1.0.0

Domain Ownership Validation

Two-tier validation:

  1. TXT Record (preferred):

    • Admin adds TXT record: _gondulf.example.com = verified
    • Server checks DNS before first use
    • Result cached in database
    • Periodic re-verification (configurable)
  2. Email-based (alternative):

    • If no TXT record, fall back to email verification
    • Email must be at verified domain (e.g., admin@example.com)
    • Less secure but more accessible for users

Token Security

  • Generation: Cryptographically secure random tokens (secrets.token_urlsafe)
  • Storage: Hashed in database (SHA-256)
  • Transmission: HTTPS only (enforced in production)
  • Expiration: Configurable (default 1 hour)
  • Validation: Constant-time comparison (prevent timing attacks)

Privacy Principles

Minimal Data Collection:

  • NEVER store email addresses beyond verification flow
  • NEVER log user personal data
  • Store only:
    • Domain name (user's identity)
    • Token hashes (security)
    • Timestamps (auditing)
    • Client IDs (protocol requirement)

Operational Architecture

Logging Strategy

Structured logging with appropriate levels:

  • INFO: Normal operations (auth success, token issued)
  • WARNING: Suspicious activity (failed validations, rate limit near)
  • ERROR: Failures requiring investigation (email send failed, DNS timeout)
  • CRITICAL: System failures (database unavailable, config invalid)

Log fields:

  • Timestamp (ISO 8601)
  • Level
  • Event type
  • Domain (never email)
  • Client ID
  • Request ID (correlation)

Privacy:

  • NEVER log email addresses
  • NEVER log full tokens (only first 8 chars for correlation)
  • NEVER log user-agent or IP in production (GDPR)

Monitoring (Future)

  • Health check endpoint: /health
  • Metrics endpoint: /metrics (Prometheus format)
  • Key metrics:
    • Authorization requests/min
    • Token generation rate
    • Email delivery success rate
    • Domain validation cache hit rate
    • Error rate by type

Upgrade Paths

Future Enhancements (Post v1.0.0)

Persistence Layer:

  • Add Redis for distributed sessions
  • Support PostgreSQL for larger deployments
  • No code changes required (SQLAlchemy abstraction)

Authentication Methods:

  • GitHub/GitLab provider support
  • IndieAuth delegation
  • WebAuthn for passwordless
  • All additive, no breaking changes

Protocol Features:

  • Token refresh
  • Token revocation endpoint
  • Scope management (authorization)
  • Dynamic client registration

Operational:

  • Multi-process deployment (gunicorn)
  • Horizontal scaling (with Redis)
  • Metrics and monitoring
  • Admin dashboard

Constraints and Trade-offs

Conscious Simplifications (v1.0.0)

  1. No Redis: In-memory storage acceptable for single-process deployment

    • Trade-off: Lose codes on restart (acceptable for 10-minute TTL)
    • Upgrade path: Add Redis when scaling needed
  2. No client pre-registration: Domain-based validation sufficient

    • Trade-off: Must validate client_id on every request
    • Mitigation: Cache validation results
  3. Email-only authentication: Simplest secure method

    • Trade-off: Requires SMTP configuration
    • Upgrade path: Add providers in future releases
  4. SQLite database: Perfect for small deployments

    • Trade-off: No built-in replication
    • Upgrade path: Migrate to PostgreSQL when needed
  5. Single process: No distributed coordination needed

    • Trade-off: Limited concurrent capacity
    • Upgrade path: Add Redis + gunicorn when scaling

Non-Negotiable Requirements

  1. W3C IndieAuth compliance: Full protocol compliance required
  2. Security best practices: No shortcuts on security
  3. HTTPS in production: Required for OAuth 2.0 security
  4. Minimal data collection: Privacy by design
  5. Comprehensive testing: 80%+ coverage minimum

Documentation Structure

For Developers

  • /docs/architecture/ - This directory
  • /docs/designs/ - Feature-specific designs
  • /docs/decisions/ - Architecture Decision Records

For Operators

  • README.md - Installation and usage
  • /docs/operations/ - Deployment guides (future)
  • Environment variable reference (future)

For Protocol Compliance

  • /docs/architecture/indieauth-protocol.md - Protocol implementation
  • /docs/architecture/security.md - Security model
  • Test suite demonstrating compliance

Next Steps

See /docs/roadmap/v1.0.0.md for the MVP feature set and implementation plan.

Key architectural documents to review:

  • /docs/architecture/indieauth-protocol.md - Protocol design
  • /docs/architecture/security.md - Security design
  • /docs/roadmap/backlog.md - Feature prioritization