# Phase 4-5: Critical Components for v1.0.0 Release **Architect**: Claude (Architect Agent) **Date**: 2025-11-20 **Status**: Design Complete **Design References**: - `/docs/roadmap/v1.0.0.md` - Original release plan - `/docs/reports/2025-11-20-gap-analysis-v1.0.0.md` - Gap analysis identifying missing components - `/docs/architecture/security.md` - Security architecture - `/docs/architecture/indieauth-protocol.md` - Protocol implementation details ## Overview This design document addresses the 7 critical missing components identified in the v1.0.0 gap analysis that are blocking release. These components span Phase 3 completion (metadata endpoint, client metadata fetching), Phase 4 (security hardening), and Phase 5 (deployment, testing, documentation). **Current Status**: Phases 1-2 are complete (100%). Phase 3 is 75% complete (missing metadata endpoint and h-app parsing). Phases 4-5 have not been started. **Estimated Remaining Effort**: 10-15 days to reach v1.0.0 release readiness. **Design Philosophy**: Maintain simplicity while meeting all P0 requirements. Reuse existing infrastructure where possible. Focus on production readiness and W3C IndieAuth compliance. --- ## Component 1: Metadata Endpoint ### Purpose Provide OAuth 2.0 Authorization Server Metadata endpoint per RFC 8414 to enable IndieAuth client discovery of server capabilities and endpoints. ### Specification References - **W3C IndieAuth**: Section on Discovery (metadata endpoint) - **RFC 8414**: OAuth 2.0 Authorization Server Metadata - **v1.0.0 Roadmap**: Line 62 (P0 feature), Phase 3 lines 162, 168 ### Design Overview Create a static metadata endpoint at `/.well-known/oauth-authorization-server` that returns server capabilities in JSON format. This endpoint requires no authentication and should be publicly cacheable. ### API Specification **Endpoint**: `GET /.well-known/oauth-authorization-server` **Request**: No parameters, no authentication required **Response** (HTTP 200 OK): ```json { "issuer": "https://auth.example.com", "authorization_endpoint": "https://auth.example.com/authorize", "token_endpoint": "https://auth.example.com/token", "response_types_supported": ["code"], "grant_types_supported": ["authorization_code"], "code_challenge_methods_supported": [], "token_endpoint_auth_methods_supported": ["none"], "revocation_endpoint_auth_methods_supported": ["none"], "scopes_supported": [] } ``` **Response Headers**: ```http Content-Type: application/json Cache-Control: public, max-age=86400 ``` ### Field Definitions | Field | Value | Rationale | |-------|-------|-----------| | `issuer` | Server base URL (from config) | Identifies this authorization server | | `authorization_endpoint` | `{base_url}/authorize` | Where clients initiate auth flow | | `token_endpoint` | `{base_url}/token` | Where clients exchange codes for tokens | | `response_types_supported` | `["code"]` | Only authorization code flow supported | | `grant_types_supported` | `["authorization_code"]` | Only grant type in v1.0.0 | | `code_challenge_methods_supported` | `[]` | PKCE not supported in v1.0.0 (ADR-003) | | `token_endpoint_auth_methods_supported` | `["none"]` | Public clients, no client secrets | | `revocation_endpoint_auth_methods_supported` | `["none"]` | No revocation endpoint in v1.0.0 | | `scopes_supported` | `[]` | Authentication-only, no scopes in v1.0.0 | ### Implementation Approach **File**: `/src/gondulf/routers/metadata.py` **Implementation Strategy**: Static JSON response generated from configuration at startup. ```python from fastapi import APIRouter, Response from gondulf.config import get_config router = APIRouter() @router.get("/.well-known/oauth-authorization-server") async def get_metadata(): """ OAuth 2.0 Authorization Server Metadata (RFC 8414). Returns server capabilities for IndieAuth client discovery. """ config = get_config() metadata = { "issuer": config.BASE_URL, "authorization_endpoint": f"{config.BASE_URL}/authorize", "token_endpoint": f"{config.BASE_URL}/token", "response_types_supported": ["code"], "grant_types_supported": ["authorization_code"], "code_challenge_methods_supported": [], "token_endpoint_auth_methods_supported": ["none"], "revocation_endpoint_auth_methods_supported": ["none"], "scopes_supported": [] } return Response( content=json.dumps(metadata, indent=2), media_type="application/json", headers={ "Cache-Control": "public, max-age=86400" } ) ``` **Configuration Change**: Add `BASE_URL` to config (e.g., `GONDULF_BASE_URL=https://auth.example.com`). **Registration**: Add router to main.py: ```python from gondulf.routers import metadata app.include_router(metadata.router) ``` ### Error Handling No error conditions - endpoint always returns metadata. If configuration is invalid, application fails at startup (fail-fast principle). ### Security Considerations - **Public Endpoint**: No authentication required (per RFC 8414) - **Cache-Control**: Set public cache for 24 hours to reduce load - **No Secrets**: Metadata contains no sensitive information - **HTTPS**: Served over HTTPS in production (enforced by middleware) ### Testing Requirements **Unit Tests** (`tests/unit/test_metadata.py`): 1. Test metadata endpoint returns 200 OK 2. Test all required fields present 3. Test correct values for each field 4. Test Cache-Control header present 5. Test Content-Type is application/json 6. Test issuer matches BASE_URL configuration **Integration Test** (`tests/integration/test_metadata_integration.py`): 1. Test endpoint accessible via FastAPI TestClient 2. Test response can be parsed as valid JSON 3. Test endpoints in metadata are valid URLs **Compliance Test**: 1. Verify metadata response matches RFC 8414 format ### Acceptance Criteria - [ ] `/.well-known/oauth-authorization-server` endpoint returns valid JSON - [ ] All required fields present per RFC 8414 - [ ] Endpoint values match actual server configuration - [ ] Cache-Control header set to public, max-age=86400 - [ ] All tests pass (unit, integration) - [ ] Endpoint accessible without authentication --- ## Component 2: Client Metadata Fetching (h-app Microformat) ### Purpose Fetch and parse client application metadata from `client_id` URL to display application name, icon, and URL on the consent screen. This improves user experience by showing what application they're authorizing. ### Specification References - **W3C IndieAuth**: Client Information Discovery - **Microformats h-app**: http://microformats.org/wiki/h-app - **v1.0.0 Roadmap**: Success criteria line 27, Phase 3 deliverables line 169 ### Design Overview Create a service that fetches the `client_id` URL, parses HTML for h-app microformat data, extracts application metadata, and caches results. Integrate with authorization endpoint to display app information on consent screen. ### h-app Microformat Structure Example HTML from client application: ```html
``` **Properties to Extract**: - `p-name`: Application name (required) - `u-logo`: Application icon URL (optional) - `u-url`: Application URL (optional, usually same as client_id) ### Data Model **ClientMetadata** (in-memory cache): ```python @dataclass class ClientMetadata: """Client application metadata from h-app microformat.""" client_id: str name: str # Extracted from p-name, or domain fallback logo_url: Optional[str] = None # Extracted from u-logo url: Optional[str] = None # Extracted from u-url fetched_at: datetime = None # Timestamp for cache expiry ``` ### Service Design **File**: `/src/gondulf/services/h_app_parser.py` **Dependencies**: - `html_fetcher.py` (already exists from Phase 2) - `mf2py` library for microformat parsing **Service Interface**: ```python class HAppParser: """Parse h-app microformat from client_id URL.""" def __init__(self, html_fetcher: HTMLFetcher): self.html_fetcher = html_fetcher self.cache: Dict[str, ClientMetadata] = {} self.cache_ttl = timedelta(hours=24) def parse_client_metadata(self, client_id: str) -> ClientMetadata: """ Fetch and parse client metadata from client_id URL. Returns ClientMetadata with name (always populated), and optional logo_url and url. Caches results for 24 hours to reduce HTTP requests. """ pass def _parse_h_app(self, html: str, client_id: str) -> ClientMetadata: """ Parse h-app microformat from HTML. Returns ClientMetadata with extracted values, or fallback to domain name if no h-app found. """ pass def _extract_domain_name(self, client_id: str) -> str: """Extract domain name from client_id for fallback display.""" pass ``` ### Implementation Details **Parsing Strategy**: 1. Check cache for `client_id` (if cached and not expired, return cached) 2. Fetch HTML from `client_id` using `HTMLFetcher` (reuse Phase 2 infrastructure) 3. Parse HTML with `mf2py` library to extract h-app microformat 4. Extract `p-name`, `u-logo`, `u-url` properties 5. If h-app not found, fallback to domain name extraction 6. Store in cache with 24-hour TTL 7. Return `ClientMetadata` object **mf2py Usage**: ```python import mf2py from urllib.parse import urlparse, urljoin def _parse_h_app(self, html: str, client_id: str) -> ClientMetadata: """Parse h-app microformat from HTML.""" # Parse microformats parsed = mf2py.parse(doc=html, url=client_id) # Find h-app items h_apps = [item for item in parsed.get('items', []) if 'h-app' in item.get('type', [])] if not h_apps: # Fallback: no h-app found return ClientMetadata( client_id=client_id, name=self._extract_domain_name(client_id), fetched_at=datetime.utcnow() ) # Use first h-app h_app = h_apps[0] properties = h_app.get('properties', {}) # Extract properties name = properties.get('name', [None])[0] or self._extract_domain_name(client_id) logo_url = properties.get('logo', [None])[0] url = properties.get('url', [None])[0] or client_id # Resolve relative URLs if logo_url and not logo_url.startswith('http'): logo_url = urljoin(client_id, logo_url) return ClientMetadata( client_id=client_id, name=name, logo_url=logo_url, url=url, fetched_at=datetime.utcnow() ) ``` **Fallback Strategy**: ```python def _extract_domain_name(self, client_id: str) -> str: """Extract domain name for fallback display.""" parsed = urlparse(client_id) domain = parsed.hostname or client_id # Remove 'www.' prefix if present if domain.startswith('www.'): domain = domain[4:] return domain ``` **Cache Management**: ```python def parse_client_metadata(self, client_id: str) -> ClientMetadata: """Fetch and parse with caching.""" # Check cache if client_id in self.cache: cached = self.cache[client_id] age = datetime.utcnow() - cached.fetched_at if age < self.cache_ttl: logger.debug(f"Cache hit for client_id: {client_id}") return cached # Fetch HTML try: html = self.html_fetcher.fetch(client_id) if not html: raise ValueError("Failed to fetch client_id URL") # Parse h-app metadata = self._parse_h_app(html, client_id) # Cache result self.cache[client_id] = metadata return metadata except Exception as e: logger.warning(f"Failed to parse client metadata for {client_id}: {e}") # Return fallback metadata return ClientMetadata( client_id=client_id, name=self._extract_domain_name(client_id), fetched_at=datetime.utcnow() ) ``` ### Integration with Authorization Endpoint **Update**: `/src/gondulf/routers/authorization.py` **Change**: Inject `HAppParser` and fetch client metadata before rendering consent screen. ```python from gondulf.services.h_app_parser import HAppParser, ClientMetadata @router.get("/authorize") async def authorize_get( # ... existing parameters ... h_app_parser: HAppParser = Depends(get_h_app_parser) ): # ... existing validation ... # Fetch client metadata client_metadata = h_app_parser.parse_client_metadata(client_id) # Render consent screen with client metadata return templates.TemplateResponse( "authorize.html", { "request": request, "me": me, "client_name": client_metadata.name, "client_logo_url": client_metadata.logo_url, "client_url": client_metadata.url or client_id, "client_id": client_id, "redirect_uri": redirect_uri, "state": state } ) ``` **Template Update**: `/src/gondulf/templates/authorize.html` Add client metadata display: ```html{{ client_url }}
This application wants to sign you in as:
{{ me }}
``` ### Dependency Injection **File**: `/src/gondulf/dependencies.py` Add `get_h_app_parser()`: ```python from gondulf.services.h_app_parser import HAppParser @lru_cache() def get_h_app_parser() -> HAppParser: """Get HAppParser singleton.""" html_fetcher = get_html_fetcher() return HAppParser(html_fetcher) ``` ### Error Handling **Failure Modes**: 1. **HTTP fetch fails**: Return fallback metadata with domain name 2. **HTML parse fails**: Return fallback metadata with domain name 3. **h-app not found**: Return fallback metadata with domain name 4. **Invalid URLs in h-app**: Skip invalid fields, use available data **Logging**: - Log INFO when h-app successfully parsed - Log WARNING when fallback used - Log ERROR only on unexpected exceptions ### Security Considerations - **HTTPS Only**: Reuse `HTMLFetcher` which enforces HTTPS - **Timeout**: 5-second timeout from `HTMLFetcher` - **Size Limit**: 5MB limit from `HTMLFetcher` - **XSS Prevention**: HTML escape all client metadata in templates (Jinja2 auto-escaping) - **Logo URL Validation**: Only display logo if HTTPS URL - **Cache Poisoning**: Cache keyed by client_id, no user input ### Testing Requirements **Unit Tests** (`tests/unit/test_h_app_parser.py`): 1. Test h-app parsing with complete metadata 2. Test h-app parsing with missing logo 3. Test h-app parsing with missing url 4. Test h-app not found (fallback to domain) 5. Test relative logo URL resolution 6. Test domain name extraction fallback 7. Test cache hit (no HTTP request) 8. Test cache expiry (new HTTP request) 9. Test HTML fetch failure (fallback) 10. Test multiple h-app items (use first) **Integration Tests** (`tests/integration/test_authorization_with_client_metadata.py`): 1. Test authorization endpoint displays client name 2. Test authorization endpoint displays client logo 3. Test authorization endpoint with fallback metadata ### Acceptance Criteria - [ ] `HAppParser` service created with caching - [ ] h-app microformat parsing working with mf2py - [ ] Fallback to domain name when h-app not found - [ ] Cache working with 24-hour TTL - [ ] Integration with authorization endpoint complete - [ ] Consent screen displays client name, logo, URL - [ ] All tests pass (unit, integration) - [ ] HTML escaping prevents XSS --- ## Component 3: Security Hardening ### Purpose Implement security best practices required for production deployment: security headers, HTTPS enforcement, input sanitization audit, and PII logging audit. ### Specification References - **v1.0.0 Roadmap**: Line 65 (P0 feature), Phase 4 lines 198-203 - **OWASP Top 10**: Security header recommendations - **OAuth 2.0 Security Best Practices**: HTTPS enforcement ### Design Overview Create security middleware to add HTTP security headers to all responses. Implement HTTPS enforcement for production environments. Conduct comprehensive audit of input sanitization and PII logging practices. ### 3.1: Security Headers Middleware **File**: `/src/gondulf/middleware/security_headers.py` **Headers to Implement**: | Header | Value | Purpose | |--------|-------|---------| | `X-Frame-Options` | `DENY` | Prevent clickjacking attacks | | `X-Content-Type-Options` | `nosniff` | Prevent MIME sniffing | | `X-XSS-Protection` | `1; mode=block` | Enable XSS filter (legacy browsers) | | `Referrer-Policy` | `strict-origin-when-cross-origin` | Control referrer information | | `Strict-Transport-Security` | `max-age=31536000; includeSubDomains` | Force HTTPS (production only) | | `Content-Security-Policy` | `default-src 'self'; style-src 'self' 'unsafe-inline'` | Restrict resource loading | | `Cache-Control` | `no-store, no-cache, must-revalidate` | Prevent caching of sensitive pages (auth endpoints only) | | `Pragma` | `no-cache` | HTTP/1.0 cache control | **Implementation**: ```python from fastapi import Request, Response from starlette.middleware.base import BaseHTTPMiddleware from gondolf.config import get_config class SecurityHeadersMiddleware(BaseHTTPMiddleware): """Add security headers to all responses.""" async def dispatch(self, request: Request, call_next): response = await call_next(request) config = get_config() # Always add these headers response.headers["X-Frame-Options"] = "DENY" response.headers["X-Content-Type-Options"] = "nosniff" response.headers["X-XSS-Protection"] = "1; mode=block" response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin" # CSP: Allow self and inline styles (for minimal CSS in templates) response.headers["Content-Security-Policy"] = ( "default-src 'self'; " "style-src 'self' 'unsafe-inline'; " "img-src 'self' https:; " # Allow HTTPS images (client logos) "frame-ancestors 'none'" # Equivalent to X-Frame-Options: DENY ) # HSTS: Only in production with HTTPS if not config.DEBUG and request.url.scheme == "https": response.headers["Strict-Transport-Security"] = ( "max-age=31536000; includeSubDomains" ) # Cache control for sensitive endpoints if request.url.path in ["/authorize", "/token", "/api/verify/code"]: response.headers["Cache-Control"] = "no-store, no-cache, must-revalidate" response.headers["Pragma"] = "no-cache" return response ``` **Registration**: Add to `main.py`: ```python from gondulf.middleware.security_headers import SecurityHeadersMiddleware app.add_middleware(SecurityHeadersMiddleware) ``` ### 3.2: HTTPS Enforcement Middleware **File**: `/src/gondulf/middleware/https_enforcement.py` **Implementation**: ```python from fastapi import Request, Response from starlette.middleware.base import BaseHTTPMiddleware from starlette.responses import RedirectResponse from gondolf.config import get_config class HTTPSEnforcementMiddleware(BaseHTTPMiddleware): """Enforce HTTPS in production (redirect HTTP to HTTPS).""" async def dispatch(self, request: Request, call_next): config = get_config() # Only enforce in production if not config.DEBUG: # Allow localhost HTTP for local testing if request.url.hostname not in ["localhost", "127.0.0.1"]: # Check if HTTP (not HTTPS) if request.url.scheme != "https": # Redirect to HTTPS https_url = request.url.replace(scheme="https") return RedirectResponse(url=str(https_url), status_code=301) return await call_next(request) ``` **Registration**: Add to `main.py` BEFORE security headers middleware: ```python from gondulf.middleware.https_enforcement import HTTPSEnforcementMiddleware # Add HTTPS enforcement first (before security headers) app.add_middleware(HTTPSEnforcementMiddleware) app.add_middleware(SecurityHeadersMiddleware) ``` **Configuration**: Add `DEBUG` flag to config: ```python # config.py DEBUG = os.getenv("GONDULF_DEBUG", "false").lower() == "true" ``` ### 3.3: Input Sanitization Audit **Scope**: Review all user input handling for proper validation and sanitization. **Audit Checklist**: | Input Source | Current Validation | Additional Sanitization Needed | |--------------|-------------------|-------------------------------| | `me` parameter | Pydantic HttpUrl, custom validation | ✅ Adequate (URL validation comprehensive) | | `client_id` parameter | Pydantic HttpUrl | ✅ Adequate | | `redirect_uri` parameter | Pydantic HttpUrl, domain validation | ✅ Adequate | | `state` parameter | Pydantic str, max_length=512 | ✅ Adequate (opaque, not interpreted) | | `code` parameter | str, checked against storage | ✅ Adequate (constant-time hash comparison) | | Email addresses | Regex validation in `validation.py` | ✅ Adequate (from rel=me, not user input) | | HTML templates | Jinja2 auto-escaping | ✅ Adequate (all {{ }} escaped by default) | | SQL queries | SQLAlchemy parameterized queries | ✅ Adequate (no string interpolation) | | DNS queries | dnspython library | ✅ Adequate (library handles escaping) | **Action Items**: 1. **Add HTML escaping test**: Verify Jinja2 auto-escaping works 2. **Add SQL injection test**: Verify parameterized queries prevent injection 3. **Document validation patterns**: Update security.md with validation approach **No Code Changes Required**: Existing validation is adequate. ### 3.4: PII Logging Audit **Scope**: Ensure no Personally Identifiable Information (PII) is logged. **PII Definition**: - Email addresses - Full tokens (access tokens, authorization codes, verification codes) - IP addresses (in production) **Audit Checklist**: | Service | Logs Email? | Logs Tokens? | Logs IP? | Action Required | |---------|-------------|--------------|----------|-----------------| | `email.py` | ❌ No (only domain) | N/A | ❌ No | ✅ Compliant | | `token_service.py` | N/A | ⚠️ Token prefix only (8 chars) | ❌ No | ✅ Compliant (prefix OK) | | `domain_verification.py` | ⚠️ **May log email** | ⚠️ **May log code** | ❌ No | 🔍 **Review Required** | | `authorization.py` | ❌ No | ⚠️ Code prefix only | ⚠️ **May log IP in errors** | 🔍 **Review Required** | | `verification.py` | ❌ No | ⚠️ Code in errors? | ❌ No | 🔍 **Review Required** | **Action Items**: 1. **Review all logger.info/warning/error calls** in services and routers 2. **Remove email addresses** from log messages (use domain only) 3. **Remove full codes/tokens** from log messages (use prefix or hash) 4. **Remove IP addresses** from production logs (OK in DEBUG mode) 5. **Add logging best practices** to coding standards **Example Fix**: ```python # BAD: Logs email (PII) logger.info(f"Verification sent to {email}") # GOOD: Logs domain only logger.info(f"Verification sent to user at domain {domain}") # BAD: Logs full token logger.debug(f"Generated token: {token}") # GOOD: Logs token prefix for correlation logger.debug(f"Generated token: {token[:8]}...") ``` ### 3.5: Security Configuration **Add to config.py**: ```python # Security settings DEBUG = os.getenv("GONDULF_DEBUG", "false").lower() == "true" HSTS_MAX_AGE = int(os.getenv("GONDULF_HSTS_MAX_AGE", "31536000")) # 1 year CSP_POLICY = os.getenv( "GONDULF_CSP_POLICY", "default-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' https:; frame-ancestors 'none'" ) ``` **Add to .env.example**: ```bash # Security Settings (Production) GONDULF_DEBUG=false GONDULF_HSTS_MAX_AGE=31536000 # HSTS max-age in seconds (1 year default) # GONDULF_CSP_POLICY=... # Custom CSP policy (optional) ``` ### Error Handling **Middleware Errors**: - Log errors but DO NOT block requests - If middleware fails, continue without headers (fail-open for availability) - Log ERROR level for middleware failures ### Security Considerations - **HSTS Preloading**: Consider submitting domain to HSTS preload list (future) - **CSP Reporting**: Consider adding CSP report-uri (future) - **Security.txt**: Consider adding /.well-known/security.txt (future) ### Testing Requirements **Unit Tests** (`tests/unit/test_security_middleware.py`): 1. Test security headers present on all responses 2. Test HSTS header only in production 3. Test HSTS header not present in DEBUG mode 4. Test HTTPS enforcement redirects HTTP to HTTPS 5. Test HTTPS enforcement allows localhost HTTP 6. Test Cache-Control headers on sensitive endpoints 7. Test CSP header allows self and inline styles **Integration Tests** (`tests/integration/test_security_integration.py`): 1. Test authorization endpoint has security headers 2. Test token endpoint has cache control headers 3. Test metadata endpoint does NOT have cache control (should be cacheable) **Security Tests** (`tests/security/test_input_validation.py`): 1. Test HTML escaping in templates (XSS prevention) 2. Test SQL injection prevention (parameterized queries) 3. Test URL validation rejects malicious URLs **PII Tests** (`tests/security/test_pii_logging.py`): 1. Test no email addresses in logs (mock logger, verify calls) 2. Test no full tokens in logs 3. Test no full codes in logs ### Acceptance Criteria - [ ] Security headers middleware implemented and registered - [ ] HTTPS enforcement middleware implemented and registered (production only) - [ ] All security headers present on responses - [ ] HSTS header only in production - [ ] Input sanitization audit complete (documented) - [ ] PII logging audit complete (issues fixed) - [ ] All tests pass (unit, integration, security) - [ ] No email addresses in logs - [ ] No full tokens/codes in logs --- ## Component 4: Deployment Configuration ### Purpose Provide production-ready deployment artifacts: Dockerfile, docker-compose.yml, database backup script, and comprehensive environment variable documentation. ### Specification References - **v1.0.0 Roadmap**: Line 66 (P0 feature), Phase 5 lines 233-236 - **Docker Best Practices**: Multi-stage builds, non-root user, minimal base images ### Design Overview Create Docker deployment configuration using multi-stage build for minimal image size. Provide docker-compose.yml for easy local testing. Create SQLite backup script with GPG encryption support. Document all environment variables comprehensively. ### 4.1: Dockerfile **File**: `/Dockerfile` **Strategy**: Multi-stage build to minimize final image size. **Implementation**: ```dockerfile # Stage 1: Builder FROM python:3.11-slim AS builder # Install uv for fast dependency resolution COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv # Set working directory WORKDIR /app # Copy dependency files COPY pyproject.toml uv.lock ./ # Install dependencies to a virtual environment RUN uv sync --frozen --no-dev # Stage 2: Runtime FROM python:3.11-slim # Create non-root user RUN useradd --create-home --shell /bin/bash gondulf # Set working directory WORKDIR /app # Copy virtual environment from builder COPY --from=builder /app/.venv /app/.venv # Copy application code COPY src/ /app/src/ COPY migrations/ /app/migrations/ # Create data directory for SQLite RUN mkdir -p /app/data && chown gondulf:gondulf /app/data # Switch to non-root user USER gondulf # Set environment variables ENV PATH="/app/.venv/bin:$PATH" ENV PYTHONPATH="/app/src" ENV GONDULF_DATABASE_URL="sqlite:////app/data/gondulf.db" # Expose port EXPOSE 8000 # Health check HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" # Run application CMD ["uvicorn", "gondulf.main:app", "--host", "0.0.0.0", "--port", "8000"] ``` **Build Instructions**: ```bash # Build image docker build -t gondulf:1.0.0 . # Tag as latest docker tag gondulf:1.0.0 gondulf:latest ``` **Image Properties**: - Base: `python:3.11-slim` (Debian-based, ~150MB) - User: Non-root `gondulf` user - Port: 8000 (Uvicorn default) - Health check: `/health` endpoint every 30 seconds - Data volume: `/app/data` (for SQLite database) ### 4.2: docker-compose.yml **File**: `/docker-compose.yml` **Purpose**: Local testing and development deployment. **Implementation**: ```yaml version: "3.8" services: gondulf: build: . image: gondulf:latest container_name: gondulf restart: unless-stopped ports: - "8000:8000" volumes: - ./data:/app/data - ./logs:/app/logs environment: # Required - GONDULF_SECRET_KEY=${GONDULF_SECRET_KEY} - GONDULF_BASE_URL=${GONDULF_BASE_URL:-http://localhost:8000} # SMTP Configuration (required for email verification) - GONDULF_SMTP_HOST=${GONDULF_SMTP_HOST} - GONDULF_SMTP_PORT=${GONDULF_SMTP_PORT:-587} - GONDULF_SMTP_USERNAME=${GONDULF_SMTP_USERNAME} - GONDULF_SMTP_PASSWORD=${GONDULF_SMTP_PASSWORD} - GONDULF_SMTP_FROM_EMAIL=${GONDULF_SMTP_FROM_EMAIL} - GONDULF_SMTP_USE_TLS=${GONDULF_SMTP_USE_TLS:-true} # Optional Configuration - GONDULF_DEBUG=${GONDULF_DEBUG:-false} - GONDULF_LOG_LEVEL=${GONDULF_LOG_LEVEL:-INFO} - GONDULF_TOKEN_TTL=${GONDULF_TOKEN_TTL:-3600} healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8000/health"] interval: 30s timeout: 5s retries: 3 start_period: 10s networks: - gondulf-network networks: gondulf-network: driver: bridge volumes: data: logs: ``` **Usage**: ```bash # Create .env file with required variables cp .env.example .env nano .env # Edit with your values # Start service docker-compose up -d # View logs docker-compose logs -f gondulf # Stop service docker-compose down # Restart service docker-compose restart gondulf ``` ### 4.3: Backup Script **File**: `/scripts/backup_database.sh` **Purpose**: Backup SQLite database with optional GPG encryption. **Implementation**: ```bash #!/bin/bash set -euo pipefail # Configuration SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" PROJECT_ROOT="$(dirname "$SCRIPT_DIR")" DATA_DIR="${DATA_DIR:-$PROJECT_ROOT/data}" BACKUP_DIR="${BACKUP_DIR:-$PROJECT_ROOT/backups}" DB_FILE="${DB_FILE:-$DATA_DIR/gondulf.db}" TIMESTAMP=$(date +%Y%m%d_%H%M%S) BACKUP_FILE="$BACKUP_DIR/gondulf_${TIMESTAMP}.db" RETENTION_DAYS="${RETENTION_DAYS:-30}" # GPG encryption (optional) GPG_RECIPIENT="${GPG_RECIPIENT:-}" # Colors for output GREEN='\033[0;32m' YELLOW='\033[1;33m' RED='\033[0;31m' NC='\033[0m' # No Color echo "=========================================" echo "Gondulf Database Backup" echo "=========================================" echo # Check if database exists if [ ! -f "$DB_FILE" ]; then echo -e "${RED}ERROR: Database file not found: $DB_FILE${NC}" exit 1 fi # Create backup directory mkdir -p "$BACKUP_DIR" # SQLite backup (using .backup command for consistency) echo -e "${YELLOW}Backing up database...${NC}" sqlite3 "$DB_FILE" ".backup $BACKUP_FILE" if [ $? -eq 0 ]; then echo -e "${GREEN}✓ Database backed up to: $BACKUP_FILE${NC}" # Get file size SIZE=$(du -h "$BACKUP_FILE" | cut -f1) echo -e " Size: $SIZE" else echo -e "${RED}ERROR: Backup failed${NC}" exit 1 fi # GPG encryption (optional) if [ -n "$GPG_RECIPIENT" ]; then echo -e "${YELLOW}Encrypting backup with GPG...${NC}" gpg --encrypt --recipient "$GPG_RECIPIENT" --output "${BACKUP_FILE}.gpg" "$BACKUP_FILE" if [ $? -eq 0 ]; then echo -e "${GREEN}✓ Backup encrypted: ${BACKUP_FILE}.gpg${NC}" # Remove unencrypted backup rm "$BACKUP_FILE" BACKUP_FILE="${BACKUP_FILE}.gpg" else echo -e "${RED}WARNING: Encryption failed, keeping unencrypted backup${NC}" fi fi # Cleanup old backups echo -e "${YELLOW}Cleaning up old backups (older than $RETENTION_DAYS days)...${NC}" find "$BACKUP_DIR" -name "gondulf_*.db*" -type f -mtime +$RETENTION_DAYS -delete CLEANED=$(find "$BACKUP_DIR" -name "gondulf_*.db*" -type f -mtime +$RETENTION_DAYS | wc -l) echo -e "${GREEN}✓ Removed $CLEANED old backup(s)${NC}" # List recent backups echo echo "Recent backups:" ls -lh "$BACKUP_DIR"/gondulf_*.db* 2>/dev/null | tail -5 || echo " No backups found" echo echo -e "${GREEN}Backup complete!${NC}" ``` **Restore Script**: `/scripts/restore_database.sh` ```bash #!/bin/bash set -euo pipefail # Configuration SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" PROJECT_ROOT="$(dirname "$SCRIPT_DIR")" DATA_DIR="${DATA_DIR:-$PROJECT_ROOT/data}" DB_FILE="${DB_FILE:-$DATA_DIR/gondulf.db}" # Colors GREEN='\033[0;32m' YELLOW='\033[1;33m' RED='\033[0;31m' NC='\033[0m' echo "=========================================" echo "Gondulf Database Restore" echo "=========================================" echo # Check arguments if [ $# -ne 1 ]; then echo "Usage: $0