# Security Architecture ## Security Philosophy Gondulf follows a defense-in-depth security model with these core principles: 1. **Secure by Default**: Security features enabled out of the box 2. **Fail Securely**: Errors default to denying access, not granting it 3. **Least Privilege**: Collect and store minimum necessary data 4. **Transparency**: Security decisions documented and auditable 5. **Standards Compliance**: Follow OAuth 2.0 and IndieAuth security best practices ## Threat Model ### Assets to Protect **Primary Assets**: - User domain identities (the `me` parameter) - Access tokens (prove user identity to clients) - Authorization codes (short-lived, exchange for tokens) **Secondary Assets**: - Email verification codes (prove email ownership) - Domain verification status (cached TXT record checks) - Client metadata (cached application information) **Explicitly NOT Protected** (by design): - Passwords (none stored) - Personal user data beyond domain (privacy principle) - Client secrets (OAuth 2.0 public clients) ### Threat Actors **External Attackers**: - Phishing attempts (fake clients) - Token theft (network interception) - Open redirect exploitation - CSRF attacks - Brute force attacks (code guessing) **Compromised Clients**: - Malicious client applications - Client impersonation - Redirect URI manipulation **System Compromise**: - Database access (SQLite file theft) - Server memory access (in-memory code theft) - Log file access (token exposure) ### Out of Scope (v1.0.0) - DDoS attacks (handled by infrastructure) - Zero-day vulnerabilities in dependencies - Physical access to server - Social engineering attacks on users - DNS hijacking (external to application) ## Authentication Security ### Two-Factor Domain Verification (v1.0.0) **Mechanism**: Users prove domain ownership through TWO independent factors: 1. **DNS TXT Record**: Proves DNS control (`_gondulf.{domain}` = `verified`) 2. **Email via rel="me"**: Proves email control (discovered from site's rel="me" link) **Security Model**: An attacker must compromise BOTH factors to authenticate fraudulently. This is significantly stronger than single-factor verification. #### Threat: Email Interception **Risk**: Attacker intercepts email containing verification code. **Mitigations**: 1. **Two-Factor Requirement**: Email alone is insufficient (DNS also required) 2. **Short Code Lifetime**: 15-minute expiration 3. **Single Use**: Code invalidated after verification 4. **Rate Limiting**: Max 3 code requests per domain per hour 5. **TLS Email Delivery**: Require STARTTLS for SMTP 6. **Display Warning**: "Only request code if you initiated this login" **Residual Risk**: Low. Even with email interception, attacker still needs DNS control. #### Threat: Code Brute Force **Risk**: Attacker guesses 6-digit verification code. **Mitigations**: 1. **Two-Factor Requirement**: Code alone is insufficient (DNS also required) 2. **Sufficient Entropy**: 1,000,000 possible codes (6 digits) 3. **Attempt Limiting**: Max 3 attempts per email 4. **Short Lifetime**: 15-minute window 5. **Rate Limiting**: Max 3 codes per domain per hour 6. **Single-Use**: Code invalidated after use **Math**: - 3 attempts × 1,000,000 codes = 0.0003% success probability - 15-minute window limits attack time - Even if guessed, attacker still needs DNS control **Residual Risk**: Very low. Two-factor requirement makes brute force insufficient. #### Threat: DNS TXT Record Spoofing **Risk**: Attacker attempts to spoof DNS responses. **Mitigations**: 1. **Multiple Resolvers**: Query 2+ independent DNS servers (Google, Cloudflare) 2. **Consensus Required**: Require agreement from at least 2 resolvers 3. **DNSSEC Support**: Validate DNSSEC signatures when available (future) 4. **Timeout Handling**: Fail securely if DNS unavailable 5. **Logging**: Log all DNS verification attempts **Residual Risk**: Low. Spoofing multiple independent resolvers is difficult. #### Threat: rel="me" Link Spoofing **Risk**: Attacker compromises user's website to add malicious rel="me" link. **Mitigations**: 1. **Two-Factor Requirement**: Website compromise alone insufficient (DNS also required) 2. **HTTPS Required**: Fetch site over TLS (prevents MITM) 3. **Certificate Validation**: Verify SSL certificate 4. **Email Domain Matching**: Email should match site domain (warning if not) 5. **User Education**: Inform users to secure their website **Residual Risk**: Moderate. If attacker compromises both DNS and website, they can authenticate. This is acceptable as it represents full domain compromise. #### Threat: Email Address Enumeration **Risk**: Attacker discovers email addresses by triggering rel="me" discovery. **Mitigations**: 1. **Public Information**: rel="me" links are intentionally public 2. **User Awareness**: Users know they're publishing email on their site 3. **Rate Limiting**: Prevent bulk scanning 4. **Robots.txt**: Users can restrict crawler access if desired **Residual Risk**: Minimal. Email addresses are intentionally published by users on their own sites. ### Domain Ownership Verification (Two-Factor) **Mechanism**: v1.0.0 requires BOTH verification methods: #### 1. TXT Record Validation (Required) **Mechanism**: Admin adds DNS TXT record `_gondulf.{domain}` = `verified`. **Security Properties**: - Proves DNS control (first factor) - Verifiable without user interaction - Cacheable for performance - Re-verifiable periodically **Implementation**: ```python import dns.resolver def verify_txt_record(domain: str) -> bool: """ Verify _gondulf.{domain} TXT record exists with value 'verified'. Requires consensus from multiple independent resolvers. """ try: # Use Google and Cloudflare DNS for redundancy resolvers = ['8.8.8.8', '1.1.1.1'] verified_count = 0 for resolver_ip in resolvers: resolver = dns.resolver.Resolver() resolver.nameservers = [resolver_ip] resolver.timeout = 5 answers = resolver.resolve(f'_gondulf.{domain}', 'TXT') for rdata in answers: txt_value = rdata.to_text().strip('"') if txt_value == 'verified': verified_count += 1 break # Require consensus from at least 2 resolvers return verified_count >= 2 except Exception as e: logger.warning(f"DNS verification failed for {domain}: {e}") return False ``` #### 2. Email Verification via rel="me" (Required) **Mechanism**: Email discovered from site's ``, then verified with code. **Security Properties**: - Proves website control (can modify HTML) - Proves email control (receives and enters code) - Follows IndieWeb standards (rel="me") - Self-documenting (user declares email publicly) **Implementation**: ```python from bs4 import BeautifulSoup import requests def discover_email_from_site(domain: str) -> Optional[str]: """ Fetch site and discover email from rel="me" link. """ try: response = requests.get(f"https://{domain}", timeout=10, allow_redirects=True) response.raise_for_status() soup = BeautifulSoup(response.content, 'html.parser') me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me') for link in me_links: href = link.get('href', '') if href.startswith('mailto:'): email = href.replace('mailto:', '').strip() if validate_email_format(email): return email return None except Exception as e: logger.error(f"Failed to discover email for {domain}: {e}") return None ``` **Combined Residual Risk**: Low. Attacker must compromise DNS, website, and email account to authenticate fraudulently. ## Authorization Security ### Authorization Code Security **Properties**: - **Length**: 32 bytes (256 bits of entropy) - **Generation**: `secrets.token_urlsafe(32)` (cryptographically secure) - **Lifetime**: 10 minutes maximum (per W3C spec) - **Single-Use**: Invalidated immediately after exchange - **Binding**: Tied to client_id, redirect_uri, me #### Threat: Authorization Code Interception **Risk**: Attacker intercepts code from redirect URL. **Mitigations (v1.0.0)**: 1. **HTTPS Only**: Enforce TLS for all communications 2. **Short Lifetime**: 10-minute expiration 3. **Single Use**: Code invalidated after first use 4. **State Binding**: Client validates state parameter (CSRF protection) **Mitigations (Future - PKCE)**: 1. **Code Challenge**: Client sends hash of secret with auth request 2. **Code Verifier**: Client proves knowledge of secret on token exchange 3. **No Interception Value**: Code useless without original secret **ADR-003 Decision**: PKCE deferred to v1.1.0 to maintain MVP simplicity. **Residual Risk**: Low with HTTPS + short lifetime, minimal with PKCE (future). #### Threat: Code Replay Attack **Risk**: Attacker reuses previously valid authorization code. **Mitigations**: 1. **Single-Use Enforcement**: Mark code as used in storage 2. **Immediate Invalidation**: Delete code after exchange 3. **Concurrent Use Detection**: Log warning if used code presented again **Implementation**: ```python def exchange_code(code: str) -> Optional[dict]: """ Exchange authorization code for token. Returns None if code invalid, expired, or already used. """ # Retrieve code data code_data = code_storage.get(code) if not code_data: logger.warning("Code not found or expired") return None # Check if already used if code_data.get('used'): logger.error(f"Code replay attack detected: {code[:8]}...") # SECURITY: Potential replay attack, alert admin return None # Mark as used IMMEDIATELY (before token generation) code_data['used'] = True code_storage.set(code, code_data) # Generate token return generate_token(code_data) ``` **Residual Risk**: Negligible. ### Access Token Security **Properties**: - **Format**: Opaque tokens (v1.0.0), not JWT - **Length**: 32 bytes (256 bits of entropy) - **Generation**: `secrets.token_urlsafe(32)` - **Storage**: SHA-256 hash only (never plaintext) - **Lifetime**: 1 hour default (configurable) - **Transmission**: HTTPS only, Bearer authentication #### Threat: Token Theft **Risk**: Attacker steals access token from storage or transmission. **Mitigations**: 1. **TLS Enforcement**: HTTPS only in production 2. **Hashed Storage**: Store SHA-256 hash, not plaintext 3. **Short Lifetime**: 1-hour expiration (configurable) 4. **Revocation**: Admin can revoke tokens (future) 5. **Secure Headers**: Set Cache-Control: no-store, Pragma: no-cache **Token Storage**: ```python import hashlib import secrets def generate_token(me: str, client_id: str) -> str: """ Generate access token and store hash in database. """ # Generate token (returned to client, never stored) token = secrets.token_urlsafe(32) # Store only hash (irreversible) token_hash = hashlib.sha256(token.encode()).hexdigest() db.execute(''' INSERT INTO tokens (token_hash, me, client_id, scope, issued_at, expires_at) VALUES (?, ?, ?, ?, ?, ?) ''', (token_hash, me, client_id, "", datetime.utcnow(), expires_at)) return token ``` **Residual Risk**: Low, tokens useless if hashing is secure. #### Threat: Timing Attacks on Token Verification **Risk**: Attacker uses timing differences to guess valid tokens character-by-character. **Mitigations**: 1. **Constant-Time Comparison**: Use `secrets.compare_digest()` 2. **Hash Comparison**: Compare hashes, not tokens 3. **Logging Delays**: Random delay on failed validation **Implementation**: ```python import secrets import hashlib def verify_token(provided_token: str) -> Optional[dict]: """ Verify access token using constant-time comparison. """ # Hash provided token provided_hash = hashlib.sha256(provided_token.encode()).hexdigest() # Lookup in database token_data = db.query_one(''' SELECT me, client_id, scope, expires_at, revoked FROM tokens WHERE token_hash = ? ''', (provided_hash,)) if not token_data: return None # Constant-time comparison (even though we use SQL =, hash mismatch protection) # The comparison happens in SQL, but we add extra layer here if not secrets.compare_digest(provided_hash, provided_hash): # This always passes, but ensures constant-time code path pass # Check expiration if datetime.utcnow() > token_data['expires_at']: return None # Check revocation if token_data.get('revoked'): return None return token_data ``` **Residual Risk**: Negligible. ## Input Validation ### URL Validation Security **Critical**: Improper URL validation enables phishing and open redirect attacks. #### Threat: Open Redirect via redirect_uri **Risk**: Attacker tricks user into authorizing malicious redirect_uri, steals authorization code. **Mitigations**: 1. **Domain Matching**: Require redirect_uri domain match client_id domain 2. **Subdomain Validation**: Allow subdomains of client_id domain 3. **Registered URIs**: Future feature to pre-register alternate domains 4. **User Warning**: Display warning if domains differ 5. **HTTPS Enforcement**: Require HTTPS for non-localhost **Validation Logic**: ```python def validate_redirect_uri(redirect_uri: str, client_id: str, registered_uris: list) -> tuple[bool, str]: """ Validate redirect_uri against client_id. Returns (is_valid, warning_message). """ redirect_parsed = urlparse(redirect_uri) client_parsed = urlparse(client_id) # Must be HTTPS (except localhost) if redirect_parsed.hostname != 'localhost': if redirect_parsed.scheme != 'https': return False, "redirect_uri must use HTTPS" redirect_domain = redirect_parsed.hostname.lower() client_domain = client_parsed.hostname.lower() # Exact match: OK if redirect_domain == client_domain: return True, "" # Subdomain: OK if redirect_domain.endswith('.' + client_domain): return True, "" # Registered URI: OK (future) if redirect_uri in registered_uris: return True, "" # Different domain: WARNING warning = f"Warning: Redirect to different domain ({redirect_domain})" return True, warning # Allow but warn user ``` **Residual Risk**: Low, user must approve redirect with warning. #### Threat: Phishing via Malicious client_id **Risk**: Attacker uses client_id of legitimate-looking domain (typosquatting). **Mitigations**: 1. **Display Full URL**: Show complete client_id to user, not just app name 2. **Fetch Verification**: Verify client_id is fetchable (real domain) 3. **Subdomain Check**: Warn if client_id is subdomain of well-known domain 4. **Certificate Validation**: Verify SSL certificate validity 5. **User Education**: Inform users to verify client_id carefully **UI Display**: ``` Sign in to: Application Name (if available) https://client.example.com ← Full URL always displayed Redirect to: https://client.example.com/callback ``` **Residual Risk**: Moderate, requires user vigilance. #### Threat: URL Parameter Injection **Risk**: Attacker injects malicious parameters via crafted URLs. **Mitigations**: 1. **Pydantic Validation**: Use Pydantic models for all parameters 2. **Type Enforcement**: Strict type checking (str, not any) 3. **Allowlist Validation**: Only accept expected parameters 4. **SQL Parameterization**: Use parameterized queries (prevent SQL injection) 5. **HTML Encoding**: Encode all user input in HTML responses **Pydantic Models**: ```python from pydantic import BaseModel, HttpUrl, Field class AuthorizeRequest(BaseModel): me: HttpUrl client_id: HttpUrl redirect_uri: HttpUrl state: str = Field(min_length=1, max_length=512) response_type: Literal["code"] scope: str = "" # Optional, ignored in v1.0.0 class Config: extra = "forbid" # Reject unknown parameters ``` **Residual Risk**: Minimal, Pydantic provides strong validation. ### HTML Parsing Security (rel="me" Discovery) #### Threat: Malicious HTML Injection **Risk**: Attacker's site contains malicious HTML to exploit parser. **Mitigations**: 1. **Robust Parser**: Use BeautifulSoup (handles malformed HTML safely) 2. **Link Extraction Only**: Only extract href attributes, no script execution 3. **Timeout**: 10-second timeout for HTTP requests 4. **Size Limit**: Limit response size (prevent memory exhaustion) 5. **HTTPS Required**: Fetch over TLS only 6. **Certificate Validation**: Verify SSL certificates **Implementation**: ```python from bs4 import BeautifulSoup import requests def discover_email_from_site(domain: str) -> Optional[str]: """ Safely discover email from rel="me" link. """ try: # Fetch with safety limits response = requests.get( f"https://{domain}", timeout=10, allow_redirects=True, max_redirects=5, stream=True # Don't load entire response into memory ) response.raise_for_status() # Limit response size (prevent memory exhaustion) MAX_SIZE = 5 * 1024 * 1024 # 5MB content = response.raw.read(MAX_SIZE) # Parse HTML (BeautifulSoup handles malformed HTML safely) soup = BeautifulSoup(content, 'html.parser') # Find rel="me" links (no script execution) me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me') # Extract mailto: links only for link in me_links: href = link.get('href', '') if href.startswith('mailto:'): email = href.replace('mailto:', '').strip() # Validate email format before returning if validate_email_format(email): return email return None except requests.exceptions.SSLError as e: logger.error(f"SSL certificate validation failed for {domain}: {e}") return None except Exception as e: logger.error(f"Failed to discover email for {domain}: {e}") return None ``` **Residual Risk**: Very low. BeautifulSoup is designed for untrusted HTML. ### Email Validation #### Threat: Email Injection Attacks **Risk**: Attacker crafts malicious email address in rel="me" link. **Mitigations**: 1. **Format Validation**: Strict email regex (RFC 5322) 2. **No User Input**: Email discovered from site (not user-provided) 3. **SMTP Library**: Use well-tested library (smtplib) 4. **Content Encoding**: Encode email content properly 5. **Rate Limiting**: Prevent abuse **Validation**: ```python import re def validate_email_format(email: str) -> bool: """ Validate email address format. """ # Basic format check (RFC 5322 simplified) email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' if not re.match(email_regex, email): return False # Sanity checks if len(email) > 254: # RFC 5321 maximum return False if email.count('@') != 1: return False return True ``` **Note**: Domain matching is NOT enforced in v1.0.0. User may have email at different domain than their identity site (e.g., phil@gmail.com for phil.example.com). This is acceptable as user explicitly publishes the email on their site. **Residual Risk**: Low, standard validation patterns. ## Network Security ### TLS/HTTPS Enforcement **Production Requirements**: - All endpoints MUST use HTTPS - Minimum TLS 1.2 (prefer TLS 1.3) - Strong cipher suites only - Valid SSL certificate (not self-signed) **Configuration**: ```python # In production configuration if not DEBUG: # Enforce HTTPS app.add_middleware(HTTPSRedirectMiddleware) # Add security headers app.add_middleware( SecureHeadersMiddleware, hsts="max-age=31536000; includeSubDomains", content_security_policy="default-src 'self'", x_frame_options="DENY", x_content_type_options="nosniff" ) ``` **Development Exception**: - HTTP allowed for `localhost` only - Never in production **Residual Risk**: Negligible if properly configured. ### Security Headers **Required Headers**: ```http # Prevent clickjacking X-Frame-Options: DENY # Prevent MIME sniffing X-Content-Type-Options: nosniff # XSS protection (legacy browsers) X-XSS-Protection: 1; mode=block # HSTS (HTTPS enforcement) Strict-Transport-Security: max-age=31536000; includeSubDomains # CSP (limit resource loading) Content-Security-Policy: default-src 'self'; style-src 'self' 'unsafe-inline' # Referrer policy (privacy) Referrer-Policy: strict-origin-when-cross-origin ``` **Implementation**: ```python @app.middleware("http") async def add_security_headers(request: Request, call_next): response = await call_next(request) response.headers["X-Frame-Options"] = "DENY" response.headers["X-Content-Type-Options"] = "nosniff" response.headers["X-XSS-Protection"] = "1; mode=block" if not DEBUG: response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains" response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin" return response ``` ## Data Security ### Data Minimization (Privacy) **Principle**: Collect and store ONLY essential data. **Stored Data**: - ✅ Domain name (user identity, required) - ✅ Token hashes (security, required) - ✅ Client IDs (protocol, required) - ✅ Timestamps (auditing, required) **Never Stored**: - ❌ Email addresses (after verification) - ❌ Plaintext tokens - ❌ User-Agent strings - ❌ IP addresses (except rate limiting, temporary) - ❌ Browsing history - ❌ Personal information **Email Handling**: ```python # Email discovered from rel="me" link (not user-provided) # Stored ONLY during verification (in-memory, 15-min TTL) verification_codes[code_id] = { "email": email, # ← Discovered from site, exists ONLY here, NEVER in database "code": code, "domain": domain, "expires_at": datetime.utcnow() + timedelta(minutes=15) } # After verification: email is deleted, only domain + timestamp stored db.execute(''' INSERT INTO domains (domain, verification_method, verified_at, last_email_check) VALUES (?, 'two_factor', ?, ?) ''', (domain, datetime.utcnow(), datetime.utcnow())) # Note: NO email address in database, only verification timestamp ``` **rel="me" Discovery**: - Email addresses are public (user publishes on their site) - Server fetches email from user's site (not user input) - Reduces social engineering risk (can't claim arbitrary email) - Follows IndieWeb standards for identity ### Database Security **SQLite Security**: 1. **File Permissions**: 600 (owner read/write only) 2. **Encryption at Rest**: Use encrypted filesystem (LUKS, dm-crypt) 3. **Backup Encryption**: Encrypt backup files (GPG) 4. **SQL Injection Prevention**: Parameterized queries only **Parameterized Queries**: ```python # GOOD: Parameterized (safe) db.execute( "SELECT * FROM tokens WHERE token_hash = ?", (token_hash,) ) # BAD: String interpolation (vulnerable) db.execute( f"SELECT * FROM tokens WHERE token_hash = '{token_hash}'" ) # ← NEVER DO THIS ``` **File Permissions**: ```bash # Set restrictive permissions chmod 600 /data/gondulf.db chown gondulf:gondulf /data/gondulf.db ``` ### Logging Security **Principle**: Log security events, NEVER log sensitive data. **Log Security Events**: - ✅ Failed authentication attempts - ✅ Authorization grants (domain + client_id) - ✅ Token generation (hash prefix only) - ✅ Email verification attempts - ✅ DNS verification results - ✅ Error conditions **Never Log**: - ❌ Email addresses (PII) - ❌ Full access tokens - ❌ Verification codes - ❌ Authorization codes - ❌ IP addresses (production) **Safe Logging Examples**: ```python # GOOD: Domain only (public information) logger.info(f"Authorization granted for {domain} to {client_id}") # GOOD: Token prefix for correlation logger.debug(f"Token generated: {token[:8]}...") # GOOD: Error without sensitive data logger.error(f"Email send failed for domain {domain}") # BAD: Email address (PII) logger.info(f"Verification sent to {email}") # ← NEVER # BAD: Full token (security) logger.debug(f"Token: {token}") # ← NEVER ``` ## Dependency Security ### Dependency Management **Principles**: 1. **Minimal Dependencies**: Prefer standard library 2. **Vetted Libraries**: Only well-maintained, popular libraries 3. **Version Pinning**: Pin exact versions in requirements.txt 4. **Security Scanning**: Regular vulnerability scanning 5. **Update Strategy**: Security patches applied promptly **Security Scanning**: ```bash # Scan for known vulnerabilities uv run pip-audit # Alternative: safety check uv run safety check ``` **Update Policy**: - **Security patches**: Apply within 24 hours (critical), 7 days (high) - **Minor versions**: Review and test before updating - **Major versions**: Evaluate breaking changes, test thoroughly ### Secrets Management **Environment Variables** (v1.0.0): ```bash # Required secrets GONDULF_SECRET_KEY=<256-bit random value> GONDULF_SMTP_PASSWORD= # Optional secrets GONDULF_DATABASE_ENCRYPTION_KEY= ``` **Secret Generation**: ```bash # Generate SECRET_KEY (256 bits) python -c "import secrets; print(secrets.token_urlsafe(32))" ``` **Storage**: - Development: `.env` file (not committed) - Production: Docker secrets or environment variables - Never hardcode secrets in code **Future**: Integrate with HashiCorp Vault or AWS Secrets Manager. ## Rate Limiting (Future) **v1.0.0**: Not implemented (acceptable for small deployments). **Future Implementation**: | Endpoint | Limit | Window | Key | |----------|-------|--------|-----| | /authorize | 10 requests | 1 minute | IP | | /token | 30 requests | 1 minute | client_id | | Email verification | 3 codes | 1 hour | email | | Code submission | 3 attempts | 15 minutes | session | **Implementation Strategy**: - Use Redis for distributed rate limiting - Token bucket algorithm - Exponential backoff on failures ## Security Testing ### Required Security Tests 1. **Input Validation**: - Malformed URLs (me, client_id, redirect_uri) - SQL injection attempts - XSS attempts - Email injection 2. **Authentication**: - Expired code rejection - Used code rejection - Invalid code rejection - Brute force resistance 3. **Authorization**: - State parameter validation - Redirect URI validation - Open redirect prevention 4. **Token Security**: - Timing attack resistance - Token theft scenarios - Expiration enforcement 5. **TLS/HTTPS**: - HTTP rejection in production - Security headers presence - Certificate validation ### Security Scanning Tools **Required Tools**: - `bandit`: Python security linter - `pip-audit`: Dependency vulnerability scanner - `pytest`: Security-focused test cases **CI/CD Integration**: ```yaml # GitHub Actions example security: - name: Run Bandit run: uv run bandit -r src/gondulf - name: Scan Dependencies run: uv run pip-audit - name: Run Security Tests run: uv run pytest tests/security/ ``` ## Incident Response ### Security Event Monitoring **Monitor For**: 1. Multiple failed authentication attempts 2. Authorization code reuse attempts 3. Invalid token presentation 4. Unusual DNS verification failures 5. Email send failures (potential abuse) **Alerting** (future): - Admin email on critical events - Webhook integration (Slack, Discord) - Metrics dashboard (Grafana) ### Breach Response Plan (Future) **If Access Tokens Compromised**: 1. Revoke all active tokens 2. Force re-authentication 3. Notify affected users (via domain) 4. Rotate SECRET_KEY 5. Audit logs for suspicious activity **If Database Compromised**: 1. Assess data exposure (only hashes + domains) 2. Rotate all tokens 3. Review access logs 4. Notify users if domains exposed ## Compliance Considerations ### GDPR Compliance **Personal Data Stored**: - Domain names (considered PII in some jurisdictions) - Timestamps (associated with domains) **GDPR Rights**: - **Right to Access**: Admin can query database - **Right to Erasure**: Admin can delete domain records - **Right to Portability**: Data export feature (future) **Privacy Policy** (required): - Document what data is collected (domains, timestamps) - Document how data is used (authentication) - Document retention policy (indefinite unless deleted) - Provide contact for data requests ### Security Disclosure **Security Policy** (future): - Responsible disclosure process - Security contact (security@domain) - GPG key for encrypted reports - Acknowledgments for researchers ## Security Roadmap ### v1.0.0 (MVP) - ✅ Two-factor domain verification (DNS TXT + Email via rel="me") - ✅ rel="me" email discovery (IndieWeb standard) - ✅ HTML parsing security (BeautifulSoup) - ✅ TLS/HTTPS enforcement - ✅ Secure token generation (opaque, hashed) - ✅ URL validation (open redirect prevention) - ✅ Input validation (Pydantic) - ✅ Security headers - ✅ Minimal data collection (no email storage) ### v1.1.0 - PKCE support (code challenge/verifier) - Rate limiting (Redis-based) - Token revocation endpoint - Enhanced logging ### v1.2.0 - WebAuthn support (passwordless) - Hardware security key support - Admin dashboard (audit logs) - Security metrics ### v2.0.0 - Multi-factor authentication - Federated identity providers - Advanced threat detection - SOC 2 compliance preparation ## References - OWASP Top 10: https://owasp.org/www-project-top-ten/ - OAuth 2.0 Security Best Practices: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics - NIST Cybersecurity Framework: https://www.nist.gov/cyberframework - CWE Top 25: https://cwe.mitre.org/top25/