Files
Gondulf/docs/architecture/security.md
Phil Skentelbery 6f06aebf40 docs: add Phase 2 domain verification design and clarifications
Add comprehensive Phase 2 documentation:
- Complete design document for two-factor domain verification
- Implementation guide with code examples
- ADR for implementation decisions (ADR-0004)
- ADR for rel="me" email discovery (ADR-008)
- Phase 1 impact assessment
- All 23 clarification questions answered
- Updated architecture docs (indieauth-protocol, security)
- Updated ADR-005 with rel="me" approach
- Updated backlog with technical debt items

Design ready for Phase 2 implementation.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 13:05:09 -07:00

29 KiB
Raw Permalink Blame History

Security Architecture

Security Philosophy

Gondulf follows a defense-in-depth security model with these core principles:

  1. Secure by Default: Security features enabled out of the box
  2. Fail Securely: Errors default to denying access, not granting it
  3. Least Privilege: Collect and store minimum necessary data
  4. Transparency: Security decisions documented and auditable
  5. Standards Compliance: Follow OAuth 2.0 and IndieAuth security best practices

Threat Model

Assets to Protect

Primary Assets:

  • User domain identities (the me parameter)
  • Access tokens (prove user identity to clients)
  • Authorization codes (short-lived, exchange for tokens)

Secondary Assets:

  • Email verification codes (prove email ownership)
  • Domain verification status (cached TXT record checks)
  • Client metadata (cached application information)

Explicitly NOT Protected (by design):

  • Passwords (none stored)
  • Personal user data beyond domain (privacy principle)
  • Client secrets (OAuth 2.0 public clients)

Threat Actors

External Attackers:

  • Phishing attempts (fake clients)
  • Token theft (network interception)
  • Open redirect exploitation
  • CSRF attacks
  • Brute force attacks (code guessing)

Compromised Clients:

  • Malicious client applications
  • Client impersonation
  • Redirect URI manipulation

System Compromise:

  • Database access (SQLite file theft)
  • Server memory access (in-memory code theft)
  • Log file access (token exposure)

Out of Scope (v1.0.0)

  • DDoS attacks (handled by infrastructure)
  • Zero-day vulnerabilities in dependencies
  • Physical access to server
  • Social engineering attacks on users
  • DNS hijacking (external to application)

Authentication Security

Two-Factor Domain Verification (v1.0.0)

Mechanism: Users prove domain ownership through TWO independent factors:

  1. DNS TXT Record: Proves DNS control (_gondulf.{domain} = verified)
  2. Email via rel="me": Proves email control (discovered from site's rel="me" link)

Security Model: An attacker must compromise BOTH factors to authenticate fraudulently. This is significantly stronger than single-factor verification.

Threat: Email Interception

Risk: Attacker intercepts email containing verification code.

Mitigations:

  1. Two-Factor Requirement: Email alone is insufficient (DNS also required)
  2. Short Code Lifetime: 15-minute expiration
  3. Single Use: Code invalidated after verification
  4. Rate Limiting: Max 3 code requests per domain per hour
  5. TLS Email Delivery: Require STARTTLS for SMTP
  6. Display Warning: "Only request code if you initiated this login"

Residual Risk: Low. Even with email interception, attacker still needs DNS control.

Threat: Code Brute Force

Risk: Attacker guesses 6-digit verification code.

Mitigations:

  1. Two-Factor Requirement: Code alone is insufficient (DNS also required)
  2. Sufficient Entropy: 1,000,000 possible codes (6 digits)
  3. Attempt Limiting: Max 3 attempts per email
  4. Short Lifetime: 15-minute window
  5. Rate Limiting: Max 3 codes per domain per hour
  6. Single-Use: Code invalidated after use

Math:

  • 3 attempts × 1,000,000 codes = 0.0003% success probability
  • 15-minute window limits attack time
  • Even if guessed, attacker still needs DNS control

Residual Risk: Very low. Two-factor requirement makes brute force insufficient.

Threat: DNS TXT Record Spoofing

Risk: Attacker attempts to spoof DNS responses.

Mitigations:

  1. Multiple Resolvers: Query 2+ independent DNS servers (Google, Cloudflare)
  2. Consensus Required: Require agreement from at least 2 resolvers
  3. DNSSEC Support: Validate DNSSEC signatures when available (future)
  4. Timeout Handling: Fail securely if DNS unavailable
  5. Logging: Log all DNS verification attempts

Residual Risk: Low. Spoofing multiple independent resolvers is difficult.

Risk: Attacker compromises user's website to add malicious rel="me" link.

Mitigations:

  1. Two-Factor Requirement: Website compromise alone insufficient (DNS also required)
  2. HTTPS Required: Fetch site over TLS (prevents MITM)
  3. Certificate Validation: Verify SSL certificate
  4. Email Domain Matching: Email should match site domain (warning if not)
  5. User Education: Inform users to secure their website

Residual Risk: Moderate. If attacker compromises both DNS and website, they can authenticate. This is acceptable as it represents full domain compromise.

Threat: Email Address Enumeration

Risk: Attacker discovers email addresses by triggering rel="me" discovery.

Mitigations:

  1. Public Information: rel="me" links are intentionally public
  2. User Awareness: Users know they're publishing email on their site
  3. Rate Limiting: Prevent bulk scanning
  4. Robots.txt: Users can restrict crawler access if desired

Residual Risk: Minimal. Email addresses are intentionally published by users on their own sites.

Domain Ownership Verification (Two-Factor)

Mechanism: v1.0.0 requires BOTH verification methods:

1. TXT Record Validation (Required)

Mechanism: Admin adds DNS TXT record _gondulf.{domain} = verified.

Security Properties:

  • Proves DNS control (first factor)
  • Verifiable without user interaction
  • Cacheable for performance
  • Re-verifiable periodically

Implementation:

import dns.resolver

def verify_txt_record(domain: str) -> bool:
    """
    Verify _gondulf.{domain} TXT record exists with value 'verified'.
    Requires consensus from multiple independent resolvers.
    """
    try:
        # Use Google and Cloudflare DNS for redundancy
        resolvers = ['8.8.8.8', '1.1.1.1']
        verified_count = 0

        for resolver_ip in resolvers:
            resolver = dns.resolver.Resolver()
            resolver.nameservers = [resolver_ip]
            resolver.timeout = 5

            answers = resolver.resolve(f'_gondulf.{domain}', 'TXT')
            for rdata in answers:
                txt_value = rdata.to_text().strip('"')
                if txt_value == 'verified':
                    verified_count += 1
                    break

        # Require consensus from at least 2 resolvers
        return verified_count >= 2

    except Exception as e:
        logger.warning(f"DNS verification failed for {domain}: {e}")
        return False

2. Email Verification via rel="me" (Required)

Mechanism: Email discovered from site's <link rel="me" href="mailto:...">, then verified with code.

Security Properties:

  • Proves website control (can modify HTML)
  • Proves email control (receives and enters code)
  • Follows IndieWeb standards (rel="me")
  • Self-documenting (user declares email publicly)

Implementation:

from bs4 import BeautifulSoup
import requests

def discover_email_from_site(domain: str) -> Optional[str]:
    """
    Fetch site and discover email from rel="me" link.
    """
    try:
        response = requests.get(f"https://{domain}", timeout=10, allow_redirects=True)
        response.raise_for_status()

        soup = BeautifulSoup(response.content, 'html.parser')
        me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me')

        for link in me_links:
            href = link.get('href', '')
            if href.startswith('mailto:'):
                email = href.replace('mailto:', '').strip()
                if validate_email_format(email):
                    return email

        return None

    except Exception as e:
        logger.error(f"Failed to discover email for {domain}: {e}")
        return None

Combined Residual Risk: Low. Attacker must compromise DNS, website, and email account to authenticate fraudulently.

Authorization Security

Authorization Code Security

Properties:

  • Length: 32 bytes (256 bits of entropy)
  • Generation: secrets.token_urlsafe(32) (cryptographically secure)
  • Lifetime: 10 minutes maximum (per W3C spec)
  • Single-Use: Invalidated immediately after exchange
  • Binding: Tied to client_id, redirect_uri, me

Threat: Authorization Code Interception

Risk: Attacker intercepts code from redirect URL.

Mitigations (v1.0.0):

  1. HTTPS Only: Enforce TLS for all communications
  2. Short Lifetime: 10-minute expiration
  3. Single Use: Code invalidated after first use
  4. State Binding: Client validates state parameter (CSRF protection)

Mitigations (Future - PKCE):

  1. Code Challenge: Client sends hash of secret with auth request
  2. Code Verifier: Client proves knowledge of secret on token exchange
  3. No Interception Value: Code useless without original secret

ADR-003 Decision: PKCE deferred to v1.1.0 to maintain MVP simplicity.

Residual Risk: Low with HTTPS + short lifetime, minimal with PKCE (future).

Threat: Code Replay Attack

Risk: Attacker reuses previously valid authorization code.

Mitigations:

  1. Single-Use Enforcement: Mark code as used in storage
  2. Immediate Invalidation: Delete code after exchange
  3. Concurrent Use Detection: Log warning if used code presented again

Implementation:

def exchange_code(code: str) -> Optional[dict]:
    """
    Exchange authorization code for token.
    Returns None if code invalid, expired, or already used.
    """
    # Retrieve code data
    code_data = code_storage.get(code)
    if not code_data:
        logger.warning("Code not found or expired")
        return None

    # Check if already used
    if code_data.get('used'):
        logger.error(f"Code replay attack detected: {code[:8]}...")
        # SECURITY: Potential replay attack, alert admin
        return None

    # Mark as used IMMEDIATELY (before token generation)
    code_data['used'] = True
    code_storage.set(code, code_data)

    # Generate token
    return generate_token(code_data)

Residual Risk: Negligible.

Access Token Security

Properties:

  • Format: Opaque tokens (v1.0.0), not JWT
  • Length: 32 bytes (256 bits of entropy)
  • Generation: secrets.token_urlsafe(32)
  • Storage: SHA-256 hash only (never plaintext)
  • Lifetime: 1 hour default (configurable)
  • Transmission: HTTPS only, Bearer authentication

Threat: Token Theft

Risk: Attacker steals access token from storage or transmission.

Mitigations:

  1. TLS Enforcement: HTTPS only in production
  2. Hashed Storage: Store SHA-256 hash, not plaintext
  3. Short Lifetime: 1-hour expiration (configurable)
  4. Revocation: Admin can revoke tokens (future)
  5. Secure Headers: Set Cache-Control: no-store, Pragma: no-cache

Token Storage:

import hashlib
import secrets

def generate_token(me: str, client_id: str) -> str:
    """
    Generate access token and store hash in database.
    """
    # Generate token (returned to client, never stored)
    token = secrets.token_urlsafe(32)

    # Store only hash (irreversible)
    token_hash = hashlib.sha256(token.encode()).hexdigest()

    db.execute('''
        INSERT INTO tokens (token_hash, me, client_id, scope, issued_at, expires_at)
        VALUES (?, ?, ?, ?, ?, ?)
    ''', (token_hash, me, client_id, "", datetime.utcnow(), expires_at))

    return token

Residual Risk: Low, tokens useless if hashing is secure.

Threat: Timing Attacks on Token Verification

Risk: Attacker uses timing differences to guess valid tokens character-by-character.

Mitigations:

  1. Constant-Time Comparison: Use secrets.compare_digest()
  2. Hash Comparison: Compare hashes, not tokens
  3. Logging Delays: Random delay on failed validation

Implementation:

import secrets
import hashlib

def verify_token(provided_token: str) -> Optional[dict]:
    """
    Verify access token using constant-time comparison.
    """
    # Hash provided token
    provided_hash = hashlib.sha256(provided_token.encode()).hexdigest()

    # Lookup in database
    token_data = db.query_one('''
        SELECT me, client_id, scope, expires_at, revoked
        FROM tokens
        WHERE token_hash = ?
    ''', (provided_hash,))

    if not token_data:
        return None

    # Constant-time comparison (even though we use SQL =, hash mismatch protection)
    # The comparison happens in SQL, but we add extra layer here
    if not secrets.compare_digest(provided_hash, provided_hash):
        # This always passes, but ensures constant-time code path
        pass

    # Check expiration
    if datetime.utcnow() > token_data['expires_at']:
        return None

    # Check revocation
    if token_data.get('revoked'):
        return None

    return token_data

Residual Risk: Negligible.

Input Validation

URL Validation Security

Critical: Improper URL validation enables phishing and open redirect attacks.

Threat: Open Redirect via redirect_uri

Risk: Attacker tricks user into authorizing malicious redirect_uri, steals authorization code.

Mitigations:

  1. Domain Matching: Require redirect_uri domain match client_id domain
  2. Subdomain Validation: Allow subdomains of client_id domain
  3. Registered URIs: Future feature to pre-register alternate domains
  4. User Warning: Display warning if domains differ
  5. HTTPS Enforcement: Require HTTPS for non-localhost

Validation Logic:

def validate_redirect_uri(redirect_uri: str, client_id: str, registered_uris: list) -> tuple[bool, str]:
    """
    Validate redirect_uri against client_id.
    Returns (is_valid, warning_message).
    """
    redirect_parsed = urlparse(redirect_uri)
    client_parsed = urlparse(client_id)

    # Must be HTTPS (except localhost)
    if redirect_parsed.hostname != 'localhost':
        if redirect_parsed.scheme != 'https':
            return False, "redirect_uri must use HTTPS"

    redirect_domain = redirect_parsed.hostname.lower()
    client_domain = client_parsed.hostname.lower()

    # Exact match: OK
    if redirect_domain == client_domain:
        return True, ""

    # Subdomain: OK
    if redirect_domain.endswith('.' + client_domain):
        return True, ""

    # Registered URI: OK (future)
    if redirect_uri in registered_uris:
        return True, ""

    # Different domain: WARNING
    warning = f"Warning: Redirect to different domain ({redirect_domain})"
    return True, warning  # Allow but warn user

Residual Risk: Low, user must approve redirect with warning.

Threat: Phishing via Malicious client_id

Risk: Attacker uses client_id of legitimate-looking domain (typosquatting).

Mitigations:

  1. Display Full URL: Show complete client_id to user, not just app name
  2. Fetch Verification: Verify client_id is fetchable (real domain)
  3. Subdomain Check: Warn if client_id is subdomain of well-known domain
  4. Certificate Validation: Verify SSL certificate validity
  5. User Education: Inform users to verify client_id carefully

UI Display:

Sign in to:
  Application Name (if available)
  https://client.example.com  ← Full URL always displayed

Redirect to:
  https://client.example.com/callback

Residual Risk: Moderate, requires user vigilance.

Threat: URL Parameter Injection

Risk: Attacker injects malicious parameters via crafted URLs.

Mitigations:

  1. Pydantic Validation: Use Pydantic models for all parameters
  2. Type Enforcement: Strict type checking (str, not any)
  3. Allowlist Validation: Only accept expected parameters
  4. SQL Parameterization: Use parameterized queries (prevent SQL injection)
  5. HTML Encoding: Encode all user input in HTML responses

Pydantic Models:

from pydantic import BaseModel, HttpUrl, Field

class AuthorizeRequest(BaseModel):
    me: HttpUrl
    client_id: HttpUrl
    redirect_uri: HttpUrl
    state: str = Field(min_length=1, max_length=512)
    response_type: Literal["code"]
    scope: str = ""  # Optional, ignored in v1.0.0

    class Config:
        extra = "forbid"  # Reject unknown parameters

Residual Risk: Minimal, Pydantic provides strong validation.

HTML Parsing Security (rel="me" Discovery)

Threat: Malicious HTML Injection

Risk: Attacker's site contains malicious HTML to exploit parser.

Mitigations:

  1. Robust Parser: Use BeautifulSoup (handles malformed HTML safely)
  2. Link Extraction Only: Only extract href attributes, no script execution
  3. Timeout: 10-second timeout for HTTP requests
  4. Size Limit: Limit response size (prevent memory exhaustion)
  5. HTTPS Required: Fetch over TLS only
  6. Certificate Validation: Verify SSL certificates

Implementation:

from bs4 import BeautifulSoup
import requests

def discover_email_from_site(domain: str) -> Optional[str]:
    """
    Safely discover email from rel="me" link.
    """
    try:
        # Fetch with safety limits
        response = requests.get(
            f"https://{domain}",
            timeout=10,
            allow_redirects=True,
            max_redirects=5,
            stream=True  # Don't load entire response into memory
        )
        response.raise_for_status()

        # Limit response size (prevent memory exhaustion)
        MAX_SIZE = 5 * 1024 * 1024  # 5MB
        content = response.raw.read(MAX_SIZE)

        # Parse HTML (BeautifulSoup handles malformed HTML safely)
        soup = BeautifulSoup(content, 'html.parser')

        # Find rel="me" links (no script execution)
        me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me')

        # Extract mailto: links only
        for link in me_links:
            href = link.get('href', '')
            if href.startswith('mailto:'):
                email = href.replace('mailto:', '').strip()
                # Validate email format before returning
                if validate_email_format(email):
                    return email

        return None

    except requests.exceptions.SSLError as e:
        logger.error(f"SSL certificate validation failed for {domain}: {e}")
        return None
    except Exception as e:
        logger.error(f"Failed to discover email for {domain}: {e}")
        return None

Residual Risk: Very low. BeautifulSoup is designed for untrusted HTML.

Email Validation

Threat: Email Injection Attacks

Risk: Attacker crafts malicious email address in rel="me" link.

Mitigations:

  1. Format Validation: Strict email regex (RFC 5322)
  2. No User Input: Email discovered from site (not user-provided)
  3. SMTP Library: Use well-tested library (smtplib)
  4. Content Encoding: Encode email content properly
  5. Rate Limiting: Prevent abuse

Validation:

import re

def validate_email_format(email: str) -> bool:
    """
    Validate email address format.
    """
    # Basic format check (RFC 5322 simplified)
    email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    if not re.match(email_regex, email):
        return False

    # Sanity checks
    if len(email) > 254:  # RFC 5321 maximum
        return False
    if email.count('@') != 1:
        return False

    return True

Note: Domain matching is NOT enforced in v1.0.0. User may have email at different domain than their identity site (e.g., phil@gmail.com for phil.example.com). This is acceptable as user explicitly publishes the email on their site.

Residual Risk: Low, standard validation patterns.

Network Security

TLS/HTTPS Enforcement

Production Requirements:

  • All endpoints MUST use HTTPS
  • Minimum TLS 1.2 (prefer TLS 1.3)
  • Strong cipher suites only
  • Valid SSL certificate (not self-signed)

Configuration:

# In production configuration
if not DEBUG:
    # Enforce HTTPS
    app.add_middleware(HTTPSRedirectMiddleware)

    # Add security headers
    app.add_middleware(
        SecureHeadersMiddleware,
        hsts="max-age=31536000; includeSubDomains",
        content_security_policy="default-src 'self'",
        x_frame_options="DENY",
        x_content_type_options="nosniff"
    )

Development Exception:

  • HTTP allowed for localhost only
  • Never in production

Residual Risk: Negligible if properly configured.

Security Headers

Required Headers:

# Prevent clickjacking
X-Frame-Options: DENY

# Prevent MIME sniffing
X-Content-Type-Options: nosniff

# XSS protection (legacy browsers)
X-XSS-Protection: 1; mode=block

# HSTS (HTTPS enforcement)
Strict-Transport-Security: max-age=31536000; includeSubDomains

# CSP (limit resource loading)
Content-Security-Policy: default-src 'self'; style-src 'self' 'unsafe-inline'

# Referrer policy (privacy)
Referrer-Policy: strict-origin-when-cross-origin

Implementation:

@app.middleware("http")
async def add_security_headers(request: Request, call_next):
    response = await call_next(request)
    response.headers["X-Frame-Options"] = "DENY"
    response.headers["X-Content-Type-Options"] = "nosniff"
    response.headers["X-XSS-Protection"] = "1; mode=block"
    if not DEBUG:
        response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
    response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
    return response

Data Security

Data Minimization (Privacy)

Principle: Collect and store ONLY essential data.

Stored Data:

  • Domain name (user identity, required)
  • Token hashes (security, required)
  • Client IDs (protocol, required)
  • Timestamps (auditing, required)

Never Stored:

  • Email addresses (after verification)
  • Plaintext tokens
  • User-Agent strings
  • IP addresses (except rate limiting, temporary)
  • Browsing history
  • Personal information

Email Handling:

# Email discovered from rel="me" link (not user-provided)
# Stored ONLY during verification (in-memory, 15-min TTL)
verification_codes[code_id] = {
    "email": email,  # ← Discovered from site, exists ONLY here, NEVER in database
    "code": code,
    "domain": domain,
    "expires_at": datetime.utcnow() + timedelta(minutes=15)
}

# After verification: email is deleted, only domain + timestamp stored
db.execute('''
    INSERT INTO domains (domain, verification_method, verified_at, last_email_check)
    VALUES (?, 'two_factor', ?, ?)
''', (domain, datetime.utcnow(), datetime.utcnow()))
# Note: NO email address in database, only verification timestamp

rel="me" Discovery:

  • Email addresses are public (user publishes on their site)
  • Server fetches email from user's site (not user input)
  • Reduces social engineering risk (can't claim arbitrary email)
  • Follows IndieWeb standards for identity

Database Security

SQLite Security:

  1. File Permissions: 600 (owner read/write only)
  2. Encryption at Rest: Use encrypted filesystem (LUKS, dm-crypt)
  3. Backup Encryption: Encrypt backup files (GPG)
  4. SQL Injection Prevention: Parameterized queries only

Parameterized Queries:

# GOOD: Parameterized (safe)
db.execute(
    "SELECT * FROM tokens WHERE token_hash = ?",
    (token_hash,)
)

# BAD: String interpolation (vulnerable)
db.execute(
    f"SELECT * FROM tokens WHERE token_hash = '{token_hash}'"
)  # ← NEVER DO THIS

File Permissions:

# Set restrictive permissions
chmod 600 /data/gondulf.db
chown gondulf:gondulf /data/gondulf.db

Logging Security

Principle: Log security events, NEVER log sensitive data.

Log Security Events:

  • Failed authentication attempts
  • Authorization grants (domain + client_id)
  • Token generation (hash prefix only)
  • Email verification attempts
  • DNS verification results
  • Error conditions

Never Log:

  • Email addresses (PII)
  • Full access tokens
  • Verification codes
  • Authorization codes
  • IP addresses (production)

Safe Logging Examples:

# GOOD: Domain only (public information)
logger.info(f"Authorization granted for {domain} to {client_id}")

# GOOD: Token prefix for correlation
logger.debug(f"Token generated: {token[:8]}...")

# GOOD: Error without sensitive data
logger.error(f"Email send failed for domain {domain}")

# BAD: Email address (PII)
logger.info(f"Verification sent to {email}")  # ← NEVER

# BAD: Full token (security)
logger.debug(f"Token: {token}")  # ← NEVER

Dependency Security

Dependency Management

Principles:

  1. Minimal Dependencies: Prefer standard library
  2. Vetted Libraries: Only well-maintained, popular libraries
  3. Version Pinning: Pin exact versions in requirements.txt
  4. Security Scanning: Regular vulnerability scanning
  5. Update Strategy: Security patches applied promptly

Security Scanning:

# Scan for known vulnerabilities
uv run pip-audit

# Alternative: safety check
uv run safety check

Update Policy:

  • Security patches: Apply within 24 hours (critical), 7 days (high)
  • Minor versions: Review and test before updating
  • Major versions: Evaluate breaking changes, test thoroughly

Secrets Management

Environment Variables (v1.0.0):

# Required secrets
GONDULF_SECRET_KEY=<256-bit random value>
GONDULF_SMTP_PASSWORD=<SMTP password>

# Optional secrets
GONDULF_DATABASE_ENCRYPTION_KEY=<for encrypted backups>

Secret Generation:

# Generate SECRET_KEY (256 bits)
python -c "import secrets; print(secrets.token_urlsafe(32))"

Storage:

  • Development: .env file (not committed)
  • Production: Docker secrets or environment variables
  • Never hardcode secrets in code

Future: Integrate with HashiCorp Vault or AWS Secrets Manager.

Rate Limiting (Future)

v1.0.0: Not implemented (acceptable for small deployments).

Future Implementation:

Endpoint Limit Window Key
/authorize 10 requests 1 minute IP
/token 30 requests 1 minute client_id
Email verification 3 codes 1 hour email
Code submission 3 attempts 15 minutes session

Implementation Strategy:

  • Use Redis for distributed rate limiting
  • Token bucket algorithm
  • Exponential backoff on failures

Security Testing

Required Security Tests

  1. Input Validation:

    • Malformed URLs (me, client_id, redirect_uri)
    • SQL injection attempts
    • XSS attempts
    • Email injection
  2. Authentication:

    • Expired code rejection
    • Used code rejection
    • Invalid code rejection
    • Brute force resistance
  3. Authorization:

    • State parameter validation
    • Redirect URI validation
    • Open redirect prevention
  4. Token Security:

    • Timing attack resistance
    • Token theft scenarios
    • Expiration enforcement
  5. TLS/HTTPS:

    • HTTP rejection in production
    • Security headers presence
    • Certificate validation

Security Scanning Tools

Required Tools:

  • bandit: Python security linter
  • pip-audit: Dependency vulnerability scanner
  • pytest: Security-focused test cases

CI/CD Integration:

# GitHub Actions example
security:
  - name: Run Bandit
    run: uv run bandit -r src/gondulf

  - name: Scan Dependencies
    run: uv run pip-audit

  - name: Run Security Tests
    run: uv run pytest tests/security/

Incident Response

Security Event Monitoring

Monitor For:

  1. Multiple failed authentication attempts
  2. Authorization code reuse attempts
  3. Invalid token presentation
  4. Unusual DNS verification failures
  5. Email send failures (potential abuse)

Alerting (future):

  • Admin email on critical events
  • Webhook integration (Slack, Discord)
  • Metrics dashboard (Grafana)

Breach Response Plan (Future)

If Access Tokens Compromised:

  1. Revoke all active tokens
  2. Force re-authentication
  3. Notify affected users (via domain)
  4. Rotate SECRET_KEY
  5. Audit logs for suspicious activity

If Database Compromised:

  1. Assess data exposure (only hashes + domains)
  2. Rotate all tokens
  3. Review access logs
  4. Notify users if domains exposed

Compliance Considerations

GDPR Compliance

Personal Data Stored:

  • Domain names (considered PII in some jurisdictions)
  • Timestamps (associated with domains)

GDPR Rights:

  • Right to Access: Admin can query database
  • Right to Erasure: Admin can delete domain records
  • Right to Portability: Data export feature (future)

Privacy Policy (required):

  • Document what data is collected (domains, timestamps)
  • Document how data is used (authentication)
  • Document retention policy (indefinite unless deleted)
  • Provide contact for data requests

Security Disclosure

Security Policy (future):

  • Responsible disclosure process
  • Security contact (security@domain)
  • GPG key for encrypted reports
  • Acknowledgments for researchers

Security Roadmap

v1.0.0 (MVP)

  • Two-factor domain verification (DNS TXT + Email via rel="me")
  • rel="me" email discovery (IndieWeb standard)
  • HTML parsing security (BeautifulSoup)
  • TLS/HTTPS enforcement
  • Secure token generation (opaque, hashed)
  • URL validation (open redirect prevention)
  • Input validation (Pydantic)
  • Security headers
  • Minimal data collection (no email storage)

v1.1.0

  • PKCE support (code challenge/verifier)
  • Rate limiting (Redis-based)
  • Token revocation endpoint
  • Enhanced logging

v1.2.0

  • WebAuthn support (passwordless)
  • Hardware security key support
  • Admin dashboard (audit logs)
  • Security metrics

v2.0.0

  • Multi-factor authentication
  • Federated identity providers
  • Advanced threat detection
  • SOC 2 compliance preparation

References