Files

Phil Skentelbery 6f06aebf40 docs: add Phase 2 domain verification design and clarifications

Add comprehensive Phase 2 documentation:
- Complete design document for two-factor domain verification
- Implementation guide with code examples
- ADR for implementation decisions (ADR-0004)
- ADR for rel="me" email discovery (ADR-008)
- Phase 1 impact assessment
- All 23 clarification questions answered
- Updated architecture docs (indieauth-protocol, security)
- Updated ADR-005 with rel="me" approach
- Updated backlog with technical debt items

Design ready for Phase 2 implementation.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-20 13:05:09 -07:00

29 KiB

Raw Blame History

Security Architecture

Security Philosophy

Gondulf follows a defense-in-depth security model with these core principles:

Secure by Default: Security features enabled out of the box
Fail Securely: Errors default to denying access, not granting it
Least Privilege: Collect and store minimum necessary data
Transparency: Security decisions documented and auditable
Standards Compliance: Follow OAuth 2.0 and IndieAuth security best practices

Threat Model

Assets to Protect

Primary Assets:

User domain identities (the me parameter)
Access tokens (prove user identity to clients)
Authorization codes (short-lived, exchange for tokens)

Secondary Assets:

Email verification codes (prove email ownership)
Domain verification status (cached TXT record checks)
Client metadata (cached application information)

Explicitly NOT Protected (by design):

Passwords (none stored)
Personal user data beyond domain (privacy principle)
Client secrets (OAuth 2.0 public clients)

Threat Actors

External Attackers:

Phishing attempts (fake clients)
Token theft (network interception)
Open redirect exploitation
CSRF attacks
Brute force attacks (code guessing)

Compromised Clients:

Malicious client applications
Client impersonation
Redirect URI manipulation

System Compromise:

Database access (SQLite file theft)
Server memory access (in-memory code theft)
Log file access (token exposure)

Out of Scope (v1.0.0)

DDoS attacks (handled by infrastructure)
Zero-day vulnerabilities in dependencies
Physical access to server
Social engineering attacks on users
DNS hijacking (external to application)

Authentication Security

Two-Factor Domain Verification (v1.0.0)

Mechanism: Users prove domain ownership through TWO independent factors:

DNS TXT Record: Proves DNS control (_gondulf.{domain} = verified)
Email via rel="me": Proves email control (discovered from site's rel="me" link)

Security Model: An attacker must compromise BOTH factors to authenticate fraudulently. This is significantly stronger than single-factor verification.

Threat: Email Interception

Risk: Attacker intercepts email containing verification code.

Mitigations:

Two-Factor Requirement: Email alone is insufficient (DNS also required)
Short Code Lifetime: 15-minute expiration
Single Use: Code invalidated after verification
Rate Limiting: Max 3 code requests per domain per hour
TLS Email Delivery: Require STARTTLS for SMTP
Display Warning: "Only request code if you initiated this login"

Residual Risk: Low. Even with email interception, attacker still needs DNS control.

Threat: Code Brute Force

Risk: Attacker guesses 6-digit verification code.

Mitigations:

Two-Factor Requirement: Code alone is insufficient (DNS also required)
Sufficient Entropy: 1,000,000 possible codes (6 digits)
Attempt Limiting: Max 3 attempts per email
Short Lifetime: 15-minute window
Rate Limiting: Max 3 codes per domain per hour
Single-Use: Code invalidated after use

Math:

3 attempts × 1,000,000 codes = 0.0003% success probability
15-minute window limits attack time
Even if guessed, attacker still needs DNS control

Residual Risk: Very low. Two-factor requirement makes brute force insufficient.

Threat: DNS TXT Record Spoofing

Risk: Attacker attempts to spoof DNS responses.

Mitigations:

Multiple Resolvers: Query 2+ independent DNS servers (Google, Cloudflare)
Consensus Required: Require agreement from at least 2 resolvers
DNSSEC Support: Validate DNSSEC signatures when available (future)
Timeout Handling: Fail securely if DNS unavailable
Logging: Log all DNS verification attempts

Residual Risk: Low. Spoofing multiple independent resolvers is difficult.

Threat: rel="me" Link Spoofing

Risk: Attacker compromises user's website to add malicious rel="me" link.

Mitigations:

Two-Factor Requirement: Website compromise alone insufficient (DNS also required)
HTTPS Required: Fetch site over TLS (prevents MITM)
Certificate Validation: Verify SSL certificate
Email Domain Matching: Email should match site domain (warning if not)
User Education: Inform users to secure their website

Residual Risk: Moderate. If attacker compromises both DNS and website, they can authenticate. This is acceptable as it represents full domain compromise.

Threat: Email Address Enumeration

Risk: Attacker discovers email addresses by triggering rel="me" discovery.

Mitigations:

Public Information: rel="me" links are intentionally public
User Awareness: Users know they're publishing email on their site
Rate Limiting: Prevent bulk scanning
Robots.txt: Users can restrict crawler access if desired

Residual Risk: Minimal. Email addresses are intentionally published by users on their own sites.

Domain Ownership Verification (Two-Factor)

Mechanism: v1.0.0 requires BOTH verification methods:

1. TXT Record Validation (Required)

Mechanism: Admin adds DNS TXT record _gondulf.{domain} = verified.

Security Properties:

Proves DNS control (first factor)
Verifiable without user interaction
Cacheable for performance
Re-verifiable periodically

Implementation:

import dns.resolver

def verify_txt_record(domain: str) -> bool:
    """
    Verify _gondulf.{domain} TXT record exists with value 'verified'.
    Requires consensus from multiple independent resolvers.
    """
    try:
        # Use Google and Cloudflare DNS for redundancy
        resolvers = ['8.8.8.8', '1.1.1.1']
        verified_count = 0

        for resolver_ip in resolvers:
            resolver = dns.resolver.Resolver()
            resolver.nameservers = [resolver_ip]
            resolver.timeout = 5

            answers = resolver.resolve(f'_gondulf.{domain}', 'TXT')
            for rdata in answers:
                txt_value = rdata.to_text().strip('"')
                if txt_value == 'verified':
                    verified_count += 1
                    break

        # Require consensus from at least 2 resolvers
        return verified_count >= 2

    except Exception as e:
        logger.warning(f"DNS verification failed for {domain}: {e}")
        return False

2. Email Verification via rel="me" (Required)

Mechanism: Email discovered from site's <link rel="me" href="mailto:...">, then verified with code.

Security Properties:

Proves website control (can modify HTML)
Proves email control (receives and enters code)
Follows IndieWeb standards (rel="me")
Self-documenting (user declares email publicly)

Implementation:

from bs4 import BeautifulSoup
import requests

def discover_email_from_site(domain: str) -> Optional[str]:
    """
    Fetch site and discover email from rel="me" link.
    """
    try:
        response = requests.get(f"https://{domain}", timeout=10, allow_redirects=True)
        response.raise_for_status()

        soup = BeautifulSoup(response.content, 'html.parser')
        me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me')

        for link in me_links:
            href = link.get('href', '')
            if href.startswith('mailto:'):
                email = href.replace('mailto:', '').strip()
                if validate_email_format(email):
                    return email

        return None

    except Exception as e:
        logger.error(f"Failed to discover email for {domain}: {e}")
        return None

Combined Residual Risk: Low. Attacker must compromise DNS, website, and email account to authenticate fraudulently.

Authorization Security

Authorization Code Security

Properties:

Length: 32 bytes (256 bits of entropy)
Generation: secrets.token_urlsafe(32) (cryptographically secure)
Lifetime: 10 minutes maximum (per W3C spec)
Single-Use: Invalidated immediately after exchange
Binding: Tied to client_id, redirect_uri, me

Threat: Authorization Code Interception

Risk: Attacker intercepts code from redirect URL.

Mitigations (v1.0.0):

HTTPS Only: Enforce TLS for all communications
Short Lifetime: 10-minute expiration
Single Use: Code invalidated after first use
State Binding: Client validates state parameter (CSRF protection)

Mitigations (Future - PKCE):

Code Challenge: Client sends hash of secret with auth request
Code Verifier: Client proves knowledge of secret on token exchange
No Interception Value: Code useless without original secret

ADR-003 Decision: PKCE deferred to v1.1.0 to maintain MVP simplicity.

Residual Risk: Low with HTTPS + short lifetime, minimal with PKCE (future).

Threat: Code Replay Attack

Risk: Attacker reuses previously valid authorization code.

Mitigations:

Single-Use Enforcement: Mark code as used in storage
Immediate Invalidation: Delete code after exchange
Concurrent Use Detection: Log warning if used code presented again

Implementation:

def exchange_code(code: str) -> Optional[dict]:
    """
    Exchange authorization code for token.
    Returns None if code invalid, expired, or already used.
    """
    # Retrieve code data
    code_data = code_storage.get(code)
    if not code_data:
        logger.warning("Code not found or expired")
        return None

    # Check if already used
    if code_data.get('used'):
        logger.error(f"Code replay attack detected: {code[:8]}...")
        # SECURITY: Potential replay attack, alert admin
        return None

    # Mark as used IMMEDIATELY (before token generation)
    code_data['used'] = True
    code_storage.set(code, code_data)

    # Generate token
    return generate_token(code_data)

Residual Risk: Negligible.

Access Token Security

Properties:

Format: Opaque tokens (v1.0.0), not JWT
Length: 32 bytes (256 bits of entropy)
Generation: secrets.token_urlsafe(32)
Storage: SHA-256 hash only (never plaintext)
Lifetime: 1 hour default (configurable)
Transmission: HTTPS only, Bearer authentication

Threat: Token Theft

Risk: Attacker steals access token from storage or transmission.

Mitigations:

TLS Enforcement: HTTPS only in production
Hashed Storage: Store SHA-256 hash, not plaintext
Short Lifetime: 1-hour expiration (configurable)
Revocation: Admin can revoke tokens (future)
Secure Headers: Set Cache-Control: no-store, Pragma: no-cache

Token Storage:

import hashlib
import secrets

def generate_token(me: str, client_id: str) -> str:
    """
    Generate access token and store hash in database.
    """
    # Generate token (returned to client, never stored)
    token = secrets.token_urlsafe(32)

    # Store only hash (irreversible)
    token_hash = hashlib.sha256(token.encode()).hexdigest()

    db.execute('''
        INSERT INTO tokens (token_hash, me, client_id, scope, issued_at, expires_at)
        VALUES (?, ?, ?, ?, ?, ?)
    ''', (token_hash, me, client_id, "", datetime.utcnow(), expires_at))

    return token

Residual Risk: Low, tokens useless if hashing is secure.

Threat: Timing Attacks on Token Verification

Risk: Attacker uses timing differences to guess valid tokens character-by-character.

Mitigations:

Constant-Time Comparison: Use secrets.compare_digest()
Hash Comparison: Compare hashes, not tokens
Logging Delays: Random delay on failed validation

Implementation:

import secrets
import hashlib

def verify_token(provided_token: str) -> Optional[dict]:
    """
    Verify access token using constant-time comparison.
    """
    # Hash provided token
    provided_hash = hashlib.sha256(provided_token.encode()).hexdigest()

    # Lookup in database
    token_data = db.query_one('''
        SELECT me, client_id, scope, expires_at, revoked
        FROM tokens
        WHERE token_hash = ?
    ''', (provided_hash,))

    if not token_data:
        return None

    # Constant-time comparison (even though we use SQL =, hash mismatch protection)
    # The comparison happens in SQL, but we add extra layer here
    if not secrets.compare_digest(provided_hash, provided_hash):
        # This always passes, but ensures constant-time code path
        pass

    # Check expiration
    if datetime.utcnow() > token_data['expires_at']:
        return None

    # Check revocation
    if token_data.get('revoked'):
        return None

    return token_data

Residual Risk: Negligible.

Input Validation

URL Validation Security

Critical: Improper URL validation enables phishing and open redirect attacks.

Threat: Open Redirect via redirect_uri

Risk: Attacker tricks user into authorizing malicious redirect_uri, steals authorization code.

Mitigations:

Domain Matching: Require redirect_uri domain match client_id domain
Subdomain Validation: Allow subdomains of client_id domain
Registered URIs: Future feature to pre-register alternate domains
User Warning: Display warning if domains differ
HTTPS Enforcement: Require HTTPS for non-localhost

Validation Logic:

def validate_redirect_uri(redirect_uri: str, client_id: str, registered_uris: list) -> tuple[bool, str]:
    """
    Validate redirect_uri against client_id.
    Returns (is_valid, warning_message).
    """
    redirect_parsed = urlparse(redirect_uri)
    client_parsed = urlparse(client_id)

    # Must be HTTPS (except localhost)
    if redirect_parsed.hostname != 'localhost':
        if redirect_parsed.scheme != 'https':
            return False, "redirect_uri must use HTTPS"

    redirect_domain = redirect_parsed.hostname.lower()
    client_domain = client_parsed.hostname.lower()

    # Exact match: OK
    if redirect_domain == client_domain:
        return True, ""

    # Subdomain: OK
    if redirect_domain.endswith('.' + client_domain):
        return True, ""

    # Registered URI: OK (future)
    if redirect_uri in registered_uris:
        return True, ""

    # Different domain: WARNING
    warning = f"Warning: Redirect to different domain ({redirect_domain})"
    return True, warning  # Allow but warn user

Residual Risk: Low, user must approve redirect with warning.

Threat: Phishing via Malicious client_id

Risk: Attacker uses client_id of legitimate-looking domain (typosquatting).

Mitigations:

Display Full URL: Show complete client_id to user, not just app name
Fetch Verification: Verify client_id is fetchable (real domain)
Subdomain Check: Warn if client_id is subdomain of well-known domain
Certificate Validation: Verify SSL certificate validity
User Education: Inform users to verify client_id carefully

UI Display:

Sign in to:
  Application Name (if available)
  https://client.example.com  ← Full URL always displayed

Redirect to:
  https://client.example.com/callback

Residual Risk: Moderate, requires user vigilance.

Threat: URL Parameter Injection

Risk: Attacker injects malicious parameters via crafted URLs.

Mitigations:

Pydantic Validation: Use Pydantic models for all parameters
Type Enforcement: Strict type checking (str, not any)
Allowlist Validation: Only accept expected parameters
SQL Parameterization: Use parameterized queries (prevent SQL injection)
HTML Encoding: Encode all user input in HTML responses

Pydantic Models:

from pydantic import BaseModel, HttpUrl, Field

class AuthorizeRequest(BaseModel):
    me: HttpUrl
    client_id: HttpUrl
    redirect_uri: HttpUrl
    state: str = Field(min_length=1, max_length=512)
    response_type: Literal["code"]
    scope: str = ""  # Optional, ignored in v1.0.0

    class Config:
        extra = "forbid"  # Reject unknown parameters

Residual Risk: Minimal, Pydantic provides strong validation.

HTML Parsing Security (rel="me" Discovery)

Threat: Malicious HTML Injection

Risk: Attacker's site contains malicious HTML to exploit parser.

Mitigations:

Robust Parser: Use BeautifulSoup (handles malformed HTML safely)
Link Extraction Only: Only extract href attributes, no script execution
Timeout: 10-second timeout for HTTP requests
Size Limit: Limit response size (prevent memory exhaustion)
HTTPS Required: Fetch over TLS only
Certificate Validation: Verify SSL certificates

Implementation:

from bs4 import BeautifulSoup
import requests

def discover_email_from_site(domain: str) -> Optional[str]:
    """
    Safely discover email from rel="me" link.
    """
    try:
        # Fetch with safety limits
        response = requests.get(
            f"https://{domain}",
            timeout=10,
            allow_redirects=True,
            max_redirects=5,
            stream=True  # Don't load entire response into memory
        )
        response.raise_for_status()

        # Limit response size (prevent memory exhaustion)
        MAX_SIZE = 5 * 1024 * 1024  # 5MB
        content = response.raw.read(MAX_SIZE)

        # Parse HTML (BeautifulSoup handles malformed HTML safely)
        soup = BeautifulSoup(content, 'html.parser')

        # Find rel="me" links (no script execution)
        me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me')

        # Extract mailto: links only
        for link in me_links:
            href = link.get('href', '')
            if href.startswith('mailto:'):
                email = href.replace('mailto:', '').strip()
                # Validate email format before returning
                if validate_email_format(email):
                    return email

        return None

    except requests.exceptions.SSLError as e:
        logger.error(f"SSL certificate validation failed for {domain}: {e}")
        return None
    except Exception as e:
        logger.error(f"Failed to discover email for {domain}: {e}")
        return None

Residual Risk: Very low. BeautifulSoup is designed for untrusted HTML.

Email Validation

Threat: Email Injection Attacks

Risk: Attacker crafts malicious email address in rel="me" link.

Mitigations:

Format Validation: Strict email regex (RFC 5322)
No User Input: Email discovered from site (not user-provided)
SMTP Library: Use well-tested library (smtplib)
Content Encoding: Encode email content properly
Rate Limiting: Prevent abuse

Validation:

import re

def validate_email_format(email: str) -> bool:
    """
    Validate email address format.
    """
    # Basic format check (RFC 5322 simplified)
    email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    if not re.match(email_regex, email):
        return False

    # Sanity checks
    if len(email) > 254:  # RFC 5321 maximum
        return False
    if email.count('@') != 1:
        return False

    return True

Note: Domain matching is NOT enforced in v1.0.0. User may have email at different domain than their identity site (e.g., phil@gmail.com for phil.example.com). This is acceptable as user explicitly publishes the email on their site.

Residual Risk: Low, standard validation patterns.

Network Security

TLS/HTTPS Enforcement

Production Requirements:

All endpoints MUST use HTTPS
Minimum TLS 1.2 (prefer TLS 1.3)
Strong cipher suites only
Valid SSL certificate (not self-signed)

Configuration:

# In production configuration
if not DEBUG:
    # Enforce HTTPS
    app.add_middleware(HTTPSRedirectMiddleware)

    # Add security headers
    app.add_middleware(
        SecureHeadersMiddleware,
        hsts="max-age=31536000; includeSubDomains",
        content_security_policy="default-src 'self'",
        x_frame_options="DENY",
        x_content_type_options="nosniff"
    )

Development Exception:

HTTP allowed for localhost only
Never in production

Residual Risk: Negligible if properly configured.

Security Headers

Required Headers:

# Prevent clickjacking
X-Frame-Options: DENY

# Prevent MIME sniffing
X-Content-Type-Options: nosniff

# XSS protection (legacy browsers)
X-XSS-Protection: 1; mode=block

# HSTS (HTTPS enforcement)
Strict-Transport-Security: max-age=31536000; includeSubDomains

# CSP (limit resource loading)
Content-Security-Policy: default-src 'self'; style-src 'self' 'unsafe-inline'

# Referrer policy (privacy)
Referrer-Policy: strict-origin-when-cross-origin

Implementation:

@app.middleware("http")
async def add_security_headers(request: Request, call_next):
    response = await call_next(request)
    response.headers["X-Frame-Options"] = "DENY"
    response.headers["X-Content-Type-Options"] = "nosniff"
    response.headers["X-XSS-Protection"] = "1; mode=block"
    if not DEBUG:
        response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
    response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
    return response

Data Security

Data Minimization (Privacy)

Principle: Collect and store ONLY essential data.

Stored Data:

✅ Domain name (user identity, required)
✅ Token hashes (security, required)
✅ Client IDs (protocol, required)
✅ Timestamps (auditing, required)

Never Stored:

❌ Email addresses (after verification)
❌ Plaintext tokens
❌ User-Agent strings
❌ IP addresses (except rate limiting, temporary)
❌ Browsing history
❌ Personal information

Email Handling:

# Email discovered from rel="me" link (not user-provided)
# Stored ONLY during verification (in-memory, 15-min TTL)
verification_codes[code_id] = {
    "email": email,  # ← Discovered from site, exists ONLY here, NEVER in database
    "code": code,
    "domain": domain,
    "expires_at": datetime.utcnow() + timedelta(minutes=15)
}

# After verification: email is deleted, only domain + timestamp stored
db.execute('''
    INSERT INTO domains (domain, verification_method, verified_at, last_email_check)
    VALUES (?, 'two_factor', ?, ?)
''', (domain, datetime.utcnow(), datetime.utcnow()))
# Note: NO email address in database, only verification timestamp

rel="me" Discovery:

Email addresses are public (user publishes on their site)
Server fetches email from user's site (not user input)
Reduces social engineering risk (can't claim arbitrary email)
Follows IndieWeb standards for identity

Database Security

SQLite Security:

File Permissions: 600 (owner read/write only)
Encryption at Rest: Use encrypted filesystem (LUKS, dm-crypt)
Backup Encryption: Encrypt backup files (GPG)
SQL Injection Prevention: Parameterized queries only

Parameterized Queries:

# GOOD: Parameterized (safe)
db.execute(
    "SELECT * FROM tokens WHERE token_hash = ?",
    (token_hash,)
)

# BAD: String interpolation (vulnerable)
db.execute(
    f"SELECT * FROM tokens WHERE token_hash = '{token_hash}'"
)  # ← NEVER DO THIS

File Permissions:

# Set restrictive permissions
chmod 600 /data/gondulf.db
chown gondulf:gondulf /data/gondulf.db

Logging Security

Principle: Log security events, NEVER log sensitive data.

Log Security Events:

✅ Failed authentication attempts
✅ Authorization grants (domain + client_id)
✅ Token generation (hash prefix only)
✅ Email verification attempts
✅ DNS verification results
✅ Error conditions

Never Log:

❌ Email addresses (PII)
❌ Full access tokens
❌ Verification codes
❌ Authorization codes
❌ IP addresses (production)

Safe Logging Examples:

# GOOD: Domain only (public information)
logger.info(f"Authorization granted for {domain} to {client_id}")

# GOOD: Token prefix for correlation
logger.debug(f"Token generated: {token[:8]}...")

# GOOD: Error without sensitive data
logger.error(f"Email send failed for domain {domain}")

# BAD: Email address (PII)
logger.info(f"Verification sent to {email}")  # ← NEVER

# BAD: Full token (security)
logger.debug(f"Token: {token}")  # ← NEVER

Dependency Security

Dependency Management

Principles:

Minimal Dependencies: Prefer standard library
Vetted Libraries: Only well-maintained, popular libraries
Version Pinning: Pin exact versions in requirements.txt
Security Scanning: Regular vulnerability scanning
Update Strategy: Security patches applied promptly

Security Scanning:

# Scan for known vulnerabilities
uv run pip-audit

# Alternative: safety check
uv run safety check

Update Policy:

Security patches: Apply within 24 hours (critical), 7 days (high)
Minor versions: Review and test before updating
Major versions: Evaluate breaking changes, test thoroughly

Secrets Management

Environment Variables (v1.0.0):

# Required secrets
GONDULF_SECRET_KEY=<256-bit random value>
GONDULF_SMTP_PASSWORD=<SMTP password>

# Optional secrets
GONDULF_DATABASE_ENCRYPTION_KEY=<for encrypted backups>

Secret Generation:

# Generate SECRET_KEY (256 bits)
python -c "import secrets; print(secrets.token_urlsafe(32))"

Storage:

Development: .env file (not committed)
Production: Docker secrets or environment variables
Never hardcode secrets in code

Future: Integrate with HashiCorp Vault or AWS Secrets Manager.

Rate Limiting (Future)

v1.0.0: Not implemented (acceptable for small deployments).

Future Implementation:

Endpoint	Limit	Window	Key
/authorize	10 requests	1 minute	IP
/token	30 requests	1 minute	client_id
Email verification	3 codes	1 hour	email
Code submission	3 attempts	15 minutes	session

Implementation Strategy:

Use Redis for distributed rate limiting
Token bucket algorithm
Exponential backoff on failures

Security Testing

Required Security Tests

Input Validation:
- Malformed URLs (me, client_id, redirect_uri)
- SQL injection attempts
- XSS attempts
- Email injection
Authentication:
- Expired code rejection
- Used code rejection
- Invalid code rejection
- Brute force resistance
Authorization:
- State parameter validation
- Redirect URI validation
- Open redirect prevention
Token Security:
- Timing attack resistance
- Token theft scenarios
- Expiration enforcement
TLS/HTTPS:
- HTTP rejection in production
- Security headers presence
- Certificate validation

Security Scanning Tools

Required Tools:

bandit: Python security linter
pip-audit: Dependency vulnerability scanner
pytest: Security-focused test cases

CI/CD Integration:

# GitHub Actions example
security:
  - name: Run Bandit
    run: uv run bandit -r src/gondulf

  - name: Scan Dependencies
    run: uv run pip-audit

  - name: Run Security Tests
    run: uv run pytest tests/security/

Incident Response

Security Event Monitoring

Monitor For:

Multiple failed authentication attempts
Authorization code reuse attempts
Invalid token presentation
Unusual DNS verification failures
Email send failures (potential abuse)

Alerting (future):

Admin email on critical events
Webhook integration (Slack, Discord)
Metrics dashboard (Grafana)

Breach Response Plan (Future)

If Access Tokens Compromised:

Revoke all active tokens
Force re-authentication
Notify affected users (via domain)
Rotate SECRET_KEY
Audit logs for suspicious activity

If Database Compromised:

Assess data exposure (only hashes + domains)
Rotate all tokens
Review access logs
Notify users if domains exposed

Compliance Considerations

Personal Data Stored:

Domain names (considered PII in some jurisdictions)
Timestamps (associated with domains)

GDPR Rights:

Right to Access: Admin can query database
Right to Erasure: Admin can delete domain records
Right to Portability: Data export feature (future)

Privacy Policy (required):

Document what data is collected (domains, timestamps)
Document how data is used (authentication)
Document retention policy (indefinite unless deleted)
Provide contact for data requests

Security Disclosure

Security Policy (future):

Responsible disclosure process
Security contact (security@domain)
GPG key for encrypted reports
Acknowledgments for researchers

Security Roadmap

v1.0.0 (MVP)

✅ Two-factor domain verification (DNS TXT + Email via rel="me")
✅ rel="me" email discovery (IndieWeb standard)
✅ HTML parsing security (BeautifulSoup)
✅ TLS/HTTPS enforcement
✅ Secure token generation (opaque, hashed)
✅ URL validation (open redirect prevention)
✅ Input validation (Pydantic)
✅ Security headers
✅ Minimal data collection (no email storage)

v1.1.0

PKCE support (code challenge/verifier)
Rate limiting (Redis-based)
Token revocation endpoint
Enhanced logging

v1.2.0

WebAuthn support (passwordless)
Hardware security key support
Admin dashboard (audit logs)
Security metrics

v2.0.0

Multi-factor authentication
Federated identity providers
Advanced threat detection
SOC 2 compliance preparation

References

OWASP Top 10: https://owasp.org/www-project-top-ten/
OAuth 2.0 Security Best Practices: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics
NIST Cybersecurity Framework: https://www.nist.gov/cyberframework
CWE Top 25: https://cwe.mitre.org/top25/

29 KiB Raw Blame History Unescape Escape

Security Architecture

Security Philosophy

Threat Model

Assets to Protect

Threat Actors

Out of Scope (v1.0.0)

Authentication Security

Two-Factor Domain Verification (v1.0.0)

Threat: Email Interception

Threat: Code Brute Force

Threat: DNS TXT Record Spoofing

Threat: rel="me" Link Spoofing

Threat: Email Address Enumeration

Domain Ownership Verification (Two-Factor)

1. TXT Record Validation (Required)

2. Email Verification via rel="me" (Required)

Authorization Security

Authorization Code Security

Threat: Authorization Code Interception

Threat: Code Replay Attack

Access Token Security

Threat: Token Theft

Threat: Timing Attacks on Token Verification

Input Validation

URL Validation Security

Threat: Open Redirect via redirect_uri

Threat: Phishing via Malicious client_id

Threat: URL Parameter Injection

HTML Parsing Security (rel="me" Discovery)

Threat: Malicious HTML Injection

Email Validation

Threat: Email Injection Attacks

Network Security

TLS/HTTPS Enforcement

Security Headers

Data Security

Data Minimization (Privacy)

Database Security

Logging Security

Dependency Security

Dependency Management

Secrets Management

Rate Limiting (Future)

Security Testing

Required Security Tests

Security Scanning Tools

Incident Response

Security Event Monitoring

Breach Response Plan (Future)

Compliance Considerations

GDPR Compliance

Security Disclosure

Security Roadmap

v1.0.0 (MVP)

v1.1.0

v1.2.0

v2.0.0

References

29 KiB

Raw Blame History