Files
Gondulf/docs/architecture/indieauth-protocol.md
Phil Skentelbery 6f06aebf40 docs: add Phase 2 domain verification design and clarifications
Add comprehensive Phase 2 documentation:
- Complete design document for two-factor domain verification
- Implementation guide with code examples
- ADR for implementation decisions (ADR-0004)
- ADR for rel="me" email discovery (ADR-008)
- Phase 1 impact assessment
- All 23 clarification questions answered
- Updated architecture docs (indieauth-protocol, security)
- Updated ADR-005 with rel="me" approach
- Updated backlog with technical debt items

Design ready for Phase 2 implementation.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 13:05:09 -07:00

21 KiB

IndieAuth Protocol Implementation

Specification Compliance

This document describes Gondulf's implementation of the W3C IndieAuth specification.

Primary Reference: https://www.w3.org/TR/indieauth/ Reference Implementation: https://github.com/aaronpk/indielogin.com

Compliance Target: Any compliant IndieAuth client MUST be able to authenticate successfully against Gondulf.

Protocol Overview

IndieAuth is built on OAuth 2.0, extending it to enable decentralized authentication where users are identified by URLs (typically their own domain) rather than accounts on centralized services.

Core Principle

Users prove ownership of a domain, and that domain becomes their identity. No usernames, no passwords stored by the server.

IndieAuth vs OAuth 2.0

Similarities:

  • Authorization code flow
  • Token endpoint for code exchange
  • State parameter for CSRF protection
  • Redirect-based flow

Differences:

  • User identity is a URL (me parameter), not an opaque user ID
  • No client secrets (all clients are "public clients")
  • Client IDs are URLs that must be fetchable
  • Domain ownership verification instead of password authentication

v1.0.0 Scope

Gondulf v1.0.0 implements authentication only (not authorization):

  • Users can prove they own a domain
  • Tokens are issued but carry no permissions (scope)
  • Client applications can verify user identity
  • NO resource server capabilities
  • NO scope-based authorization

Future versions will add:

  • Authorization with scopes
  • Token refresh
  • Token revocation
  • Resource server capabilities

Endpoints

Discovery Endpoint (Optional)

URL: /.well-known/oauth-authorization-server

Purpose: Advertise server capabilities and endpoints per RFC 8414.

Response (JSON):

{
  "issuer": "https://auth.example.com",
  "authorization_endpoint": "https://auth.example.com/authorize",
  "token_endpoint": "https://auth.example.com/token",
  "response_types_supported": ["code"],
  "grant_types_supported": ["authorization_code"],
  "code_challenge_methods_supported": ["S256"],
  "token_endpoint_auth_methods_supported": ["none"]
}

Implementation Notes:

  • Optional for v1.0.0 but recommended
  • FastAPI endpoint: GET /.well-known/oauth-authorization-server
  • Static response (no database access)
  • Cache-Control: public, max-age=86400

Authorization Endpoint

URL: /authorize Method: GET Purpose: Initiate authentication flow

Required Parameters

Parameter Description Validation
me User's domain/URL Must be valid URL, no fragments/credentials/ports
client_id Client application URL Must be valid URL, must be fetchable
redirect_uri Where to send user after auth Must be valid URL, must match client_id domain OR be registered
state CSRF protection token Required, opaque string, returned unchanged
response_type Must be code Exactly code for auth code flow

Optional Parameters (v1.0.0)

Parameter Description v1.0.0 Behavior
scope Requested permissions Ignored (authentication only)
code_challenge PKCE challenge NOT supported in v1.0.0
code_challenge_method PKCE method NOT supported in v1.0.0

PKCE Decision: Deferred to post-v1.0.0 to maintain MVP simplicity. See ADR-003.

Request Validation Sequence

  1. Validate response_type

    • MUST be exactly code
    • Error: unsupported_response_type
  2. Validate me parameter

    • Must be a valid URL
    • Must NOT contain fragment (#)
    • Must NOT contain credentials (user:pass@)
    • Must NOT contain port (except :443 for HTTPS)
    • Must NOT be an IP address
    • Normalize: lowercase domain, remove trailing slash
    • Error: invalid_request with description
  3. Validate client_id

    • Must be a valid URL
    • Must contain a domain component (not localhost in production)
    • Fetch client_id URL to retrieve app info (see Client Validation)
    • Error: invalid_client with description
  4. Validate redirect_uri

    • Must be a valid URL
    • Must use HTTPS in production (HTTP allowed for localhost)
    • If domain differs from client_id domain:
      • Must match client_id subdomain pattern, OR
      • Must be registered in client metadata (future), OR
      • Display warning to user
    • Error: invalid_request with description
  5. Validate state

    • Must be present
    • Must be non-empty string
    • Store for verification (not used server-side, returned to client)
    • Error: invalid_request with description

Client Validation

When client_id is provided, fetch the URL to retrieve application information:

HTTP Request:

GET https://client.example.com/
Accept: text/html

Extract Application Info:

  • Look for h-app microformat in HTML
  • Extract: app name, icon, URL
  • Extract registered redirect URIs from <link rel="redirect_uri"> tags
  • Cache result for 24 hours

Fallback:

  • If no h-app found, use domain name as app name
  • If no icon, use generic icon
  • If no redirect URIs registered, rely on domain matching

Security:

  • Follow redirects (max 5)
  • Timeout after 5 seconds
  • Validate SSL certificates
  • Reject non-200 responses
  • Log client_id fetch failures

Authentication Flow (v1.0.0: Two-Factor Domain Verification)

  1. DNS TXT Record Verification (Required)

    • Check if me domain has TXT record: _gondulf.{domain} = verified
    • Query multiple DNS resolvers (Google 8.8.8.8, Cloudflare 1.1.1.1)
    • Require consensus from at least 2 resolvers
    • If not found: Display error with instructions to add TXT record
    • If found: Proceed to email discovery
    • Proves: User controls DNS for the domain
  2. Email Discovery via rel="me" (Required)

    • Fetch user's domain homepage (e.g., https://example.com)
    • Parse HTML for <link rel="me" href="mailto:user@example.com"> or <a rel="me" href="mailto:user@example.com">
    • Extract email address from first matching mailto: link
    • If not found: Display error with instructions to add rel="me" link
    • If found: Proceed to email verification
    • Proves: User has published email relationship on their site
    • Reference: https://indieweb.org/rel-me
  3. Email Verification Code (Required)

    • Generate 6-digit verification code (cryptographically random)
    • Store code in memory with 15-minute TTL
    • Send code to discovered email address via SMTP
    • Display code entry form showing discovered email (partially masked)
    • User enters 6-digit code
    • Validate code matches and hasn't expired (max 3 attempts)
    • Proves: User controls the email account
    • Mark domain as verified (store in database)
  4. User Consent

    • Display authorization prompt:
      • "Sign in to [App Name] as [me]"
      • Show client_id full URL
      • Show redirect_uri if different domain
      • Show scope (future)
    • User approves or denies
  5. Authorization Code Generation

    • Generate cryptographically secure code (32 bytes, base64url)
    • Store code in memory with 10-minute TTL
    • Store associated data:
      • me (user's domain)
      • client_id
      • redirect_uri
      • state
      • Timestamp
    • Code is single-use only
  6. Redirect to Client

    HTTP/1.1 302 Found
    Location: {redirect_uri}?code={code}&state={state}
    

Security Model: Two-factor verification requires BOTH DNS control AND email control. An attacker would need to compromise both to authenticate fraudulently.

Error Responses

Return error via redirect when possible:

HTTP/1.1 302 Found
Location: {redirect_uri}?error={error}&error_description={description}&state={state}

Error Codes (OAuth 2.0 standard):

  • invalid_request - Malformed request
  • unauthorized_client - Client not authorized
  • access_denied - User denied authorization
  • unsupported_response_type - response_type not code
  • invalid_scope - Invalid scope requested (future)
  • server_error - Internal server error
  • temporarily_unavailable - Server temporarily unavailable

When redirect not possible (invalid redirect_uri), display error page.

Token Endpoint

URL: /token Method: POST Content-Type: application/x-www-form-urlencoded Purpose: Exchange authorization code for access token

Required Parameters

Parameter Description Validation
grant_type Must be authorization_code Exactly authorization_code
code Authorization code from /authorize Must be valid, unexpired, unused
client_id Client application URL Must match code's client_id
redirect_uri Original redirect URI Must match code's redirect_uri
me User's domain Must match code's me

Request Validation Sequence

  1. Validate grant_type

    • MUST be authorization_code
    • Error: unsupported_grant_type
  2. Validate code

    • Must exist in storage
    • Must not have expired (10-minute TTL)
    • Must not have been used already
    • Mark as used immediately
    • Error: invalid_grant
  3. Validate client_id

    • Must match the client_id associated with code
    • Error: invalid_client
  4. Validate redirect_uri

    • Must exactly match the redirect_uri from authorization request
    • Error: invalid_grant
  5. Validate me

    • Must exactly match the me from authorization request
    • Error: invalid_request

Token Generation

v1.0.0 Implementation: Opaque Tokens

import secrets
import hashlib
from datetime import datetime, timedelta

# Generate token
token = secrets.token_urlsafe(32)  # 256 bits

# Store in database
token_record = {
    "token_hash": hashlib.sha256(token.encode()).hexdigest(),
    "me": me,
    "client_id": client_id,
    "scope": "",  # Empty for authentication-only
    "issued_at": datetime.utcnow(),
    "expires_at": datetime.utcnow() + timedelta(hours=1)
}

Why Opaque Tokens in v1.0.0:

  • Simpler than JWT (no signing, no key rotation)
  • Easier to revoke (database lookup)
  • Sufficient for authentication-only use case
  • Can migrate to JWT in future versions

Token Properties:

  • Length: 43 characters (base64url of 32 bytes)
  • Entropy: 256 bits (cryptographically secure)
  • Storage: SHA-256 hash in database
  • Expiration: 1 hour (configurable)
  • Revocable: Delete from database

Success Response

HTTP 200 OK:

{
  "access_token": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "token_type": "Bearer",
  "me": "https://example.com",
  "scope": ""
}

Response Fields:

  • access_token: The opaque token (43 characters)
  • token_type: Always Bearer
  • me: User's canonical domain URL (normalized)
  • scope: Empty string for authentication-only (future: space-separated scopes)

Headers:

Content-Type: application/json
Cache-Control: no-store
Pragma: no-cache

Error Responses

HTTP 400 Bad Request:

{
  "error": "invalid_grant",
  "error_description": "Authorization code has expired"
}

Error Codes (OAuth 2.0 standard):

  • invalid_request - Missing or invalid parameters
  • invalid_client - Client authentication failed
  • invalid_grant - Invalid or expired authorization code
  • unauthorized_client - Client not authorized for grant type
  • unsupported_grant_type - Grant type not authorization_code

Token Verification Endpoint (Future)

URL: /token/verify Method: GET Purpose: Verify token validity (for resource servers)

NOT implemented in v1.0.0 (authentication only, no resource servers).

Future implementation:

GET /token/verify
Authorization: Bearer {token}

Response 200 OK:
{
  "me": "https://example.com",
  "client_id": "https://client.example.com",
  "scope": ""
}

Token Revocation Endpoint (Future)

URL: /token/revoke Method: POST Purpose: Revoke access token

NOT implemented in v1.0.0.

Future implementation per RFC 7009.

Data Models

Authorization Code (In-Memory)

{
    "code": "abc123...",  # 43-char base64url
    "me": "https://example.com",
    "client_id": "https://client.example.com",
    "redirect_uri": "https://client.example.com/callback",
    "state": "client-provided-state",
    "created_at": datetime,
    "expires_at": datetime,  # created_at + 10 minutes
    "used": False
}

Storage: Python dict with TTL management Expiration: 10 minutes (per spec: "shortly after") Single-use: Marked as used after redemption Cleanup: Automatic expiration via TTL

Email Verification Code (In-Memory)

{
    "email": "admin@example.com",  # Discovered from rel="me", not user-provided
    "code": "123456",  # 6-digit string
    "domain": "example.com",
    "created_at": datetime,
    "expires_at": datetime,  # created_at + 15 minutes
    "attempts": 0  # Rate limiting (max 3 attempts)
}

Storage: Python dict with TTL management Email Source: Discovered from site's rel="me" link (not user input) Expiration: 15 minutes Rate Limiting: Max 3 attempts per email, max 3 codes per domain per hour Cleanup: Automatic expiration via TTL

Access Token (SQLite)

CREATE TABLE tokens (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    token_hash TEXT NOT NULL UNIQUE,  -- SHA-256 hash
    me TEXT NOT NULL,
    client_id TEXT NOT NULL,
    scope TEXT NOT NULL,  -- Empty string for v1.0.0
    issued_at TIMESTAMP NOT NULL,
    expires_at TIMESTAMP NOT NULL,
    revoked BOOLEAN DEFAULT 0,

    INDEX idx_token_hash (token_hash),
    INDEX idx_me (me),
    INDEX idx_expires_at (expires_at)
);

Lookup: By token_hash (constant-time comparison) Expiration: 1 hour default (configurable) Revocation: Set revoked = 1 (future feature) Cleanup: Periodic deletion of expired tokens

Verified Domain (SQLite)

CREATE TABLE domains (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    domain TEXT NOT NULL UNIQUE,
    verification_method TEXT NOT NULL,  -- 'two_factor' (DNS + Email)
    verified_at TIMESTAMP NOT NULL,
    last_dns_check TIMESTAMP,
    dns_txt_valid BOOLEAN DEFAULT 0,
    last_email_check TIMESTAMP,

    INDEX idx_domain (domain)
);

Purpose: Cache domain ownership verification Verification Method: Always 'two_factor' in v1.0.0 (DNS TXT + Email via rel="me") DNS TXT: Re-verified periodically (daily check) Email: NOT stored (only verification timestamp recorded) Re-verification: DNS checked periodically, email re-verified on each login Cleanup: Optional (admin decision)

Security Considerations

URL Validation

Critical: Prevent open redirect and phishing attacks.

me Validation:

from urllib.parse import urlparse

def validate_me(me: str) -> tuple[bool, str, str]:
    """
    Validate me parameter.

    Returns: (valid, normalized_me, error_message)
    """
    parsed = urlparse(me)

    # Must have scheme and netloc
    if not parsed.scheme or not parsed.netloc:
        return False, "", "me must be a complete URL"

    # Must be HTTP or HTTPS
    if parsed.scheme not in ['http', 'https']:
        return False, "", "me must use http or https"

    # No fragments
    if parsed.fragment:
        return False, "", "me must not contain fragment"

    # No credentials
    if parsed.username or parsed.password:
        return False, "", "me must not contain credentials"

    # No ports (except default)
    if parsed.port and not (parsed.port == 443 and parsed.scheme == 'https'):
        return False, "", "me must not contain non-standard port"

    # No IP addresses
    import ipaddress
    try:
        ipaddress.ip_address(parsed.netloc)
        return False, "", "me must be a domain, not IP address"
    except ValueError:
        pass  # Good, not an IP

    # Normalize
    domain = parsed.netloc.lower()
    path = parsed.path.rstrip('/')
    normalized = f"{parsed.scheme}://{domain}{path}"

    return True, normalized, ""

redirect_uri Validation:

def validate_redirect_uri(redirect_uri: str, client_id: str) -> tuple[bool, str]:
    """
    Validate redirect_uri against client_id.

    Returns: (valid, error_message)
    """
    parsed_redirect = urlparse(redirect_uri)
    parsed_client = urlparse(client_id)

    # Must be valid URL
    if not parsed_redirect.scheme or not parsed_redirect.netloc:
        return False, "redirect_uri must be a complete URL"

    # Must be HTTPS in production (allow HTTP for localhost)
    if not DEBUG:
        if parsed_redirect.scheme != 'https':
            if parsed_redirect.netloc != 'localhost':
                return False, "redirect_uri must use HTTPS"

    redirect_domain = parsed_redirect.netloc.lower()
    client_domain = parsed_client.netloc.lower()

    # Same domain: OK
    if redirect_domain == client_domain:
        return True, ""

    # Subdomain of client domain: OK
    if redirect_domain.endswith('.' + client_domain):
        return True, ""

    # Different domain: Check if registered (future)
    # For v1.0.0: Display warning to user
    return True, "warning: redirect_uri domain differs from client_id"

Constant-Time Comparison

Prevent timing attacks on token verification:

import secrets

def verify_token(provided_token: str, stored_hash: str) -> bool:
    """
    Verify token using constant-time comparison.
    """
    import hashlib
    provided_hash = hashlib.sha256(provided_token.encode()).hexdigest()
    return secrets.compare_digest(provided_hash, stored_hash)

CSRF Protection

State Parameter:

  • Client generates unguessable state
  • Server returns state unchanged
  • Client verifies state matches
  • Server does NOT validate state (client's responsibility)

HTTPS Enforcement

Production Requirements:

  • All endpoints MUST use HTTPS
  • HTTP allowed only for localhost in development
  • HSTS header recommended: Strict-Transport-Security: max-age=31536000

Rate Limiting (Future)

v1.0.0: Not implemented (acceptable for small deployments).

Future versions:

  • Authorization requests: 10/minute per IP
  • Token requests: 30/minute per client_id
  • Email codes: 3/hour per email
  • Failed verifications: 5/hour per IP

Protocol Deviations

Intentional Deviations from W3C Spec

ADR-003: PKCE deferred to post-v1.0.0

  • Reason: Simplicity for MVP, small user base, HTTPS mitigates risk
  • Impact: Slightly less secure against code interception
  • Mitigation: Enforce HTTPS, short code TTL (10 minutes)
  • Upgrade Path: Add PKCE in v1.1.0 without breaking changes

ADR-004: No client pre-registration required (TBD)

  • Reason: Aligns with user requirement for simplified client onboarding
  • Impact: Must validate client_id on every request
  • Mitigation: Cache client metadata, implement rate limiting
  • Spec Compliance: Spec allows this ("client IDs are resolvable URLs")

Scope Limitations (v1.0.0)

Authentication Only:

  • scope parameter accepted but ignored
  • All tokens issued with empty scope
  • Tokens prove identity, not authorization
  • Future versions will support scopes

Testing Strategy

Compliance Testing

Required Tests:

  1. Valid authorization request → code generation
  2. Valid token request → token generation
  3. Invalid client_id → error
  4. Invalid redirect_uri → error
  5. Missing state → error
  6. Expired authorization code → error
  7. Used authorization code → error
  8. Mismatched client_id on token request → error

Interoperability Testing

Test Against:

  • IndieAuth.com test suite (if available)
  • Real IndieAuth clients (IndieLogin, etc.)
  • Reference implementation comparison

Security Testing

Required Tests:

  1. Open redirect prevention (invalid redirect_uri)
  2. Timing attack resistance (token verification)
  3. CSRF protection (state parameter)
  4. Code reuse prevention (single-use codes)
  5. URL validation (me parameter malformation)

Implementation Checklist

  • /authorize endpoint with parameter validation
  • Client metadata fetching (h-app microformat)
  • Email verification flow (code generation, sending, validation)
  • Domain ownership caching (SQLite)
  • Authorization code generation and storage (in-memory)
  • /token endpoint with grant validation
  • Access token generation and storage (SQLite, hashed)
  • Error responses (OAuth 2.0 compliant)
  • HTTPS enforcement (production)
  • URL validation (me, client_id, redirect_uri)
  • Constant-time token comparison
  • Metadata endpoint /.well-known/oauth-authorization-server
  • Comprehensive test suite (80%+ coverage)

References