# IndieAuth Protocol Implementation ## Specification Compliance This document describes Gondulf's implementation of the W3C IndieAuth specification. **Primary Reference**: https://www.w3.org/TR/indieauth/ **Reference Implementation**: https://github.com/aaronpk/indielogin.com **Compliance Target**: Any compliant IndieAuth client MUST be able to authenticate successfully against Gondulf. ## Protocol Overview IndieAuth is built on OAuth 2.0, extending it to enable decentralized authentication where users are identified by URLs (typically their own domain) rather than accounts on centralized services. ### Core Principle Users prove ownership of a domain, and that domain becomes their identity. No usernames, no passwords stored by the server. ### IndieAuth vs OAuth 2.0 **Similarities**: - Authorization code flow - Token endpoint for code exchange - State parameter for CSRF protection - Redirect-based flow **Differences**: - User identity is a URL (`me` parameter), not an opaque user ID - No client secrets (all clients are "public clients") - Client IDs are URLs that must be fetchable - Domain ownership verification instead of password authentication ## v1.0.0 Scope Gondulf v1.0.0 implements **authentication only** (not authorization): - Users can prove they own a domain - Tokens are issued but carry no permissions (scope) - Client applications can verify user identity - NO resource server capabilities - NO scope-based authorization **Future versions** will add: - Authorization with scopes - Token refresh - Token revocation - Resource server capabilities ## Endpoints ### Discovery Endpoint (Optional) **URL**: `/.well-known/oauth-authorization-server` **Purpose**: Advertise server capabilities and endpoints per RFC 8414. **Response** (JSON): ```json { "issuer": "https://auth.example.com", "authorization_endpoint": "https://auth.example.com/authorize", "token_endpoint": "https://auth.example.com/token", "response_types_supported": ["code"], "grant_types_supported": ["authorization_code"], "code_challenge_methods_supported": ["S256"], "token_endpoint_auth_methods_supported": ["none"] } ``` **Implementation Notes**: - Optional for v1.0.0 but recommended - FastAPI endpoint: `GET /.well-known/oauth-authorization-server` - Static response (no database access) - Cache-Control: public, max-age=86400 ### Authorization Endpoint **URL**: `/authorize` **Method**: GET **Purpose**: Initiate authentication flow #### Required Parameters | Parameter | Description | Validation | |-----------|-------------|------------| | `me` | User's domain/URL | Must be valid URL, no fragments/credentials/ports | | `client_id` | Client application URL | Must be valid URL, must be fetchable | | `redirect_uri` | Where to send user after auth | Must be valid URL, must match client_id domain OR be registered | | `state` | CSRF protection token | Required, opaque string, returned unchanged | | `response_type` | Must be `code` | Exactly `code` for auth code flow | #### Optional Parameters (v1.0.0) | Parameter | Description | v1.0.0 Behavior | |-----------|-------------|-----------------| | `scope` | Requested permissions | Ignored (authentication only) | | `code_challenge` | PKCE challenge | NOT supported in v1.0.0 | | `code_challenge_method` | PKCE method | NOT supported in v1.0.0 | **PKCE Decision**: Deferred to post-v1.0.0 to maintain MVP simplicity. See ADR-003. #### Request Validation Sequence 1. **Validate `response_type`** - MUST be exactly `code` - Error: `unsupported_response_type` 2. **Validate `me` parameter** - Must be a valid URL - Must NOT contain fragment (#) - Must NOT contain credentials (user:pass@) - Must NOT contain port (except :443 for HTTPS) - Must NOT be an IP address - Normalize: lowercase domain, remove trailing slash - Error: `invalid_request` with description 3. **Validate `client_id`** - Must be a valid URL - Must contain a domain component (not localhost in production) - Fetch client_id URL to retrieve app info (see Client Validation) - Error: `invalid_client` with description 4. **Validate `redirect_uri`** - Must be a valid URL - Must use HTTPS in production (HTTP allowed for localhost) - If domain differs from client_id domain: - Must match client_id subdomain pattern, OR - Must be registered in client metadata (future), OR - Display warning to user - Error: `invalid_request` with description 5. **Validate `state`** - Must be present - Must be non-empty string - Store for verification (not used server-side, returned to client) - Error: `invalid_request` with description #### Client Validation When `client_id` is provided, fetch the URL to retrieve application information: **HTTP Request**: ``` GET https://client.example.com/ Accept: text/html ``` **Extract Application Info**: - Look for `h-app` microformat in HTML - Extract: app name, icon, URL - Extract registered redirect URIs from `` tags - Cache result for 24 hours **Fallback**: - If no h-app found, use domain name as app name - If no icon, use generic icon - If no redirect URIs registered, rely on domain matching **Security**: - Follow redirects (max 5) - Timeout after 5 seconds - Validate SSL certificates - Reject non-200 responses - Log client_id fetch failures #### Authentication Flow (v1.0.0: Two-Factor Domain Verification) 1. **DNS TXT Record Verification (Required)** - Check if `me` domain has TXT record: `_gondulf.{domain}` = `verified` - Query multiple DNS resolvers (Google 8.8.8.8, Cloudflare 1.1.1.1) - Require consensus from at least 2 resolvers - If not found: Display error with instructions to add TXT record - If found: Proceed to email discovery - Proves: User controls DNS for the domain 2. **Email Discovery via rel="me" (Required)** - Fetch user's domain homepage (e.g., https://example.com) - Parse HTML for `` or `` - Extract email address from first matching mailto: link - If not found: Display error with instructions to add rel="me" link - If found: Proceed to email verification - Proves: User has published email relationship on their site - Reference: https://indieweb.org/rel-me 3. **Email Verification Code (Required)** - Generate 6-digit verification code (cryptographically random) - Store code in memory with 15-minute TTL - Send code to discovered email address via SMTP - Display code entry form showing discovered email (partially masked) - User enters 6-digit code - Validate code matches and hasn't expired (max 3 attempts) - Proves: User controls the email account - Mark domain as verified (store in database) 4. **User Consent** - Display authorization prompt: - "Sign in to [App Name] as [me]" - Show client_id full URL - Show redirect_uri if different domain - Show scope (future) - User approves or denies 5. **Authorization Code Generation** - Generate cryptographically secure code (32 bytes, base64url) - Store code in memory with 10-minute TTL - Store associated data: - `me` (user's domain) - `client_id` - `redirect_uri` - `state` - Timestamp - Code is single-use only 6. **Redirect to Client** ``` HTTP/1.1 302 Found Location: {redirect_uri}?code={code}&state={state} ``` **Security Model**: Two-factor verification requires BOTH DNS control AND email control. An attacker would need to compromise both to authenticate fraudulently. #### Error Responses Return error via redirect when possible: ``` HTTP/1.1 302 Found Location: {redirect_uri}?error={error}&error_description={description}&state={state} ``` **Error Codes** (OAuth 2.0 standard): - `invalid_request` - Malformed request - `unauthorized_client` - Client not authorized - `access_denied` - User denied authorization - `unsupported_response_type` - response_type not `code` - `invalid_scope` - Invalid scope requested (future) - `server_error` - Internal server error - `temporarily_unavailable` - Server temporarily unavailable When redirect not possible (invalid redirect_uri), display error page. ### Token Endpoint **URL**: `/token` **Method**: POST **Content-Type**: `application/x-www-form-urlencoded` **Purpose**: Exchange authorization code for access token #### Required Parameters | Parameter | Description | Validation | |-----------|-------------|------------| | `grant_type` | Must be `authorization_code` | Exactly `authorization_code` | | `code` | Authorization code from /authorize | Must be valid, unexpired, unused | | `client_id` | Client application URL | Must match code's client_id | | `redirect_uri` | Original redirect URI | Must match code's redirect_uri | | `me` | User's domain | Must match code's me | #### Request Validation Sequence 1. **Validate `grant_type`** - MUST be `authorization_code` - Error: `unsupported_grant_type` 2. **Validate `code`** - Must exist in storage - Must not have expired (10-minute TTL) - Must not have been used already - Mark as used immediately - Error: `invalid_grant` 3. **Validate `client_id`** - Must match the client_id associated with code - Error: `invalid_client` 4. **Validate `redirect_uri`** - Must exactly match the redirect_uri from authorization request - Error: `invalid_grant` 5. **Validate `me`** - Must exactly match the me from authorization request - Error: `invalid_request` #### Token Generation **v1.0.0 Implementation: Opaque Tokens** ```python import secrets import hashlib from datetime import datetime, timedelta # Generate token token = secrets.token_urlsafe(32) # 256 bits # Store in database token_record = { "token_hash": hashlib.sha256(token.encode()).hexdigest(), "me": me, "client_id": client_id, "scope": "", # Empty for authentication-only "issued_at": datetime.utcnow(), "expires_at": datetime.utcnow() + timedelta(hours=1) } ``` **Why Opaque Tokens in v1.0.0**: - Simpler than JWT (no signing, no key rotation) - Easier to revoke (database lookup) - Sufficient for authentication-only use case - Can migrate to JWT in future versions **Token Properties**: - Length: 43 characters (base64url of 32 bytes) - Entropy: 256 bits (cryptographically secure) - Storage: SHA-256 hash in database - Expiration: 1 hour (configurable) - Revocable: Delete from database #### Success Response **HTTP 200 OK**: ```json { "access_token": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "token_type": "Bearer", "me": "https://example.com", "scope": "" } ``` **Response Fields**: - `access_token`: The opaque token (43 characters) - `token_type`: Always `Bearer` - `me`: User's canonical domain URL (normalized) - `scope`: Empty string for authentication-only (future: space-separated scopes) **Headers**: ``` Content-Type: application/json Cache-Control: no-store Pragma: no-cache ``` #### Error Responses **HTTP 400 Bad Request**: ```json { "error": "invalid_grant", "error_description": "Authorization code has expired" } ``` **Error Codes** (OAuth 2.0 standard): - `invalid_request` - Missing or invalid parameters - `invalid_client` - Client authentication failed - `invalid_grant` - Invalid or expired authorization code - `unauthorized_client` - Client not authorized for grant type - `unsupported_grant_type` - Grant type not `authorization_code` ### Token Verification Endpoint (Future) **URL**: `/token/verify` **Method**: GET **Purpose**: Verify token validity (for resource servers) **NOT implemented in v1.0.0** (authentication only, no resource servers). Future implementation: ``` GET /token/verify Authorization: Bearer {token} Response 200 OK: { "me": "https://example.com", "client_id": "https://client.example.com", "scope": "" } ``` ### Token Revocation Endpoint (Future) **URL**: `/token/revoke` **Method**: POST **Purpose**: Revoke access token **NOT implemented in v1.0.0**. Future implementation per RFC 7009. ## Data Models ### Authorization Code (In-Memory) ```python { "code": "abc123...", # 43-char base64url "me": "https://example.com", "client_id": "https://client.example.com", "redirect_uri": "https://client.example.com/callback", "state": "client-provided-state", "created_at": datetime, "expires_at": datetime, # created_at + 10 minutes "used": False } ``` **Storage**: Python dict with TTL management **Expiration**: 10 minutes (per spec: "shortly after") **Single-use**: Marked as used after redemption **Cleanup**: Automatic expiration via TTL ### Email Verification Code (In-Memory) ```python { "email": "admin@example.com", # Discovered from rel="me", not user-provided "code": "123456", # 6-digit string "domain": "example.com", "created_at": datetime, "expires_at": datetime, # created_at + 15 minutes "attempts": 0 # Rate limiting (max 3 attempts) } ``` **Storage**: Python dict with TTL management **Email Source**: Discovered from site's rel="me" link (not user input) **Expiration**: 15 minutes **Rate Limiting**: Max 3 attempts per email, max 3 codes per domain per hour **Cleanup**: Automatic expiration via TTL ### Access Token (SQLite) ```sql CREATE TABLE tokens ( id INTEGER PRIMARY KEY AUTOINCREMENT, token_hash TEXT NOT NULL UNIQUE, -- SHA-256 hash me TEXT NOT NULL, client_id TEXT NOT NULL, scope TEXT NOT NULL, -- Empty string for v1.0.0 issued_at TIMESTAMP NOT NULL, expires_at TIMESTAMP NOT NULL, revoked BOOLEAN DEFAULT 0, INDEX idx_token_hash (token_hash), INDEX idx_me (me), INDEX idx_expires_at (expires_at) ); ``` **Lookup**: By token_hash (constant-time comparison) **Expiration**: 1 hour default (configurable) **Revocation**: Set `revoked = 1` (future feature) **Cleanup**: Periodic deletion of expired tokens ### Verified Domain (SQLite) ```sql CREATE TABLE domains ( id INTEGER PRIMARY KEY AUTOINCREMENT, domain TEXT NOT NULL UNIQUE, verification_method TEXT NOT NULL, -- 'two_factor' (DNS + Email) verified_at TIMESTAMP NOT NULL, last_dns_check TIMESTAMP, dns_txt_valid BOOLEAN DEFAULT 0, last_email_check TIMESTAMP, INDEX idx_domain (domain) ); ``` **Purpose**: Cache domain ownership verification **Verification Method**: Always 'two_factor' in v1.0.0 (DNS TXT + Email via rel="me") **DNS TXT**: Re-verified periodically (daily check) **Email**: NOT stored (only verification timestamp recorded) **Re-verification**: DNS checked periodically, email re-verified on each login **Cleanup**: Optional (admin decision) ## Security Considerations ### URL Validation **Critical**: Prevent open redirect and phishing attacks. **`me` Validation**: ```python from urllib.parse import urlparse def validate_me(me: str) -> tuple[bool, str, str]: """ Validate me parameter. Returns: (valid, normalized_me, error_message) """ parsed = urlparse(me) # Must have scheme and netloc if not parsed.scheme or not parsed.netloc: return False, "", "me must be a complete URL" # Must be HTTP or HTTPS if parsed.scheme not in ['http', 'https']: return False, "", "me must use http or https" # No fragments if parsed.fragment: return False, "", "me must not contain fragment" # No credentials if parsed.username or parsed.password: return False, "", "me must not contain credentials" # No ports (except default) if parsed.port and not (parsed.port == 443 and parsed.scheme == 'https'): return False, "", "me must not contain non-standard port" # No IP addresses import ipaddress try: ipaddress.ip_address(parsed.netloc) return False, "", "me must be a domain, not IP address" except ValueError: pass # Good, not an IP # Normalize domain = parsed.netloc.lower() path = parsed.path.rstrip('/') normalized = f"{parsed.scheme}://{domain}{path}" return True, normalized, "" ``` **`redirect_uri` Validation**: ```python def validate_redirect_uri(redirect_uri: str, client_id: str) -> tuple[bool, str]: """ Validate redirect_uri against client_id. Returns: (valid, error_message) """ parsed_redirect = urlparse(redirect_uri) parsed_client = urlparse(client_id) # Must be valid URL if not parsed_redirect.scheme or not parsed_redirect.netloc: return False, "redirect_uri must be a complete URL" # Must be HTTPS in production (allow HTTP for localhost) if not DEBUG: if parsed_redirect.scheme != 'https': if parsed_redirect.netloc != 'localhost': return False, "redirect_uri must use HTTPS" redirect_domain = parsed_redirect.netloc.lower() client_domain = parsed_client.netloc.lower() # Same domain: OK if redirect_domain == client_domain: return True, "" # Subdomain of client domain: OK if redirect_domain.endswith('.' + client_domain): return True, "" # Different domain: Check if registered (future) # For v1.0.0: Display warning to user return True, "warning: redirect_uri domain differs from client_id" ``` ### Constant-Time Comparison Prevent timing attacks on token verification: ```python import secrets def verify_token(provided_token: str, stored_hash: str) -> bool: """ Verify token using constant-time comparison. """ import hashlib provided_hash = hashlib.sha256(provided_token.encode()).hexdigest() return secrets.compare_digest(provided_hash, stored_hash) ``` ### CSRF Protection **State Parameter**: - Client generates unguessable state - Server returns state unchanged - Client verifies state matches - Server does NOT validate state (client's responsibility) ### HTTPS Enforcement **Production Requirements**: - All endpoints MUST use HTTPS - HTTP allowed only for localhost in development - HSTS header recommended: `Strict-Transport-Security: max-age=31536000` ### Rate Limiting (Future) **v1.0.0**: Not implemented (acceptable for small deployments). **Future versions**: - Authorization requests: 10/minute per IP - Token requests: 30/minute per client_id - Email codes: 3/hour per email - Failed verifications: 5/hour per IP ## Protocol Deviations ### Intentional Deviations from W3C Spec **ADR-003**: PKCE deferred to post-v1.0.0 - **Reason**: Simplicity for MVP, small user base, HTTPS mitigates risk - **Impact**: Slightly less secure against code interception - **Mitigation**: Enforce HTTPS, short code TTL (10 minutes) - **Upgrade Path**: Add PKCE in v1.1.0 without breaking changes **ADR-004**: No client pre-registration required (TBD) - **Reason**: Aligns with user requirement for simplified client onboarding - **Impact**: Must validate client_id on every request - **Mitigation**: Cache client metadata, implement rate limiting - **Spec Compliance**: Spec allows this ("client IDs are resolvable URLs") ### Scope Limitations (v1.0.0) **Authentication Only**: - `scope` parameter accepted but ignored - All tokens issued with empty scope - Tokens prove identity, not authorization - Future versions will support scopes ## Testing Strategy ### Compliance Testing **Required Tests**: 1. Valid authorization request → code generation 2. Valid token request → token generation 3. Invalid client_id → error 4. Invalid redirect_uri → error 5. Missing state → error 6. Expired authorization code → error 7. Used authorization code → error 8. Mismatched client_id on token request → error ### Interoperability Testing **Test Against**: - IndieAuth.com test suite (if available) - Real IndieAuth clients (IndieLogin, etc.) - Reference implementation comparison ### Security Testing **Required Tests**: 1. Open redirect prevention (invalid redirect_uri) 2. Timing attack resistance (token verification) 3. CSRF protection (state parameter) 4. Code reuse prevention (single-use codes) 5. URL validation (me parameter malformation) ## Implementation Checklist - [ ] `/authorize` endpoint with parameter validation - [ ] Client metadata fetching (h-app microformat) - [ ] Email verification flow (code generation, sending, validation) - [ ] Domain ownership caching (SQLite) - [ ] Authorization code generation and storage (in-memory) - [ ] `/token` endpoint with grant validation - [ ] Access token generation and storage (SQLite, hashed) - [ ] Error responses (OAuth 2.0 compliant) - [ ] HTTPS enforcement (production) - [ ] URL validation (me, client_id, redirect_uri) - [ ] Constant-time token comparison - [ ] Metadata endpoint `/.well-known/oauth-authorization-server` - [ ] Comprehensive test suite (80%+ coverage) ## References - W3C IndieAuth Specification: https://www.w3.org/TR/indieauth/ - OAuth 2.0 (RFC 6749): https://datatracker.ietf.org/doc/html/rfc6749 - OAuth 2.0 Security Best Practices: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics - PKCE (RFC 7636): https://datatracker.ietf.org/doc/html/rfc7636 (future) - Token Revocation (RFC 7009): https://datatracker.ietf.org/doc/html/rfc7009 (future) - Authorization Server Metadata (RFC 8414): https://datatracker.ietf.org/doc/html/rfc8414