Files

Phil Skentelbery 6f06aebf40 docs: add Phase 2 domain verification design and clarifications

Add comprehensive Phase 2 documentation:
- Complete design document for two-factor domain verification
- Implementation guide with code examples
- ADR for implementation decisions (ADR-0004)
- ADR for rel="me" email discovery (ADR-008)
- Phase 1 impact assessment
- All 23 clarification questions answered
- Updated architecture docs (indieauth-protocol, security)
- Updated ADR-005 with rel="me" approach
- Updated backlog with technical debt items

Design ready for Phase 2 implementation.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-20 13:05:09 -07:00

21 KiB

Raw Blame History

IndieAuth Protocol Implementation

Specification Compliance

This document describes Gondulf's implementation of the W3C IndieAuth specification.

Primary Reference: https://www.w3.org/TR/indieauth/ Reference Implementation: https://github.com/aaronpk/indielogin.com

Compliance Target: Any compliant IndieAuth client MUST be able to authenticate successfully against Gondulf.

Protocol Overview

IndieAuth is built on OAuth 2.0, extending it to enable decentralized authentication where users are identified by URLs (typically their own domain) rather than accounts on centralized services.

Core Principle

Users prove ownership of a domain, and that domain becomes their identity. No usernames, no passwords stored by the server.

IndieAuth vs OAuth 2.0

Similarities:

Authorization code flow
Token endpoint for code exchange
State parameter for CSRF protection
Redirect-based flow

Differences:

User identity is a URL (me parameter), not an opaque user ID
No client secrets (all clients are "public clients")
Client IDs are URLs that must be fetchable
Domain ownership verification instead of password authentication

v1.0.0 Scope

Gondulf v1.0.0 implements authentication only (not authorization):

Users can prove they own a domain
Tokens are issued but carry no permissions (scope)
Client applications can verify user identity
NO resource server capabilities
NO scope-based authorization

Future versions will add:

Authorization with scopes
Token refresh
Token revocation
Resource server capabilities

Endpoints

Discovery Endpoint (Optional)

URL: /.well-known/oauth-authorization-server

Purpose: Advertise server capabilities and endpoints per RFC 8414.

Response (JSON):

{
  "issuer": "https://auth.example.com",
  "authorization_endpoint": "https://auth.example.com/authorize",
  "token_endpoint": "https://auth.example.com/token",
  "response_types_supported": ["code"],
  "grant_types_supported": ["authorization_code"],
  "code_challenge_methods_supported": ["S256"],
  "token_endpoint_auth_methods_supported": ["none"]
}

Implementation Notes:

Optional for v1.0.0 but recommended
FastAPI endpoint: GET /.well-known/oauth-authorization-server
Static response (no database access)
Cache-Control: public, max-age=86400

Authorization Endpoint

URL: /authorize Method: GET Purpose: Initiate authentication flow

Required Parameters

Parameter	Description	Validation
`me`	User's domain/URL	Must be valid URL, no fragments/credentials/ports
`client_id`	Client application URL	Must be valid URL, must be fetchable
`redirect_uri`	Where to send user after auth	Must be valid URL, must match client_id domain OR be registered
`state`	CSRF protection token	Required, opaque string, returned unchanged
`response_type`	Must be `code`	Exactly `code` for auth code flow

Optional Parameters (v1.0.0)

Parameter	Description	v1.0.0 Behavior
`scope`	Requested permissions	Ignored (authentication only)
`code_challenge`	PKCE challenge	NOT supported in v1.0.0
`code_challenge_method`	PKCE method	NOT supported in v1.0.0

PKCE Decision: Deferred to post-v1.0.0 to maintain MVP simplicity. See ADR-003.

Request Validation Sequence

Validate response_type
- MUST be exactly code
- Error: unsupported_response_type
Validate me parameter
- Must be a valid URL
- Must NOT contain fragment (#)
- Must NOT contain credentials (user:pass@)
- Must NOT contain port (except :443 for HTTPS)
- Must NOT be an IP address
- Normalize: lowercase domain, remove trailing slash
- Error: invalid_request with description
Validate client_id
- Must be a valid URL
- Must contain a domain component (not localhost in production)
- Fetch client_id URL to retrieve app info (see Client Validation)
- Error: invalid_client with description
Validate redirect_uri
- Must be a valid URL
- Must use HTTPS in production (HTTP allowed for localhost)
- If domain differs from client_id domain:
  - Must match client_id subdomain pattern, OR
  - Must be registered in client metadata (future), OR
  - Display warning to user
- Error: invalid_request with description
Validate state
- Must be present
- Must be non-empty string
- Store for verification (not used server-side, returned to client)
- Error: invalid_request with description

Client Validation

When client_id is provided, fetch the URL to retrieve application information:

HTTP Request:

GET https://client.example.com/
Accept: text/html

Extract Application Info:

Look for h-app microformat in HTML
Extract: app name, icon, URL
Extract registered redirect URIs from <link rel="redirect_uri"> tags
Cache result for 24 hours

Fallback:

If no h-app found, use domain name as app name
If no icon, use generic icon
If no redirect URIs registered, rely on domain matching

Security:

Follow redirects (max 5)
Timeout after 5 seconds
Validate SSL certificates
Reject non-200 responses
Log client_id fetch failures

Authentication Flow (v1.0.0: Two-Factor Domain Verification)

DNS TXT Record Verification (Required)
- Check if me domain has TXT record: _gondulf.{domain} = verified
- Query multiple DNS resolvers (Google 8.8.8.8, Cloudflare 1.1.1.1)
- Require consensus from at least 2 resolvers
- If not found: Display error with instructions to add TXT record
- If found: Proceed to email discovery
- Proves: User controls DNS for the domain
Email Discovery via rel="me" (Required)
- Fetch user's domain homepage (e.g., https://example.com)
- Parse HTML for <link rel="me" href="mailto:user@example.com"> or <a rel="me" href="mailto:user@example.com">
- Extract email address from first matching mailto: link
- If not found: Display error with instructions to add rel="me" link
- If found: Proceed to email verification
- Proves: User has published email relationship on their site
- Reference: https://indieweb.org/rel-me
Email Verification Code (Required)
- Generate 6-digit verification code (cryptographically random)
- Store code in memory with 15-minute TTL
- Send code to discovered email address via SMTP
- Display code entry form showing discovered email (partially masked)
- User enters 6-digit code
- Validate code matches and hasn't expired (max 3 attempts)
- Proves: User controls the email account
- Mark domain as verified (store in database)
User Consent
- Display authorization prompt:
  - "Sign in to [App Name] as [me]"
  - Show client_id full URL
  - Show redirect_uri if different domain
  - Show scope (future)
- User approves or denies
Authorization Code Generation
- Generate cryptographically secure code (32 bytes, base64url)
- Store code in memory with 10-minute TTL
- Store associated data:
  - me (user's domain)
  - client_id
  - redirect_uri
  - state
  - Timestamp
- Code is single-use only

Redirect to Client

HTTP/1.1 302 Found
Location: {redirect_uri}?code={code}&state={state}

Security Model: Two-factor verification requires BOTH DNS control AND email control. An attacker would need to compromise both to authenticate fraudulently.

Error Responses

Return error via redirect when possible:

HTTP/1.1 302 Found
Location: {redirect_uri}?error={error}&error_description={description}&state={state}

Error Codes (OAuth 2.0 standard):

invalid_request - Malformed request
unauthorized_client - Client not authorized
access_denied - User denied authorization
unsupported_response_type - response_type not code
invalid_scope - Invalid scope requested (future)
server_error - Internal server error
temporarily_unavailable - Server temporarily unavailable

When redirect not possible (invalid redirect_uri), display error page.

Token Endpoint

URL: /token Method: POST Content-Type: application/x-www-form-urlencoded Purpose: Exchange authorization code for access token

Required Parameters

Parameter	Description	Validation
`grant_type`	Must be `authorization_code`	Exactly `authorization_code`
`code`	Authorization code from /authorize	Must be valid, unexpired, unused
`client_id`	Client application URL	Must match code's client_id
`redirect_uri`	Original redirect URI	Must match code's redirect_uri
`me`	User's domain	Must match code's me

Request Validation Sequence

Validate grant_type
- MUST be authorization_code
- Error: unsupported_grant_type
Validate code
- Must exist in storage
- Must not have expired (10-minute TTL)
- Must not have been used already
- Mark as used immediately
- Error: invalid_grant
Validate client_id
- Must match the client_id associated with code
- Error: invalid_client
Validate redirect_uri
- Must exactly match the redirect_uri from authorization request
- Error: invalid_grant
Validate me
- Must exactly match the me from authorization request
- Error: invalid_request

Token Generation

v1.0.0 Implementation: Opaque Tokens

import secrets
import hashlib
from datetime import datetime, timedelta

# Generate token
token = secrets.token_urlsafe(32)  # 256 bits

# Store in database
token_record = {
    "token_hash": hashlib.sha256(token.encode()).hexdigest(),
    "me": me,
    "client_id": client_id,
    "scope": "",  # Empty for authentication-only
    "issued_at": datetime.utcnow(),
    "expires_at": datetime.utcnow() + timedelta(hours=1)
}

Why Opaque Tokens in v1.0.0:

Simpler than JWT (no signing, no key rotation)
Easier to revoke (database lookup)
Sufficient for authentication-only use case
Can migrate to JWT in future versions

Token Properties:

Length: 43 characters (base64url of 32 bytes)
Entropy: 256 bits (cryptographically secure)
Storage: SHA-256 hash in database
Expiration: 1 hour (configurable)
Revocable: Delete from database

Success Response

HTTP 200 OK:

{
  "access_token": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "token_type": "Bearer",
  "me": "https://example.com",
  "scope": ""
}

Response Fields:

access_token: The opaque token (43 characters)
token_type: Always Bearer
me: User's canonical domain URL (normalized)
scope: Empty string for authentication-only (future: space-separated scopes)

Headers:

Content-Type: application/json
Cache-Control: no-store
Pragma: no-cache

Error Responses

HTTP 400 Bad Request:

{
  "error": "invalid_grant",
  "error_description": "Authorization code has expired"
}

Error Codes (OAuth 2.0 standard):

invalid_request - Missing or invalid parameters
invalid_client - Client authentication failed
invalid_grant - Invalid or expired authorization code
unauthorized_client - Client not authorized for grant type
unsupported_grant_type - Grant type not authorization_code

Token Verification Endpoint (Future)

URL: /token/verify Method: GET Purpose: Verify token validity (for resource servers)

NOT implemented in v1.0.0 (authentication only, no resource servers).

Future implementation:

GET /token/verify
Authorization: Bearer {token}

Response 200 OK:
{
  "me": "https://example.com",
  "client_id": "https://client.example.com",
  "scope": ""
}

Token Revocation Endpoint (Future)

URL: /token/revoke Method: POST Purpose: Revoke access token

NOT implemented in v1.0.0.

Future implementation per RFC 7009.

Data Models

Authorization Code (In-Memory)

{
    "code": "abc123...",  # 43-char base64url
    "me": "https://example.com",
    "client_id": "https://client.example.com",
    "redirect_uri": "https://client.example.com/callback",
    "state": "client-provided-state",
    "created_at": datetime,
    "expires_at": datetime,  # created_at + 10 minutes
    "used": False
}

Storage: Python dict with TTL management Expiration: 10 minutes (per spec: "shortly after") Single-use: Marked as used after redemption Cleanup: Automatic expiration via TTL

Email Verification Code (In-Memory)

{
    "email": "admin@example.com",  # Discovered from rel="me", not user-provided
    "code": "123456",  # 6-digit string
    "domain": "example.com",
    "created_at": datetime,
    "expires_at": datetime,  # created_at + 15 minutes
    "attempts": 0  # Rate limiting (max 3 attempts)
}

Storage: Python dict with TTL management Email Source: Discovered from site's rel="me" link (not user input) Expiration: 15 minutes Rate Limiting: Max 3 attempts per email, max 3 codes per domain per hour Cleanup: Automatic expiration via TTL

Access Token (SQLite)

CREATE TABLE tokens (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    token_hash TEXT NOT NULL UNIQUE,  -- SHA-256 hash
    me TEXT NOT NULL,
    client_id TEXT NOT NULL,
    scope TEXT NOT NULL,  -- Empty string for v1.0.0
    issued_at TIMESTAMP NOT NULL,
    expires_at TIMESTAMP NOT NULL,
    revoked BOOLEAN DEFAULT 0,

    INDEX idx_token_hash (token_hash),
    INDEX idx_me (me),
    INDEX idx_expires_at (expires_at)
);

Lookup: By token_hash (constant-time comparison) Expiration: 1 hour default (configurable) Revocation: Set revoked = 1 (future feature) Cleanup: Periodic deletion of expired tokens

Verified Domain (SQLite)

CREATE TABLE domains (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    domain TEXT NOT NULL UNIQUE,
    verification_method TEXT NOT NULL,  -- 'two_factor' (DNS + Email)
    verified_at TIMESTAMP NOT NULL,
    last_dns_check TIMESTAMP,
    dns_txt_valid BOOLEAN DEFAULT 0,
    last_email_check TIMESTAMP,

    INDEX idx_domain (domain)
);

Purpose: Cache domain ownership verification Verification Method: Always 'two_factor' in v1.0.0 (DNS TXT + Email via rel="me") DNS TXT: Re-verified periodically (daily check) Email: NOT stored (only verification timestamp recorded) Re-verification: DNS checked periodically, email re-verified on each login Cleanup: Optional (admin decision)

Security Considerations

URL Validation

Critical: Prevent open redirect and phishing attacks.

me Validation:

from urllib.parse import urlparse

def validate_me(me: str) -> tuple[bool, str, str]:
    """
    Validate me parameter.

    Returns: (valid, normalized_me, error_message)
    """
    parsed = urlparse(me)

    # Must have scheme and netloc
    if not parsed.scheme or not parsed.netloc:
        return False, "", "me must be a complete URL"

    # Must be HTTP or HTTPS
    if parsed.scheme not in ['http', 'https']:
        return False, "", "me must use http or https"

    # No fragments
    if parsed.fragment:
        return False, "", "me must not contain fragment"

    # No credentials
    if parsed.username or parsed.password:
        return False, "", "me must not contain credentials"

    # No ports (except default)
    if parsed.port and not (parsed.port == 443 and parsed.scheme == 'https'):
        return False, "", "me must not contain non-standard port"

    # No IP addresses
    import ipaddress
    try:
        ipaddress.ip_address(parsed.netloc)
        return False, "", "me must be a domain, not IP address"
    except ValueError:
        pass  # Good, not an IP

    # Normalize
    domain = parsed.netloc.lower()
    path = parsed.path.rstrip('/')
    normalized = f"{parsed.scheme}://{domain}{path}"

    return True, normalized, ""

redirect_uri Validation:

def validate_redirect_uri(redirect_uri: str, client_id: str) -> tuple[bool, str]:
    """
    Validate redirect_uri against client_id.

    Returns: (valid, error_message)
    """
    parsed_redirect = urlparse(redirect_uri)
    parsed_client = urlparse(client_id)

    # Must be valid URL
    if not parsed_redirect.scheme or not parsed_redirect.netloc:
        return False, "redirect_uri must be a complete URL"

    # Must be HTTPS in production (allow HTTP for localhost)
    if not DEBUG:
        if parsed_redirect.scheme != 'https':
            if parsed_redirect.netloc != 'localhost':
                return False, "redirect_uri must use HTTPS"

    redirect_domain = parsed_redirect.netloc.lower()
    client_domain = parsed_client.netloc.lower()

    # Same domain: OK
    if redirect_domain == client_domain:
        return True, ""

    # Subdomain of client domain: OK
    if redirect_domain.endswith('.' + client_domain):
        return True, ""

    # Different domain: Check if registered (future)
    # For v1.0.0: Display warning to user
    return True, "warning: redirect_uri domain differs from client_id"

Constant-Time Comparison

Prevent timing attacks on token verification:

import secrets

def verify_token(provided_token: str, stored_hash: str) -> bool:
    """
    Verify token using constant-time comparison.
    """
    import hashlib
    provided_hash = hashlib.sha256(provided_token.encode()).hexdigest()
    return secrets.compare_digest(provided_hash, stored_hash)

CSRF Protection

State Parameter:

Client generates unguessable state
Server returns state unchanged
Client verifies state matches
Server does NOT validate state (client's responsibility)

HTTPS Enforcement

Production Requirements:

All endpoints MUST use HTTPS
HTTP allowed only for localhost in development
HSTS header recommended: Strict-Transport-Security: max-age=31536000

Rate Limiting (Future)

v1.0.0: Not implemented (acceptable for small deployments).

Future versions:

Authorization requests: 10/minute per IP
Token requests: 30/minute per client_id
Email codes: 3/hour per email
Failed verifications: 5/hour per IP

Protocol Deviations

Intentional Deviations from W3C Spec

ADR-003: PKCE deferred to post-v1.0.0

Reason: Simplicity for MVP, small user base, HTTPS mitigates risk
Impact: Slightly less secure against code interception
Mitigation: Enforce HTTPS, short code TTL (10 minutes)
Upgrade Path: Add PKCE in v1.1.0 without breaking changes

ADR-004: No client pre-registration required (TBD)

Reason: Aligns with user requirement for simplified client onboarding
Impact: Must validate client_id on every request
Mitigation: Cache client metadata, implement rate limiting
Spec Compliance: Spec allows this ("client IDs are resolvable URLs")

Scope Limitations (v1.0.0)

Authentication Only:

scope parameter accepted but ignored
All tokens issued with empty scope
Tokens prove identity, not authorization
Future versions will support scopes

Testing Strategy

Compliance Testing

Required Tests:

Valid authorization request → code generation
Valid token request → token generation
Invalid client_id → error
Invalid redirect_uri → error
Missing state → error
Expired authorization code → error
Used authorization code → error
Mismatched client_id on token request → error

Interoperability Testing

Test Against:

IndieAuth.com test suite (if available)
Real IndieAuth clients (IndieLogin, etc.)
Reference implementation comparison

Security Testing

Required Tests:

Open redirect prevention (invalid redirect_uri)
Timing attack resistance (token verification)
CSRF protection (state parameter)
Code reuse prevention (single-use codes)
URL validation (me parameter malformation)

Implementation Checklist

/authorize endpoint with parameter validation
Client metadata fetching (h-app microformat)
Email verification flow (code generation, sending, validation)
Domain ownership caching (SQLite)
Authorization code generation and storage (in-memory)
/token endpoint with grant validation
Access token generation and storage (SQLite, hashed)
Error responses (OAuth 2.0 compliant)
HTTPS enforcement (production)
URL validation (me, client_id, redirect_uri)
Constant-time token comparison
Metadata endpoint /.well-known/oauth-authorization-server
Comprehensive test suite (80%+ coverage)

References

W3C IndieAuth Specification: https://www.w3.org/TR/indieauth/
OAuth 2.0 (RFC 6749): https://datatracker.ietf.org/doc/html/rfc6749
OAuth 2.0 Security Best Practices: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics
PKCE (RFC 7636): https://datatracker.ietf.org/doc/html/rfc7636 (future)
Token Revocation (RFC 7009): https://datatracker.ietf.org/doc/html/rfc7009 (future)
Authorization Server Metadata (RFC 8414): https://datatracker.ietf.org/doc/html/rfc8414

21 KiB Raw Blame History

IndieAuth Protocol Implementation

Specification Compliance

Protocol Overview

Core Principle

IndieAuth vs OAuth 2.0

v1.0.0 Scope

Endpoints

Discovery Endpoint (Optional)

Authorization Endpoint

Required Parameters

Optional Parameters (v1.0.0)

Request Validation Sequence

Client Validation

Authentication Flow (v1.0.0: Two-Factor Domain Verification)

Error Responses

Token Endpoint

Required Parameters

Request Validation Sequence

Token Generation

Success Response

Error Responses

Token Verification Endpoint (Future)

Token Revocation Endpoint (Future)

Data Models

Authorization Code (In-Memory)

Email Verification Code (In-Memory)

Access Token (SQLite)

Verified Domain (SQLite)

Security Considerations

URL Validation

Constant-Time Comparison

CSRF Protection

HTTPS Enforcement

Rate Limiting (Future)

Protocol Deviations

Intentional Deviations from W3C Spec

Scope Limitations (v1.0.0)

Testing Strategy

Compliance Testing

Interoperability Testing

Security Testing

Implementation Checklist

References

21 KiB

Raw Blame History