Complete architectural documentation for: - Migration race condition fix with database locking - IndieAuth endpoint discovery implementation - Security considerations and migration guides New documentation: - ADR-030-CORRECTED: IndieAuth endpoint discovery decision - ADR-031: Endpoint discovery implementation details - Architecture docs on endpoint discovery - Migration guide for removed TOKEN_ENDPOINT - Security analysis of endpoint discovery - Implementation and analysis reports
12 KiB
IndieAuth Endpoint Discovery Architecture
Overview
This document details the CORRECT implementation of IndieAuth endpoint discovery for StarPunk. This corrects a fundamental misunderstanding where endpoints were incorrectly hardcoded instead of being discovered dynamically.
Core Principle
Endpoints are NEVER hardcoded. They are ALWAYS discovered from the user's profile URL.
Discovery Process
Step 1: Profile URL Fetching
When discovering endpoints for a user (e.g., https://alice.example.com/):
GET https://alice.example.com/ HTTP/1.1
Accept: text/html
User-Agent: StarPunk/1.0
Step 2: Endpoint Extraction
Check in priority order:
2.1 HTTP Link Headers (Highest Priority)
Link: <https://auth.example.com/authorize>; rel="authorization_endpoint",
<https://auth.example.com/token>; rel="token_endpoint"
2.2 HTML Link Elements
<link rel="authorization_endpoint" href="https://auth.example.com/authorize">
<link rel="token_endpoint" href="https://auth.example.com/token">
2.3 IndieAuth Metadata (Optional)
<link rel="indieauth-metadata" href="https://auth.example.com/.well-known/indieauth-metadata">
Step 3: URL Resolution
All discovered URLs must be resolved relative to the profile URL:
- Absolute URL: Use as-is
- Relative URL: Resolve against profile URL
- Protocol-relative: Inherit profile URL protocol
Token Verification Architecture
The Problem
When Micropub receives a token, it needs to verify it. But with which endpoint?
The Solution
┌─────────────────┐
│ Micropub Request│
│ Bearer: xxxxx │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Extract Token │
└────────┬────────┘
│
▼
┌─────────────────────────┐
│ Determine User Identity │
│ (from token or cache) │
└────────┬────────────────┘
│
▼
┌──────────────────────┐
│ Discover Endpoints │
│ from User Profile │
└────────┬─────────────┘
│
▼
┌──────────────────────┐
│ Verify with │
│ Discovered Endpoint │
└────────┬─────────────┘
│
▼
┌──────────────────────┐
│ Validate Response │
│ - Check 'me' URL │
│ - Check scopes │
└──────────────────────┘
Implementation Components
1. Endpoint Discovery Module
class EndpointDiscovery:
"""
Discovers IndieAuth endpoints from profile URLs
"""
def discover(self, profile_url: str) -> Dict[str, str]:
"""
Discover endpoints from a profile URL
Returns:
{
'authorization_endpoint': 'https://...',
'token_endpoint': 'https://...',
'indieauth_metadata': 'https://...' # optional
}
"""
def parse_link_header(self, header: str) -> Dict[str, str]:
"""Parse HTTP Link header for endpoints"""
def extract_from_html(self, html: str, base_url: str) -> Dict[str, str]:
"""Extract endpoints from HTML link elements"""
def resolve_url(self, url: str, base: str) -> str:
"""Resolve potentially relative URL against base"""
2. Token Verification Module
class TokenVerifier:
"""
Verifies tokens using discovered endpoints
"""
def __init__(self, discovery: EndpointDiscovery, cache: EndpointCache):
self.discovery = discovery
self.cache = cache
def verify(self, token: str, expected_me: str = None) -> TokenInfo:
"""
Verify a token using endpoint discovery
Args:
token: The bearer token to verify
expected_me: Optional expected 'me' URL
Returns:
TokenInfo with 'me', 'scope', 'client_id', etc.
"""
def introspect_token(self, token: str, endpoint: str) -> dict:
"""Call token endpoint to verify token"""
3. Caching Layer
class EndpointCache:
"""
Caches discovered endpoints for performance
"""
def __init__(self, ttl: int = 3600):
self.endpoint_cache = {} # profile_url -> (endpoints, expiry)
self.token_cache = {} # token_hash -> (info, expiry)
self.ttl = ttl
def get_endpoints(self, profile_url: str) -> Optional[Dict[str, str]]:
"""Get cached endpoints if still valid"""
def store_endpoints(self, profile_url: str, endpoints: Dict[str, str]):
"""Cache discovered endpoints"""
def get_token_info(self, token_hash: str) -> Optional[TokenInfo]:
"""Get cached token verification if still valid"""
def store_token_info(self, token_hash: str, info: TokenInfo):
"""Cache token verification result"""
Error Handling
Discovery Failures
| Error | Cause | Response |
|---|---|---|
| ProfileUnreachableError | Can't fetch profile URL | 503 Service Unavailable |
| NoEndpointsFoundError | No endpoints in profile | 400 Bad Request |
| InvalidEndpointError | Malformed endpoint URL | 500 Internal Server Error |
| TimeoutError | Discovery timeout | 504 Gateway Timeout |
Verification Failures
| Error | Cause | Response |
|---|---|---|
| TokenInvalidError | Token rejected by endpoint | 403 Forbidden |
| EndpointUnreachableError | Can't reach token endpoint | 503 Service Unavailable |
| ScopeMismatchError | Token lacks required scope | 403 Forbidden |
| MeMismatchError | Token 'me' doesn't match expected | 403 Forbidden |
Security Considerations
1. HTTPS Enforcement
- Profile URLs SHOULD use HTTPS
- Discovered endpoints MUST use HTTPS
- Reject non-HTTPS endpoints in production
2. Redirect Limits
- Maximum 5 redirects when fetching profiles
- Prevent redirect loops
- Log suspicious redirect patterns
3. Cache Poisoning Prevention
- Validate discovered URLs are well-formed
- Don't cache error responses
- Clear cache on configuration changes
4. Token Security
- Never log tokens in plaintext
- Hash tokens before caching
- Use constant-time comparison for token hashes
Performance Optimization
Caching Strategy
┌─────────────────────────────────────┐
│ First Request │
│ Discovery: ~500ms │
│ Verification: ~200ms │
│ Total: ~700ms │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Subsequent Requests │
│ Cached Endpoints: ~1ms │
│ Cached Token: ~1ms │
│ Total: ~2ms │
└─────────────────────────────────────┘
Cache Configuration
# Endpoint cache (user rarely changes provider)
ENDPOINT_CACHE_TTL=3600 # 1 hour
# Token cache (balance security and performance)
TOKEN_CACHE_TTL=300 # 5 minutes
# Cache sizes
MAX_ENDPOINT_CACHE_SIZE=1000
MAX_TOKEN_CACHE_SIZE=10000
Migration Path
From Incorrect Hardcoded Implementation
- Remove hardcoded endpoint configuration
- Implement discovery module
- Update token verification to use discovery
- Add caching layer
- Update documentation
Configuration Changes
Before (WRONG):
TOKEN_ENDPOINT=https://tokens.indieauth.com/token
AUTHORIZATION_ENDPOINT=https://indieauth.com/auth
After (CORRECT):
ADMIN_ME=https://admin.example.com/
# Endpoints discovered automatically from ADMIN_ME
Testing Strategy
Unit Tests
-
Discovery Tests
- Parse various Link header formats
- Extract from different HTML structures
- Handle malformed responses
- URL resolution edge cases
-
Cache Tests
- TTL expiration
- Cache invalidation
- Size limits
- Concurrent access
-
Security Tests
- HTTPS enforcement
- Redirect limit enforcement
- Cache poisoning attempts
Integration Tests
-
Real Provider Tests
- Test against indieauth.com
- Test against indie-auth.com
- Test against self-hosted providers
-
Network Condition Tests
- Slow responses
- Timeouts
- Connection failures
- Partial responses
End-to-End Tests
- Full Flow Tests
- Discovery → Verification → Caching
- Multiple users with different providers
- Provider switching scenarios
Monitoring and Debugging
Metrics to Track
- Discovery success/failure rate
- Average discovery latency
- Cache hit ratio
- Token verification latency
- Endpoint availability
Debug Logging
# Discovery
DEBUG: Fetching profile URL: https://alice.example.com/
DEBUG: Found Link header: <https://auth.alice.net/token>; rel="token_endpoint"
DEBUG: Discovered token endpoint: https://auth.alice.net/token
# Verification
DEBUG: Verifying token for claimed identity: https://alice.example.com/
DEBUG: Using cached endpoint: https://auth.alice.net/token
DEBUG: Token verification successful, scopes: ['create', 'update']
# Caching
DEBUG: Caching endpoints for https://alice.example.com/ (TTL: 3600s)
DEBUG: Token verification cached (TTL: 300s)
Common Issues and Solutions
Issue 1: No Endpoints Found
Symptom: "No token endpoint found for user"
Causes:
- User hasn't set up IndieAuth on their profile
- Profile URL returns wrong Content-Type
- Link elements have typos
Solution:
- Provide clear error message
- Link to IndieAuth setup documentation
- Log details for debugging
Issue 2: Verification Timeouts
Symptom: "Authorization server is unreachable"
Causes:
- Auth server is down
- Network issues
- Firewall blocking requests
Solution:
- Implement retries with backoff
- Cache successful verifications
- Provide status page for auth server health
Issue 3: Cache Invalidation
Symptom: User changed provider but old one still used
Causes:
- Endpoints still cached
- TTL too long
Solution:
- Provide manual cache clear option
- Reduce TTL if needed
- Clear cache on errors
Appendix: Example Discoveries
Example 1: IndieAuth.com User
<!-- https://user.example.com/ -->
<link rel="authorization_endpoint" href="https://indieauth.com/auth">
<link rel="token_endpoint" href="https://tokens.indieauth.com/token">
Example 2: Self-Hosted
<!-- https://alice.example.com/ -->
<link rel="authorization_endpoint" href="https://alice.example.com/auth">
<link rel="token_endpoint" href="https://alice.example.com/token">
Example 3: Link Headers
HTTP/1.1 200 OK
Link: <https://auth.provider.com/authorize>; rel="authorization_endpoint",
<https://auth.provider.com/token>; rel="token_endpoint"
Content-Type: text/html
<!-- No link elements needed in HTML -->
Example 4: Relative URLs
<!-- https://bob.example.org/ -->
<link rel="authorization_endpoint" href="/auth/authorize">
<link rel="token_endpoint" href="/auth/token">
<!-- Resolves to https://bob.example.org/auth/authorize -->
<!-- Resolves to https://bob.example.org/auth/token -->
Document Version: 1.0 Created: 2024-11-24 Purpose: Correct implementation of IndieAuth endpoint discovery Status: Authoritative guide for implementation