Files
StarPunk/docs/architecture/indieauth-endpoint-discovery.md
Phil Skentelbery a7e0af9c2c docs: Add complete documentation for v1.0.0-rc.5 hotfix
Complete architectural documentation for:
- Migration race condition fix with database locking
- IndieAuth endpoint discovery implementation
- Security considerations and migration guides

New documentation:
- ADR-030-CORRECTED: IndieAuth endpoint discovery decision
- ADR-031: Endpoint discovery implementation details
- Architecture docs on endpoint discovery
- Migration guide for removed TOKEN_ENDPOINT
- Security analysis of endpoint discovery
- Implementation and analysis reports
2025-11-24 20:20:00 -07:00

12 KiB

IndieAuth Endpoint Discovery Architecture

Overview

This document details the CORRECT implementation of IndieAuth endpoint discovery for StarPunk. This corrects a fundamental misunderstanding where endpoints were incorrectly hardcoded instead of being discovered dynamically.

Core Principle

Endpoints are NEVER hardcoded. They are ALWAYS discovered from the user's profile URL.

Discovery Process

Step 1: Profile URL Fetching

When discovering endpoints for a user (e.g., https://alice.example.com/):

GET https://alice.example.com/ HTTP/1.1
Accept: text/html
User-Agent: StarPunk/1.0

Step 2: Endpoint Extraction

Check in priority order:

Link: <https://auth.example.com/authorize>; rel="authorization_endpoint",
      <https://auth.example.com/token>; rel="token_endpoint"
<link rel="authorization_endpoint" href="https://auth.example.com/authorize">
<link rel="token_endpoint" href="https://auth.example.com/token">

2.3 IndieAuth Metadata (Optional)

<link rel="indieauth-metadata" href="https://auth.example.com/.well-known/indieauth-metadata">

Step 3: URL Resolution

All discovered URLs must be resolved relative to the profile URL:

  • Absolute URL: Use as-is
  • Relative URL: Resolve against profile URL
  • Protocol-relative: Inherit profile URL protocol

Token Verification Architecture

The Problem

When Micropub receives a token, it needs to verify it. But with which endpoint?

The Solution

┌─────────────────┐
│ Micropub Request│
│ Bearer: xxxxx   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Extract Token   │
└────────┬────────┘
         │
         ▼
┌─────────────────────────┐
│ Determine User Identity │
│ (from token or cache)   │
└────────┬────────────────┘
         │
         ▼
┌──────────────────────┐
│ Discover Endpoints   │
│ from User Profile    │
└────────┬─────────────┘
         │
         ▼
┌──────────────────────┐
│ Verify with          │
│ Discovered Endpoint  │
└────────┬─────────────┘
         │
         ▼
┌──────────────────────┐
│ Validate Response    │
│ - Check 'me' URL     │
│ - Check scopes       │
└──────────────────────┘

Implementation Components

1. Endpoint Discovery Module

class EndpointDiscovery:
    """
    Discovers IndieAuth endpoints from profile URLs
    """

    def discover(self, profile_url: str) -> Dict[str, str]:
        """
        Discover endpoints from a profile URL

        Returns:
            {
                'authorization_endpoint': 'https://...',
                'token_endpoint': 'https://...',
                'indieauth_metadata': 'https://...'  # optional
            }
        """

    def parse_link_header(self, header: str) -> Dict[str, str]:
        """Parse HTTP Link header for endpoints"""

    def extract_from_html(self, html: str, base_url: str) -> Dict[str, str]:
        """Extract endpoints from HTML link elements"""

    def resolve_url(self, url: str, base: str) -> str:
        """Resolve potentially relative URL against base"""

2. Token Verification Module

class TokenVerifier:
    """
    Verifies tokens using discovered endpoints
    """

    def __init__(self, discovery: EndpointDiscovery, cache: EndpointCache):
        self.discovery = discovery
        self.cache = cache

    def verify(self, token: str, expected_me: str = None) -> TokenInfo:
        """
        Verify a token using endpoint discovery

        Args:
            token: The bearer token to verify
            expected_me: Optional expected 'me' URL

        Returns:
            TokenInfo with 'me', 'scope', 'client_id', etc.
        """

    def introspect_token(self, token: str, endpoint: str) -> dict:
        """Call token endpoint to verify token"""

3. Caching Layer

class EndpointCache:
    """
    Caches discovered endpoints for performance
    """

    def __init__(self, ttl: int = 3600):
        self.endpoint_cache = {}  # profile_url -> (endpoints, expiry)
        self.token_cache = {}      # token_hash -> (info, expiry)
        self.ttl = ttl

    def get_endpoints(self, profile_url: str) -> Optional[Dict[str, str]]:
        """Get cached endpoints if still valid"""

    def store_endpoints(self, profile_url: str, endpoints: Dict[str, str]):
        """Cache discovered endpoints"""

    def get_token_info(self, token_hash: str) -> Optional[TokenInfo]:
        """Get cached token verification if still valid"""

    def store_token_info(self, token_hash: str, info: TokenInfo):
        """Cache token verification result"""

Error Handling

Discovery Failures

Error Cause Response
ProfileUnreachableError Can't fetch profile URL 503 Service Unavailable
NoEndpointsFoundError No endpoints in profile 400 Bad Request
InvalidEndpointError Malformed endpoint URL 500 Internal Server Error
TimeoutError Discovery timeout 504 Gateway Timeout

Verification Failures

Error Cause Response
TokenInvalidError Token rejected by endpoint 403 Forbidden
EndpointUnreachableError Can't reach token endpoint 503 Service Unavailable
ScopeMismatchError Token lacks required scope 403 Forbidden
MeMismatchError Token 'me' doesn't match expected 403 Forbidden

Security Considerations

1. HTTPS Enforcement

  • Profile URLs SHOULD use HTTPS
  • Discovered endpoints MUST use HTTPS
  • Reject non-HTTPS endpoints in production

2. Redirect Limits

  • Maximum 5 redirects when fetching profiles
  • Prevent redirect loops
  • Log suspicious redirect patterns

3. Cache Poisoning Prevention

  • Validate discovered URLs are well-formed
  • Don't cache error responses
  • Clear cache on configuration changes

4. Token Security

  • Never log tokens in plaintext
  • Hash tokens before caching
  • Use constant-time comparison for token hashes

Performance Optimization

Caching Strategy

┌─────────────────────────────────────┐
│         First Request               │
│  Discovery: ~500ms                  │
│  Verification: ~200ms               │
│  Total: ~700ms                      │
└─────────────────────────────────────┘
                │
                ▼
┌─────────────────────────────────────┐
│      Subsequent Requests            │
│  Cached Endpoints: ~1ms             │
│  Cached Token: ~1ms                 │
│  Total: ~2ms                        │
└─────────────────────────────────────┘

Cache Configuration

# Endpoint cache (user rarely changes provider)
ENDPOINT_CACHE_TTL=3600  # 1 hour

# Token cache (balance security and performance)
TOKEN_CACHE_TTL=300      # 5 minutes

# Cache sizes
MAX_ENDPOINT_CACHE_SIZE=1000
MAX_TOKEN_CACHE_SIZE=10000

Migration Path

From Incorrect Hardcoded Implementation

  1. Remove hardcoded endpoint configuration
  2. Implement discovery module
  3. Update token verification to use discovery
  4. Add caching layer
  5. Update documentation

Configuration Changes

Before (WRONG):

TOKEN_ENDPOINT=https://tokens.indieauth.com/token
AUTHORIZATION_ENDPOINT=https://indieauth.com/auth

After (CORRECT):

ADMIN_ME=https://admin.example.com/
# Endpoints discovered automatically from ADMIN_ME

Testing Strategy

Unit Tests

  1. Discovery Tests

    • Parse various Link header formats
    • Extract from different HTML structures
    • Handle malformed responses
    • URL resolution edge cases
  2. Cache Tests

    • TTL expiration
    • Cache invalidation
    • Size limits
    • Concurrent access
  3. Security Tests

    • HTTPS enforcement
    • Redirect limit enforcement
    • Cache poisoning attempts

Integration Tests

  1. Real Provider Tests

    • Test against indieauth.com
    • Test against indie-auth.com
    • Test against self-hosted providers
  2. Network Condition Tests

    • Slow responses
    • Timeouts
    • Connection failures
    • Partial responses

End-to-End Tests

  1. Full Flow Tests
    • Discovery → Verification → Caching
    • Multiple users with different providers
    • Provider switching scenarios

Monitoring and Debugging

Metrics to Track

  • Discovery success/failure rate
  • Average discovery latency
  • Cache hit ratio
  • Token verification latency
  • Endpoint availability

Debug Logging

# Discovery
DEBUG: Fetching profile URL: https://alice.example.com/
DEBUG: Found Link header: <https://auth.alice.net/token>; rel="token_endpoint"
DEBUG: Discovered token endpoint: https://auth.alice.net/token

# Verification
DEBUG: Verifying token for claimed identity: https://alice.example.com/
DEBUG: Using cached endpoint: https://auth.alice.net/token
DEBUG: Token verification successful, scopes: ['create', 'update']

# Caching
DEBUG: Caching endpoints for https://alice.example.com/ (TTL: 3600s)
DEBUG: Token verification cached (TTL: 300s)

Common Issues and Solutions

Issue 1: No Endpoints Found

Symptom: "No token endpoint found for user"

Causes:

  • User hasn't set up IndieAuth on their profile
  • Profile URL returns wrong Content-Type
  • Link elements have typos

Solution:

  • Provide clear error message
  • Link to IndieAuth setup documentation
  • Log details for debugging

Issue 2: Verification Timeouts

Symptom: "Authorization server is unreachable"

Causes:

  • Auth server is down
  • Network issues
  • Firewall blocking requests

Solution:

  • Implement retries with backoff
  • Cache successful verifications
  • Provide status page for auth server health

Issue 3: Cache Invalidation

Symptom: User changed provider but old one still used

Causes:

  • Endpoints still cached
  • TTL too long

Solution:

  • Provide manual cache clear option
  • Reduce TTL if needed
  • Clear cache on errors

Appendix: Example Discoveries

Example 1: IndieAuth.com User

<!-- https://user.example.com/ -->
<link rel="authorization_endpoint" href="https://indieauth.com/auth">
<link rel="token_endpoint" href="https://tokens.indieauth.com/token">

Example 2: Self-Hosted

<!-- https://alice.example.com/ -->
<link rel="authorization_endpoint" href="https://alice.example.com/auth">
<link rel="token_endpoint" href="https://alice.example.com/token">
HTTP/1.1 200 OK
Link: <https://auth.provider.com/authorize>; rel="authorization_endpoint",
      <https://auth.provider.com/token>; rel="token_endpoint"
Content-Type: text/html

<!-- No link elements needed in HTML -->

Example 4: Relative URLs

<!-- https://bob.example.org/ -->
<link rel="authorization_endpoint" href="/auth/authorize">
<link rel="token_endpoint" href="/auth/token">
<!-- Resolves to https://bob.example.org/auth/authorize -->
<!-- Resolves to https://bob.example.org/auth/token -->

Document Version: 1.0 Created: 2024-11-24 Purpose: Correct implementation of IndieAuth endpoint discovery Status: Authoritative guide for implementation