Files
StarPunk/docs/architecture/endpoint-discovery-answers.md
Phil Skentelbery a7e0af9c2c docs: Add complete documentation for v1.0.0-rc.5 hotfix
Complete architectural documentation for:
- Migration race condition fix with database locking
- IndieAuth endpoint discovery implementation
- Security considerations and migration guides

New documentation:
- ADR-030-CORRECTED: IndieAuth endpoint discovery decision
- ADR-031: Endpoint discovery implementation details
- Architecture docs on endpoint discovery
- Migration guide for removed TOKEN_ENDPOINT
- Security analysis of endpoint discovery
- Implementation and analysis reports
2025-11-24 20:20:00 -07:00

12 KiB

IndieAuth Endpoint Discovery: Definitive Implementation Answers

Date: 2025-11-24 Architect: StarPunk Software Architect Status: APPROVED FOR IMPLEMENTATION Target Version: 1.0.0-rc.5


Executive Summary

These are definitive answers to the developer's 10 questions about IndieAuth endpoint discovery implementation. The developer should implement exactly as specified here.


CRITICAL ANSWERS (Blocking Implementation)

Answer 1: The "Which Endpoint?" Problem

DEFINITIVE ANSWER: For StarPunk V1 (single-user CMS), ALWAYS use ADMIN_ME for endpoint discovery.

Your proposed solution is 100% CORRECT:

def verify_external_token(token: str) -> Optional[Dict[str, Any]]:
    """Verify token for the admin user"""
    admin_me = current_app.config.get("ADMIN_ME")

    # ALWAYS discover endpoints from ADMIN_ME profile
    endpoints = discover_endpoints(admin_me)
    token_endpoint = endpoints['token_endpoint']

    # Verify token with discovered endpoint
    response = httpx.get(
        token_endpoint,
        headers={'Authorization': f'Bearer {token}'}
    )

    token_info = response.json()

    # Validate token belongs to admin
    if normalize_url(token_info['me']) != normalize_url(admin_me):
        raise TokenVerificationError("Token not for admin user")

    return token_info

Rationale:

  • StarPunk V1 is explicitly single-user
  • Only the admin (ADMIN_ME) can post to the CMS
  • Any token not belonging to ADMIN_ME is invalid by definition
  • This eliminates the chicken-and-egg problem completely

Important: Document this single-user assumption clearly in the code comments. When V2 adds multi-user support, this will need revisiting.

Answer 2a: Cache Structure

DEFINITIVE ANSWER: Use a SIMPLE cache for V1 single-user.

class EndpointCache:
    def __init__(self):
        # Simple cache for single-user V1
        self.endpoints = None
        self.endpoints_expire = 0
        self.token_cache = {}  # token_hash -> (info, expiry)

Rationale:

  • We only have one user (ADMIN_ME) in V1
  • No need for profile_url -> endpoints mapping
  • Simplest solution that works
  • Easy to upgrade to dict-based for V2 multi-user

Answer 3a: BeautifulSoup4 Dependency

DEFINITIVE ANSWER: YES, add BeautifulSoup4 as a dependency.

# pyproject.toml
[project.dependencies]
beautifulsoup4 = ">=4.12.0"

Rationale:

  • Industry standard for HTML parsing
  • More robust than regex or built-in parser
  • Pure Python (with html.parser backend)
  • Well-maintained and documented
  • Worth the dependency for correctness

IMPORTANT ANSWERS (Affects Quality)

Answer 2b: Token Hashing

DEFINITIVE ANSWER: YES, hash tokens with SHA-256.

token_hash = hashlib.sha256(token.encode()).hexdigest()

Rationale:

  • Prevents tokens appearing in logs
  • Fixed-length cache keys
  • Security best practice
  • NO need for HMAC (we're not signing, just hashing)
  • NO need for constant-time comparison (cache lookup, not authentication)

Answer 2c: Cache Invalidation

DEFINITIVE ANSWER: Clear cache on:

  1. Application startup (cache is in-memory)
  2. TTL expiry (automatic)
  3. NOT on failures (could be transient network issues)
  4. NO manual endpoint needed for V1

Answer 2d: Cache Storage

DEFINITIVE ANSWER: Custom EndpointCache class with simple dict.

class EndpointCache:
    """Simple in-memory cache with TTL support"""

    def __init__(self):
        self.endpoints = None
        self.endpoints_expire = 0
        self.token_cache = {}

    def get_endpoints(self):
        if time.time() < self.endpoints_expire:
            return self.endpoints
        return None

    def set_endpoints(self, endpoints, ttl=3600):
        self.endpoints = endpoints
        self.endpoints_expire = time.time() + ttl

Rationale:

  • Simple and explicit
  • No external dependencies
  • Easy to test
  • Clear TTL handling

Answer 3b: HTML Validation

DEFINITIVE ANSWER: Handle malformed HTML gracefully.

try:
    soup = BeautifulSoup(html, 'html.parser')
    # Look for links in both head and body (be liberal)
    for link in soup.find_all('link', rel=True):
        # Process...
except Exception as e:
    logger.warning(f"HTML parsing failed: {e}")
    return {}  # Return empty, don't crash

Answer 3c: Case Sensitivity

DEFINITIVE ANSWER: BeautifulSoup handles this correctly by default. No special handling needed.

DEFINITIVE ANSWER: Use simple regex, document limitations.

def _parse_link_header(self, header: str) -> Dict[str, str]:
    """Parse Link header (basic RFC 8288 support)

    Note: Only supports quoted rel values, single Link headers
    """
    pattern = r'<([^>]+)>;\s*rel="([^"]+)"'
    matches = re.findall(pattern, header)
    # ... process matches

Rationale:

  • Simple implementation for V1
  • Document limitations clearly
  • Can upgrade if needed later
  • Avoids additional dependencies

Answer 4b: Multiple Headers

DEFINITIVE ANSWER: Your regex with re.findall() is correct. It handles both cases.

Answer 4c: Priority Order

DEFINITIVE ANSWER: Option B - Merge with Link header overwriting HTML.

endpoints = {}
# First get from HTML
endpoints.update(html_endpoints)
# Then overwrite with Link headers (higher priority)
endpoints.update(link_header_endpoints)

Answer 5a: URL Validation

DEFINITIVE ANSWER: Validate with these checks:

def validate_endpoint_url(url: str) -> bool:
    parsed = urlparse(url)

    # Must be absolute
    if not parsed.scheme or not parsed.netloc:
        raise DiscoveryError("Invalid URL format")

    # HTTPS required in production
    if not current_app.debug and parsed.scheme != 'https':
        raise DiscoveryError("HTTPS required in production")

    # Allow localhost only in debug mode
    if not current_app.debug and parsed.hostname in ['localhost', '127.0.0.1', '::1']:
        raise DiscoveryError("Localhost not allowed in production")

    return True

Answer 5b: URL Normalization

DEFINITIVE ANSWER: Normalize only for comparison, not storage.

def normalize_url(url: str) -> str:
    """Normalize URL for comparison only"""
    return url.rstrip("/").lower()

Store endpoints as discovered, normalize only when comparing.

Answer 5c: Relative URL Edge Cases

DEFINITIVE ANSWER: Let urljoin() handle it, document behavior.

Python's urljoin() handles first two cases correctly. For the third (broken) case, let it fail naturally. Don't try to be clever.

Answer 6a: Discovery Failures

DEFINITIVE ANSWER: Fail closed with grace period.

def discover_endpoints(profile_url: str) -> Dict[str, str]:
    try:
        # Try discovery
        endpoints = self._fetch_and_parse(profile_url)
        self.cache.set_endpoints(endpoints)
        return endpoints
    except Exception as e:
        # Check cache even if expired (grace period)
        cached = self.cache.get_endpoints(ignore_expiry=True)
        if cached:
            logger.warning(f"Using expired cache due to discovery failure: {e}")
            return cached
        # No cache, must fail
        raise DiscoveryError(f"Endpoint discovery failed: {e}")

Answer 6b: Token Verification Failures

DEFINITIVE ANSWER: Retry ONLY for network errors.

def verify_with_retries(endpoint: str, token: str, max_retries: int = 3):
    for attempt in range(max_retries):
        try:
            response = httpx.get(...)
            if response.status_code in [500, 502, 503, 504]:
                # Server error, retry
                if attempt < max_retries - 1:
                    time.sleep(2 ** attempt)  # Exponential backoff
                    continue
            return response
        except (httpx.TimeoutException, httpx.NetworkError):
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)
                continue
            raise

    # For 400/401/403, fail immediately (no retry)

Answer 6c: Timeout Configuration

DEFINITIVE ANSWER: Use these timeouts:

DISCOVERY_TIMEOUT = 5.0  # Profile fetch (cached, so can be slower)
VERIFICATION_TIMEOUT = 3.0  # Token verification (every request)

Not configurable in V1. Hardcode with constants.


OTHER ANSWERS

Answer 7a: Test Strategy

DEFINITIVE ANSWER: Unit tests mock, ONE integration test with real IndieAuth.com.

Answer 7b: Test Fixtures

DEFINITIVE ANSWER: YES, create reusable fixtures.

# tests/fixtures/indieauth_profiles.py
PROFILES = {
    'link_header': {...},
    'html_links': {...},
    'both': {...},
    # etc.
}

Answer 7c: Test Coverage

DEFINITIVE ANSWER:

  • 90%+ coverage for new code
  • All edge cases tested
  • One real integration test

Answer 8a: First Request Latency

DEFINITIVE ANSWER: Accept the delay. Do NOT pre-warm cache.

Rationale:

  • Only happens once per hour
  • Pre-warming adds complexity
  • User can wait 850ms for first post

Answer 8b: Cache TTLs

DEFINITIVE ANSWER: Keep as specified:

  • Endpoints: 3600s (1 hour)
  • Token verifications: 300s (5 minutes)

These are good defaults.

Answer 8c: Concurrent Requests

DEFINITIVE ANSWER: Accept duplicate discoveries for V1.

No locking needed for single-user low-traffic V1.

Answer 9a: Configuration Changes

DEFINITIVE ANSWER: Remove TOKEN_ENDPOINT immediately with deprecation warning.

# config.py
if 'TOKEN_ENDPOINT' in os.environ:
    logger.warning(
        "TOKEN_ENDPOINT is deprecated and ignored. "
        "Remove it from your configuration. "
        "Endpoints are now discovered from ADMIN_ME profile."
    )

Answer 9b: Backward Compatibility

DEFINITIVE ANSWER: Document breaking change in CHANGELOG. No migration script.

We're in RC phase, breaking changes are acceptable.

Answer 9c: Health Check

DEFINITIVE ANSWER: NO endpoint discovery in health check.

Too expensive. Health check should be fast.

Answer 10a: Local Development

DEFINITIVE ANSWER: Allow HTTP in debug mode.

if current_app.debug:
    # Allow HTTP in development
    pass
else:
    # Require HTTPS in production
    if parsed.scheme != 'https':
        raise SecurityError("HTTPS required")

Answer 10b: Testing with Real Providers

DEFINITIVE ANSWER: Document test setup, skip in CI.

@pytest.mark.skipif(
    not os.environ.get('TEST_REAL_INDIEAUTH'),
    reason="Set TEST_REAL_INDIEAUTH=1 to run real provider tests"
)
def test_real_indieauth():
    # Test with real IndieAuth.com

Implementation Go/No-Go Decision

APPROVED FOR IMPLEMENTATION

You have all the information needed to implement endpoint discovery correctly. Proceed with your Phase 1-5 plan.

Implementation Priorities

  1. FIRST: Implement Question 1 solution (ADMIN_ME discovery)
  2. SECOND: Add BeautifulSoup4 dependency
  3. THIRD: Create EndpointCache class
  4. THEN: Follow your phased implementation plan

Key Implementation Notes

  1. Always use ADMIN_ME for endpoint discovery in V1
  2. Fail closed on security errors
  3. Be liberal in what you accept (HTML parsing)
  4. Be strict in what you validate (URLs, tokens)
  5. Document single-user assumptions clearly
  6. Test edge cases thoroughly

Summary for Quick Reference

Question Answer Implementation
Q1: Which endpoint? Always use ADMIN_ME discover_endpoints(admin_me)
Q2a: Cache structure? Simple for single-user self.endpoints = None
Q3a: Add BeautifulSoup4? YES Add to dependencies
Q5a: URL validation? HTTPS in prod, localhost in dev Check with current_app.debug
Q6a: Error handling? Fail closed with cache grace Try cache on failure
Q6b: Retry logic? Only for network errors 3 retries with backoff
Q9a: Remove TOKEN_ENDPOINT? Yes with warning Deprecation message

This document provides definitive answers. Implement as specified. No further architectural review needed before coding.

Document Version: 1.0 Status: FINAL Next Step: Begin implementation immediately