Files
StarPunk/docs/design/v1.0.0/indieauth-removal-implementation-guide.md
Phil Skentelbery f10d0679da feat(tags): Add database schema and tags module (v1.3.0 Phase 1)
Implements tag/category system backend following microformats2 p-category specification.

Database changes:
- Migration 008: Add tags and note_tags tables
- Normalized tag storage (case-insensitive lookup, display name preserved)
- Indexes for performance

New module:
- starpunk/tags.py: Tag management functions
  - normalize_tag: Normalize tag strings
  - get_or_create_tag: Get or create tag records
  - add_tags_to_note: Associate tags with notes (replaces existing)
  - get_note_tags: Retrieve note tags (alphabetically ordered)
  - get_tag_by_name: Lookup tag by normalized name
  - get_notes_by_tag: Get all notes with specific tag
  - parse_tag_input: Parse comma-separated tag input

Model updates:
- Note.tags property (lazy-loaded, prefer pre-loading in routes)
- Note.to_dict() add include_tags parameter

CRUD updates:
- create_note() accepts tags parameter
- update_note() accepts tags parameter (None = no change, [] = remove all)

Micropub integration:
- Pass tags to create_note() (tags already extracted by extract_tags())
- Return tags in q=source response

Per design doc: docs/design/v1.3.0/microformats-tags-design.md

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-10 11:24:23 -07:00

15 KiB

IndieAuth Provider Removal - Implementation Guide

Executive Summary

This document provides complete architectural guidance for removing the internal IndieAuth provider functionality from StarPunk while maintaining external IndieAuth integration for token verification. All questions have been answered based on the IndieAuth specification and architectural principles.

Answers to Critical Questions

Q1: External Token Endpoint Response Format ✓

Answer: The user is correct. The IndieAuth specification (W3C) defines exact response formats.

Token Verification Response (per spec section 6.3.4):

{
  "me": "https://user.example.net/",
  "client_id": "https://app.example.com/",
  "scope": "create update delete"
}

Key Points:

  • Response is JSON with required fields: me, client_id, scope
  • Additional fields may be present but should be ignored
  • On invalid tokens: return HTTP 400, 401, or 403
  • The me field MUST match the configured admin identity

Q2: HTML Discovery Headers ✓

Answer: The user refers to how users configure their personal domains to point to IndieAuth providers.

What Users Add to Their HTML (per spec sections 4.1, 5.1, 6.1):

<!-- In the <head> of the user's personal website -->
<link rel="authorization_endpoint" href="https://indielogin.com/auth">
<link rel="token_endpoint" href="https://tokens.indieauth.com/token">
<link rel="micropub" href="https://your-starpunk.example.com/api/micropub">

Key Points:

  • These links go on the USER'S personal website, NOT in StarPunk
  • StarPunk doesn't generate these - it discovers them from user URLs
  • Users choose their own authorization/token providers
  • StarPunk only needs to know the user's identity URL (configured as ADMIN_ME)

Q3: Migration Strategy - ARCHITECTURAL DECISION

Answer: Keep migration 002 but clarify its purpose.

Decision:

  1. Keep Migration 002 - The tables are actually needed for V2 features
  2. Rename/Document - Clarify that these tables are for future internal provider support
  3. No Production Impact - Tables remain empty in V1, cause no harm

Rationale:

  • The tokens table with secure hash storage is good future-proofing
  • The authorization_codes table will be needed if V2 adds internal provider
  • Empty tables have zero performance impact
  • Removing and re-adding later creates unnecessary migration complexity
  • Document clearly that these are unused in V1

Implementation:

-- Add comment to migration 002
-- These tables are created for future V2 internal provider support
-- In V1, StarPunk only verifies external tokens via HTTP, not database

Q4: Error Handling ✓

Answer: The user provided clear guidance - display informative error messages.

Error Handling Strategy:

def verify_token(bearer_token, token_endpoint):
    try:
        response = httpx.get(
            token_endpoint,
            headers={'Authorization': f'Bearer {bearer_token}'},
            timeout=5.0
        )

        if response.status_code == 200:
            return response.json()
        elif response.status_code in [400, 401, 403]:
            return None  # Invalid token
        else:
            raise TokenEndpointError(f"Unexpected status: {response.status_code}")

    except httpx.TimeoutError:
        # User's requirement: show auth server unreachable
        raise TokenEndpointError("Authorization server is unreachable")
    except httpx.RequestError as e:
        raise TokenEndpointError(f"Cannot connect to authorization server: {e}")

User-Facing Errors:

  • Auth Server Down: "Authorization server is unreachable. Please try again later."
  • Invalid Token: "Access token is invalid or expired. Please re-authorize."
  • Network Error: "Cannot connect to authorization server. Check your network connection."

Q5: Cache Revocation Delay - ARCHITECTURAL DECISION

Answer: The 5-minute cache is acceptable with proper configuration.

Decision: Use configurable short-lived cache with bypass option.

Architecture:

class TokenCache:
    """
    Simple time-based token cache with security considerations

    Configuration:
    - MICROPUB_TOKEN_CACHE_TTL: 300 (5 minutes default)
    - MICROPUB_TOKEN_CACHE_ENABLED: true (can disable for high-security)
    """

    def __init__(self, ttl=300):
        self.ttl = ttl
        self.cache = {}  # token_hash -> (token_info, expiry_time)

    def get(self, token):
        """Get cached token if valid and not expired"""
        token_hash = hashlib.sha256(token.encode()).hexdigest()
        if token_hash in self.cache:
            info, expiry = self.cache[token_hash]
            if time.time() < expiry:
                return info
            del self.cache[token_hash]
        return None

    def set(self, token, info):
        """Cache token info with TTL"""
        token_hash = hashlib.sha256(token.encode()).hexdigest()
        expiry = time.time() + self.ttl
        self.cache[token_hash] = (info, expiry)

Security Analysis:

  • Risk: Revoked tokens remain valid for up to 5 minutes
  • Mitigation: Short TTL limits exposure window
  • Trade-off: Performance vs immediate revocation
  • Best Practice: Document the delay in security considerations

Configuration Options:

# For high-security environments
MICROPUB_TOKEN_CACHE_ENABLED=false  # Disable cache entirely

# For normal use (recommended)
MICROPUB_TOKEN_CACHE_TTL=300  # 5 minutes

# For development/testing
MICROPUB_TOKEN_CACHE_TTL=60  # 1 minute

Complete Implementation Architecture

1. System Boundaries

┌─────────────────────────────────────────────────────────────┐
│                     StarPunk V1 Scope                        │
│                                                              │
│  IN SCOPE:                                                   │
│  ✓ Token verification (external)                            │
│  ✓ Micropub endpoint                                        │
│  ✓ Bearer token extraction                                  │
│  ✓ Endpoint discovery                                       │
│  ✓ Admin session auth (IndieLogin)                          │
│                                                              │
│  OUT OF SCOPE:                                              │
│  ✗ Authorization endpoint (user provides)                   │
│  ✗ Token endpoint (user provides)                          │
│  ✗ Token issuance (external only)                          │
│  ✗ User registration                                        │
│  ✗ Identity management                                      │
└─────────────────────────────────────────────────────────────┘

2. Component Design

2.1 Token Verifier Component

# starpunk/indieauth/verifier.py

class ExternalTokenVerifier:
    """
    Verifies tokens with external IndieAuth providers
    Never stores tokens, only verifies them
    """

    def __init__(self, cache_ttl=300, cache_enabled=True):
        self.cache = TokenCache(ttl=cache_ttl) if cache_enabled else None
        self.http_client = httpx.Client(timeout=5.0)

    def verify(self, bearer_token: str, expected_me: str) -> Optional[TokenInfo]:
        """
        Verify bearer token with external token endpoint

        Returns:
            TokenInfo if valid, None if invalid

        Raises:
            TokenEndpointError if endpoint unreachable
        """
        # Check cache first
        if self.cache:
            cached = self.cache.get(bearer_token)
            if cached and cached.me == expected_me:
                return cached

        # Discover token endpoint from user's URL
        token_endpoint = self.discover_token_endpoint(expected_me)

        # Verify with external endpoint
        token_info = self.verify_with_endpoint(
            bearer_token,
            token_endpoint,
            expected_me
        )

        # Cache if valid
        if token_info and self.cache:
            self.cache.set(bearer_token, token_info)

        return token_info

2.2 Endpoint Discovery Component

# starpunk/indieauth/discovery.py

class EndpointDiscovery:
    """
    Discovers IndieAuth endpoints from user URLs
    Implements full spec compliance for discovery
    """

    def discover_token_endpoint(self, me_url: str) -> str:
        """
        Discover token endpoint from profile URL

        Priority order (per spec):
        1. HTTP Link header
        2. HTML <link> element
        3. IndieAuth metadata endpoint
        """
        response = httpx.get(me_url, follow_redirects=True)

        # 1. Check HTTP Link header (highest priority)
        link_header = response.headers.get('Link', '')
        if endpoint := self.parse_link_header(link_header, 'token_endpoint'):
            return urljoin(me_url, endpoint)

        # 2. Check HTML if content-type is HTML
        if 'text/html' in response.headers.get('content-type', ''):
            if endpoint := self.parse_html_links(response.text, 'token_endpoint'):
                return urljoin(me_url, endpoint)

        # 3. Check for indieauth-metadata endpoint
        if metadata_url := self.find_metadata_endpoint(response):
            metadata = httpx.get(metadata_url).json()
            if endpoint := metadata.get('token_endpoint'):
                return endpoint

        raise DiscoveryError(f"No token endpoint found at {me_url}")

3. Database Schema (V1 - Unused but Present)

-- These tables exist but are NOT USED in V1
-- They are created for future V2 internal provider support
-- Document this clearly in the migration

-- tokens table: For future internal token storage
-- authorization_codes table: For future OAuth flow support

-- V1 uses only external token verification via HTTP
-- No database queries for token validation in V1

4. API Contract

Micropub Endpoint

endpoint: /api/micropub
methods: [POST]
authentication: Bearer token

request:
  headers:
    Authorization: "Bearer {access_token}"
    Content-Type: "application/x-www-form-urlencoded" or "application/json"

  body: |
    Micropub create request per spec

response:
  success:
    status: 201
    headers:
      Location: "https://starpunk.example.com/notes/{id}"

  unauthorized:
    status: 401
    body:
      error: "unauthorized"
      error_description: "No access token provided"

  forbidden:
    status: 403
    body:
      error: "forbidden"
      error_description: "Invalid or expired access token"

  server_error:
    status: 503
    body:
      error: "temporarily_unavailable"
      error_description: "Authorization server is unreachable"

5. Configuration

# config.ini or environment variables

# User's identity URL (required)
ADMIN_ME=https://user.example.com

# Token cache settings (optional)
MICROPUB_TOKEN_CACHE_ENABLED=true
MICROPUB_TOKEN_CACHE_TTL=300

# HTTP client settings (optional)
MICROPUB_HTTP_TIMEOUT=5.0
MICROPUB_MAX_RETRIES=1

6. Security Considerations

Token Handling

  • Never store plain tokens - Only cache with SHA256 hashes
  • Always use HTTPS - Token verification must use TLS
  • Validate 'me' field - Must match configured admin identity
  • Check scope - Ensure 'create' scope for Micropub posts

Cache Security

  • Short TTL - 5 minutes maximum to limit revocation delay
  • Hash tokens - Even in cache, never store plain tokens
  • Memory only - Don't persist cache to disk
  • Config option - Allow disabling cache in high-security environments

Error Messages

  • Don't leak tokens - Never include tokens in error messages
  • Generic client errors - Don't reveal why authentication failed
  • Specific server errors - Help users understand connectivity issues

7. Testing Strategy

Unit Tests

def test_token_verification():
    """Test external token verification"""
    # Mock HTTP client
    # Test valid token response
    # Test invalid token response
    # Test network errors
    # Test timeout handling

def test_endpoint_discovery():
    """Test endpoint discovery from URLs"""
    # Test HTTP Link header discovery
    # Test HTML link element discovery
    # Test metadata endpoint discovery
    # Test relative URL resolution

def test_cache_behavior():
    """Test token cache"""
    # Test cache hit
    # Test cache miss
    # Test TTL expiry
    # Test cache disabled

Integration Tests

def test_micropub_with_valid_token():
    """Test full Micropub flow with valid token"""
    # Mock token endpoint
    # Send Micropub request
    # Verify note created
    # Check Location header

def test_micropub_with_invalid_token():
    """Test Micropub rejection with invalid token"""
    # Mock token endpoint to return 401
    # Send Micropub request
    # Verify 403 response
    # Verify no note created

def test_micropub_with_unreachable_auth_server():
    """Test handling of unreachable auth server"""
    # Mock network timeout
    # Send Micropub request
    # Verify 503 response
    # Verify error message

8. Implementation Checklist

Phase 1: Remove Internal Provider

  • Remove /auth/authorize endpoint
  • Remove /auth/token endpoint
  • Remove internal token issuance logic
  • Remove authorization code generation
  • Update tests to not expect these endpoints

Phase 2: Implement External Verification

  • Create ExternalTokenVerifier class
  • Implement endpoint discovery
  • Add token cache with TTL
  • Handle network errors gracefully
  • Add configuration options

Phase 3: Update Documentation

  • Update API documentation
  • Create user setup guide
  • Document security considerations
  • Update architecture diagrams
  • Add troubleshooting guide

Phase 4: Testing & Validation

  • Test with IndieLogin.com
  • Test with tokens.indieauth.com
  • Test with real Micropub clients (Quill, Indigenous)
  • Verify error handling
  • Load test token verification

Migration Path

For Existing Installations

  1. Database: No action needed (tables remain but unused)
  2. Configuration: Add ADMIN_ME setting
  3. Users: Provide setup instructions for their domains
  4. Testing: Verify external token verification works

For New Installations

  1. Fresh start: Full V1 external-only implementation
  2. Simple setup: Just configure ADMIN_ME
  3. User guide: How to configure their domain for IndieAuth

Conclusion

This architecture provides a clean, secure, and standards-compliant implementation of external IndieAuth token verification. The design follows the principle of "every line of code must justify its existence" by removing unnecessary internal provider complexity while maintaining full Micropub support.

The key insight is that StarPunk is a Micropub server, not an authorization server. This separation of concerns aligns perfectly with IndieWeb principles and keeps the codebase minimal and focused.


Document Version: 1.0 Created: 2024-11-24 Author: StarPunk Architecture Team Status: Final