# IndieAuth Removal - Questions for Architect **Date**: 2025-11-24 **Developer**: Fullstack Developer Agent **Document**: Pre-Implementation Questions ## Status: BLOCKED - Awaiting Architectural Clarification I have thoroughly reviewed the removal plan and identified several architectural questions that need answers before implementation can begin safely. --- ## CRITICAL QUESTIONS (Must answer before implementing) ### Q1: External Token Endpoint Response Format **What I see in the plan** (ADR-050 lines 156-191): ```python response = httpx.get( token_endpoint, headers={'Authorization': f'Bearer {bearer_token}'} ) data = response.json() # Uses: data.get('me'), data.get('scope') ``` **What I see in current code** (starpunk/tokens.py:116-164): ```python def verify_token(token: str) -> Optional[Dict[str, Any]]: return { 'me': row['me'], 'client_id': row['client_id'], 'scope': row['scope'] } ``` **Questions**: 1. What is the EXACT response format from tokens.indieauth.com/token? 2. Does it include `client_id`? (current code uses this) 3. What fields can we rely on? 4. What status codes indicate invalid token vs server error? **Request**: Provide actual example response from tokens.indieauth.com or point to specification. **Why this blocks**: Phase 4 implementation depends on knowing exact response format. --- ### Q2: HTML Discovery Headers Strategy **What the plan shows** (simplified-auth-architecture.md lines 207-210): ```html ``` **My confusion**: - These headers tell Micropub CLIENTS where to get tokens - We're putting them on OUR pages (starpunk instance) - But shouldn't they point to the USER's chosen provider? - IndieAuth spec says these come from the user's DOMAIN, not from StarPunk **Example**: - User: alice.com (ADMIN_ME) - StarPunk: starpunk.alice.com - Client (Quill) looks at alice.com for discovery headers - Quill should see alice's chosen provider, not ours **Questions**: 1. Should these headers be on StarPunk pages at all? 2. Or should users add them to their own domain? 3. Are we confusing "where StarPunk verifies" with "where clients authenticate"? **Request**: Clarify the relationship between: - StarPunk's token verification (internal, uses tokens.indieauth.com) - Client's token acquisition (should use user's domain discovery) **Why this blocks**: We might be implementing discovery headers incorrectly, which would break IndieAuth flow. --- ### Q3: Migration 002 Handling Strategy **The plan mentions** (indieauth-removal-phases.md line 209): ```bash mv migrations/002_secure_tokens_and_authorization_codes.sql migrations/archive/ ``` **Questions**: 1. Should we keep 002 in migrations/ and add 003 that drops tables? 2. Should we delete 002 entirely? 3. Should we archive to a different directory? 4. What about fresh installs - do they need 002 at all? **Three approaches**: **Option A: Keep 002, Add 003** - Pro: Clear history, both migrations run in order - Con: Creates then immediately drops tables (wasteful) - Use case: Existing installations upgrade smoothly **Option B: Delete 002, Renumber Everything** - Pro: Clean, no dead migrations - Con: Breaking change for existing installations - Use case: Fresh installs don't have dead code **Option C: Archive 002, Add 003** - Pro: Git history preserved, clean migrations/ - Con: Migration numbers have gaps - Use case: Documentation without execution **Request**: Which approach should we use and why? **Why this blocks**: Phase 3 depends on knowing how to handle migration files. --- ### Q4: Error Handling Strategy **Current plan** (indieauth-removal-plan.md lines 169-173): ```python if response.status_code != 200: return None ``` This treats ALL failures identically: - Token invalid (401 from provider) → return None - tokens.indieauth.com down (connection error) → return None - Rate limited (429 from provider) → return None - Timeout (no response) → return None **Questions**: 1. Should we differentiate between "invalid token" and "service unavailable"? 2. Should we fail closed (deny) or fail open (allow) on timeout? 3. Should we return different error messages to users? **Proposed enhancement**: ```python try: response = httpx.get(endpoint, timeout=5.0) if response.status_code == 401: return None # Invalid token elif response.status_code != 200: logger.error(f"Token endpoint returned {response.status_code}") return None # Service error, deny access except httpx.TimeoutException: logger.error("Token verification timeout") return None # Network issue, deny access ``` **Request**: Define error handling policy - what happens for each error type? **Why this blocks**: Affects user experience and security posture. --- ### Q5: Token Cache Revocation Delay **Proposed caching** (indieauth-removal-phases.md lines 266-280): ```python # Cache for 5 minutes _token_cache[token_hash] = (data, time() + 300) ``` **The problem**: 1. User revokes token at tokens.indieauth.com 2. StarPunk cache still has it for up to 5 minutes 3. Token continues to work for 5 minutes after revocation **Questions**: 1. Is this acceptable for security? 2. Should we document this limitation? 3. Should we implement cache invalidation somehow? 4. Should cache TTL be shorter (1 minute)? **Trade-off**: - Longer TTL = better performance, worse security - Shorter TTL = worse performance, better security - No cache = worst performance, best security **Request**: Confirm 5-minute window is acceptable or specify different TTL. **Why this blocks**: Security/performance trade-off needs architectural decision. --- ## IMPORTANT QUESTIONS (Should answer before implementing) ### Q6: Cache Cleanup Implementation **Current plan** (indieauth-removal-phases.md lines 266-280): ```python _token_cache = {} ``` **Problem**: No cleanup mechanism - expired entries accumulate forever. **Questions**: 1. Should we implement LRU cache eviction? 2. Should we implement TTL-based cleanup? 3. Should we just document the limitation? 4. Should we recommend Redis for production? **Recommendation**: Add simple cleanup: ```python def verify_token(token): # Clean expired entries every 100 requests if len(_token_cache) % 100 == 0: now = time() _token_cache = {k: v for k, v in _token_cache.items() if v[1] > now} ``` **Request**: Approve cleanup approach or specify alternative. --- ### Q7: Integration Testing Strategy **Plan shows only mocked tests** (indieauth-removal-phases.md lines 332-348): ```python @patch('starpunk.micropub.httpx.get') def test_external_token_verification(mock_get): mock_response.status_code = 200 ``` **Questions**: 1. Should we have integration tests with real tokens.indieauth.com? 2. How do we get test tokens for CI? 3. Should CI test against real external service? **Recommendation**: Two-tier testing: - Unit tests: Mock external calls (fast, always pass) - Integration tests: Real tokens.indieauth.com (slow, conditional) **Request**: Define testing strategy for external dependencies. --- ### Q8: Rollback Procedure Detail **Plan mentions** (ADR-050 lines 224-240): ```bash git revert HEAD~5..HEAD ``` **Problems**: 1. Assumes exactly 5 commits 2. Plan mentions PostgreSQL but we use SQLite 3. No phase-specific rollback **Request**: Create specific rollback for each phase: **Phase 1 rollback**: ```bash git revert # No database changes, just code ``` **Phase 3 rollback**: ```bash cp data/starpunk.db.backup data/starpunk.db git revert ``` **Full rollback**: ```bash git revert ... cp data/starpunk.db.backup data/starpunk.db ``` --- ### Q9: TOKEN_ENDPOINT Configuration **Plan shows** (indieauth-removal-plan.md line 181): ```python TOKEN_ENDPOINT = os.getenv('TOKEN_ENDPOINT', 'https://tokens.indieauth.com/token') ``` **Questions**: 1. Should this be configurable or hardcoded? 2. Is there a use case for different token endpoints? 3. Should we support per-user endpoints (discovery)? **Recommendation**: Hardcode for V1, make configurable later if needed. **Request**: Confirm configuration approach. --- ### Q10: Schema Version Table **Plan shows** (indieauth-removal-plan.md lines 246-248): ```sql UPDATE schema_version SET version = 3 WHERE id = 1; ``` **Question**: Does this table exist? I don't see it in current migrations. **Request**: Clarify if this is needed or remove from migration 003. --- ## NICE TO HAVE ANSWERS ### Q11: Multi-Worker Cache Coherence With multiple gunicorn workers, each has separate in-memory cache: - Worker 1: Verifies token, caches it - Worker 2: Gets request with same token, cache miss, verifies again **Question**: Should we document this limitation or implement shared cache (Redis)? ### Q12: Request Coalescing If multiple concurrent requests use same token: - All hit cache miss - All make external API call - All cache separately **Question**: Should we implement request coalescing (only one verification per token)? ### Q13: Configurable Cache TTL **Question**: Should cache TTL be configurable via environment variable? ```python CACHE_TTL = int(os.getenv('TOKEN_CACHE_TTL', '300')) ``` --- ## Summary **Status**: Ready to review, not ready to implement **Blocking questions**: 5 critical architectural decisions **Important questions**: 5 implementation details **Nice-to-have questions**: 3 optimization considerations **My assessment**: The plan is solid and well-thought-out. These questions are about clarifying implementation details and edge cases, not fundamental flaws. Once we have answers to the critical questions, I'm confident we can implement successfully. **Next steps**: 1. Architect reviews and answers questions 2. I implement based on clarified architecture 3. We proceed through phases with clear acceptance criteria **Estimated implementation time after clarification**: 2-3 days per plan