Files
StarPunk/docs/reports/indieauth-removal-questions.md
Phil Skentelbery a3bac86647 feat: Complete IndieAuth server removal (Phases 2-4)
Completed all remaining phases of ADR-030 IndieAuth provider removal.
StarPunk no longer acts as an authorization server - all IndieAuth
operations delegated to external providers.

Phase 2 - Remove Token Issuance:
- Deleted /auth/token endpoint
- Removed token_endpoint() function from routes/auth.py
- Deleted tests/test_routes_token.py

Phase 3 - Remove Token Storage:
- Deleted starpunk/tokens.py module entirely
- Created migration 004 to drop tokens and authorization_codes tables
- Deleted tests/test_tokens.py
- Removed all internal token CRUD operations

Phase 4 - External Token Verification:
- Created starpunk/auth_external.py module
- Implemented verify_external_token() for external IndieAuth providers
- Updated Micropub endpoint to use external verification
- Added TOKEN_ENDPOINT configuration
- Updated all Micropub tests to mock external verification
- HTTP timeout protection (5s) for external requests

Additional Changes:
- Created migration 003 to remove code_verifier from auth_state
- Fixed 5 migration tests that referenced obsolete code_verifier column
- Updated 11 Micropub tests for external verification
- Fixed test fixture and app context issues
- All 501 tests passing

Breaking Changes:
- Micropub clients must use external IndieAuth providers
- TOKEN_ENDPOINT configuration now required
- Existing internal tokens invalid (tables dropped)

Migration Impact:
- Simpler codebase: -500 lines of code
- Fewer database tables: -2 tables (tokens, authorization_codes)
- More secure: External providers handle token security
- More maintainable: Less authentication code to maintain

Standards Compliance:
- W3C IndieAuth specification
- OAuth 2.0 Bearer token authentication
- IndieWeb principle: delegate to external services

Related:
- ADR-030: IndieAuth Provider Removal Strategy
- ADR-050: Remove Custom IndieAuth Server
- Migration 003: Remove code_verifier from auth_state
- Migration 004: Drop tokens and authorization_codes tables

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 17:23:46 -07:00

349 lines
9.9 KiB
Markdown

# IndieAuth Removal - Questions for Architect
**Date**: 2025-11-24
**Developer**: Fullstack Developer Agent
**Document**: Pre-Implementation Questions
## Status: BLOCKED - Awaiting Architectural Clarification
I have thoroughly reviewed the removal plan and identified several architectural questions that need answers before implementation can begin safely.
---
## CRITICAL QUESTIONS (Must answer before implementing)
### Q1: External Token Endpoint Response Format
**What I see in the plan** (ADR-050 lines 156-191):
```python
response = httpx.get(
token_endpoint,
headers={'Authorization': f'Bearer {bearer_token}'}
)
data = response.json()
# Uses: data.get('me'), data.get('scope')
```
**What I see in current code** (starpunk/tokens.py:116-164):
```python
def verify_token(token: str) -> Optional[Dict[str, Any]]:
return {
'me': row['me'],
'client_id': row['client_id'],
'scope': row['scope']
}
```
**Questions**:
1. What is the EXACT response format from tokens.indieauth.com/token?
2. Does it include `client_id`? (current code uses this)
3. What fields can we rely on?
4. What status codes indicate invalid token vs server error?
**Request**: Provide actual example response from tokens.indieauth.com or point to specification.
**Why this blocks**: Phase 4 implementation depends on knowing exact response format.
---
### Q2: HTML Discovery Headers Strategy
**What the plan shows** (simplified-auth-architecture.md lines 207-210):
```html
<link rel="authorization_endpoint" href="https://indieauth.com/auth">
<link rel="token_endpoint" href="https://tokens.indieauth.com/token">
```
**My confusion**:
- These headers tell Micropub CLIENTS where to get tokens
- We're putting them on OUR pages (starpunk instance)
- But shouldn't they point to the USER's chosen provider?
- IndieAuth spec says these come from the user's DOMAIN, not from StarPunk
**Example**:
- User: alice.com (ADMIN_ME)
- StarPunk: starpunk.alice.com
- Client (Quill) looks at alice.com for discovery headers
- Quill should see alice's chosen provider, not ours
**Questions**:
1. Should these headers be on StarPunk pages at all?
2. Or should users add them to their own domain?
3. Are we confusing "where StarPunk verifies" with "where clients authenticate"?
**Request**: Clarify the relationship between:
- StarPunk's token verification (internal, uses tokens.indieauth.com)
- Client's token acquisition (should use user's domain discovery)
**Why this blocks**: We might be implementing discovery headers incorrectly, which would break IndieAuth flow.
---
### Q3: Migration 002 Handling Strategy
**The plan mentions** (indieauth-removal-phases.md line 209):
```bash
mv migrations/002_secure_tokens_and_authorization_codes.sql migrations/archive/
```
**Questions**:
1. Should we keep 002 in migrations/ and add 003 that drops tables?
2. Should we delete 002 entirely?
3. Should we archive to a different directory?
4. What about fresh installs - do they need 002 at all?
**Three approaches**:
**Option A: Keep 002, Add 003**
- Pro: Clear history, both migrations run in order
- Con: Creates then immediately drops tables (wasteful)
- Use case: Existing installations upgrade smoothly
**Option B: Delete 002, Renumber Everything**
- Pro: Clean, no dead migrations
- Con: Breaking change for existing installations
- Use case: Fresh installs don't have dead code
**Option C: Archive 002, Add 003**
- Pro: Git history preserved, clean migrations/
- Con: Migration numbers have gaps
- Use case: Documentation without execution
**Request**: Which approach should we use and why?
**Why this blocks**: Phase 3 depends on knowing how to handle migration files.
---
### Q4: Error Handling Strategy
**Current plan** (indieauth-removal-plan.md lines 169-173):
```python
if response.status_code != 200:
return None
```
This treats ALL failures identically:
- Token invalid (401 from provider) → return None
- tokens.indieauth.com down (connection error) → return None
- Rate limited (429 from provider) → return None
- Timeout (no response) → return None
**Questions**:
1. Should we differentiate between "invalid token" and "service unavailable"?
2. Should we fail closed (deny) or fail open (allow) on timeout?
3. Should we return different error messages to users?
**Proposed enhancement**:
```python
try:
response = httpx.get(endpoint, timeout=5.0)
if response.status_code == 401:
return None # Invalid token
elif response.status_code != 200:
logger.error(f"Token endpoint returned {response.status_code}")
return None # Service error, deny access
except httpx.TimeoutException:
logger.error("Token verification timeout")
return None # Network issue, deny access
```
**Request**: Define error handling policy - what happens for each error type?
**Why this blocks**: Affects user experience and security posture.
---
### Q5: Token Cache Revocation Delay
**Proposed caching** (indieauth-removal-phases.md lines 266-280):
```python
# Cache for 5 minutes
_token_cache[token_hash] = (data, time() + 300)
```
**The problem**:
1. User revokes token at tokens.indieauth.com
2. StarPunk cache still has it for up to 5 minutes
3. Token continues to work for 5 minutes after revocation
**Questions**:
1. Is this acceptable for security?
2. Should we document this limitation?
3. Should we implement cache invalidation somehow?
4. Should cache TTL be shorter (1 minute)?
**Trade-off**:
- Longer TTL = better performance, worse security
- Shorter TTL = worse performance, better security
- No cache = worst performance, best security
**Request**: Confirm 5-minute window is acceptable or specify different TTL.
**Why this blocks**: Security/performance trade-off needs architectural decision.
---
## IMPORTANT QUESTIONS (Should answer before implementing)
### Q6: Cache Cleanup Implementation
**Current plan** (indieauth-removal-phases.md lines 266-280):
```python
_token_cache = {}
```
**Problem**: No cleanup mechanism - expired entries accumulate forever.
**Questions**:
1. Should we implement LRU cache eviction?
2. Should we implement TTL-based cleanup?
3. Should we just document the limitation?
4. Should we recommend Redis for production?
**Recommendation**: Add simple cleanup:
```python
def verify_token(token):
# Clean expired entries every 100 requests
if len(_token_cache) % 100 == 0:
now = time()
_token_cache = {k: v for k, v in _token_cache.items() if v[1] > now}
```
**Request**: Approve cleanup approach or specify alternative.
---
### Q7: Integration Testing Strategy
**Plan shows only mocked tests** (indieauth-removal-phases.md lines 332-348):
```python
@patch('starpunk.micropub.httpx.get')
def test_external_token_verification(mock_get):
mock_response.status_code = 200
```
**Questions**:
1. Should we have integration tests with real tokens.indieauth.com?
2. How do we get test tokens for CI?
3. Should CI test against real external service?
**Recommendation**: Two-tier testing:
- Unit tests: Mock external calls (fast, always pass)
- Integration tests: Real tokens.indieauth.com (slow, conditional)
**Request**: Define testing strategy for external dependencies.
---
### Q8: Rollback Procedure Detail
**Plan mentions** (ADR-050 lines 224-240):
```bash
git revert HEAD~5..HEAD
```
**Problems**:
1. Assumes exactly 5 commits
2. Plan mentions PostgreSQL but we use SQLite
3. No phase-specific rollback
**Request**: Create specific rollback for each phase:
**Phase 1 rollback**:
```bash
git revert <commit-hash>
# No database changes, just code
```
**Phase 3 rollback**:
```bash
cp data/starpunk.db.backup data/starpunk.db
git revert <commit-hash>
```
**Full rollback**:
```bash
git revert <phase-5-commit>...<phase-1-commit>
cp data/starpunk.db.backup data/starpunk.db
```
---
### Q9: TOKEN_ENDPOINT Configuration
**Plan shows** (indieauth-removal-plan.md line 181):
```python
TOKEN_ENDPOINT = os.getenv('TOKEN_ENDPOINT', 'https://tokens.indieauth.com/token')
```
**Questions**:
1. Should this be configurable or hardcoded?
2. Is there a use case for different token endpoints?
3. Should we support per-user endpoints (discovery)?
**Recommendation**: Hardcode for V1, make configurable later if needed.
**Request**: Confirm configuration approach.
---
### Q10: Schema Version Table
**Plan shows** (indieauth-removal-plan.md lines 246-248):
```sql
UPDATE schema_version SET version = 3 WHERE id = 1;
```
**Question**: Does this table exist? I don't see it in current migrations.
**Request**: Clarify if this is needed or remove from migration 003.
---
## NICE TO HAVE ANSWERS
### Q11: Multi-Worker Cache Coherence
With multiple gunicorn workers, each has separate in-memory cache:
- Worker 1: Verifies token, caches it
- Worker 2: Gets request with same token, cache miss, verifies again
**Question**: Should we document this limitation or implement shared cache (Redis)?
### Q12: Request Coalescing
If multiple concurrent requests use same token:
- All hit cache miss
- All make external API call
- All cache separately
**Question**: Should we implement request coalescing (only one verification per token)?
### Q13: Configurable Cache TTL
**Question**: Should cache TTL be configurable via environment variable?
```python
CACHE_TTL = int(os.getenv('TOKEN_CACHE_TTL', '300'))
```
---
## Summary
**Status**: Ready to review, not ready to implement
**Blocking questions**: 5 critical architectural decisions
**Important questions**: 5 implementation details
**Nice-to-have questions**: 3 optimization considerations
**My assessment**: The plan is solid and well-thought-out. These questions are about clarifying implementation details and edge cases, not fundamental flaws. Once we have answers to the critical questions, I'm confident we can implement successfully.
**Next steps**:
1. Architect reviews and answers questions
2. I implement based on clarified architecture
3. We proceed through phases with clear acceptance criteria
**Estimated implementation time after clarification**: 2-3 days per plan