Files
Gondulf/docs/CLARIFICATIONS-PHASE-3.md
Phil Skentelbery 05b4ff7a6b feat(phase-3): implement token endpoint and OAuth 2.0 flow
Phase 3 Implementation:
- Token service with secure token generation and validation
- Token endpoint (POST /token) with OAuth 2.0 compliance
- Database migration 003 for tokens table
- Authorization code validation and single-use enforcement

Phase 1 Updates:
- Enhanced CodeStore to support dict values with JSON serialization
- Maintains backward compatibility

Phase 2 Updates:
- Authorization codes now include PKCE fields, used flag, timestamps
- Complete metadata structure for token exchange

Security:
- 256-bit cryptographically secure tokens (secrets.token_urlsafe)
- SHA-256 hashed storage (no plaintext)
- Constant-time comparison for validation
- Single-use code enforcement with replay detection

Testing:
- 226 tests passing (100%)
- 87.27% coverage (exceeds 80% requirement)
- OAuth 2.0 compliance verified

This completes the v1.0.0 MVP with full IndieAuth authorization code flow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 14:24:06 -07:00

236 lines
6.9 KiB
Markdown

# Phase 3 Token Endpoint - Clarification Responses
**Date**: 2025-11-20
**Architect**: Claude (Architect Agent)
**Developer Questions**: 8 clarifications needed
**Status**: All questions answered
## Summary of Decisions
All 8 clarification questions have been addressed with clear, specific architectural decisions prioritizing simplicity. See ADR-0009 for formal documentation of these decisions.
## Question-by-Question Responses
### 1. Authorization Code Storage Format (CRITICAL) ✅
**Question**: Phase 1 CodeStore only accepts string values, but Phase 3 needs dict metadata. Should we modify CodeStore or handle serialization elsewhere?
**DECISION**: Modify CodeStore to accept dict values with internal JSON serialization.
**Implementation**:
```python
# Update CodeStore in Phase 1
def store(self, key: str, value: Union[str, dict], ttl: int = 600) -> None:
"""Store key-value pair. Value can be string or dict."""
if isinstance(value, dict):
value_to_store = json.dumps(value)
else:
value_to_store = value
# ... rest of implementation
def get(self, key: str) -> Optional[Union[str, dict]]:
"""Get value. Returns dict if stored value is JSON."""
# ... retrieve value
try:
return json.loads(value)
except (json.JSONDecodeError, TypeError):
return value
```
**Rationale**: Simplest approach that maintains backward compatibility while supporting Phase 2/3 needs.
---
### 2. Authorization Code Single-Use Marking ✅
**Question**: How to mark code as "used" before token generation? Calculate remaining TTL?
**DECISION**: Simplify - just check 'used' flag, then delete after successful generation. No marking.
**Implementation**:
```python
# Check if already used
if metadata.get('used'):
raise HTTPException(400, {"error": "invalid_grant"})
# Generate token...
# Delete code after success (single-use enforcement)
code_storage.delete(code)
```
**Rationale**: Eliminates TTL calculation complexity and race condition concerns.
---
### 3. Token Endpoint Error Response Format ✅
**Question**: Does FastAPI handle dict detail correctly? Need cache headers?
**DECISION**: FastAPI handles dict→JSON automatically. Add cache headers explicitly.
**Implementation**:
```python
@router.post("/token")
async def token_exchange(response: Response, ...):
response.headers["Cache-Control"] = "no-store"
response.headers["Pragma"] = "no-cache"
# FastAPI HTTPException with dict detail works correctly
```
**Rationale**: Use framework capabilities, ensure OAuth compliance with explicit headers.
---
### 4. Phase 2/3 Authorization Code Structure ✅
**Question**: Will Phase 2 include PKCE fields? Should Phase 3 handle missing keys?
**DECISION**: Phase 2 MUST include all fields with defaults. Phase 3 assumes complete structure.
**Phase 2 Update Required**:
```python
code_data = {
'client_id': client_id,
'redirect_uri': redirect_uri,
'state': state,
'me': verified_email,
'scope': scope,
'code_challenge': code_challenge or "", # Empty if not provided
'code_challenge_method': code_challenge_method or "",
'created_at': int(time.time()),
'expires_at': int(time.time() + 600),
'used': False # Always False initially
}
```
**Rationale**: Consistency within v1.0.0 is more important than backward compatibility.
---
### 5. Database Connection Pattern ✅
**Question**: Does get_connection() auto-commit or need explicit commit?
**DECISION**: Explicit commit required (Phase 1 pattern).
**Implementation**:
```python
with self.database.get_connection() as conn:
conn.execute(query, params)
conn.commit() # Required
```
**Rationale**: Matches SQLite default behavior and Phase 1 implementation.
---
### 6. Token Hash Collision Handling ✅
**Question**: Should we handle UNIQUE constraint violations defensively?
**DECISION**: NO defensive handling. Let it fail catastrophically.
**Implementation**:
```python
# No try/except for UNIQUE constraint
# If 2^256 collision occurs, something is fundamentally broken
conn.execute("INSERT INTO tokens ...", params)
conn.commit()
# Let any IntegrityError propagate
```
**Rationale**: With 2^256 entropy, collision indicates fundamental system failure. Retrying won't help.
---
### 7. Logging Token Validation ✅
**Question**: What logging levels for token operations?
**DECISION**: Adopt Developer's suggestion:
- DEBUG: Successful validations (high volume)
- INFO: Token generation (important events)
- WARNING: Validation failures (potential issues)
**Implementation**:
```python
# Success (frequent, not interesting)
logger.debug(f"Token validated successfully (me: {token_data['me']})")
# Generation (important)
logger.info(f"Token generated for {me} (client: {client_id})")
# Failure (potential attack/misconfiguration)
logger.warning(f"Token validation failed: {reason}")
```
**Rationale**: Appropriate visibility without log flooding.
---
### 8. Token Cleanup Configuration ✅
**Question**: Should cleanup_expired_tokens() be called automatically?
**DECISION**: Manual/cron only for v1.0.0. No automatic calling.
**Implementation**:
```python
# Utility method only
def cleanup_expired_tokens(self) -> int:
"""Delete expired tokens. Call manually or via cron."""
# Implementation as designed
# Config vars exist but unused in v1.0.0:
# TOKEN_CLEANUP_ENABLED (ignored)
# TOKEN_CLEANUP_INTERVAL (ignored)
```
**Rationale**: Simplicity for v1.0.0 MVP. Small scale doesn't need automatic cleanup.
---
## Required Changes Before Phase 3 Implementation
### Phase 1 Changes
1. Update CodeStore to handle dict values with JSON serialization
2. Update CodeStore type hints to Union[str, dict]
### Phase 2 Changes
1. Add PKCE fields to authorization code metadata (even if empty)
2. Add 'used' field (always False initially)
3. Add created_at/expires_at as epoch integers
### Phase 3 Implementation Notes
1. Assume complete metadata structure from Phase 2
2. No defensive programming for token collisions
3. No automatic token cleanup
4. Explicit cache headers for OAuth compliance
---
## Design Updates
The original Phase 3 design document remains valid with these clarifications:
1. **Line 509**: Remove mark-as-used step, go directly to delete after generation
2. **Line 685**: Note that TOKEN_CLEANUP_* configs exist but aren't used in v1.0.0
3. **Line 1163**: Simplify single-use enforcement to check-and-delete
---
## Next Steps
1. Developer implements Phase 1 CodeStore changes
2. Developer updates Phase 2 authorization code structure
3. Developer proceeds with Phase 3 implementation using these clarifications
4. No further architectural review needed unless new issues arise
---
**ARCHITECTURAL CLARIFICATIONS COMPLETE**
All 8 questions have been answered with specific implementation guidance. The Developer can proceed with Phase 3 implementation immediately after making the minor updates to Phase 1 and Phase 2.
Remember: When in doubt, choose the simpler solution. We're building v1.0.0, not the perfect system.