Gondulf/docs/CLARIFICATIONS-PHASE-3.md

# Phase 3 Token Endpoint - Clarification Responses

**Date**: 2025-11-20
**Architect**: Claude (Architect Agent)
**Developer Questions**: 8 clarifications needed
**Status**: All questions answered

## Summary of Decisions

All 8 clarification questions have been addressed with clear, specific architectural decisions prioritizing simplicity. See ADR-0009 for formal documentation of these decisions.

## Question-by-Question Responses

### 1. Authorization Code Storage Format (CRITICAL) ✅

**Question**: Phase 1 CodeStore only accepts string values, but Phase 3 needs dict metadata. Should we modify CodeStore or handle serialization elsewhere?

**DECISION**: Modify CodeStore to accept dict values with internal JSON serialization.

**Implementation**:
```python
# Update CodeStore in Phase 1
def store(self, key: str, value: Union[str, dict], ttl: int = 600) -> None:
    """Store key-value pair. Value can be string or dict."""
    if isinstance(value, dict):
        value_to_store = json.dumps(value)
    else:
        value_to_store = value
    # ... rest of implementation

def get(self, key: str) -> Optional[Union[str, dict]]:
    """Get value. Returns dict if stored value is JSON."""
    # ... retrieve value
    try:
        return json.loads(value)
    except (json.JSONDecodeError, TypeError):
        return value
```

**Rationale**: Simplest approach that maintains backward compatibility while supporting Phase 2/3 needs.

---

### 2. Authorization Code Single-Use Marking ✅

**Question**: How to mark code as "used" before token generation? Calculate remaining TTL?

**DECISION**: Simplify - just check 'used' flag, then delete after successful generation. No marking.

**Implementation**:
```python
# Check if already used
if metadata.get('used'):
    raise HTTPException(400, {"error": "invalid_grant"})

# Generate token...

# Delete code after success (single-use enforcement)
code_storage.delete(code)
```

**Rationale**: Eliminates TTL calculation complexity and race condition concerns.

---

### 3. Token Endpoint Error Response Format ✅

**Question**: Does FastAPI handle dict detail correctly? Need cache headers?

**DECISION**: FastAPI handles dict→JSON automatically. Add cache headers explicitly.

**Implementation**:
```python
@router.post("/token")
async def token_exchange(response: Response, ...):
    response.headers["Cache-Control"] = "no-store"
    response.headers["Pragma"] = "no-cache"
    # FastAPI HTTPException with dict detail works correctly
```

**Rationale**: Use framework capabilities, ensure OAuth compliance with explicit headers.

---

### 4. Phase 2/3 Authorization Code Structure ✅

**Question**: Will Phase 2 include PKCE fields? Should Phase 3 handle missing keys?

**DECISION**: Phase 2 MUST include all fields with defaults. Phase 3 assumes complete structure.

**Phase 2 Update Required**:
```python
code_data = {
    'client_id': client_id,
    'redirect_uri': redirect_uri,
    'state': state,
    'me': verified_email,
    'scope': scope,
    'code_challenge': code_challenge or "",  # Empty if not provided
    'code_challenge_method': code_challenge_method or "",
    'created_at': int(time.time()),
    'expires_at': int(time.time() + 600),
    'used': False  # Always False initially
}
```

**Rationale**: Consistency within v1.0.0 is more important than backward compatibility.

---

### 5. Database Connection Pattern ✅

**Question**: Does get_connection() auto-commit or need explicit commit?

**DECISION**: Explicit commit required (Phase 1 pattern).

**Implementation**:
```python
with self.database.get_connection() as conn:
    conn.execute(query, params)
    conn.commit()  # Required
```

**Rationale**: Matches SQLite default behavior and Phase 1 implementation.

---

### 6. Token Hash Collision Handling ✅

**Question**: Should we handle UNIQUE constraint violations defensively?

**DECISION**: NO defensive handling. Let it fail catastrophically.

**Implementation**:
```python
# No try/except for UNIQUE constraint
# If 2^256 collision occurs, something is fundamentally broken
conn.execute("INSERT INTO tokens ...", params)
conn.commit()
# Let any IntegrityError propagate
```

**Rationale**: With 2^256 entropy, collision indicates fundamental system failure. Retrying won't help.

---

### 7. Logging Token Validation ✅

**Question**: What logging levels for token operations?

**DECISION**: Adopt Developer's suggestion:
- DEBUG: Successful validations (high volume)
- INFO: Token generation (important events)
- WARNING: Validation failures (potential issues)

**Implementation**:
```python
# Success (frequent, not interesting)
logger.debug(f"Token validated successfully (me: {token_data['me']})")

# Generation (important)
logger.info(f"Token generated for {me} (client: {client_id})")

# Failure (potential attack/misconfiguration)
logger.warning(f"Token validation failed: {reason}")
```

**Rationale**: Appropriate visibility without log flooding.

---

### 8. Token Cleanup Configuration ✅

**Question**: Should cleanup_expired_tokens() be called automatically?

**DECISION**: Manual/cron only for v1.0.0. No automatic calling.

**Implementation**:
```python
# Utility method only
def cleanup_expired_tokens(self) -> int:
    """Delete expired tokens. Call manually or via cron."""
    # Implementation as designed

# Config vars exist but unused in v1.0.0:
# TOKEN_CLEANUP_ENABLED (ignored)
# TOKEN_CLEANUP_INTERVAL (ignored)
```

**Rationale**: Simplicity for v1.0.0 MVP. Small scale doesn't need automatic cleanup.

---

## Required Changes Before Phase 3 Implementation

### Phase 1 Changes
1. Update CodeStore to handle dict values with JSON serialization
2. Update CodeStore type hints to Union[str, dict]

### Phase 2 Changes
1. Add PKCE fields to authorization code metadata (even if empty)
2. Add 'used' field (always False initially)
3. Add created_at/expires_at as epoch integers

### Phase 3 Implementation Notes
1. Assume complete metadata structure from Phase 2
2. No defensive programming for token collisions
3. No automatic token cleanup
4. Explicit cache headers for OAuth compliance

---

## Design Updates

The original Phase 3 design document remains valid with these clarifications:

1. **Line 509**: Remove mark-as-used step, go directly to delete after generation
2. **Line 685**: Note that TOKEN_CLEANUP_* configs exist but aren't used in v1.0.0
3. **Line 1163**: Simplify single-use enforcement to check-and-delete

---

## Next Steps

1. Developer implements Phase 1 CodeStore changes
2. Developer updates Phase 2 authorization code structure
3. Developer proceeds with Phase 3 implementation using these clarifications
4. No further architectural review needed unless new issues arise

---

**ARCHITECTURAL CLARIFICATIONS COMPLETE**

All 8 questions have been answered with specific implementation guidance. The Developer can proceed with Phase 3 implementation immediately after making the minor updates to Phase 1 and Phase 2.

Remember: When in doubt, choose the simpler solution. We're building v1.0.0, not the perfect system.