Gondulf/docs/decisions/ADR-003-pkce-deferred-to-v1-1-0.md

# ADR-003: PKCE Support Deferred to v1.1.0

Date: 2025-11-20

## Status
Accepted

## Context

PKCE (Proof Key for Code Exchange, RFC 7636) is a security extension to OAuth 2.0 that protects against authorization code interception attacks. It works by having the client generate a random secret, send a hash of it during the authorization request, and prove knowledge of the original secret during token exchange.

### PKCE Security Benefits
1. **Code Interception Protection**: Even if an attacker intercepts the authorization code, they cannot exchange it for a token without the code_verifier
2. **Public Client Security**: Essential for native apps and SPAs where client secrets cannot be stored securely
3. **Best Practice**: OAuth 2.0 Security Best Practices (draft-ietf-oauth-security-topics) recommends PKCE for all clients

### W3C IndieAuth Specification
The current W3C IndieAuth specification (published January 2018) does not mention PKCE. However, PKCE has become a standard security measure in modern OAuth 2.0 implementations since then.

### v1.0.0 Constraints
For the MVP release, we face a simplicity vs. security trade-off:
- Target users: 10s of users initially
- Deployment: Single-process Docker container
- Timeline: 6-8 weeks to v1.0.0
- Focus: Prove core authentication functionality

### PKCE Implementation Effort
**Estimated effort**: 1-2 days

**Required changes**:
1. Accept `code_challenge` and `code_challenge_method` parameters in /authorize endpoint
2. Store code challenge with authorization code
3. Accept `code_verifier` parameter in /token endpoint
4. Validate code_verifier hashes to stored challenge using S256 method
5. Update metadata endpoint to advertise PKCE support
6. Add comprehensive tests

**Complexity**:
- Low technical complexity (straightforward hashing and comparison)
- Well-documented in RFC 7636
- Standard library support (hashlib for SHA-256)

## Decision

**PKCE support is deferred to v1.1.0 and will NOT be included in v1.0.0.**

### Rationale

**Simplicity Over Complexity**:
- v1.0.0 is an MVP focused on proving core authentication functionality
- Every additional feature increases risk and development time
- PKCE adds security but is not required for the W3C IndieAuth specification compliance
- Deferring PKCE reduces v1.0.0 scope without compromising compliance

**Acceptable Risk for MVP**:
- Target deployment: Small scale (10s of users), controlled environment
- Mitigation: HTTPS enforcement + short code lifetime (10 minutes) + single-use codes
- Risk window: 10-minute authorization code lifetime only
- Attack complexity: Requires MITM during specific 10-minute window despite HTTPS

**Clear Upgrade Path**:
- PKCE can be added in v1.1.0 without breaking changes
- Implementation is well-understood and straightforward
- Clients that don't use PKCE will continue working (backward compatible)

**Development Focus**:
- Free up 1-2 days of effort for core functionality
- Reduce testing surface area for MVP
- Simplify initial security review
- Gather real-world usage data before adding complexity

## Consequences

### Positive Consequences

1. **Faster Time to Market**: v1.0.0 ships 1-2 days earlier
2. **Reduced Complexity**: Fewer parameters to validate, fewer edge cases
3. **Simpler Testing**: Smaller test surface area for initial release
4. **Focus**: Development effort concentrated on core authentication flow
5. **Learning Opportunity**: Real-world usage informs PKCE implementation in v1.1.0

### Negative Consequences

1. **Slightly Reduced Security**: Authorization codes vulnerable to interception (mitigated by HTTPS)
2. **Not Best Practice**: Modern OAuth 2.0 guidance recommends PKCE for all flows
3. **Client Compatibility**: Clients requiring PKCE cannot use v1.0.0 (upgrade to v1.1.0)
4. **Perception**: Security-conscious users may view absence as weakness

### Mitigation Strategies

**For v1.0.0 (Without PKCE)**:

1. **Enforce HTTPS**: Strictly enforce HTTPS in production (mitigates interception)
   ```python
   if not DEBUG and request.url.scheme != 'https':
       raise HTTPException(status_code=400, detail="HTTPS required")
   ```

2. **Short Code Lifetime**: 10-minute maximum (per W3C spec)
   ```python
   CODE_EXPIRATION = timedelta(minutes=10)  # Minimize attack window
   ```

3. **Single-Use Codes**: Immediately invalidate after use (detect replay attacks)
   ```python
   if code_data.get('used'):
       logger.error(f"Code replay attack detected: {code[:8]}...")
       raise HTTPException(status_code=400, detail="invalid_grant")
   ```

4. **Documentation**: Clearly document PKCE absence and planned support
   - README security section
   - Release notes
   - Roadmap (v1.1.0 feature)

5. **Logging**: Monitor for potential code interception attempts
   ```python
   if code_expired:
       logger.warning(f"Expired code presented: {code[:8]}... (potential attack)")
   ```

**For v1.1.0 (Adding PKCE)**:

1. **Backward Compatibility**: PKCE optional, not required
   - Clients without PKCE continue working
   - Clients with PKCE get enhanced security
   - Gradual migration path

2. **Client Detection**: Detect PKCE capability and encourage usage
   ```python
   if code_challenge is None:
       logger.info(f"Client {client_id} not using PKCE (consider upgrading)")
   ```

3. **Future Enforcement**: Option to require PKCE in configuration
   ```python
   if config.REQUIRE_PKCE and not code_challenge:
       raise HTTPException(status_code=400, detail="PKCE required")
   ```

### Implementation Plan for v1.1.0

**Effort**: 1-2 days

**Changes Required**:

1. **Authorization Endpoint** (`/authorize`):
   ```python
   class AuthorizeRequest(BaseModel):
       # ... existing fields ...
       code_challenge: Optional[str] = None
       code_challenge_method: Optional[Literal["S256"]] = None

   # Validation
   if code_challenge and code_challenge_method != "S256":
       raise HTTPException(400, "Only S256 challenge method supported")

   # Store challenge with code
   code_data = {
       # ... existing data ...
       "code_challenge": code_challenge,
       "code_challenge_method": code_challenge_method
   }
   ```

2. **Token Endpoint** (`/token`):
   ```python
   class TokenRequest(BaseModel):
       # ... existing fields ...
       code_verifier: Optional[str] = None

   # Validate PKCE
   if code_data.get('code_challenge'):
       if not code_verifier:
           raise HTTPException(400, "code_verifier required")

       # Verify S256(code_verifier) == code_challenge
       import hashlib
       import base64
       computed = base64.urlsafe_b64encode(
           hashlib.sha256(code_verifier.encode()).digest()
       ).decode().rstrip('=')

       if not secrets.compare_digest(computed, code_data['code_challenge']):
           raise HTTPException(400, "Invalid code_verifier")
   ```

3. **Metadata Endpoint**:
   ```json
   {
       "code_challenge_methods_supported": ["S256"]
   }
   ```

4. **Tests**:
   - PKCE flow success cases
   - Invalid code_verifier rejection
   - Missing code_verifier when challenge present
   - Backward compatibility (no PKCE)

### Risk Assessment

**Attack Scenario Without PKCE**:
1. Attacker performs MITM attack on authorization redirect (despite HTTPS)
2. Attacker intercepts authorization code
3. Attacker exchanges code for token (within 10-minute window)
4. Attacker uses token to impersonate user

**Likelihood**: Very Low
- Requires MITM capability (difficult with proper TLS)
- Requires attack during specific 10-minute window
- Requires client_id and redirect_uri knowledge

**Impact**: High
- Complete account impersonation
- Access to user's identity

**Risk Level**: **Low** (Very Low likelihood × High impact = Low overall risk)

**Acceptable for MVP?**: Yes
- Controlled deployment (small user base)
- Proper TLS mitigates primary attack vector
- Short code lifetime limits exposure window
- Clear upgrade path to full PKCE in v1.1.0

### Monitoring and Review

**v1.0.0 Deployment**:
- Monitor logs for expired code presentations (potential interception attempts)
- Track time between code generation and redemption
- Document any security concerns from real-world usage

**v1.1.0 Planning**:
- Review security logs from v1.0.0 deployment
- Prioritize PKCE based on actual risk observed
- Implement with lessons learned from v1.0.0

## References

- RFC 7636 - PKCE: https://datatracker.ietf.org/doc/html/rfc7636
- OAuth 2.0 Security Best Practices: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics
- W3C IndieAuth Specification: https://www.w3.org/TR/indieauth/
- OWASP OAuth 2.0 Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/OAuth2_Cheat_Sheet.html

## Alternatives Considered

### Alternative 1: Implement PKCE in v1.0.0

**Pros**:
- Better security from day one
- Follows modern OAuth 2.0 best practices
- No perceived security weakness

**Cons**:
- Adds 1-2 days to development timeline
- Increases testing surface area
- More complexity for MVP
- Not required for W3C spec compliance

**Rejected**: Violates simplicity principle for MVP.

### Alternative 2: Make PKCE Required (not optional) in v1.1.0

**Pros**:
- Forces clients to adopt best security practices
- Simpler server logic (no conditional handling)

**Cons**:
- Breaking change for clients
- Not backward compatible
- W3C spec doesn't require PKCE

**Rejected**: Breaks compatibility, not required by spec.

### Alternative 3: Never Implement PKCE

**Pros**:
- Permanent simplicity
- No additional development

**Cons**:
- Permanent security weakness
- Not following industry best practices
- May limit adoption by security-conscious users

**Rejected**: Security should improve over time, not stagnate.

## Decision History

- 2025-11-20: Proposed (Architect)
- 2025-11-20: Accepted (Architect)
- TBD: Implementation in v1.1.0