Files
Gondulf/docs/decisions/ADR-003-pkce-deferred-to-v1-1-0.md
Phil Skentelbery bebd47955f feat(core): implement Phase 1 foundation infrastructure
Implements Phase 1 Foundation with all core services:

Core Components:
- Configuration management with GONDULF_ environment variables
- Database layer with SQLAlchemy and migration system
- In-memory code storage with TTL support
- Email service with SMTP and TLS support (STARTTLS + implicit TLS)
- DNS service with TXT record verification
- Structured logging with Python standard logging
- FastAPI application with health check endpoint

Database Schema:
- authorization_codes table for OAuth 2.0 authorization codes
- domains table for domain verification
- migrations table for tracking schema versions
- Simple sequential migration system (001_initial_schema.sql)

Configuration:
- Environment-based configuration with validation
- .env.example template with all GONDULF_ variables
- Fail-fast validation on startup
- Sensible defaults for optional settings

Testing:
- 96 comprehensive tests (77 unit, 5 integration)
- 94.16% code coverage (exceeds 80% requirement)
- All tests passing
- Test coverage includes:
  - Configuration loading and validation
  - Database migrations and health checks
  - In-memory storage with expiration
  - Email service (STARTTLS, implicit TLS, authentication)
  - DNS service (TXT records, domain verification)
  - Health check endpoint integration

Documentation:
- Implementation report with test results
- Phase 1 clarifications document
- ADRs for key decisions (config, database, email, logging)

Technical Details:
- Python 3.10+ with type hints
- SQLite with configurable database URL
- System DNS with public DNS fallback
- Port-based TLS detection (465=SSL, 587=STARTTLS)
- Lazy configuration loading for testability

Exit Criteria Met:
✓ All foundation services implemented
✓ Application starts without errors
✓ Health check endpoint operational
✓ Database migrations working
✓ Test coverage exceeds 80%
✓ All tests passing

Ready for Architect review and Phase 2 development.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 12:21:42 -07:00

293 lines
9.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ADR-003: PKCE Support Deferred to v1.1.0
Date: 2025-11-20
## Status
Accepted
## Context
PKCE (Proof Key for Code Exchange, RFC 7636) is a security extension to OAuth 2.0 that protects against authorization code interception attacks. It works by having the client generate a random secret, send a hash of it during the authorization request, and prove knowledge of the original secret during token exchange.
### PKCE Security Benefits
1. **Code Interception Protection**: Even if an attacker intercepts the authorization code, they cannot exchange it for a token without the code_verifier
2. **Public Client Security**: Essential for native apps and SPAs where client secrets cannot be stored securely
3. **Best Practice**: OAuth 2.0 Security Best Practices (draft-ietf-oauth-security-topics) recommends PKCE for all clients
### W3C IndieAuth Specification
The current W3C IndieAuth specification (published January 2018) does not mention PKCE. However, PKCE has become a standard security measure in modern OAuth 2.0 implementations since then.
### v1.0.0 Constraints
For the MVP release, we face a simplicity vs. security trade-off:
- Target users: 10s of users initially
- Deployment: Single-process Docker container
- Timeline: 6-8 weeks to v1.0.0
- Focus: Prove core authentication functionality
### PKCE Implementation Effort
**Estimated effort**: 1-2 days
**Required changes**:
1. Accept `code_challenge` and `code_challenge_method` parameters in /authorize endpoint
2. Store code challenge with authorization code
3. Accept `code_verifier` parameter in /token endpoint
4. Validate code_verifier hashes to stored challenge using S256 method
5. Update metadata endpoint to advertise PKCE support
6. Add comprehensive tests
**Complexity**:
- Low technical complexity (straightforward hashing and comparison)
- Well-documented in RFC 7636
- Standard library support (hashlib for SHA-256)
## Decision
**PKCE support is deferred to v1.1.0 and will NOT be included in v1.0.0.**
### Rationale
**Simplicity Over Complexity**:
- v1.0.0 is an MVP focused on proving core authentication functionality
- Every additional feature increases risk and development time
- PKCE adds security but is not required for the W3C IndieAuth specification compliance
- Deferring PKCE reduces v1.0.0 scope without compromising compliance
**Acceptable Risk for MVP**:
- Target deployment: Small scale (10s of users), controlled environment
- Mitigation: HTTPS enforcement + short code lifetime (10 minutes) + single-use codes
- Risk window: 10-minute authorization code lifetime only
- Attack complexity: Requires MITM during specific 10-minute window despite HTTPS
**Clear Upgrade Path**:
- PKCE can be added in v1.1.0 without breaking changes
- Implementation is well-understood and straightforward
- Clients that don't use PKCE will continue working (backward compatible)
**Development Focus**:
- Free up 1-2 days of effort for core functionality
- Reduce testing surface area for MVP
- Simplify initial security review
- Gather real-world usage data before adding complexity
## Consequences
### Positive Consequences
1. **Faster Time to Market**: v1.0.0 ships 1-2 days earlier
2. **Reduced Complexity**: Fewer parameters to validate, fewer edge cases
3. **Simpler Testing**: Smaller test surface area for initial release
4. **Focus**: Development effort concentrated on core authentication flow
5. **Learning Opportunity**: Real-world usage informs PKCE implementation in v1.1.0
### Negative Consequences
1. **Slightly Reduced Security**: Authorization codes vulnerable to interception (mitigated by HTTPS)
2. **Not Best Practice**: Modern OAuth 2.0 guidance recommends PKCE for all flows
3. **Client Compatibility**: Clients requiring PKCE cannot use v1.0.0 (upgrade to v1.1.0)
4. **Perception**: Security-conscious users may view absence as weakness
### Mitigation Strategies
**For v1.0.0 (Without PKCE)**:
1. **Enforce HTTPS**: Strictly enforce HTTPS in production (mitigates interception)
```python
if not DEBUG and request.url.scheme != 'https':
raise HTTPException(status_code=400, detail="HTTPS required")
```
2. **Short Code Lifetime**: 10-minute maximum (per W3C spec)
```python
CODE_EXPIRATION = timedelta(minutes=10) # Minimize attack window
```
3. **Single-Use Codes**: Immediately invalidate after use (detect replay attacks)
```python
if code_data.get('used'):
logger.error(f"Code replay attack detected: {code[:8]}...")
raise HTTPException(status_code=400, detail="invalid_grant")
```
4. **Documentation**: Clearly document PKCE absence and planned support
- README security section
- Release notes
- Roadmap (v1.1.0 feature)
5. **Logging**: Monitor for potential code interception attempts
```python
if code_expired:
logger.warning(f"Expired code presented: {code[:8]}... (potential attack)")
```
**For v1.1.0 (Adding PKCE)**:
1. **Backward Compatibility**: PKCE optional, not required
- Clients without PKCE continue working
- Clients with PKCE get enhanced security
- Gradual migration path
2. **Client Detection**: Detect PKCE capability and encourage usage
```python
if code_challenge is None:
logger.info(f"Client {client_id} not using PKCE (consider upgrading)")
```
3. **Future Enforcement**: Option to require PKCE in configuration
```python
if config.REQUIRE_PKCE and not code_challenge:
raise HTTPException(status_code=400, detail="PKCE required")
```
### Implementation Plan for v1.1.0
**Effort**: 1-2 days
**Changes Required**:
1. **Authorization Endpoint** (`/authorize`):
```python
class AuthorizeRequest(BaseModel):
# ... existing fields ...
code_challenge: Optional[str] = None
code_challenge_method: Optional[Literal["S256"]] = None
# Validation
if code_challenge and code_challenge_method != "S256":
raise HTTPException(400, "Only S256 challenge method supported")
# Store challenge with code
code_data = {
# ... existing data ...
"code_challenge": code_challenge,
"code_challenge_method": code_challenge_method
}
```
2. **Token Endpoint** (`/token`):
```python
class TokenRequest(BaseModel):
# ... existing fields ...
code_verifier: Optional[str] = None
# Validate PKCE
if code_data.get('code_challenge'):
if not code_verifier:
raise HTTPException(400, "code_verifier required")
# Verify S256(code_verifier) == code_challenge
import hashlib
import base64
computed = base64.urlsafe_b64encode(
hashlib.sha256(code_verifier.encode()).digest()
).decode().rstrip('=')
if not secrets.compare_digest(computed, code_data['code_challenge']):
raise HTTPException(400, "Invalid code_verifier")
```
3. **Metadata Endpoint**:
```json
{
"code_challenge_methods_supported": ["S256"]
}
```
4. **Tests**:
- PKCE flow success cases
- Invalid code_verifier rejection
- Missing code_verifier when challenge present
- Backward compatibility (no PKCE)
### Risk Assessment
**Attack Scenario Without PKCE**:
1. Attacker performs MITM attack on authorization redirect (despite HTTPS)
2. Attacker intercepts authorization code
3. Attacker exchanges code for token (within 10-minute window)
4. Attacker uses token to impersonate user
**Likelihood**: Very Low
- Requires MITM capability (difficult with proper TLS)
- Requires attack during specific 10-minute window
- Requires client_id and redirect_uri knowledge
**Impact**: High
- Complete account impersonation
- Access to user's identity
**Risk Level**: **Low** (Very Low likelihood × High impact = Low overall risk)
**Acceptable for MVP?**: Yes
- Controlled deployment (small user base)
- Proper TLS mitigates primary attack vector
- Short code lifetime limits exposure window
- Clear upgrade path to full PKCE in v1.1.0
### Monitoring and Review
**v1.0.0 Deployment**:
- Monitor logs for expired code presentations (potential interception attempts)
- Track time between code generation and redemption
- Document any security concerns from real-world usage
**v1.1.0 Planning**:
- Review security logs from v1.0.0 deployment
- Prioritize PKCE based on actual risk observed
- Implement with lessons learned from v1.0.0
## References
- RFC 7636 - PKCE: https://datatracker.ietf.org/doc/html/rfc7636
- OAuth 2.0 Security Best Practices: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics
- W3C IndieAuth Specification: https://www.w3.org/TR/indieauth/
- OWASP OAuth 2.0 Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/OAuth2_Cheat_Sheet.html
## Alternatives Considered
### Alternative 1: Implement PKCE in v1.0.0
**Pros**:
- Better security from day one
- Follows modern OAuth 2.0 best practices
- No perceived security weakness
**Cons**:
- Adds 1-2 days to development timeline
- Increases testing surface area
- More complexity for MVP
- Not required for W3C spec compliance
**Rejected**: Violates simplicity principle for MVP.
### Alternative 2: Make PKCE Required (not optional) in v1.1.0
**Pros**:
- Forces clients to adopt best security practices
- Simpler server logic (no conditional handling)
**Cons**:
- Breaking change for clients
- Not backward compatible
- W3C spec doesn't require PKCE
**Rejected**: Breaks compatibility, not required by spec.
### Alternative 3: Never Implement PKCE
**Pros**:
- Permanent simplicity
- No additional development
**Cons**:
- Permanent security weakness
- Not following industry best practices
- May limit adoption by security-conscious users
**Rejected**: Security should improve over time, not stagnate.
## Decision History
- 2025-11-20: Proposed (Architect)
- 2025-11-20: Accepted (Architect)
- TBD: Implementation in v1.1.0