Files
Gondulf/docs/decisions/ADR-003-pkce-deferred-to-v1-1-0.md
Phil Skentelbery bebd47955f feat(core): implement Phase 1 foundation infrastructure
Implements Phase 1 Foundation with all core services:

Core Components:
- Configuration management with GONDULF_ environment variables
- Database layer with SQLAlchemy and migration system
- In-memory code storage with TTL support
- Email service with SMTP and TLS support (STARTTLS + implicit TLS)
- DNS service with TXT record verification
- Structured logging with Python standard logging
- FastAPI application with health check endpoint

Database Schema:
- authorization_codes table for OAuth 2.0 authorization codes
- domains table for domain verification
- migrations table for tracking schema versions
- Simple sequential migration system (001_initial_schema.sql)

Configuration:
- Environment-based configuration with validation
- .env.example template with all GONDULF_ variables
- Fail-fast validation on startup
- Sensible defaults for optional settings

Testing:
- 96 comprehensive tests (77 unit, 5 integration)
- 94.16% code coverage (exceeds 80% requirement)
- All tests passing
- Test coverage includes:
  - Configuration loading and validation
  - Database migrations and health checks
  - In-memory storage with expiration
  - Email service (STARTTLS, implicit TLS, authentication)
  - DNS service (TXT records, domain verification)
  - Health check endpoint integration

Documentation:
- Implementation report with test results
- Phase 1 clarifications document
- ADRs for key decisions (config, database, email, logging)

Technical Details:
- Python 3.10+ with type hints
- SQLite with configurable database URL
- System DNS with public DNS fallback
- Port-based TLS detection (465=SSL, 587=STARTTLS)
- Lazy configuration loading for testability

Exit Criteria Met:
✓ All foundation services implemented
✓ Application starts without errors
✓ Health check endpoint operational
✓ Database migrations working
✓ Test coverage exceeds 80%
✓ All tests passing

Ready for Architect review and Phase 2 development.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 12:21:42 -07:00

9.6 KiB
Raw Blame History

ADR-003: PKCE Support Deferred to v1.1.0

Date: 2025-11-20

Status

Accepted

Context

PKCE (Proof Key for Code Exchange, RFC 7636) is a security extension to OAuth 2.0 that protects against authorization code interception attacks. It works by having the client generate a random secret, send a hash of it during the authorization request, and prove knowledge of the original secret during token exchange.

PKCE Security Benefits

  1. Code Interception Protection: Even if an attacker intercepts the authorization code, they cannot exchange it for a token without the code_verifier
  2. Public Client Security: Essential for native apps and SPAs where client secrets cannot be stored securely
  3. Best Practice: OAuth 2.0 Security Best Practices (draft-ietf-oauth-security-topics) recommends PKCE for all clients

W3C IndieAuth Specification

The current W3C IndieAuth specification (published January 2018) does not mention PKCE. However, PKCE has become a standard security measure in modern OAuth 2.0 implementations since then.

v1.0.0 Constraints

For the MVP release, we face a simplicity vs. security trade-off:

  • Target users: 10s of users initially
  • Deployment: Single-process Docker container
  • Timeline: 6-8 weeks to v1.0.0
  • Focus: Prove core authentication functionality

PKCE Implementation Effort

Estimated effort: 1-2 days

Required changes:

  1. Accept code_challenge and code_challenge_method parameters in /authorize endpoint
  2. Store code challenge with authorization code
  3. Accept code_verifier parameter in /token endpoint
  4. Validate code_verifier hashes to stored challenge using S256 method
  5. Update metadata endpoint to advertise PKCE support
  6. Add comprehensive tests

Complexity:

  • Low technical complexity (straightforward hashing and comparison)
  • Well-documented in RFC 7636
  • Standard library support (hashlib for SHA-256)

Decision

PKCE support is deferred to v1.1.0 and will NOT be included in v1.0.0.

Rationale

Simplicity Over Complexity:

  • v1.0.0 is an MVP focused on proving core authentication functionality
  • Every additional feature increases risk and development time
  • PKCE adds security but is not required for the W3C IndieAuth specification compliance
  • Deferring PKCE reduces v1.0.0 scope without compromising compliance

Acceptable Risk for MVP:

  • Target deployment: Small scale (10s of users), controlled environment
  • Mitigation: HTTPS enforcement + short code lifetime (10 minutes) + single-use codes
  • Risk window: 10-minute authorization code lifetime only
  • Attack complexity: Requires MITM during specific 10-minute window despite HTTPS

Clear Upgrade Path:

  • PKCE can be added in v1.1.0 without breaking changes
  • Implementation is well-understood and straightforward
  • Clients that don't use PKCE will continue working (backward compatible)

Development Focus:

  • Free up 1-2 days of effort for core functionality
  • Reduce testing surface area for MVP
  • Simplify initial security review
  • Gather real-world usage data before adding complexity

Consequences

Positive Consequences

  1. Faster Time to Market: v1.0.0 ships 1-2 days earlier
  2. Reduced Complexity: Fewer parameters to validate, fewer edge cases
  3. Simpler Testing: Smaller test surface area for initial release
  4. Focus: Development effort concentrated on core authentication flow
  5. Learning Opportunity: Real-world usage informs PKCE implementation in v1.1.0

Negative Consequences

  1. Slightly Reduced Security: Authorization codes vulnerable to interception (mitigated by HTTPS)
  2. Not Best Practice: Modern OAuth 2.0 guidance recommends PKCE for all flows
  3. Client Compatibility: Clients requiring PKCE cannot use v1.0.0 (upgrade to v1.1.0)
  4. Perception: Security-conscious users may view absence as weakness

Mitigation Strategies

For v1.0.0 (Without PKCE):

  1. Enforce HTTPS: Strictly enforce HTTPS in production (mitigates interception)

    if not DEBUG and request.url.scheme != 'https':
        raise HTTPException(status_code=400, detail="HTTPS required")
    
  2. Short Code Lifetime: 10-minute maximum (per W3C spec)

    CODE_EXPIRATION = timedelta(minutes=10)  # Minimize attack window
    
  3. Single-Use Codes: Immediately invalidate after use (detect replay attacks)

    if code_data.get('used'):
        logger.error(f"Code replay attack detected: {code[:8]}...")
        raise HTTPException(status_code=400, detail="invalid_grant")
    
  4. Documentation: Clearly document PKCE absence and planned support

    • README security section
    • Release notes
    • Roadmap (v1.1.0 feature)
  5. Logging: Monitor for potential code interception attempts

    if code_expired:
        logger.warning(f"Expired code presented: {code[:8]}... (potential attack)")
    

For v1.1.0 (Adding PKCE):

  1. Backward Compatibility: PKCE optional, not required

    • Clients without PKCE continue working
    • Clients with PKCE get enhanced security
    • Gradual migration path
  2. Client Detection: Detect PKCE capability and encourage usage

    if code_challenge is None:
        logger.info(f"Client {client_id} not using PKCE (consider upgrading)")
    
  3. Future Enforcement: Option to require PKCE in configuration

    if config.REQUIRE_PKCE and not code_challenge:
        raise HTTPException(status_code=400, detail="PKCE required")
    

Implementation Plan for v1.1.0

Effort: 1-2 days

Changes Required:

  1. Authorization Endpoint (/authorize):

    class AuthorizeRequest(BaseModel):
        # ... existing fields ...
        code_challenge: Optional[str] = None
        code_challenge_method: Optional[Literal["S256"]] = None
    
    # Validation
    if code_challenge and code_challenge_method != "S256":
        raise HTTPException(400, "Only S256 challenge method supported")
    
    # Store challenge with code
    code_data = {
        # ... existing data ...
        "code_challenge": code_challenge,
        "code_challenge_method": code_challenge_method
    }
    
  2. Token Endpoint (/token):

    class TokenRequest(BaseModel):
        # ... existing fields ...
        code_verifier: Optional[str] = None
    
    # Validate PKCE
    if code_data.get('code_challenge'):
        if not code_verifier:
            raise HTTPException(400, "code_verifier required")
    
        # Verify S256(code_verifier) == code_challenge
        import hashlib
        import base64
        computed = base64.urlsafe_b64encode(
            hashlib.sha256(code_verifier.encode()).digest()
        ).decode().rstrip('=')
    
        if not secrets.compare_digest(computed, code_data['code_challenge']):
            raise HTTPException(400, "Invalid code_verifier")
    
  3. Metadata Endpoint:

    {
        "code_challenge_methods_supported": ["S256"]
    }
    
  4. Tests:

    • PKCE flow success cases
    • Invalid code_verifier rejection
    • Missing code_verifier when challenge present
    • Backward compatibility (no PKCE)

Risk Assessment

Attack Scenario Without PKCE:

  1. Attacker performs MITM attack on authorization redirect (despite HTTPS)
  2. Attacker intercepts authorization code
  3. Attacker exchanges code for token (within 10-minute window)
  4. Attacker uses token to impersonate user

Likelihood: Very Low

  • Requires MITM capability (difficult with proper TLS)
  • Requires attack during specific 10-minute window
  • Requires client_id and redirect_uri knowledge

Impact: High

  • Complete account impersonation
  • Access to user's identity

Risk Level: Low (Very Low likelihood × High impact = Low overall risk)

Acceptable for MVP?: Yes

  • Controlled deployment (small user base)
  • Proper TLS mitigates primary attack vector
  • Short code lifetime limits exposure window
  • Clear upgrade path to full PKCE in v1.1.0

Monitoring and Review

v1.0.0 Deployment:

  • Monitor logs for expired code presentations (potential interception attempts)
  • Track time between code generation and redemption
  • Document any security concerns from real-world usage

v1.1.0 Planning:

  • Review security logs from v1.0.0 deployment
  • Prioritize PKCE based on actual risk observed
  • Implement with lessons learned from v1.0.0

References

Alternatives Considered

Alternative 1: Implement PKCE in v1.0.0

Pros:

  • Better security from day one
  • Follows modern OAuth 2.0 best practices
  • No perceived security weakness

Cons:

  • Adds 1-2 days to development timeline
  • Increases testing surface area
  • More complexity for MVP
  • Not required for W3C spec compliance

Rejected: Violates simplicity principle for MVP.

Alternative 2: Make PKCE Required (not optional) in v1.1.0

Pros:

  • Forces clients to adopt best security practices
  • Simpler server logic (no conditional handling)

Cons:

  • Breaking change for clients
  • Not backward compatible
  • W3C spec doesn't require PKCE

Rejected: Breaks compatibility, not required by spec.

Alternative 3: Never Implement PKCE

Pros:

  • Permanent simplicity
  • No additional development

Cons:

  • Permanent security weakness
  • Not following industry best practices
  • May limit adoption by security-conscious users

Rejected: Security should improve over time, not stagnate.

Decision History

  • 2025-11-20: Proposed (Architect)
  • 2025-11-20: Accepted (Architect)
  • TBD: Implementation in v1.1.0