fix(auth): make PKCE optional per ADR-003
PKCE was incorrectly required in the /authorize endpoint, contradicting ADR-003 which defers PKCE to v1.1.0. Changes: - PKCE parameters are now optional in /authorize - If code_challenge provided, validates method is S256 - Defaults to S256 if method not specified - Logs when clients don't use PKCE for monitoring - Updated tests for optional PKCE behavior This fixes authentication for clients that don't implement PKCE. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
232
docs/roadmap/v1.1.0.md
Normal file
232
docs/roadmap/v1.1.0.md
Normal file
@@ -0,0 +1,232 @@
|
||||
# v1.1.0 Release Plan: Security & Production Hardening
|
||||
|
||||
**Status**: Planning
|
||||
**Target Release**: Q1 2026
|
||||
**Duration**: 3-4 weeks (12-18 days)
|
||||
**Theme**: Mixed approach - 30% technical debt cleanup, 70% new features
|
||||
**Compatibility**: Backward compatible with v1.0.0, maintains single-process simplicity
|
||||
|
||||
## Goals
|
||||
|
||||
1. Address critical technical debt that could compound
|
||||
2. Implement security best practices (PKCE, token revocation, refresh tokens)
|
||||
3. Add production observability (Prometheus metrics)
|
||||
4. Maintain backward compatibility with v1.0.0
|
||||
5. Keep deployment simple (no Redis requirement)
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- All technical debt items TD-001, TD-002, TD-003 resolved
|
||||
- PKCE support implemented per ADR-003
|
||||
- Token revocation and refresh functional
|
||||
- Prometheus metrics available
|
||||
- All tests passing with >90% coverage
|
||||
- Zero breaking changes for v1.0.0 clients
|
||||
- Documentation complete with migration guide
|
||||
|
||||
## Features & Technical Debt
|
||||
|
||||
### Phase 1: Technical Debt Cleanup (30% - 4-5 days)
|
||||
|
||||
#### TD-001: FastAPI Lifespan Migration
|
||||
- **Effort**: <1 day
|
||||
- **Priority**: P2
|
||||
- **Type**: Technical Debt
|
||||
- **Description**: Replace deprecated `@app.on_event()` decorators with modern lifespan handlers
|
||||
- **Rationale**: Current implementation uses deprecated API that will break in future FastAPI versions
|
||||
- **Impact**: Removes deprecation warnings, future-proofs codebase
|
||||
- **Files Affected**: `src/gondulf/main.py`
|
||||
|
||||
#### TD-002: Alembic Database Migration System
|
||||
- **Effort**: 1-2 days
|
||||
- **Priority**: P2
|
||||
- **Type**: Technical Debt
|
||||
- **Description**: Replace custom migration system with Alembic
|
||||
- **Rationale**: Current migrations are one-way only, no rollback capability
|
||||
- **Impact**: Production deployment safety, standard migration tooling
|
||||
- **Deliverables**:
|
||||
- Alembic configuration
|
||||
- Convert existing migrations to Alembic format
|
||||
- Migration rollback capability
|
||||
- Updated deployment documentation
|
||||
|
||||
#### TD-003: Async Email Support
|
||||
- **Effort**: 1-2 days
|
||||
- **Priority**: P2
|
||||
- **Type**: Technical Debt
|
||||
- **Description**: Replace synchronous SMTP with aiosmtplib
|
||||
- **Rationale**: Current SMTP blocks request thread (1-5 sec delays during email sending)
|
||||
- **Impact**: Improved UX, non-blocking email operations
|
||||
- **Files Affected**: `src/gondulf/services/email_service.py`
|
||||
|
||||
### Phase 2: Security Features (40% - 5-7 days)
|
||||
|
||||
#### PKCE Support (RFC 7636)
|
||||
- **Effort**: 1-2 days
|
||||
- **Priority**: P1
|
||||
- **Type**: Feature
|
||||
- **ADR**: ADR-003 explicitly defers PKCE to v1.1.0
|
||||
- **Description**: Implement Proof Key for Code Exchange
|
||||
- **Rationale**: OAuth 2.0 security best practice, protects against authorization code interception
|
||||
- **Backward Compatible**: Yes (PKCE is optional, non-PKCE clients continue working)
|
||||
- **Implementation**:
|
||||
- Accept `code_challenge` and `code_challenge_method` parameters in /authorize
|
||||
- Store code challenge with authorization code
|
||||
- Accept `code_verifier` parameter in /token endpoint
|
||||
- Validate SHA256(code_verifier) matches stored code_challenge
|
||||
- Update metadata endpoint to advertise PKCE support
|
||||
- **Testing**: Comprehensive tests for S256 method, optional PKCE, validation failures
|
||||
|
||||
#### Token Revocation Endpoint (RFC 7009)
|
||||
- **Effort**: 1-2 days
|
||||
- **Priority**: P1
|
||||
- **Type**: Feature
|
||||
- **Description**: POST /token/revoke endpoint for revoking access and refresh tokens
|
||||
- **Rationale**: Security improvement - allows clients to invalidate tokens
|
||||
- **Backward Compatible**: Yes (new endpoint)
|
||||
- **Implementation**:
|
||||
- POST /token/revoke endpoint
|
||||
- Accept `token` and `token_type_hint` parameters
|
||||
- Mark tokens as revoked in database
|
||||
- Update token verification to check revocation status
|
||||
- **Testing**: Revoke access tokens, refresh tokens, invalid tokens, already-revoked tokens
|
||||
|
||||
#### Token Refresh (RFC 6749 Section 6)
|
||||
- **Effort**: 3-5 days
|
||||
- **Priority**: P1
|
||||
- **Type**: Feature
|
||||
- **Description**: Implement refresh token grant type for long-lived sessions
|
||||
- **Rationale**: Standard OAuth 2.0 feature, enables long-lived sessions without re-authentication
|
||||
- **Backward Compatible**: Yes (optional feature, clients must opt-in)
|
||||
- **Implementation**:
|
||||
- Generate refresh tokens alongside access tokens
|
||||
- Store refresh tokens in database with expiration (30-90 days)
|
||||
- Accept `grant_type=refresh_token` in /token endpoint
|
||||
- Implement refresh token rotation (security best practice)
|
||||
- Update metadata endpoint
|
||||
- **Testing**: Token refresh flow, rotation, expiration, revocation
|
||||
|
||||
### Phase 3: Operational Features (30% - 3-4 days)
|
||||
|
||||
#### Prometheus Metrics Endpoint
|
||||
- **Effort**: 1-2 days
|
||||
- **Priority**: P2
|
||||
- **Type**: Feature
|
||||
- **Description**: /metrics endpoint exposing Prometheus-compatible metrics
|
||||
- **Rationale**: Production observability, monitoring, alerting
|
||||
- **Backward Compatible**: Yes (new endpoint)
|
||||
- **Metrics**:
|
||||
- HTTP request counters (by endpoint, method, status code)
|
||||
- Response time histograms
|
||||
- Active authorization sessions
|
||||
- Token issuance/verification counters
|
||||
- Error rates by type
|
||||
- Database connection pool stats
|
||||
- **Implementation**: Use prometheus_client library
|
||||
- **Testing**: Metrics accuracy, format compliance
|
||||
|
||||
#### Testing & Documentation
|
||||
- **Effort**: 2-3 days
|
||||
- **Priority**: P1
|
||||
- **Type**: Quality Assurance
|
||||
- **Deliverables**:
|
||||
- Unit tests for all new features (>90% coverage maintained)
|
||||
- Integration tests for PKCE, revocation, refresh flows
|
||||
- Update API documentation
|
||||
- Migration guide: v1.0.0 → v1.1.0
|
||||
- Update deployment documentation
|
||||
- Changelog for v1.1.0
|
||||
|
||||
## Deferred to Future Releases
|
||||
|
||||
### v1.2.0 Candidates
|
||||
|
||||
- **Rate Limiting** - Requires Redis, breaks single-process simplicity
|
||||
- Defer until scaling beyond single process is needed
|
||||
- Will require Redis dependency decision
|
||||
|
||||
- **Redis Session Storage** (TD-004) - Not critical yet
|
||||
- Current in-memory storage works for single process
|
||||
- Codes are short-lived (10-15 min), minimal impact from restarts
|
||||
|
||||
- **Admin Dashboard** - Lower priority operational feature
|
||||
|
||||
- **PostgreSQL Support** - SQLite sufficient for target scale
|
||||
|
||||
### v2.0.0 Considerations
|
||||
|
||||
Not committing to v2.0.0 scope yet. Will evaluate after v1.1.0 and v1.2.0 to determine if breaking changes are needed.
|
||||
|
||||
Potential v2.0.0 candidates (breaking changes):
|
||||
- Scope-based authorization (full OAuth 2.0 authz server)
|
||||
- JWT tokens (instead of opaque tokens)
|
||||
- Required PKCE (breaking for non-PKCE clients)
|
||||
|
||||
## Technical Debt Status
|
||||
|
||||
After v1.1.0, remaining technical debt:
|
||||
- **TD-004: Redis for Session Storage** (deferred to when scaling needed)
|
||||
|
||||
All other critical technical debt will be resolved.
|
||||
|
||||
## Dependencies
|
||||
|
||||
### External Dependencies Added
|
||||
- `aiosmtplib` - Async SMTP client
|
||||
- `alembic` - Database migration tool
|
||||
- `prometheus_client` - Metrics library
|
||||
|
||||
### Breaking Changes
|
||||
**None** - v1.1.0 is fully backward compatible with v1.0.0
|
||||
|
||||
## Release Checklist
|
||||
|
||||
- [ ] Phase 1: Technical debt cleanup complete
|
||||
- [ ] TD-001: FastAPI lifespan migration
|
||||
- [ ] TD-002: Alembic integration
|
||||
- [ ] TD-003: Async email support
|
||||
- [ ] Phase 2: Security features complete
|
||||
- [ ] PKCE support implemented and tested
|
||||
- [ ] Token revocation endpoint functional
|
||||
- [ ] Token refresh flow working
|
||||
- [ ] Phase 3: Operational features complete
|
||||
- [ ] Prometheus metrics endpoint
|
||||
- [ ] Documentation updated
|
||||
- [ ] Migration guide written
|
||||
- [ ] All tests passing (>90% coverage)
|
||||
- [ ] Security audit passed
|
||||
- [ ] Real client testing (PKCE-enabled clients)
|
||||
- [ ] Performance testing (async email, metrics overhead)
|
||||
- [ ] Docker image built and tested
|
||||
- [ ] Release notes written
|
||||
- [ ] Tag v1.1.0 and push to registry
|
||||
|
||||
## Timeline
|
||||
|
||||
**Week 1**: Technical debt cleanup (TD-001, TD-002, TD-003)
|
||||
**Week 2**: PKCE support + Token revocation
|
||||
**Week 3**: Token refresh implementation
|
||||
**Week 4**: Prometheus metrics + Testing + Documentation
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
**Low Risk** - All changes are additive and backward compatible
|
||||
|
||||
Potential risks:
|
||||
- Alembic migration conversion complexity (mitigation: thorough testing)
|
||||
- PKCE validation edge cases (mitigation: comprehensive test suite)
|
||||
- Refresh token security (mitigation: implement rotation best practices)
|
||||
|
||||
## Version Compatibility
|
||||
|
||||
- **v1.0.0 clients**: Fully compatible, no changes required
|
||||
- **New features**: Opt-in (PKCE, refresh tokens)
|
||||
- **Deployment**: Drop-in replacement, run migrations, no config changes required (unless using new features)
|
||||
|
||||
## References
|
||||
|
||||
- ADR-003: PKCE Deferred to v1.1.0
|
||||
- RFC 7636: Proof Key for Code Exchange (PKCE)
|
||||
- RFC 7009: OAuth 2.0 Token Revocation
|
||||
- RFC 6749: OAuth 2.0 Framework (Refresh Tokens)
|
||||
- Technical Debt Backlog: `/docs/roadmap/backlog.md`
|
||||
Reference in New Issue
Block a user