# ADR-004: Opaque Tokens for v1.0.0 (Not JWT) Date: 2025-11-20 ## Status Accepted ## Context Access tokens in OAuth 2.0 can be implemented in two primary formats: ### 1. Opaque Tokens Random strings with no inherent meaning. Token validation requires database lookup. **Characteristics**: - Random, unpredictable string (e.g., `secrets.token_urlsafe(32)`) - Server stores token metadata in database - Validation requires database query - Easily revocable (delete from database) - No information leakage (token contains no data) **Example**: ``` Token: Xy9kP2mN8fR5tQ1wE7aZ4bV6cG3hJ0sL Database: { token_hash: sha256(token), me: "https://example.com", client_id: "https://client.example.com", scope: "", issued_at: 2025-11-20T10:00:00Z, expires_at: 2025-11-20T11:00:00Z } ``` ### 2. JWT (JSON Web Tokens) Self-contained tokens encoding claims, signed by server. **Characteristics**: - Base64-encoded JSON with signature - Contains all metadata (me, client_id, scope, expiration) - Validation via signature verification (no database lookup) - Stateless (server doesn't store tokens) - Revocation complex (requires blocklist or short TTL) **Example**: ``` Token: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJtZSI6Imh0dHBzOi8vZXhhbXBsZS5jb20iLCJjbGllbnRfaWQiOiJodHRwczovL2NsaWVudC5leGFtcGxlLmNvbSIsInNjb3BlIjoiIiwiaWF0IjoxNzAwNDgwNDAwLCJleHAiOjE3MDA0ODQwMDB9.signature Decoded Payload: { "me": "https://example.com", "client_id": "https://client.example.com", "scope": "", "iat": 1700480400, "exp": 1700484000 } ``` ### v1.0.0 Requirements **Use Case**: Authentication only (no authorization) - Users prove domain ownership - Tokens confirm user identity to clients - No resource server in v1.0.0 (no /api endpoints to protect) - No token introspection endpoint in v1.0.0 **Scale**: 10s of users - Dozens of active tokens maximum - Database lookups negligible performance impact - No distributed system requirements **Security Priorities**: 1. Simple, auditable security model 2. Easy token revocation (future requirement) 3. No information leakage 4. No key management complexity **Simplicity Principle**: - Favor straightforward implementations - Avoid unnecessary complexity - Minimize dependencies ## Decision **Gondulf v1.0.0 will use opaque tokens (NOT JWT).** Tokens will be: - Generated using `secrets.token_urlsafe(32)` (256 bits of entropy) - Stored in SQLite database as SHA-256 hashes - Validated via database lookup and constant-time comparison - 1-hour expiration (configurable) ### Rationale **Simplicity**: - No signing algorithm selection (HS256 vs RS256 vs ES256) - No key generation or rotation - No JWT library dependency - No clock skew handling - Simple database lookup, not cryptographic verification **Security**: - No information leakage (token reveals nothing) - Easy revocation (delete from database) - No risk of algorithm confusion attacks - No risk of "none" algorithm vulnerability - Hashed storage prevents token recovery from database **v1.0.0 Scope Alignment**: - No resource server = no benefit from stateless validation - Authentication only = simple token validation sufficient - Small scale = database lookup performance acceptable - No token introspection endpoint = no external validation needed **Future Flexibility**: - Can migrate to JWT in v2.0.0 if needed - Database abstraction allows storage changes - Token format is implementation detail (not exposed to clients) ## Consequences ### Positive Consequences 1. **Simpler Implementation**: - No JWT library dependency - No signing key management - No algorithm selection complexity - Straightforward database operations 2. **Better Security (for this use case)**: - No information in token (empty payload = no leakage) - Trivial revocation (DELETE FROM tokens WHERE ...) - No cryptographic algorithm vulnerabilities - No key compromise risk 3. **Easier Auditing**: - All tokens visible in database - Clear token lifecycle (creation, usage, expiration) - Simple query to list all active tokens - Easy to track token usage 4. **Operational Simplicity**: - No key rotation required - No clock synchronization concerns - No JWT debugging complexity - Standard database operations 5. **Privacy**: - Token reveals nothing about user - No accidental PII in token claims - Bearer token is just a random string ### Negative Consequences 1. **Database Dependency**: - Token validation requires database access - Database outage = token validation fails - Performance limited by database (acceptable for small scale) 2. **Not Stateless**: - Cannot validate tokens without database - Horizontal scaling requires shared database - Not suitable for distributed resource servers (not needed in v1.0.0) 3. **Larger Storage**: - Must store all active tokens in database - Database grows with token count (cleaned up on expiration) 4. **Token Introspection**: - Resource servers cannot validate tokens independently (not needed in v1.0.0) - Would require introspection endpoint in future ### Mitigation Strategies **Database Dependency**: - Acceptable for v1.0.0 (single-process deployment) - SQLite is reliable (no network dependency) - Future: Add Redis caching if performance becomes issue - Future: Migrate to JWT if distributed validation needed **Storage Growth**: - Periodic cleanup of expired tokens - Configurable expiration (default 1 hour) - Database indexes on token_hash and expires_at - Monitor database size, alerts if grows unexpectedly **Future Scaling**: - SQLAlchemy abstraction allows migration to PostgreSQL - Can add Redis for caching if needed - Can migrate to JWT in v2.0.0 if requirements change ## Implementation ### Token Generation ```python import secrets import hashlib from datetime import datetime, timedelta def generate_token(me: str, client_id: str, scope: str = "") -> str: """ Generate opaque access token. Returns: 43-character base64url string (256 bits of entropy) """ # Generate token (returned to client, never stored) token = secrets.token_urlsafe(32) # 32 bytes = 256 bits # Hash for storage (SHA-256) token_hash = hashlib.sha256(token.encode('utf-8')).hexdigest() # Store in database expires_at = datetime.utcnow() + timedelta(hours=1) db.execute(''' INSERT INTO tokens (token_hash, me, client_id, scope, issued_at, expires_at, revoked) VALUES (?, ?, ?, ?, ?, ?, 0) ''', (token_hash, me, client_id, scope, datetime.utcnow(), expires_at)) logger.info(f"Token generated for {me} (client: {client_id})") return token # Return plaintext to client (only time it exists in plaintext) ``` ### Token Validation ```python import secrets import hashlib from datetime import datetime def verify_token(provided_token: str) -> Optional[dict]: """ Verify access token and return metadata. Returns: Token metadata dict or None if invalid """ # Hash provided token token_hash = hashlib.sha256(provided_token.encode('utf-8')).hexdigest() # Lookup in database (constant-time comparison in SQL) result = db.query_one(''' SELECT me, client_id, scope, expires_at, revoked FROM tokens WHERE token_hash = ? ''', (token_hash,)) if not result: logger.warning("Token not found") return None # Check expiration if datetime.utcnow() > result['expires_at']: logger.info(f"Token expired for {result['me']}") return None # Check revocation if result['revoked']: logger.warning(f"Revoked token presented for {result['me']}") return None # Valid token return { 'me': result['me'], 'client_id': result['client_id'], 'scope': result['scope'] } ``` ### Token Revocation (Future) ```python def revoke_token(provided_token: str) -> bool: """ Revoke access token. Returns: True if revoked, False if not found """ token_hash = hashlib.sha256(provided_token.encode('utf-8')).hexdigest() rows_updated = db.execute(''' UPDATE tokens SET revoked = 1 WHERE token_hash = ? ''', (token_hash,)) if rows_updated > 0: logger.info(f"Token revoked: {provided_token[:8]}...") return True else: logger.warning(f"Revoke failed: token not found") return False ``` ### Database Schema ```sql CREATE TABLE tokens ( id INTEGER PRIMARY KEY AUTOINCREMENT, token_hash TEXT NOT NULL UNIQUE, -- SHA-256 hash me TEXT NOT NULL, -- User's domain client_id TEXT NOT NULL, -- Client application URL scope TEXT NOT NULL DEFAULT '', -- Empty for v1.0.0 issued_at TIMESTAMP NOT NULL, -- When token created expires_at TIMESTAMP NOT NULL, -- When token expires revoked BOOLEAN NOT NULL DEFAULT 0 -- Revocation flag -- Indexes for performance CREATE INDEX idx_tokens_hash ON tokens(token_hash); CREATE INDEX idx_tokens_expires ON tokens(expires_at); CREATE INDEX idx_tokens_me ON tokens(me); ); ``` ### Periodic Cleanup ```python def cleanup_expired_tokens(): """ Delete expired tokens from database. Run periodically (e.g., hourly cron job). """ deleted = db.execute(''' DELETE FROM tokens WHERE expires_at < ? ''', (datetime.utcnow(),)) logger.info(f"Cleaned up {deleted} expired tokens") ``` ## Comparison: Opaque vs JWT | Aspect | Opaque Tokens | JWT | |--------|---------------|-----| | **Complexity** | Low (simple random string) | Medium (encoding, signing, claims) | | **Dependencies** | None (standard library) | JWT library (python-jose, PyJWT) | | **Validation** | Database lookup | Signature verification | | **Performance** | Requires DB query (~1-5ms) | No DB query (~0.1ms) | | **Revocation** | Trivial (DELETE) | Complex (blocklist required) | | **Stateless** | No (requires DB) | Yes (self-contained) | | **Information Leakage** | None (opaque) | Possible (claims readable) | | **Token Size** | 43 bytes | 150-300 bytes | | **Key Management** | Not required | Required (signing key) | | **Clock Skew** | Not relevant | Can cause issues (exp claim) | | **Debugging** | Simple (query database) | Complex (decode, verify signature) | | **Scale** | Limited by DB | Unlimited (stateless) | **Verdict for v1.0.0**: Opaque tokens win on simplicity, security, and alignment with MVP scope. ## Migration Path to JWT (if needed) If future requirements demand JWT (e.g., distributed resource servers, token introspection), migration is straightforward: **Step 1**: Implement JWT generation alongside opaque tokens ```python if config.USE_JWT: return generate_jwt_token(me, client_id, scope) else: return generate_opaque_token(me, client_id, scope) ``` **Step 2**: Support both token types in validation ```python if token.startswith('ey'): # JWT starts with 'ey' (base64 of {"alg":...) return verify_jwt_token(token) else: return verify_opaque_token(token) ``` **Step 3**: Gradual migration (both types valid) **Step 4**: Deprecate opaque tokens (future major version) ## Alternatives Considered ### Alternative 1: Use JWT from v1.0.0 **Pros**: - Industry standard - Stateless validation - Self-contained (no DB for validation) - Better for distributed systems **Cons**: - Adds complexity (signing, key management) - Requires JWT library dependency - Harder to revoke - Not needed for v1.0.0 scope (no resource server) - Risk of implementation mistakes (algorithm confusion, etc.) **Rejected**: Violates simplicity principle, no benefit for v1.0.0 scope. --- ### Alternative 2: Use JWT but store in database anyway **Pros**: - JWT benefits (self-contained) - Easy revocation (DB lookup) **Cons**: - Worst of both worlds (complexity + database dependency) - No performance benefit (still requires DB) - Redundant storage (token + database) **Rejected**: Adds complexity without benefits. --- ### Alternative 3: Use Macaroons (fancy tokens) **Pros**: - Advanced capabilities (caveats, delegation) - Cryptographically interesting **Cons**: - Extreme overkill for authentication - No standard library support - Complex implementation - Not OAuth 2.0 standard **Rejected**: Massive complexity for no benefit. ## References - OAuth 2.0 Bearer Token Usage (RFC 6750): https://datatracker.ietf.org/doc/html/rfc6750 - JWT (RFC 7519): https://datatracker.ietf.org/doc/html/rfc7519 - Token Introspection (RFC 7662): https://datatracker.ietf.org/doc/html/rfc7662 - OAuth 2.0 Security Best Practices: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics ## Decision History - 2025-11-20: Proposed (Architect) - 2025-11-20: Accepted (Architect) - TBD: Review for v2.0.0 (if JWT needed)