# Phase 4b Security Hardening - Implementation Clarifications Date: 2025-11-20 ## Overview This document provides clarifications for implementation questions raised during the Phase 4b Security Hardening design review. Each clarification includes the rationale and specific implementation guidance. ## Clarifications ### 1. Content Security Policy (CSP) img-src Directive **Question**: Should `img-src 'self' https:` allow loading images from any HTTPS source, or should it be more restrictive? **Answer**: Use `img-src 'self' https:` to allow any HTTPS source. **Rationale**: - IndieAuth clients may display various client logos and user profile images from external HTTPS sources - Client applications registered via self-service could have logos hosted anywhere - User profile images from IndieWeb sites could be hosted on various services - Requiring explicit whitelisting would break the self-service registration model **Implementation**: ```python CSP_DIRECTIVES = { "default-src": "'self'", "script-src": "'self'", "style-src": "'self' 'unsafe-inline'", # unsafe-inline for minimal CSS "img-src": "'self' https:", # Allow any HTTPS image source "font-src": "'self'", "connect-src": "'self'", "frame-ancestors": "'none'" } ``` ### 2. HTTPS Enforcement with Reverse Proxy Support **Question**: Should the HTTPS enforcement middleware check the `X-Forwarded-Proto` header for reverse proxy deployments? **Answer**: Yes, check `X-Forwarded-Proto` header when configured for reverse proxy deployments. **Rationale**: - Many production deployments run behind reverse proxies (nginx, Apache, Cloudflare) - The application sees HTTP from the proxy even when the client connection is HTTPS - This is a standard pattern for Python web applications **Implementation**: ```python def is_https_request(request: Request) -> bool: """Check if request is HTTPS, considering reverse proxy headers.""" # Direct HTTPS if request.url.scheme == "https": return True # Behind proxy - check forwarded header # Only trust this header in production with TRUST_PROXY=true if config.TRUST_PROXY: forwarded_proto = request.headers.get("X-Forwarded-Proto", "").lower() return forwarded_proto == "https" return False ``` **Configuration Addition**: Add to config.py: ```python # Security settings HTTPS_REDIRECT: bool = True # Redirect HTTP to HTTPS in production TRUST_PROXY: bool = False # Trust X-Forwarded-* headers from reverse proxy ``` ### 3. Token Prefix Format for Logging **Question**: Should partial token logging consistently use exactly 8 characters with ellipsis suffix? **Answer**: Yes, use exactly 8 characters plus ellipsis for all token logging. **Rationale**: - Consistency aids in log parsing and monitoring - 8 characters provides enough uniqueness for debugging (16^8 = 4.3 billion combinations) - Ellipsis clearly indicates truncation to log readers - Matches common security logging practices **Implementation**: ```python def mask_sensitive_value(value: str, prefix_len: int = 8) -> str: """Mask sensitive values for logging, showing only prefix.""" if not value or len(value) <= prefix_len: return "***" return f"{value[:prefix_len]}..." # Usage in logging logger.info(f"Token validated", extra={ "token_prefix": mask_sensitive_value(token, 8), "client_id": client_id }) ``` ### 4. Timing Attack Test Reliability **Question**: How should we handle potential flakiness in statistical timing attack tests, especially in CI environments? **Answer**: Use a combination of increased sample size, relaxed thresholds for CI, and optional skip markers. **Rationale**: - CI environments have variable performance characteristics - Statistical tests inherently have some variance - We need to balance test reliability with meaningful security validation - Some timing variation is acceptable as long as there's no clear correlation **Implementation**: ```python @pytest.mark.security @pytest.mark.slow # Mark as slow test @pytest.mark.skipif( os.getenv("CI") == "true" and os.getenv("SKIP_TIMING_TESTS") == "true", reason="Timing tests disabled in CI" ) def test_authorization_code_timing_attack_resistance(): """Test that authorization code validation has consistent timing.""" # Increase samples in CI for better statistics samples = 200 if os.getenv("CI") == "true" else 100 # Use relaxed threshold in CI (30% vs 20% coefficient of variation) max_cv = 0.30 if os.getenv("CI") == "true" else 0.20 # ... rest of test implementation # Check coefficient of variation (stddev/mean) cv = np.std(timings) / np.mean(timings) assert cv < max_cv, f"Timing variation too high: {cv:.2%} (max: {max_cv:.2%})" ``` **CI Configuration**: Document in testing standards that `SKIP_TIMING_TESTS=true` can be set in CI if timing tests prove unreliable in a particular environment. ### 5. SQL Injection Test Implementation **Question**: Should SQL injection tests actually read and inspect source files for patterns? Are there concerns about false positives? **Answer**: No, do not inspect source files. Use actual injection attempts and verify behavior. **Rationale**: - Source code inspection is fragile and prone to false positives - Testing actual behavior is more reliable than pattern matching - SQLAlchemy's parameterized queries should handle this at runtime - Behavioral testing confirms the security measure works end-to-end **Implementation**: ```python @pytest.mark.security def test_sql_injection_prevention(): """Test that SQL injection attempts are properly prevented.""" # Test actual injection attempts, not source code patterns injection_attempts = [ "'; DROP TABLE users; --", "' OR '1'='1", "admin'--", "' UNION SELECT * FROM tokens--", "'; INSERT INTO clients VALUES ('evil', 'client'); --" ] for attempt in injection_attempts: # Attempt injection via client_id parameter response = client.get( "/authorize", params={"client_id": attempt, "response_type": "code"} ) # Should get client not found, not SQL error assert response.status_code == 400 assert "invalid_client" in response.json()["error"] # Verify no SQL error in logs (would indicate query wasn't escaped) # This would be checked via log capture in test fixtures ``` ### 6. HTTPS Redirect Configuration **Question**: Should `HTTPS_REDIRECT` configuration option be added to the Config class in Phase 4b? **Answer**: Yes, add both `HTTPS_REDIRECT` and `TRUST_PROXY` to the Config class. **Rationale**: - Security features need runtime configuration - Different deployment environments have different requirements - Development needs HTTP for local testing - Production typically needs HTTPS enforcement **Implementation**: Add to `src/config.py`: ```python class Config: """Application configuration.""" # Existing configuration... # Security configuration HTTPS_REDIRECT: bool = Field( default=True, description="Redirect HTTP requests to HTTPS in production" ) TRUST_PROXY: bool = Field( default=False, description="Trust X-Forwarded-* headers from reverse proxy" ) SECURE_COOKIES: bool = Field( default=True, description="Set secure flag on cookies (requires HTTPS)" ) @validator("HTTPS_REDIRECT") def validate_https_redirect(cls, v, values): """Disable HTTPS redirect in development.""" if values.get("ENV") == "development": return False return v ``` ### 7. Pytest Security Marker Registration **Question**: Should `@pytest.mark.security` be registered in pytest configuration? **Answer**: Yes, register the marker in `pytest.ini` or `pyproject.toml`. **Rationale**: - Prevents pytest warnings about unregistered markers - Enables running security tests separately: `pytest -m security` - Documents available test categories - Follows pytest best practices **Implementation**: Create or update `pytest.ini`: ```ini [tool:pytest] markers = security: Security-related tests (timing attacks, injection, headers) slow: Tests that take longer to run (timing attack statistics) integration: Integration tests requiring full application context ``` Or in `pyproject.toml`: ```toml [tool.pytest.ini_options] markers = [ "security: Security-related tests (timing attacks, injection, headers)", "slow: Tests that take longer to run (timing attack statistics)", "integration: Integration tests requiring full application context", ] ``` **Usage**: ```bash # Run only security tests pytest -m security # Run all except slow tests pytest -m "not slow" # Run security tests but not slow ones pytest -m "security and not slow" ``` ### 8. Secure Logging Guidelines Documentation **Question**: How should secure logging guidelines be structured in the coding standards? **Answer**: Add a dedicated "Security Practices" section to `/docs/standards/coding.md` with specific logging subsection. **Rationale**: - Security practices deserve prominent placement in coding standards - Developers need clear, findable guidelines - Examples make guidelines actionable - Should cover both what to log and what not to log **Implementation**: Add to `/docs/standards/coding.md`: ```markdown ## Security Practices ### Secure Logging Guidelines #### Never Log Sensitive Data The following must NEVER appear in logs: - Full tokens (authorization codes, access tokens, refresh tokens) - Passwords or secrets - Full authorization codes - Private keys or certificates - Personally identifiable information (PII) beyond user identifiers #### Safe Logging Practices When logging security-relevant events, follow these practices: 1. **Token Prefixes**: When token identification is necessary, log only the first 8 characters: ```python logger.info("Token validated", extra={ "token_prefix": token[:8] + "..." if len(token) > 8 else "***", "client_id": client_id }) ``` 2. **Request Context**: Log security events with context: ```python logger.warning("Authorization failed", extra={ "client_id": client_id, "ip_address": request.client.host, "user_agent": request.headers.get("User-Agent", "unknown"), "error": error_code # Use error codes, not full messages }) ``` 3. **Security Events to Log**: - Failed authentication attempts - Token validation failures - Rate limit violations - Input validation failures - HTTPS redirect actions - Client registration events 4. **Use Structured Logging**: Include metadata as structured fields: ```python logger.info("Client registered", extra={ "event": "client.registered", "client_id": client_id, "registration_method": "self_service", "timestamp": datetime.utcnow().isoformat() }) ``` 5. **Sanitize User Input**: Always sanitize user-provided data before logging: ```python def sanitize_for_logging(value: str, max_length: int = 100) -> str: """Sanitize user input for safe logging.""" # Remove control characters value = "".join(ch for ch in value if ch.isprintable()) # Truncate if too long if len(value) > max_length: value = value[:max_length] + "..." return value ``` #### Security Audit Logging For security-critical operations, use a dedicated audit logger: ```python audit_logger = logging.getLogger("security.audit") # Log security-critical events audit_logger.info("Token issued", extra={ "event": "token.issued", "client_id": client_id, "scope": scope, "expires_in": expires_in, "ip_address": request.client.host }) ``` #### Testing Logging Security Include tests that verify sensitive data doesn't leak into logs: ```python def test_no_token_in_logs(caplog): """Verify tokens are not logged in full.""" token = "sensitive_token_abc123xyz789" # Perform operation that logs token validate_token(token) # Check logs don't contain full token for record in caplog.records: assert token not in record.getMessage() # But prefix might be present assert token[:8] in record.getMessage() or "***" in record.getMessage() ``` ``` ## Summary All clarifications maintain the principle of simplicity while ensuring security. Key decisions: 1. **CSP allows any HTTPS image source** - supports self-service model 2. **HTTPS middleware checks proxy headers when configured** - supports real deployments 3. **Token prefixes use consistent 8-char + ellipsis format** - aids monitoring 4. **Timing tests use relaxed thresholds in CI** - balances reliability with security validation 5. **SQL injection tests use behavioral testing** - more reliable than source inspection 6. **Security config added to Config class** - runtime configuration for different environments 7. **Pytest markers registered properly** - enables targeted test runs 8. **Comprehensive security logging guidelines** - clear, actionable developer guidance These clarifications ensure the Developer can proceed with implementation without ambiguity while maintaining security best practices.