feat(security): merge Phase 4b security hardening

Complete security hardening implementation including HTTPS enforcement, security headers, rate limiting, and comprehensive security test suite. Key features: - HTTPS enforcement with HSTS support - Security headers (CSP, X-Frame-Options, X-Content-Type-Options) - Rate limiting for all critical endpoints - Enhanced email template security - 87% test coverage with security-specific tests Architect approval: 9.5/10 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 18:28:50 -07:00
parent 115e733604
commit d3c3e8dc6b
23 changed files with 3762 additions and 7 deletions
--- a/docs/designs/phase-4b-clarifications.md
+++ b/docs/designs/phase-4b-clarifications.md
@@ -0,0 +1,397 @@
+# Phase 4b Security Hardening - Implementation Clarifications
+
+Date: 2025-11-20
+
+## Overview
+
+This document provides clarifications for implementation questions raised during the Phase 4b Security Hardening design review. Each clarification includes the rationale and specific implementation guidance.
+
+## Clarifications
+
+### 1. Content Security Policy (CSP) img-src Directive
+
+**Question**: Should `img-src 'self' https:` allow loading images from any HTTPS source, or should it be more restrictive?
+
+**Answer**: Use `img-src 'self' https:` to allow any HTTPS source.
+
+**Rationale**:
+- IndieAuth clients may display various client logos and user profile images from external HTTPS sources
+- Client applications registered via self-service could have logos hosted anywhere
+- User profile images from IndieWeb sites could be hosted on various services
+- Requiring explicit whitelisting would break the self-service registration model
+
+**Implementation**:
+```python
+CSP_DIRECTIVES = {
+    "default-src": "'self'",
+    "script-src": "'self'",
+    "style-src": "'self' 'unsafe-inline'",  # unsafe-inline for minimal CSS
+    "img-src": "'self' https:",  # Allow any HTTPS image source
+    "font-src": "'self'",
+    "connect-src": "'self'",
+    "frame-ancestors": "'none'"
+}
+```
+
+### 2. HTTPS Enforcement with Reverse Proxy Support
+
+**Question**: Should the HTTPS enforcement middleware check the `X-Forwarded-Proto` header for reverse proxy deployments?
+
+**Answer**: Yes, check `X-Forwarded-Proto` header when configured for reverse proxy deployments.
+
+**Rationale**:
+- Many production deployments run behind reverse proxies (nginx, Apache, Cloudflare)
+- The application sees HTTP from the proxy even when the client connection is HTTPS
+- This is a standard pattern for Python web applications
+
+**Implementation**:
+```python
+def is_https_request(request: Request) -> bool:
+    """Check if request is HTTPS, considering reverse proxy headers."""
+    # Direct HTTPS
+    if request.url.scheme == "https":
+        return True
+
+    # Behind proxy - check forwarded header
+    # Only trust this header in production with TRUST_PROXY=true
+    if config.TRUST_PROXY:
+        forwarded_proto = request.headers.get("X-Forwarded-Proto", "").lower()
+        return forwarded_proto == "https"
+
+    return False
+```
+
+**Configuration Addition**:
+Add to config.py:
+```python
+# Security settings
+HTTPS_REDIRECT: bool = True  # Redirect HTTP to HTTPS in production
+TRUST_PROXY: bool = False    # Trust X-Forwarded-* headers from reverse proxy
+```
+
+### 3. Token Prefix Format for Logging
+
+**Question**: Should partial token logging consistently use exactly 8 characters with ellipsis suffix?
+
+**Answer**: Yes, use exactly 8 characters plus ellipsis for all token logging.
+
+**Rationale**:
+- Consistency aids in log parsing and monitoring
+- 8 characters provides enough uniqueness for debugging (16^8 = 4.3 billion combinations)
+- Ellipsis clearly indicates truncation to log readers
+- Matches common security logging practices
+
+**Implementation**:
+```python
+def mask_sensitive_value(value: str, prefix_len: int = 8) -> str:
+    """Mask sensitive values for logging, showing only prefix."""
+    if not value or len(value) <= prefix_len:
+        return "***"
+    return f"{value[:prefix_len]}..."
+
+# Usage in logging
+logger.info(f"Token validated", extra={
+    "token_prefix": mask_sensitive_value(token, 8),
+    "client_id": client_id
+})
+```
+
+### 4. Timing Attack Test Reliability
+
+**Question**: How should we handle potential flakiness in statistical timing attack tests, especially in CI environments?
+
+**Answer**: Use a combination of increased sample size, relaxed thresholds for CI, and optional skip markers.
+
+**Rationale**:
+- CI environments have variable performance characteristics
+- Statistical tests inherently have some variance
+- We need to balance test reliability with meaningful security validation
+- Some timing variation is acceptable as long as there's no clear correlation
+
+**Implementation**:
+```python
+@pytest.mark.security
+@pytest.mark.slow  # Mark as slow test
+@pytest.mark.skipif(
+    os.getenv("CI") == "true" and os.getenv("SKIP_TIMING_TESTS") == "true",
+    reason="Timing tests disabled in CI"
+)
+def test_authorization_code_timing_attack_resistance():
+    """Test that authorization code validation has consistent timing."""
+    # Increase samples in CI for better statistics
+    samples = 200 if os.getenv("CI") == "true" else 100
+
+    # Use relaxed threshold in CI (30% vs 20% coefficient of variation)
+    max_cv = 0.30 if os.getenv("CI") == "true" else 0.20
+
+    # ... rest of test implementation
+
+    # Check coefficient of variation (stddev/mean)
+    cv = np.std(timings) / np.mean(timings)
+    assert cv < max_cv, f"Timing variation too high: {cv:.2%} (max: {max_cv:.2%})"
+```
+
+**CI Configuration**:
+Document in testing standards that `SKIP_TIMING_TESTS=true` can be set in CI if timing tests prove unreliable in a particular environment.
+
+### 5. SQL Injection Test Implementation
+
+**Question**: Should SQL injection tests actually read and inspect source files for patterns? Are there concerns about false positives?
+
+**Answer**: No, do not inspect source files. Use actual injection attempts and verify behavior.
+
+**Rationale**:
+- Source code inspection is fragile and prone to false positives
+- Testing actual behavior is more reliable than pattern matching
+- SQLAlchemy's parameterized queries should handle this at runtime
+- Behavioral testing confirms the security measure works end-to-end
+
+**Implementation**:
+```python
+@pytest.mark.security
+def test_sql_injection_prevention():
+    """Test that SQL injection attempts are properly prevented."""
+    # Test actual injection attempts, not source code patterns
+    injection_attempts = [
+        "'; DROP TABLE users; --",
+        "' OR '1'='1",
+        "admin'--",
+        "' UNION SELECT * FROM tokens--",
+        "'; INSERT INTO clients VALUES ('evil', 'client'); --"
+    ]
+
+    for attempt in injection_attempts:
+        # Attempt injection via client_id parameter
+        response = client.get(
+            "/authorize",
+            params={"client_id": attempt, "response_type": "code"}
+        )
+
+        # Should get client not found, not SQL error
+        assert response.status_code == 400
+        assert "invalid_client" in response.json()["error"]
+
+        # Verify no SQL error in logs (would indicate query wasn't escaped)
+        # This would be checked via log capture in test fixtures
+```
+
+### 6. HTTPS Redirect Configuration
+
+**Question**: Should `HTTPS_REDIRECT` configuration option be added to the Config class in Phase 4b?
+
+**Answer**: Yes, add both `HTTPS_REDIRECT` and `TRUST_PROXY` to the Config class.
+
+**Rationale**:
+- Security features need runtime configuration
+- Different deployment environments have different requirements
+- Development needs HTTP for local testing
+- Production typically needs HTTPS enforcement
+
+**Implementation**:
+Add to `src/config.py`:
+```python
+class Config:
+    """Application configuration."""
+
+    # Existing configuration...
+
+    # Security configuration
+    HTTPS_REDIRECT: bool = Field(
+        default=True,
+        description="Redirect HTTP requests to HTTPS in production"
+    )
+
+    TRUST_PROXY: bool = Field(
+        default=False,
+        description="Trust X-Forwarded-* headers from reverse proxy"
+    )
+
+    SECURE_COOKIES: bool = Field(
+        default=True,
+        description="Set secure flag on cookies (requires HTTPS)"
+    )
+
+    @validator("HTTPS_REDIRECT")
+    def validate_https_redirect(cls, v, values):
+        """Disable HTTPS redirect in development."""
+        if values.get("ENV") == "development":
+            return False
+        return v
+```
+
+### 7. Pytest Security Marker Registration
+
+**Question**: Should `@pytest.mark.security` be registered in pytest configuration?
+
+**Answer**: Yes, register the marker in `pytest.ini` or `pyproject.toml`.
+
+**Rationale**:
+- Prevents pytest warnings about unregistered markers
+- Enables running security tests separately: `pytest -m security`
+- Documents available test categories
+- Follows pytest best practices
+
+**Implementation**:
+Create or update `pytest.ini`:
+```ini
+[tool:pytest]
+markers =
+    security: Security-related tests (timing attacks, injection, headers)
+    slow: Tests that take longer to run (timing attack statistics)
+    integration: Integration tests requiring full application context
+```
+
+Or in `pyproject.toml`:
+```toml
+[tool.pytest.ini_options]
+markers = [
+    "security: Security-related tests (timing attacks, injection, headers)",
+    "slow: Tests that take longer to run (timing attack statistics)",
+    "integration: Integration tests requiring full application context",
+]
+```
+
+**Usage**:
+```bash
+# Run only security tests
+pytest -m security
+
+# Run all except slow tests
+pytest -m "not slow"
+
+# Run security tests but not slow ones
+pytest -m "security and not slow"
+```
+
+### 8. Secure Logging Guidelines Documentation
+
+**Question**: How should secure logging guidelines be structured in the coding standards?
+
+**Answer**: Add a dedicated "Security Practices" section to `/docs/standards/coding.md` with specific logging subsection.
+
+**Rationale**:
+- Security practices deserve prominent placement in coding standards
+- Developers need clear, findable guidelines
+- Examples make guidelines actionable
+- Should cover both what to log and what not to log
+
+**Implementation**:
+Add to `/docs/standards/coding.md`:
+
+```markdown
+## Security Practices
+
+### Secure Logging Guidelines
+
+#### Never Log Sensitive Data
+
+The following must NEVER appear in logs:
+- Full tokens (authorization codes, access tokens, refresh tokens)
+- Passwords or secrets
+- Full authorization codes
+- Private keys or certificates
+- Personally identifiable information (PII) beyond user identifiers
+
+#### Safe Logging Practices
+
+When logging security-relevant events, follow these practices:
+
+1. **Token Prefixes**: When token identification is necessary, log only the first 8 characters:
+   ```python
+   logger.info("Token validated", extra={
+       "token_prefix": token[:8] + "..." if len(token) > 8 else "***",
+       "client_id": client_id
+   })
+   ```
+
+2. **Request Context**: Log security events with context:
+   ```python
+   logger.warning("Authorization failed", extra={
+       "client_id": client_id,
+       "ip_address": request.client.host,
+       "user_agent": request.headers.get("User-Agent", "unknown"),
+       "error": error_code  # Use error codes, not full messages
+   })
+   ```
+
+3. **Security Events to Log**:
+   - Failed authentication attempts
+   - Token validation failures
+   - Rate limit violations
+   - Input validation failures
+   - HTTPS redirect actions
+   - Client registration events
+
+4. **Use Structured Logging**: Include metadata as structured fields:
+   ```python
+   logger.info("Client registered", extra={
+       "event": "client.registered",
+       "client_id": client_id,
+       "registration_method": "self_service",
+       "timestamp": datetime.utcnow().isoformat()
+   })
+   ```
+
+5. **Sanitize User Input**: Always sanitize user-provided data before logging:
+   ```python
+   def sanitize_for_logging(value: str, max_length: int = 100) -> str:
+       """Sanitize user input for safe logging."""
+       # Remove control characters
+       value = "".join(ch for ch in value if ch.isprintable())
+       # Truncate if too long
+       if len(value) > max_length:
+           value = value[:max_length] + "..."
+       return value
+   ```
+
+#### Security Audit Logging
+
+For security-critical operations, use a dedicated audit logger:
+
+```python
+audit_logger = logging.getLogger("security.audit")
+
+# Log security-critical events
+audit_logger.info("Token issued", extra={
+    "event": "token.issued",
+    "client_id": client_id,
+    "scope": scope,
+    "expires_in": expires_in,
+    "ip_address": request.client.host
+})
+```
+
+#### Testing Logging Security
+
+Include tests that verify sensitive data doesn't leak into logs:
+
+```python
+def test_no_token_in_logs(caplog):
+    """Verify tokens are not logged in full."""
+    token = "sensitive_token_abc123xyz789"
+
+    # Perform operation that logs token
+    validate_token(token)
+
+    # Check logs don't contain full token
+    for record in caplog.records:
+        assert token not in record.getMessage()
+        # But prefix might be present
+        assert token[:8] in record.getMessage() or "***" in record.getMessage()
+```
+```
+
+## Summary
+
+All clarifications maintain the principle of simplicity while ensuring security. Key decisions:
+
+1. **CSP allows any HTTPS image source** - supports self-service model
+2. **HTTPS middleware checks proxy headers when configured** - supports real deployments
+3. **Token prefixes use consistent 8-char + ellipsis format** - aids monitoring
+4. **Timing tests use relaxed thresholds in CI** - balances reliability with security validation
+5. **SQL injection tests use behavioral testing** - more reliable than source inspection
+6. **Security config added to Config class** - runtime configuration for different environments
+7. **Pytest markers registered properly** - enables targeted test runs
+8. **Comprehensive security logging guidelines** - clear, actionable developer guidance
+
+These clarifications ensure the Developer can proceed with implementation without ambiguity while maintaining security best practices.
--- a/docs/designs/phase-4b-security-hardening.md
+++ b/docs/designs/phase-4b-security-hardening.md