feat(security): merge Phase 4b security hardening

Complete security hardening implementation including HTTPS enforcement,
security headers, rate limiting, and comprehensive security test suite.

Key features:
- HTTPS enforcement with HSTS support
- Security headers (CSP, X-Frame-Options, X-Content-Type-Options)
- Rate limiting for all critical endpoints
- Enhanced email template security
- 87% test coverage with security-specific tests

Architect approval: 9.5/10

Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-11-20 18:28:50 -07:00
parent 115e733604
commit d3c3e8dc6b
23 changed files with 3762 additions and 7 deletions

View File

@@ -0,0 +1,397 @@
# Phase 4b Security Hardening - Implementation Clarifications
Date: 2025-11-20
## Overview
This document provides clarifications for implementation questions raised during the Phase 4b Security Hardening design review. Each clarification includes the rationale and specific implementation guidance.
## Clarifications
### 1. Content Security Policy (CSP) img-src Directive
**Question**: Should `img-src 'self' https:` allow loading images from any HTTPS source, or should it be more restrictive?
**Answer**: Use `img-src 'self' https:` to allow any HTTPS source.
**Rationale**:
- IndieAuth clients may display various client logos and user profile images from external HTTPS sources
- Client applications registered via self-service could have logos hosted anywhere
- User profile images from IndieWeb sites could be hosted on various services
- Requiring explicit whitelisting would break the self-service registration model
**Implementation**:
```python
CSP_DIRECTIVES = {
"default-src": "'self'",
"script-src": "'self'",
"style-src": "'self' 'unsafe-inline'", # unsafe-inline for minimal CSS
"img-src": "'self' https:", # Allow any HTTPS image source
"font-src": "'self'",
"connect-src": "'self'",
"frame-ancestors": "'none'"
}
```
### 2. HTTPS Enforcement with Reverse Proxy Support
**Question**: Should the HTTPS enforcement middleware check the `X-Forwarded-Proto` header for reverse proxy deployments?
**Answer**: Yes, check `X-Forwarded-Proto` header when configured for reverse proxy deployments.
**Rationale**:
- Many production deployments run behind reverse proxies (nginx, Apache, Cloudflare)
- The application sees HTTP from the proxy even when the client connection is HTTPS
- This is a standard pattern for Python web applications
**Implementation**:
```python
def is_https_request(request: Request) -> bool:
"""Check if request is HTTPS, considering reverse proxy headers."""
# Direct HTTPS
if request.url.scheme == "https":
return True
# Behind proxy - check forwarded header
# Only trust this header in production with TRUST_PROXY=true
if config.TRUST_PROXY:
forwarded_proto = request.headers.get("X-Forwarded-Proto", "").lower()
return forwarded_proto == "https"
return False
```
**Configuration Addition**:
Add to config.py:
```python
# Security settings
HTTPS_REDIRECT: bool = True # Redirect HTTP to HTTPS in production
TRUST_PROXY: bool = False # Trust X-Forwarded-* headers from reverse proxy
```
### 3. Token Prefix Format for Logging
**Question**: Should partial token logging consistently use exactly 8 characters with ellipsis suffix?
**Answer**: Yes, use exactly 8 characters plus ellipsis for all token logging.
**Rationale**:
- Consistency aids in log parsing and monitoring
- 8 characters provides enough uniqueness for debugging (16^8 = 4.3 billion combinations)
- Ellipsis clearly indicates truncation to log readers
- Matches common security logging practices
**Implementation**:
```python
def mask_sensitive_value(value: str, prefix_len: int = 8) -> str:
"""Mask sensitive values for logging, showing only prefix."""
if not value or len(value) <= prefix_len:
return "***"
return f"{value[:prefix_len]}..."
# Usage in logging
logger.info(f"Token validated", extra={
"token_prefix": mask_sensitive_value(token, 8),
"client_id": client_id
})
```
### 4. Timing Attack Test Reliability
**Question**: How should we handle potential flakiness in statistical timing attack tests, especially in CI environments?
**Answer**: Use a combination of increased sample size, relaxed thresholds for CI, and optional skip markers.
**Rationale**:
- CI environments have variable performance characteristics
- Statistical tests inherently have some variance
- We need to balance test reliability with meaningful security validation
- Some timing variation is acceptable as long as there's no clear correlation
**Implementation**:
```python
@pytest.mark.security
@pytest.mark.slow # Mark as slow test
@pytest.mark.skipif(
os.getenv("CI") == "true" and os.getenv("SKIP_TIMING_TESTS") == "true",
reason="Timing tests disabled in CI"
)
def test_authorization_code_timing_attack_resistance():
"""Test that authorization code validation has consistent timing."""
# Increase samples in CI for better statistics
samples = 200 if os.getenv("CI") == "true" else 100
# Use relaxed threshold in CI (30% vs 20% coefficient of variation)
max_cv = 0.30 if os.getenv("CI") == "true" else 0.20
# ... rest of test implementation
# Check coefficient of variation (stddev/mean)
cv = np.std(timings) / np.mean(timings)
assert cv < max_cv, f"Timing variation too high: {cv:.2%} (max: {max_cv:.2%})"
```
**CI Configuration**:
Document in testing standards that `SKIP_TIMING_TESTS=true` can be set in CI if timing tests prove unreliable in a particular environment.
### 5. SQL Injection Test Implementation
**Question**: Should SQL injection tests actually read and inspect source files for patterns? Are there concerns about false positives?
**Answer**: No, do not inspect source files. Use actual injection attempts and verify behavior.
**Rationale**:
- Source code inspection is fragile and prone to false positives
- Testing actual behavior is more reliable than pattern matching
- SQLAlchemy's parameterized queries should handle this at runtime
- Behavioral testing confirms the security measure works end-to-end
**Implementation**:
```python
@pytest.mark.security
def test_sql_injection_prevention():
"""Test that SQL injection attempts are properly prevented."""
# Test actual injection attempts, not source code patterns
injection_attempts = [
"'; DROP TABLE users; --",
"' OR '1'='1",
"admin'--",
"' UNION SELECT * FROM tokens--",
"'; INSERT INTO clients VALUES ('evil', 'client'); --"
]
for attempt in injection_attempts:
# Attempt injection via client_id parameter
response = client.get(
"/authorize",
params={"client_id": attempt, "response_type": "code"}
)
# Should get client not found, not SQL error
assert response.status_code == 400
assert "invalid_client" in response.json()["error"]
# Verify no SQL error in logs (would indicate query wasn't escaped)
# This would be checked via log capture in test fixtures
```
### 6. HTTPS Redirect Configuration
**Question**: Should `HTTPS_REDIRECT` configuration option be added to the Config class in Phase 4b?
**Answer**: Yes, add both `HTTPS_REDIRECT` and `TRUST_PROXY` to the Config class.
**Rationale**:
- Security features need runtime configuration
- Different deployment environments have different requirements
- Development needs HTTP for local testing
- Production typically needs HTTPS enforcement
**Implementation**:
Add to `src/config.py`:
```python
class Config:
"""Application configuration."""
# Existing configuration...
# Security configuration
HTTPS_REDIRECT: bool = Field(
default=True,
description="Redirect HTTP requests to HTTPS in production"
)
TRUST_PROXY: bool = Field(
default=False,
description="Trust X-Forwarded-* headers from reverse proxy"
)
SECURE_COOKIES: bool = Field(
default=True,
description="Set secure flag on cookies (requires HTTPS)"
)
@validator("HTTPS_REDIRECT")
def validate_https_redirect(cls, v, values):
"""Disable HTTPS redirect in development."""
if values.get("ENV") == "development":
return False
return v
```
### 7. Pytest Security Marker Registration
**Question**: Should `@pytest.mark.security` be registered in pytest configuration?
**Answer**: Yes, register the marker in `pytest.ini` or `pyproject.toml`.
**Rationale**:
- Prevents pytest warnings about unregistered markers
- Enables running security tests separately: `pytest -m security`
- Documents available test categories
- Follows pytest best practices
**Implementation**:
Create or update `pytest.ini`:
```ini
[tool:pytest]
markers =
security: Security-related tests (timing attacks, injection, headers)
slow: Tests that take longer to run (timing attack statistics)
integration: Integration tests requiring full application context
```
Or in `pyproject.toml`:
```toml
[tool.pytest.ini_options]
markers = [
"security: Security-related tests (timing attacks, injection, headers)",
"slow: Tests that take longer to run (timing attack statistics)",
"integration: Integration tests requiring full application context",
]
```
**Usage**:
```bash
# Run only security tests
pytest -m security
# Run all except slow tests
pytest -m "not slow"
# Run security tests but not slow ones
pytest -m "security and not slow"
```
### 8. Secure Logging Guidelines Documentation
**Question**: How should secure logging guidelines be structured in the coding standards?
**Answer**: Add a dedicated "Security Practices" section to `/docs/standards/coding.md` with specific logging subsection.
**Rationale**:
- Security practices deserve prominent placement in coding standards
- Developers need clear, findable guidelines
- Examples make guidelines actionable
- Should cover both what to log and what not to log
**Implementation**:
Add to `/docs/standards/coding.md`:
```markdown
## Security Practices
### Secure Logging Guidelines
#### Never Log Sensitive Data
The following must NEVER appear in logs:
- Full tokens (authorization codes, access tokens, refresh tokens)
- Passwords or secrets
- Full authorization codes
- Private keys or certificates
- Personally identifiable information (PII) beyond user identifiers
#### Safe Logging Practices
When logging security-relevant events, follow these practices:
1. **Token Prefixes**: When token identification is necessary, log only the first 8 characters:
```python
logger.info("Token validated", extra={
"token_prefix": token[:8] + "..." if len(token) > 8 else "***",
"client_id": client_id
})
```
2. **Request Context**: Log security events with context:
```python
logger.warning("Authorization failed", extra={
"client_id": client_id,
"ip_address": request.client.host,
"user_agent": request.headers.get("User-Agent", "unknown"),
"error": error_code # Use error codes, not full messages
})
```
3. **Security Events to Log**:
- Failed authentication attempts
- Token validation failures
- Rate limit violations
- Input validation failures
- HTTPS redirect actions
- Client registration events
4. **Use Structured Logging**: Include metadata as structured fields:
```python
logger.info("Client registered", extra={
"event": "client.registered",
"client_id": client_id,
"registration_method": "self_service",
"timestamp": datetime.utcnow().isoformat()
})
```
5. **Sanitize User Input**: Always sanitize user-provided data before logging:
```python
def sanitize_for_logging(value: str, max_length: int = 100) -> str:
"""Sanitize user input for safe logging."""
# Remove control characters
value = "".join(ch for ch in value if ch.isprintable())
# Truncate if too long
if len(value) > max_length:
value = value[:max_length] + "..."
return value
```
#### Security Audit Logging
For security-critical operations, use a dedicated audit logger:
```python
audit_logger = logging.getLogger("security.audit")
# Log security-critical events
audit_logger.info("Token issued", extra={
"event": "token.issued",
"client_id": client_id,
"scope": scope,
"expires_in": expires_in,
"ip_address": request.client.host
})
```
#### Testing Logging Security
Include tests that verify sensitive data doesn't leak into logs:
```python
def test_no_token_in_logs(caplog):
"""Verify tokens are not logged in full."""
token = "sensitive_token_abc123xyz789"
# Perform operation that logs token
validate_token(token)
# Check logs don't contain full token
for record in caplog.records:
assert token not in record.getMessage()
# But prefix might be present
assert token[:8] in record.getMessage() or "***" in record.getMessage()
```
```
## Summary
All clarifications maintain the principle of simplicity while ensuring security. Key decisions:
1. **CSP allows any HTTPS image source** - supports self-service model
2. **HTTPS middleware checks proxy headers when configured** - supports real deployments
3. **Token prefixes use consistent 8-char + ellipsis format** - aids monitoring
4. **Timing tests use relaxed thresholds in CI** - balances reliability with security validation
5. **SQL injection tests use behavioral testing** - more reliable than source inspection
6. **Security config added to Config class** - runtime configuration for different environments
7. **Pytest markers registered properly** - enables targeted test runs
8. **Comprehensive security logging guidelines** - clear, actionable developer guidance
These clarifications ensure the Developer can proceed with implementation without ambiguity while maintaining security best practices.

File diff suppressed because it is too large Load Diff