Implements tag/category system backend following microformats2 p-category specification. Database changes: - Migration 008: Add tags and note_tags tables - Normalized tag storage (case-insensitive lookup, display name preserved) - Indexes for performance New module: - starpunk/tags.py: Tag management functions - normalize_tag: Normalize tag strings - get_or_create_tag: Get or create tag records - add_tags_to_note: Associate tags with notes (replaces existing) - get_note_tags: Retrieve note tags (alphabetically ordered) - get_tag_by_name: Lookup tag by normalized name - get_notes_by_tag: Get all notes with specific tag - parse_tag_input: Parse comma-separated tag input Model updates: - Note.tags property (lazy-loaded, prefer pre-loading in routes) - Note.to_dict() add include_tags parameter CRUD updates: - create_note() accepts tags parameter - update_note() accepts tags parameter (None = no change, [] = remove all) Micropub integration: - Pass tags to create_note() (tags already extracted by extract_tags()) - Return tags in q=source response Per design doc: docs/design/v1.3.0/microformats-tags-design.md Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
397 lines
11 KiB
Markdown
397 lines
11 KiB
Markdown
# IndieAuth Endpoint Discovery Security Analysis
|
|
|
|
## Executive Summary
|
|
|
|
This document analyzes the security implications of implementing IndieAuth endpoint discovery correctly, contrasting it with the fundamentally flawed approach of hardcoding endpoints.
|
|
|
|
## The Critical Error: Hardcoded Endpoints
|
|
|
|
### What Was Wrong
|
|
|
|
```ini
|
|
# FATALLY FLAWED - Breaks IndieAuth completely
|
|
TOKEN_ENDPOINT=https://tokens.indieauth.com/token
|
|
```
|
|
|
|
### Why It's a Security Disaster
|
|
|
|
1. **Single Point of Failure**: If the hardcoded endpoint is compromised, ALL users are affected
|
|
2. **No User Control**: Users cannot change providers if security issues arise
|
|
3. **Trust Concentration**: Forces all users to trust a single provider
|
|
4. **Not IndieAuth**: This isn't IndieAuth at all - it's just OAuth with extra steps
|
|
5. **Violates User Sovereignty**: Users don't control their own authentication
|
|
|
|
## The Correct Approach: Dynamic Discovery
|
|
|
|
### Security Model
|
|
|
|
```
|
|
User Identity URL → Endpoint Discovery → Provider Verification
|
|
(User Controls) (Dynamic) (User's Choice)
|
|
```
|
|
|
|
### Security Benefits
|
|
|
|
1. **Distributed Trust**: No single provider compromise affects all users
|
|
2. **User Control**: Users can switch providers instantly if needed
|
|
3. **Provider Independence**: Each user's security is independent
|
|
4. **Immediate Revocation**: Users can revoke by changing profile links
|
|
5. **True Decentralization**: No central authority
|
|
|
|
## Threat Analysis
|
|
|
|
### Threat 1: Profile URL Hijacking
|
|
|
|
**Attack Vector**: Attacker gains control of user's profile URL
|
|
|
|
**Impact**: Can redirect authentication to attacker's endpoints
|
|
|
|
**Mitigations**:
|
|
- Profile URL must use HTTPS
|
|
- Verify SSL certificates
|
|
- Monitor for unexpected endpoint changes
|
|
- Cache endpoints with reasonable TTL
|
|
|
|
### Threat 2: Endpoint Discovery Manipulation
|
|
|
|
**Attack Vector**: MITM attack during endpoint discovery
|
|
|
|
**Impact**: Could redirect to malicious endpoints
|
|
|
|
**Mitigations**:
|
|
```python
|
|
def discover_endpoints(profile_url: str) -> dict:
|
|
# CRITICAL: Enforce HTTPS
|
|
if not profile_url.startswith('https://'):
|
|
raise SecurityError("Profile URL must use HTTPS")
|
|
|
|
# Verify SSL certificates
|
|
response = requests.get(
|
|
profile_url,
|
|
verify=True, # Enforce certificate validation
|
|
timeout=5
|
|
)
|
|
|
|
# Validate discovered endpoints
|
|
endpoints = extract_endpoints(response)
|
|
for endpoint_url in endpoints.values():
|
|
if not endpoint_url.startswith('https://'):
|
|
raise SecurityError(f"Endpoint must use HTTPS: {endpoint_url}")
|
|
|
|
return endpoints
|
|
```
|
|
|
|
### Threat 3: Cache Poisoning
|
|
|
|
**Attack Vector**: Attacker poisons endpoint cache with malicious URLs
|
|
|
|
**Impact**: Subsequent requests use attacker's endpoints
|
|
|
|
**Mitigations**:
|
|
```python
|
|
class SecureEndpointCache:
|
|
def store_endpoints(self, profile_url: str, endpoints: dict):
|
|
# Validate before caching
|
|
self._validate_profile_url(profile_url)
|
|
self._validate_endpoints(endpoints)
|
|
|
|
# Store with integrity check
|
|
cache_entry = {
|
|
'endpoints': endpoints,
|
|
'stored_at': time.time(),
|
|
'checksum': self._calculate_checksum(endpoints)
|
|
}
|
|
self.cache[profile_url] = cache_entry
|
|
|
|
def get_endpoints(self, profile_url: str) -> dict:
|
|
entry = self.cache.get(profile_url)
|
|
if entry:
|
|
# Verify integrity
|
|
if self._calculate_checksum(entry['endpoints']) != entry['checksum']:
|
|
# Cache corruption detected
|
|
del self.cache[profile_url]
|
|
raise SecurityError("Cache integrity check failed")
|
|
return entry['endpoints']
|
|
```
|
|
|
|
### Threat 4: Redirect Attacks
|
|
|
|
**Attack Vector**: Malicious redirects during discovery
|
|
|
|
**Impact**: Could redirect to attacker-controlled endpoints
|
|
|
|
**Mitigations**:
|
|
```python
|
|
def fetch_with_redirect_limit(url: str, max_redirects: int = 5):
|
|
redirect_count = 0
|
|
visited = set()
|
|
|
|
while redirect_count < max_redirects:
|
|
if url in visited:
|
|
raise SecurityError("Redirect loop detected")
|
|
visited.add(url)
|
|
|
|
response = requests.get(url, allow_redirects=False)
|
|
|
|
if response.status_code in (301, 302, 303, 307, 308):
|
|
redirect_url = response.headers.get('Location')
|
|
|
|
# Validate redirect target
|
|
if not redirect_url.startswith('https://'):
|
|
raise SecurityError("Redirect to non-HTTPS URL blocked")
|
|
|
|
url = redirect_url
|
|
redirect_count += 1
|
|
else:
|
|
return response
|
|
|
|
raise SecurityError("Too many redirects")
|
|
```
|
|
|
|
### Threat 5: Token Replay Attacks
|
|
|
|
**Attack Vector**: Intercepted token reused
|
|
|
|
**Impact**: Unauthorized access
|
|
|
|
**Mitigations**:
|
|
- Always use HTTPS for token transmission
|
|
- Implement token expiration
|
|
- Cache token verification results briefly
|
|
- Use nonce/timestamp validation
|
|
|
|
## Security Requirements
|
|
|
|
### 1. HTTPS Enforcement
|
|
|
|
```python
|
|
class HTTPSEnforcer:
|
|
def validate_url(self, url: str, context: str):
|
|
"""Enforce HTTPS for all security-critical URLs"""
|
|
|
|
parsed = urlparse(url)
|
|
|
|
# Development exception (with warning)
|
|
if self.development_mode and parsed.hostname in ['localhost', '127.0.0.1']:
|
|
logger.warning(f"Allowing HTTP in development for {context}: {url}")
|
|
return
|
|
|
|
# Production: HTTPS required
|
|
if parsed.scheme != 'https':
|
|
raise SecurityError(f"HTTPS required for {context}: {url}")
|
|
```
|
|
|
|
### 2. Certificate Validation
|
|
|
|
```python
|
|
def create_secure_http_client():
|
|
"""Create HTTP client with proper security settings"""
|
|
|
|
return httpx.Client(
|
|
verify=True, # Always verify SSL certificates
|
|
follow_redirects=False, # Handle redirects manually
|
|
timeout=httpx.Timeout(
|
|
connect=5.0,
|
|
read=10.0,
|
|
write=10.0,
|
|
pool=10.0
|
|
),
|
|
limits=httpx.Limits(
|
|
max_connections=100,
|
|
max_keepalive_connections=20
|
|
),
|
|
headers={
|
|
'User-Agent': 'StarPunk/1.0 (+https://starpunk.example.com/)'
|
|
}
|
|
)
|
|
```
|
|
|
|
### 3. Input Validation
|
|
|
|
```python
|
|
def validate_endpoint_response(response: dict, expected_me: str):
|
|
"""Validate token verification response"""
|
|
|
|
# Required fields
|
|
if 'me' not in response:
|
|
raise ValidationError("Missing 'me' field in response")
|
|
|
|
# URL normalization and comparison
|
|
normalized_me = normalize_url(response['me'])
|
|
normalized_expected = normalize_url(expected_me)
|
|
|
|
if normalized_me != normalized_expected:
|
|
raise ValidationError(
|
|
f"Token 'me' mismatch: expected {normalized_expected}, "
|
|
f"got {normalized_me}"
|
|
)
|
|
|
|
# Scope validation
|
|
scopes = response.get('scope', '').split()
|
|
if 'create' not in scopes:
|
|
raise ValidationError("Token missing required 'create' scope")
|
|
|
|
return True
|
|
```
|
|
|
|
### 4. Rate Limiting
|
|
|
|
```python
|
|
class DiscoveryRateLimiter:
|
|
"""Prevent discovery abuse"""
|
|
|
|
def __init__(self, max_per_minute: int = 60):
|
|
self.requests = defaultdict(list)
|
|
self.max_per_minute = max_per_minute
|
|
|
|
def check_rate_limit(self, profile_url: str):
|
|
now = time.time()
|
|
minute_ago = now - 60
|
|
|
|
# Clean old entries
|
|
self.requests[profile_url] = [
|
|
t for t in self.requests[profile_url]
|
|
if t > minute_ago
|
|
]
|
|
|
|
# Check limit
|
|
if len(self.requests[profile_url]) >= self.max_per_minute:
|
|
raise RateLimitError(f"Too many discovery requests for {profile_url}")
|
|
|
|
# Record request
|
|
self.requests[profile_url].append(now)
|
|
```
|
|
|
|
## Implementation Checklist
|
|
|
|
### Discovery Security
|
|
|
|
- [ ] Enforce HTTPS for profile URLs
|
|
- [ ] Validate SSL certificates
|
|
- [ ] Limit redirect chains to 5
|
|
- [ ] Detect redirect loops
|
|
- [ ] Validate discovered endpoint URLs
|
|
- [ ] Implement discovery rate limiting
|
|
- [ ] Log all discovery attempts
|
|
- [ ] Handle timeouts gracefully
|
|
|
|
### Token Verification Security
|
|
|
|
- [ ] Use HTTPS for all token endpoints
|
|
- [ ] Validate token endpoint responses
|
|
- [ ] Check 'me' field matches expected
|
|
- [ ] Verify required scopes present
|
|
- [ ] Hash tokens before caching
|
|
- [ ] Implement cache expiration
|
|
- [ ] Use constant-time comparisons
|
|
- [ ] Log verification failures
|
|
|
|
### Cache Security
|
|
|
|
- [ ] Validate data before caching
|
|
- [ ] Implement cache size limits
|
|
- [ ] Use TTL for all cache entries
|
|
- [ ] Clear cache on configuration changes
|
|
- [ ] Protect against cache poisoning
|
|
- [ ] Monitor cache hit/miss rates
|
|
- [ ] Implement cache integrity checks
|
|
|
|
### Error Handling
|
|
|
|
- [ ] Never expose internal errors
|
|
- [ ] Log security events
|
|
- [ ] Rate limit error responses
|
|
- [ ] Implement proper timeouts
|
|
- [ ] Handle network failures gracefully
|
|
- [ ] Provide clear user messages
|
|
|
|
## Security Testing
|
|
|
|
### Test Scenarios
|
|
|
|
1. **HTTPS Downgrade Attack**
|
|
- Try to use HTTP endpoints
|
|
- Verify rejection
|
|
|
|
2. **Invalid Certificates**
|
|
- Test with self-signed certs
|
|
- Test with expired certs
|
|
- Verify rejection
|
|
|
|
3. **Redirect Attacks**
|
|
- Test redirect loops
|
|
- Test excessive redirects
|
|
- Test HTTP redirects
|
|
- Verify proper handling
|
|
|
|
4. **Cache Poisoning**
|
|
- Attempt to inject invalid data
|
|
- Verify cache validation
|
|
|
|
5. **Token Manipulation**
|
|
- Modify token before verification
|
|
- Test expired tokens
|
|
- Test tokens with wrong 'me'
|
|
- Verify proper rejection
|
|
|
|
## Monitoring and Alerting
|
|
|
|
### Security Metrics
|
|
|
|
```python
|
|
# Track these metrics
|
|
security_metrics = {
|
|
'discovery_failures': Counter(),
|
|
'https_violations': Counter(),
|
|
'certificate_errors': Counter(),
|
|
'redirect_limit_exceeded': Counter(),
|
|
'cache_poisoning_attempts': Counter(),
|
|
'token_verification_failures': Counter(),
|
|
'rate_limit_violations': Counter()
|
|
}
|
|
```
|
|
|
|
### Alert Conditions
|
|
|
|
- Multiple discovery failures for same profile
|
|
- Sudden increase in HTTPS violations
|
|
- Certificate validation failures
|
|
- Cache poisoning attempts detected
|
|
- Unusual token verification patterns
|
|
|
|
## Incident Response
|
|
|
|
### If Endpoint Compromise Suspected
|
|
|
|
1. Clear endpoint cache immediately
|
|
2. Force re-discovery of all endpoints
|
|
3. Alert affected users
|
|
4. Review logs for suspicious patterns
|
|
5. Document incident
|
|
|
|
### If Cache Poisoning Detected
|
|
|
|
1. Clear entire cache
|
|
2. Review cache validation logic
|
|
3. Identify attack vector
|
|
4. Implement additional validation
|
|
5. Monitor for recurrence
|
|
|
|
## Conclusion
|
|
|
|
Dynamic endpoint discovery is not just correct according to the IndieAuth specification - it's also more secure than hardcoded endpoints. By allowing users to control their authentication infrastructure, we:
|
|
|
|
1. Eliminate single points of failure
|
|
2. Enable immediate provider switching
|
|
3. Distribute security responsibility
|
|
4. Maintain true decentralization
|
|
5. Respect user sovereignty
|
|
|
|
The complexity of proper implementation is justified by the security and flexibility benefits. This is what IndieAuth is designed to provide, and we must implement it correctly.
|
|
|
|
---
|
|
|
|
**Document Version**: 1.0
|
|
**Created**: 2024-11-24
|
|
**Classification**: Security Architecture
|
|
**Review Schedule**: Quarterly |