docs: add Phase 5 design and architectural review documentation
- Add ADR-014: RSS Feed Implementation - Add ADR-015: Phase 5 Implementation Approach - Add Phase 5 design documents (RSS and container) - Add pre-implementation review - Add RSS and container validation reports - Add architectural approval for v0.6.0 release Architecture reviews confirm 98/100 (RSS) and 96/100 (container) scores. Phase 5 approved for production deployment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
377
docs/decisions/ADR-014-rss-feed-implementation.md
Normal file
377
docs/decisions/ADR-014-rss-feed-implementation.md
Normal file
@@ -0,0 +1,377 @@
|
||||
# ADR-014: RSS Feed Implementation Strategy
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
Phase 5 requires implementing RSS feed generation for syndicating published notes. We need to decide on the implementation approach, feed format, caching strategy, and technical details for generating a standards-compliant RSS feed.
|
||||
|
||||
### Requirements
|
||||
|
||||
1. **Standard Compliance**: Feed must be valid RSS 2.0
|
||||
2. **Content Inclusion**: Include all published notes (up to configured limit)
|
||||
3. **Performance**: Feed generation should be fast and cacheable
|
||||
4. **Simplicity**: Minimal dependencies, straightforward implementation
|
||||
5. **IndieWeb Friendly**: Support feed discovery and proper metadata
|
||||
|
||||
### Key Questions
|
||||
|
||||
1. Which feed format(s) should we support?
|
||||
2. How should we generate the RSS XML?
|
||||
3. What caching strategy should we use?
|
||||
4. How should we handle note titles (notes may not have explicit titles)?
|
||||
5. How should we format dates for RSS?
|
||||
6. What should the feed item limit be?
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. Feed Format: RSS 2.0 Only (V1)
|
||||
|
||||
**Choice**: Implement RSS 2.0 exclusively for V1
|
||||
|
||||
**Rationale**:
|
||||
- RSS 2.0 is widely supported by all feed readers
|
||||
- Simpler than Atom (fewer required elements)
|
||||
- Sufficient for V1 needs (notes syndication)
|
||||
- feedgen library handles RSS 2.0 well
|
||||
- Defer Atom and JSON Feed to V2+
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **Atom 1.0**: More modern, better extensibility
|
||||
- Rejected: More complex, not needed for basic notes
|
||||
- May add in V2
|
||||
- **JSON Feed**: Developer-friendly format
|
||||
- Rejected: Less universal support, not essential
|
||||
- May add in V2
|
||||
- **Multiple formats**: Support RSS + Atom + JSON
|
||||
- Rejected: Adds complexity, not justified for V1
|
||||
- Single format keeps implementation simple
|
||||
|
||||
### 2. XML Generation: feedgen Library
|
||||
|
||||
**Choice**: Use feedgen library (already in dependencies)
|
||||
|
||||
**Rationale**:
|
||||
- Already dependency (used in architecture overview)
|
||||
- Handles RSS/Atom generation correctly
|
||||
- Produces valid, compliant XML
|
||||
- Saves time vs. manual XML generation
|
||||
- Well-maintained, stable library
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **Manual XML generation** (ElementTree or string templates)
|
||||
- Rejected: Error-prone, easy to produce invalid XML
|
||||
- Would need extensive validation
|
||||
- **PyRSS2Gen library**
|
||||
- Rejected: Last updated 2007, unmaintained
|
||||
- **Django Syndication Framework**
|
||||
- Rejected: Requires Django, too heavyweight
|
||||
|
||||
### 3. Feed Caching Strategy: Simple In-Memory Cache
|
||||
|
||||
**Choice**: 5-minute in-memory cache with ETag support
|
||||
|
||||
**Implementation**:
|
||||
```python
|
||||
_feed_cache = {
|
||||
'xml': None,
|
||||
'timestamp': None,
|
||||
'etag': None
|
||||
}
|
||||
|
||||
# Cache for 5 minutes
|
||||
if cache is fresh:
|
||||
return cached_xml with ETag
|
||||
else:
|
||||
generate fresh feed
|
||||
update cache
|
||||
return new XML with new ETag
|
||||
```
|
||||
|
||||
**Rationale**:
|
||||
- 5 minutes is acceptable delay for note updates
|
||||
- RSS readers typically poll every 15-60 minutes
|
||||
- In-memory cache is simple (no external dependencies)
|
||||
- ETag enables conditional requests
|
||||
- Cache-Control header enables client-side caching
|
||||
- Low complexity, easy to implement
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **No caching**: Generate on every request
|
||||
- Rejected: Wasteful, feed generation involves DB + file reads
|
||||
- **Flask-Caching with Redis**
|
||||
- Rejected: Adds external dependency (Redis)
|
||||
- Overkill for single-user system
|
||||
- **File-based cache**
|
||||
- Rejected: Complicates invalidation, I/O overhead
|
||||
- **Longer cache duration** (30+ minutes)
|
||||
- Rejected: Notes should appear reasonably quickly
|
||||
- 5 minutes balances performance and freshness
|
||||
|
||||
### 4. Note Titles: First Line or Timestamp
|
||||
|
||||
**Choice**: Extract first line (max 100 chars) or use timestamp
|
||||
|
||||
**Algorithm**:
|
||||
```python
|
||||
def get_note_title(note):
|
||||
# Try first line
|
||||
lines = note.content.strip().split('\n')
|
||||
if lines:
|
||||
title = lines[0].strip('#').strip()
|
||||
if title:
|
||||
return title[:100] # Truncate to 100 chars
|
||||
|
||||
# Fall back to timestamp
|
||||
return note.created_at.strftime('%B %d, %Y at %I:%M %p')
|
||||
```
|
||||
|
||||
**Rationale**:
|
||||
- Notes (per IndieWeb spec) don't have required titles
|
||||
- First line often serves as implicit title
|
||||
- Timestamp fallback ensures every item has title
|
||||
- 100 char limit prevents overly long titles
|
||||
- Simple, deterministic algorithm
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **Always use timestamp**: Too generic, not descriptive
|
||||
- **Use content hash**: Not human-friendly
|
||||
- **Require explicit title**: Breaks note simplicity
|
||||
- **Use first sentence**: Complex parsing, can be long
|
||||
- **Content preview (first 50 chars)**: May not be meaningful
|
||||
|
||||
### 5. Date Formatting: RFC-822
|
||||
|
||||
**Choice**: RFC-822 format as required by RSS 2.0 spec
|
||||
|
||||
**Format**: `Mon, 18 Nov 2024 12:00:00 +0000`
|
||||
|
||||
**Implementation**:
|
||||
```python
|
||||
def format_rfc822_date(dt):
|
||||
"""Format datetime to RFC-822"""
|
||||
# Ensure UTC
|
||||
dt_utc = dt.replace(tzinfo=timezone.utc)
|
||||
# RFC-822 format
|
||||
return dt_utc.strftime('%a, %d %b %Y %H:%M:%S %z')
|
||||
```
|
||||
|
||||
**Rationale**:
|
||||
- Required by RSS 2.0 specification
|
||||
- Standard format recognized by all feed readers
|
||||
- Python datetime supports formatting
|
||||
- Always use UTC to avoid timezone confusion
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **ISO 8601 format**: Used by Atom, not valid for RSS 2.0
|
||||
- **Unix timestamp**: Not human-readable, not standard
|
||||
- **Local timezone**: Ambiguous, causes parsing issues
|
||||
|
||||
### 6. Feed Item Limit: 50 (Configurable)
|
||||
|
||||
**Choice**: Default limit of 50 items, configurable via FEED_MAX_ITEMS
|
||||
|
||||
**Rationale**:
|
||||
- 50 items is sufficient for typical use (notes, not articles)
|
||||
- RSS readers handle 50 items well
|
||||
- Keeps feed size reasonable (< 100KB typical)
|
||||
- Configurable for users with different needs
|
||||
- Balances completeness and performance
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **No limit**: Feed could become very large
|
||||
- Rejected: Performance issues, large XML
|
||||
- **Limit of 10-20**: Too few, users might want more history
|
||||
- **Pagination**: Complex, not well-supported by readers
|
||||
- Deferred to V2 if needed
|
||||
- **Dynamic limit based on date**: Complicated logic
|
||||
|
||||
### 7. Content Inclusion: Full HTML in CDATA
|
||||
|
||||
**Choice**: Include full rendered HTML content in CDATA wrapper
|
||||
|
||||
**Format**:
|
||||
```xml
|
||||
<description><![CDATA[
|
||||
<p>Rendered HTML content here</p>
|
||||
]]></description>
|
||||
```
|
||||
|
||||
**Rationale**:
|
||||
- RSS readers expect HTML in description
|
||||
- CDATA prevents XML parsing issues
|
||||
- Already have rendered HTML from markdown
|
||||
- Provides full context to readers
|
||||
- Standard practice for content-rich feeds
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **Plain text only**: Loses formatting
|
||||
- **Markdown in description**: Not rendered by readers
|
||||
- **Summary/excerpt**: Notes are short, full content appropriate
|
||||
- **External link only**: Forces reader to leave feed
|
||||
|
||||
### 8. Feed Discovery: Standard Link Element
|
||||
|
||||
**Choice**: Add `<link rel="alternate">` to all HTML pages
|
||||
|
||||
**Implementation**:
|
||||
```html
|
||||
<link rel="alternate" type="application/rss+xml"
|
||||
title="Site Name RSS Feed"
|
||||
href="https://example.com/feed.xml">
|
||||
```
|
||||
|
||||
**Rationale**:
|
||||
- Standard HTML feed discovery mechanism
|
||||
- RSS readers auto-detect feeds
|
||||
- IndieWeb recommended practice
|
||||
- No JavaScript required
|
||||
- Works in all browsers
|
||||
|
||||
**Alternatives Considered**:
|
||||
- **No discovery**: Users must know feed URL
|
||||
- Rejected: Poor user experience
|
||||
- **JavaScript-based discovery**: Unnecessary complexity
|
||||
- **HTTP Link header**: Less common, harder to discover
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Module Structure
|
||||
|
||||
**File**: `starpunk/feed.py`
|
||||
|
||||
**Functions**:
|
||||
1. `generate_feed()` - Main feed generation
|
||||
2. `format_rfc822_date()` - Date formatting
|
||||
3. `get_note_title()` - Title extraction
|
||||
4. `clean_html_for_rss()` - HTML sanitization
|
||||
|
||||
**Dependencies**: feedgen library (already included)
|
||||
|
||||
### Route
|
||||
|
||||
**Path**: `/feed.xml`
|
||||
|
||||
**Handler**: `public.feed()` in `starpunk/routes/public.py`
|
||||
|
||||
**Caching**: In-memory cache + ETag + Cache-Control
|
||||
|
||||
### Configuration
|
||||
|
||||
**Environment Variables**:
|
||||
- `FEED_MAX_ITEMS` - Maximum feed items (default: 50)
|
||||
- `FEED_CACHE_SECONDS` - Cache duration (default: 300)
|
||||
|
||||
### Required Channel Elements
|
||||
|
||||
Per RSS 2.0 spec:
|
||||
- `<title>` - Site name
|
||||
- `<link>` - Site URL
|
||||
- `<description>` - Site description
|
||||
- `<language>` - en-us
|
||||
- `<lastBuildDate>` - Feed generation time
|
||||
- `<atom:link rel="self">` - Feed URL (for discovery)
|
||||
|
||||
### Required Item Elements
|
||||
|
||||
Per RSS 2.0 spec:
|
||||
- `<title>` - Note title
|
||||
- `<link>` - Note permalink
|
||||
- `<guid isPermaLink="true">` - Note permalink
|
||||
- `<pubDate>` - Note publication date
|
||||
- `<description>` - Full HTML content in CDATA
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Standard Compliance**: Valid RSS 2.0 feeds work everywhere
|
||||
2. **Performance**: Caching reduces load, fast responses
|
||||
3. **Simplicity**: Single feed format, straightforward implementation
|
||||
4. **Reliability**: feedgen library ensures valid XML
|
||||
5. **Flexibility**: Configurable limits accommodate different needs
|
||||
6. **Discovery**: Auto-detection in feed readers
|
||||
7. **Complete Content**: Full HTML in feed, no truncation
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Single Format**: No Atom or JSON Feed in V1
|
||||
- Mitigation: Can add in V2 if requested
|
||||
2. **Fixed Cache Duration**: Not dynamically adjusted
|
||||
- Mitigation: 5 minutes is reasonable compromise
|
||||
3. **Memory-Based Cache**: Lost on restart
|
||||
- Mitigation: Acceptable, regenerates quickly
|
||||
4. **No Pagination**: Large archives not fully accessible
|
||||
- Mitigation: 50 items is sufficient for notes
|
||||
|
||||
### Neutral
|
||||
|
||||
1. **Title Algorithm**: May not always produce ideal titles
|
||||
- Acceptable: Notes don't require titles, algorithm is reasonable
|
||||
2. **UTC Timestamps**: Users might prefer local time
|
||||
- Standard: UTC is RSS standard practice
|
||||
|
||||
## Validation
|
||||
|
||||
The decision will be validated by:
|
||||
|
||||
1. **W3C Feed Validator**: Feed must pass without errors
|
||||
2. **Feed Reader Testing**: Test in multiple readers (Feedly, NewsBlur, etc.)
|
||||
3. **Performance Testing**: Feed generation < 100ms uncached
|
||||
4. **Caching Testing**: Cache reduces load, serves stale correctly
|
||||
5. **Standards Review**: RSS 2.0 spec compliance verification
|
||||
|
||||
## Alternatives Rejected
|
||||
|
||||
### Use Django Syndication Framework
|
||||
|
||||
**Reason**: Requires Django, which we're not using (Flask project)
|
||||
|
||||
### Generate RSS Manually with Templates
|
||||
|
||||
**Reason**: Error-prone, hard to maintain, easy to produce invalid XML
|
||||
|
||||
### Support Multiple Feed Formats in V1
|
||||
|
||||
**Reason**: Adds complexity without clear benefit, RSS 2.0 is sufficient
|
||||
|
||||
### No Feed Caching
|
||||
|
||||
**Reason**: Wasteful, feed generation involves DB + file I/O
|
||||
|
||||
### Per-Tag Feeds
|
||||
|
||||
**Reason**: V1 doesn't have tags, defer to V2
|
||||
|
||||
### WebSub (PubSubHubbub) Support
|
||||
|
||||
**Reason**: Adds complexity, external dependency, not essential for V1
|
||||
|
||||
## References
|
||||
|
||||
### Standards
|
||||
- [RSS 2.0 Specification](https://www.rssboard.org/rss-specification)
|
||||
- [RFC-822 Date Format](https://www.rfc-editor.org/rfc/rfc822)
|
||||
- [W3C Feed Validator](https://validator.w3.org/feed/)
|
||||
|
||||
### Libraries
|
||||
- [feedgen Documentation](https://feedgen.kiesow.be/)
|
||||
- [Python datetime Documentation](https://docs.python.org/3/library/datetime.html)
|
||||
|
||||
### IndieWeb
|
||||
- [IndieWeb RSS](https://indieweb.org/RSS)
|
||||
- [Feed Discovery](https://indieweb.org/feed_discovery)
|
||||
|
||||
### Internal Documentation
|
||||
- [Architecture Overview](/home/phil/Projects/starpunk/docs/architecture/overview.md)
|
||||
- [Phase 5 Design](/home/phil/Projects/starpunk/docs/designs/phase-5-rss-and-container.md)
|
||||
|
||||
---
|
||||
|
||||
**ADR**: 014
|
||||
**Status**: Accepted
|
||||
**Date**: 2025-11-18
|
||||
**Author**: StarPunk Architect
|
||||
**Related**: ADR-002 (Flask Extensions), Phase 5 Design
|
||||
Reference in New Issue
Block a user