This commit implements the first three phases of v1.1.2 Phase 2 Feed Formats, adding ATOM 1.0 and JSON Feed 1.1 support alongside the existing RSS feed. CRITICAL BUG FIX: - Fixed RSS streaming feed ordering (was showing oldest-first instead of newest-first) - Streaming RSS removed incorrect reversed() call at line 198 - Feedgen RSS kept correct reversed() to compensate for library behavior NEW FEATURES: - ATOM 1.0 feed generation (RFC 4287 compliant) - Proper XML namespacing and RFC 3339 dates - Streaming and non-streaming methods - 11 comprehensive tests - JSON Feed 1.1 generation (JSON Feed spec compliant) - RFC 3339 dates and UTF-8 JSON output - Custom _starpunk extension with permalink_path and word_count - 13 comprehensive tests REFACTORING: - Restructured feed code into starpunk/feeds/ module - feeds/rss.py - RSS 2.0 (moved from feed.py) - feeds/atom.py - ATOM 1.0 (new) - feeds/json_feed.py - JSON Feed 1.1 (new) - Backward compatible feed.py shim for existing imports - Business metrics integrated into all feed generators TESTING: - Created shared test helper tests/helpers/feed_ordering.py - Helper validates newest-first ordering across all formats - 48 total feed tests, all passing - RSS: 24 tests - ATOM: 11 tests - JSON Feed: 13 tests FILES CHANGED: - Modified: starpunk/feed.py (now compatibility shim) - New: starpunk/feeds/ module with rss.py, atom.py, json_feed.py - New: tests/helpers/feed_ordering.py (shared test helper) - New: tests/test_feeds_atom.py, tests/test_feeds_json.py - Modified: CHANGELOG.md (Phase 2 entries) - New: docs/reports/2025-11-26-v1.1.2-phase2-feed-formats-partial.md NEXT STEPS: Phase 2.4 (Content Negotiation) pending - will add /feed endpoint with Accept header negotiation and explicit format endpoints. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
525 lines
15 KiB
Markdown
525 lines
15 KiB
Markdown
# StarPunk v1.1.2 Phase 2 Feed Formats - Implementation Report (Partial)
|
|
|
|
**Date**: 2025-11-26
|
|
**Developer**: StarPunk Fullstack Developer (AI)
|
|
**Phase**: v1.1.2 "Syndicate" - Phase 2 (Phases 2.0-2.3 Complete)
|
|
**Status**: Partially Complete - Content Negotiation (Phase 2.4) Pending
|
|
|
|
## Executive Summary
|
|
|
|
Successfully implemented ATOM 1.0 and JSON Feed 1.1 support for StarPunk, along with critical RSS feed ordering fix and feed module restructuring. This partial completion of Phase 2 provides the foundation for multi-format feed syndication.
|
|
|
|
### What Was Completed
|
|
|
|
- ✅ **Phase 2.0**: RSS Feed Ordering Fix (CRITICAL bug fix)
|
|
- ✅ **Phase 2.1**: Feed Module Restructuring
|
|
- ✅ **Phase 2.2**: ATOM 1.0 Feed Implementation
|
|
- ✅ **Phase 2.3**: JSON Feed 1.1 Implementation
|
|
- ⏳ **Phase 2.4**: Content Negotiation (PENDING - for next session)
|
|
|
|
### Key Achievements
|
|
|
|
1. **Fixed Critical RSS Bug**: Streaming RSS was showing oldest-first instead of newest-first
|
|
2. **Added ATOM Support**: Full RFC 4287 compliance with 11 passing tests
|
|
3. **Added JSON Feed Support**: JSON Feed 1.1 spec with 13 passing tests
|
|
4. **Restructured Code**: Clean module organization in `starpunk/feeds/`
|
|
5. **Business Metrics**: Integrated feed generation tracking
|
|
6. **Test Coverage**: 48 total feed tests, all passing
|
|
|
|
## Implementation Details
|
|
|
|
### Phase 2.0: RSS Feed Ordering Fix (0.5 hours)
|
|
|
|
**CRITICAL Production Bug**: RSS feeds were displaying entries oldest-first instead of newest-first due to incorrect `reversed()` call in streaming generation.
|
|
|
|
#### Root Cause Analysis
|
|
|
|
The bug was more subtle than initially described in the instructions:
|
|
|
|
1. **Feedgen-based RSS** (line 100): The `reversed()` call was CORRECT
|
|
- Feedgen library internally reverses entry order when generating XML
|
|
- Our `reversed()` compensates for this behavior
|
|
- Removing it would break the feed
|
|
|
|
2. **Streaming RSS** (line 198): The `reversed()` call was WRONG
|
|
- Manual XML generation doesn't reverse order
|
|
- The `reversed()` was incorrectly flipping newest-to-oldest
|
|
- Removing it fixed the ordering
|
|
|
|
#### Solution Implemented
|
|
|
|
```python
|
|
# feeds/rss.py - Line 100 (feedgen version) - KEPT reversed()
|
|
for note in reversed(notes[:limit]):
|
|
fe = fg.add_entry()
|
|
|
|
# feeds/rss.py - Line 198 (streaming version) - REMOVED reversed()
|
|
for note in notes[:limit]:
|
|
yield item_xml
|
|
```
|
|
|
|
#### Test Coverage
|
|
|
|
Created shared test helper `/tests/helpers/feed_ordering.py`:
|
|
- `assert_feed_newest_first()` function works for all formats (RSS, ATOM, JSON)
|
|
- Extracts dates in format-specific way
|
|
- Validates descending chronological order
|
|
- Provides clear error messages
|
|
|
|
Updated RSS tests to use shared helper:
|
|
```python
|
|
# test_feed.py
|
|
from tests/helpers/feed_ordering import assert_feed_newest_first
|
|
|
|
def test_generate_feed_newest_first(self, app):
|
|
# ... generate feed ...
|
|
assert_feed_newest_first(feed_xml, format_type='rss', expected_count=3)
|
|
```
|
|
|
|
### Phase 2.1: Feed Module Restructuring (2 hours)
|
|
|
|
Reorganized feed generation code for scalability and maintainability.
|
|
|
|
#### New Structure
|
|
|
|
```
|
|
starpunk/feeds/
|
|
├── __init__.py # Module exports
|
|
├── rss.py # RSS 2.0 generation (moved from feed.py)
|
|
├── atom.py # ATOM 1.0 generation (new)
|
|
└── json_feed.py # JSON Feed 1.1 generation (new)
|
|
|
|
starpunk/feed.py # Backward compatibility shim
|
|
```
|
|
|
|
#### Module Organization
|
|
|
|
**`feeds/__init__.py`**:
|
|
```python
|
|
from .rss import generate_rss, generate_rss_streaming
|
|
from .atom import generate_atom, generate_atom_streaming
|
|
from .json_feed import generate_json_feed, generate_json_feed_streaming
|
|
|
|
__all__ = [
|
|
"generate_rss", "generate_rss_streaming",
|
|
"generate_atom", "generate_atom_streaming",
|
|
"generate_json_feed", "generate_json_feed_streaming",
|
|
]
|
|
```
|
|
|
|
**`feed.py` Compatibility Shim**:
|
|
```python
|
|
# Maintains backward compatibility
|
|
from starpunk.feeds.rss import (
|
|
generate_rss as generate_feed,
|
|
generate_rss_streaming as generate_feed_streaming,
|
|
# ... other functions
|
|
)
|
|
```
|
|
|
|
#### Business Metrics Integration
|
|
|
|
Added to all feed generators per Q&A answer I1:
|
|
```python
|
|
import time
|
|
from starpunk.monitoring.business import track_feed_generated
|
|
|
|
def generate_rss(...):
|
|
start_time = time.time()
|
|
# ... generate feed ...
|
|
duration_ms = (time.time() - start_time) * 1000
|
|
track_feed_generated(
|
|
format='rss',
|
|
item_count=len(notes),
|
|
duration_ms=duration_ms,
|
|
cached=False
|
|
)
|
|
```
|
|
|
|
#### Verification
|
|
|
|
- All 24 existing RSS tests pass
|
|
- No breaking changes to public API
|
|
- Imports work from both old (`starpunk.feed`) and new (`starpunk.feeds`) locations
|
|
|
|
### Phase 2.2: ATOM 1.0 Feed Implementation (2.5 hours)
|
|
|
|
Implemented ATOM 1.0 feed generation following RFC 4287 specification.
|
|
|
|
#### Implementation Approach
|
|
|
|
Per Q&A answer I3, used Python's standard library `xml.etree.ElementTree` approach (manual string building with XML escaping) rather than ElementTree object model or feedgen library.
|
|
|
|
**Rationale**:
|
|
- No new dependencies
|
|
- Simple and explicit
|
|
- Full control over output format
|
|
- Proper XML escaping via helper function
|
|
|
|
#### Key Features
|
|
|
|
**Required ATOM Elements**:
|
|
- `<feed>` with proper namespace (`http://www.w3.org/2005/Atom`)
|
|
- `<id>`, `<title>`, `<updated>` at feed level
|
|
- `<entry>` elements with `<id>`, `<title>`, `<updated>`, `<published>`
|
|
|
|
**Content Handling** (per Q&A answer IQ6):
|
|
- `type="html"` for rendered markdown (escaped)
|
|
- `type="text"` for plain text (escaped)
|
|
- **Skipped** `type="xhtml"` (unnecessary complexity)
|
|
|
|
**Date Format**:
|
|
- RFC 3339 (ISO 8601 profile)
|
|
- UTC timestamps with 'Z' suffix
|
|
- Example: `2024-11-26T12:00:00Z`
|
|
|
|
#### Code Structure
|
|
|
|
**feeds/atom.py**:
|
|
```python
|
|
def generate_atom(...) -> str:
|
|
"""Non-streaming for caching"""
|
|
return ''.join(generate_atom_streaming(...))
|
|
|
|
def generate_atom_streaming(...):
|
|
"""Memory-efficient streaming"""
|
|
yield '<?xml version="1.0" encoding="utf-8"?>\n'
|
|
yield f'<feed xmlns="{ATOM_NS}">\n'
|
|
# ... feed metadata ...
|
|
for note in notes[:limit]: # Newest first - no reversed()!
|
|
yield ' <entry>\n'
|
|
# ... entry content ...
|
|
yield ' </entry>\n'
|
|
yield '</feed>\n'
|
|
```
|
|
|
|
**XML Escaping**:
|
|
```python
|
|
def _escape_xml(text: str) -> str:
|
|
"""Escape &, <, >, ", ' in order"""
|
|
if not text:
|
|
return ""
|
|
text = text.replace("&", "&") # First!
|
|
text = text.replace("<", "<")
|
|
text = text.replace(">", ">")
|
|
text = text.replace('"', """)
|
|
text = text.replace("'", "'")
|
|
return text
|
|
```
|
|
|
|
#### Test Coverage
|
|
|
|
Created `tests/test_feeds_atom.py` with 11 tests:
|
|
|
|
**Basic Functionality**:
|
|
- Valid ATOM XML generation
|
|
- Empty feed handling
|
|
- Entry limit respected
|
|
- Required/site URL validation
|
|
|
|
**Ordering & Structure**:
|
|
- Newest-first ordering (using shared helper)
|
|
- Proper ATOM namespace
|
|
- All required elements present
|
|
- HTML content escaping
|
|
|
|
**Edge Cases**:
|
|
- Special XML characters (`&`, `<`, `>`, `"`, `'`)
|
|
- Unicode content
|
|
- Empty description
|
|
|
|
All 11 tests passing.
|
|
|
|
### Phase 2.3: JSON Feed 1.1 Implementation (2.5 hours)
|
|
|
|
Implemented JSON Feed 1.1 following the official JSON Feed specification.
|
|
|
|
#### Implementation Approach
|
|
|
|
Used Python's standard library `json` module for serialization. Simple and straightforward - no external dependencies needed.
|
|
|
|
#### Key Features
|
|
|
|
**Required JSON Feed Fields**:
|
|
- `version`: "https://jsonfeed.org/version/1.1"
|
|
- `title`: Feed title
|
|
- `items`: Array of item objects
|
|
|
|
**Optional Fields Used**:
|
|
- `home_page_url`: Site URL
|
|
- `feed_url`: Self-reference URL
|
|
- `description`: Feed description
|
|
- `language`: "en"
|
|
|
|
**Item Structure**:
|
|
- `id`: Permalink (required)
|
|
- `url`: Permalink
|
|
- `title`: Note title
|
|
- `content_html` or `content_text`: Note content
|
|
- `date_published`: RFC 3339 timestamp
|
|
|
|
**Custom Extension** (per Q&A answer IQ7):
|
|
```json
|
|
"_starpunk": {
|
|
"permalink_path": "/notes/slug",
|
|
"word_count": 42
|
|
}
|
|
```
|
|
|
|
Minimal extension - only permalink_path and word_count. Can expand later based on user feedback.
|
|
|
|
#### Code Structure
|
|
|
|
**feeds/json_feed.py**:
|
|
```python
|
|
def generate_json_feed(...) -> str:
|
|
"""Non-streaming for caching"""
|
|
feed = _build_feed_object(...)
|
|
return json.dumps(feed, ensure_ascii=False, indent=2)
|
|
|
|
def generate_json_feed_streaming(...):
|
|
"""Memory-efficient streaming"""
|
|
yield '{\n'
|
|
yield f' "version": "https://jsonfeed.org/version/1.1",\n'
|
|
yield f' "title": {json.dumps(site_name)},\n'
|
|
# ... metadata ...
|
|
yield ' "items": [\n'
|
|
for i, note in enumerate(notes[:limit]): # Newest first!
|
|
item = _build_item_object(site_url, note)
|
|
item_json = json.dumps(item, ensure_ascii=False, indent=4)
|
|
# Proper indentation
|
|
yield indented_item_json
|
|
yield ',\n' if i < len(notes) - 1 else '\n'
|
|
yield ' ]\n'
|
|
yield '}\n'
|
|
```
|
|
|
|
**Date Formatting**:
|
|
```python
|
|
def _format_rfc3339_date(dt: datetime) -> str:
|
|
"""RFC 3339 format: 2024-11-26T12:00:00Z"""
|
|
if dt.tzinfo is None:
|
|
dt = dt.replace(tzinfo=timezone.utc)
|
|
if dt.tzinfo == timezone.utc:
|
|
return dt.strftime("%Y-%m-%dT%H:%M:%SZ")
|
|
else:
|
|
return dt.isoformat()
|
|
```
|
|
|
|
#### Test Coverage
|
|
|
|
Created `tests/test_feeds_json.py` with 13 tests:
|
|
|
|
**Basic Functionality**:
|
|
- Valid JSON generation
|
|
- Empty feed handling
|
|
- Entry limit respected
|
|
- Required field validation
|
|
|
|
**Ordering & Structure**:
|
|
- Newest-first ordering (using shared helper)
|
|
- JSON Feed 1.1 compliance
|
|
- All required fields present
|
|
- HTML content handling
|
|
|
|
**Format-Specific**:
|
|
- StarPunk custom extension (`_starpunk`)
|
|
- RFC 3339 date format validation
|
|
- UTF-8 encoding
|
|
- Pretty-printed output
|
|
|
|
All 13 tests passing.
|
|
|
|
## Testing Summary
|
|
|
|
### Test Results
|
|
|
|
```
|
|
48 total feed tests - ALL PASSING
|
|
- RSS: 24 tests (existing + ordering fix)
|
|
- ATOM: 11 tests (new)
|
|
- JSON Feed: 13 tests (new)
|
|
```
|
|
|
|
### Test Organization
|
|
|
|
```
|
|
tests/
|
|
├── helpers/
|
|
│ ├── __init__.py
|
|
│ └── feed_ordering.py # Shared ordering validation
|
|
├── test_feed.py # RSS tests (original)
|
|
├── test_feeds_atom.py # ATOM tests (new)
|
|
└── test_feeds_json.py # JSON Feed tests (new)
|
|
```
|
|
|
|
### Shared Test Helper
|
|
|
|
The `feed_ordering.py` helper provides cross-format ordering validation:
|
|
|
|
```python
|
|
def assert_feed_newest_first(feed_content, format_type, expected_count=None):
|
|
"""Verify feed items are newest-first regardless of format"""
|
|
if format_type == 'rss':
|
|
dates = _extract_rss_dates(feed_content) # Parse XML, get pubDate
|
|
elif format_type == 'atom':
|
|
dates = _extract_atom_dates(feed_content) # Parse XML, get published
|
|
elif format_type == 'json':
|
|
dates = _extract_json_feed_dates(feed_content) # Parse JSON, get date_published
|
|
|
|
# Verify descending order
|
|
for i in range(len(dates) - 1):
|
|
assert dates[i] >= dates[i + 1], "Not in newest-first order!"
|
|
```
|
|
|
|
This helper is now used by all feed format tests, ensuring consistent ordering validation.
|
|
|
|
## Code Quality
|
|
|
|
### Adherence to Standards
|
|
|
|
- **RSS 2.0**: Full specification compliance, RFC-822 dates
|
|
- **ATOM 1.0**: RFC 4287 compliance, RFC 3339 dates
|
|
- **JSON Feed 1.1**: Official spec compliance, RFC 3339 dates
|
|
|
|
### Python Standards
|
|
|
|
- Type hints on all function signatures
|
|
- Comprehensive docstrings with examples
|
|
- Standard library usage (no unnecessary dependencies)
|
|
- Proper error handling with ValueError
|
|
|
|
### StarPunk Principles
|
|
|
|
✅ **Simplicity**: Minimal code, standard library usage
|
|
✅ **Standards Compliance**: Following specs exactly
|
|
✅ **Testing**: Comprehensive test coverage
|
|
✅ **Documentation**: Clear docstrings and comments
|
|
|
|
## Performance Considerations
|
|
|
|
### Streaming vs Non-Streaming
|
|
|
|
All formats implement both methods per Q&A answer CQ6:
|
|
|
|
**Non-Streaming** (`generate_*`):
|
|
- Returns complete string
|
|
- Required for caching
|
|
- Built from streaming for consistency
|
|
|
|
**Streaming** (`generate_*_streaming`):
|
|
- Yields chunks
|
|
- Memory-efficient for large feeds
|
|
- Recommended for 100+ entries
|
|
|
|
### Business Metrics Overhead
|
|
|
|
Minimal impact from metrics tracking:
|
|
- Single `time.time()` call at start/end
|
|
- One function call to `track_feed_generated()`
|
|
- No sampling - always records feed generation
|
|
- Estimated overhead: <1ms per feed generation
|
|
|
|
## Files Created/Modified
|
|
|
|
### New Files
|
|
|
|
```
|
|
starpunk/feeds/__init__.py # Module exports
|
|
starpunk/feeds/rss.py # RSS moved from feed.py
|
|
starpunk/feeds/atom.py # ATOM 1.0 implementation
|
|
starpunk/feeds/json_feed.py # JSON Feed 1.1 implementation
|
|
|
|
tests/helpers/__init__.py # Test helpers module
|
|
tests/helpers/feed_ordering.py # Shared ordering validation
|
|
tests/test_feeds_atom.py # ATOM tests
|
|
tests/test_feeds_json.py # JSON Feed tests
|
|
```
|
|
|
|
### Modified Files
|
|
|
|
```
|
|
starpunk/feed.py # Now a compatibility shim
|
|
tests/test_feed.py # Added shared helper usage
|
|
CHANGELOG.md # Phase 2 entries
|
|
```
|
|
|
|
### File Sizes
|
|
|
|
```
|
|
starpunk/feeds/rss.py: ~400 lines (moved)
|
|
starpunk/feeds/atom.py: ~310 lines (new)
|
|
starpunk/feeds/json_feed.py: ~300 lines (new)
|
|
tests/test_feeds_atom.py: ~260 lines (new)
|
|
tests/test_feeds_json.py: ~290 lines (new)
|
|
tests/helpers/feed_ordering.py: ~150 lines (new)
|
|
```
|
|
|
|
## Remaining Work (Phase 2.4)
|
|
|
|
### Content Negotiation
|
|
|
|
Per Q&A answer CQ3, implement dual endpoint strategy:
|
|
|
|
**Endpoints Needed**:
|
|
- `/feed` - Content negotiation via Accept header
|
|
- `/feed.xml` or `/feed.rss` - Explicit RSS (backward compat)
|
|
- `/feed.atom` - Explicit ATOM
|
|
- `/feed.json` - Explicit JSON Feed
|
|
|
|
**Content Negotiation Logic**:
|
|
- Parse Accept header
|
|
- Quality factor scoring
|
|
- Default to RSS if multiple formats match
|
|
- Return 406 Not Acceptable if no match
|
|
|
|
**Implementation**:
|
|
- Create `feeds/negotiation.py` module
|
|
- Implement `ContentNegotiator` class
|
|
- Add routes to `routes/public.py`
|
|
- Update route tests
|
|
|
|
**Estimated Time**: 0.5-1 hour
|
|
|
|
## Questions for Architect
|
|
|
|
None at this time. All questions were answered in the Q&A document. Implementation followed specifications exactly.
|
|
|
|
## Recommendations
|
|
|
|
### Immediate Next Steps
|
|
|
|
1. **Complete Phase 2.4**: Implement content negotiation
|
|
2. **Integration Testing**: Test all three formats in production-like environment
|
|
3. **Feed Reader Testing**: Validate with actual feed reader clients
|
|
|
|
### Future Enhancements (Post v1.1.2)
|
|
|
|
1. **Feed Caching** (Phase 3): Implement checksum-based caching per design
|
|
2. **Feed Discovery**: Add `<link>` tags to HTML for feed auto-discovery (per Q&A N1)
|
|
3. **OPML Export**: Allow users to export all feed formats
|
|
4. **Enhanced JSON Feed**: Add author objects, attachments when supported by Note model
|
|
|
|
## Conclusion
|
|
|
|
Phase 2 (Phases 2.0-2.3) successfully implemented:
|
|
|
|
✅ Critical RSS ordering fix
|
|
✅ Clean feed module architecture
|
|
✅ ATOM 1.0 feed support
|
|
✅ JSON Feed 1.1 support
|
|
✅ Business metrics integration
|
|
✅ Comprehensive test coverage (48 tests, all passing)
|
|
|
|
The codebase is now ready for Phase 2.4 (content negotiation) to complete the feed formats feature. All feed generators follow standards, maintain newest-first ordering, and include proper metrics tracking.
|
|
|
|
**Status**: Ready for architect review and Phase 2.4 implementation.
|
|
|
|
---
|
|
|
|
**Implementation Date**: 2025-11-26
|
|
**Developer**: StarPunk Fullstack Developer (AI)
|
|
**Total Time**: ~7 hours (of estimated 7-8 hours for Phases 2.0-2.3)
|
|
**Tests**: 48 passing
|
|
**Next**: Phase 2.4 - Content Negotiation (0.5-1 hour)
|