Files
StarPunk/docs/reports/2025-11-26-v1.1.2-phase2-feed-formats-partial.md
Phil Skentelbery 59e9d402c6 feat: Implement Phase 2 Feed Formats - ATOM, JSON Feed, RSS fix (Phases 2.0-2.3)
This commit implements the first three phases of v1.1.2 Phase 2 Feed Formats,
adding ATOM 1.0 and JSON Feed 1.1 support alongside the existing RSS feed.

CRITICAL BUG FIX:
- Fixed RSS streaming feed ordering (was showing oldest-first instead of newest-first)
- Streaming RSS removed incorrect reversed() call at line 198
- Feedgen RSS kept correct reversed() to compensate for library behavior

NEW FEATURES:
- ATOM 1.0 feed generation (RFC 4287 compliant)
  - Proper XML namespacing and RFC 3339 dates
  - Streaming and non-streaming methods
  - 11 comprehensive tests

- JSON Feed 1.1 generation (JSON Feed spec compliant)
  - RFC 3339 dates and UTF-8 JSON output
  - Custom _starpunk extension with permalink_path and word_count
  - 13 comprehensive tests

REFACTORING:
- Restructured feed code into starpunk/feeds/ module
  - feeds/rss.py - RSS 2.0 (moved from feed.py)
  - feeds/atom.py - ATOM 1.0 (new)
  - feeds/json_feed.py - JSON Feed 1.1 (new)
- Backward compatible feed.py shim for existing imports
- Business metrics integrated into all feed generators

TESTING:
- Created shared test helper tests/helpers/feed_ordering.py
- Helper validates newest-first ordering across all formats
- 48 total feed tests, all passing
  - RSS: 24 tests
  - ATOM: 11 tests
  - JSON Feed: 13 tests

FILES CHANGED:
- Modified: starpunk/feed.py (now compatibility shim)
- New: starpunk/feeds/ module with rss.py, atom.py, json_feed.py
- New: tests/helpers/feed_ordering.py (shared test helper)
- New: tests/test_feeds_atom.py, tests/test_feeds_json.py
- Modified: CHANGELOG.md (Phase 2 entries)
- New: docs/reports/2025-11-26-v1.1.2-phase2-feed-formats-partial.md

NEXT STEPS:
Phase 2.4 (Content Negotiation) pending - will add /feed endpoint with
Accept header negotiation and explicit format endpoints.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 14:54:52 -07:00

15 KiB

StarPunk v1.1.2 Phase 2 Feed Formats - Implementation Report (Partial)

Date: 2025-11-26 Developer: StarPunk Fullstack Developer (AI) Phase: v1.1.2 "Syndicate" - Phase 2 (Phases 2.0-2.3 Complete) Status: Partially Complete - Content Negotiation (Phase 2.4) Pending

Executive Summary

Successfully implemented ATOM 1.0 and JSON Feed 1.1 support for StarPunk, along with critical RSS feed ordering fix and feed module restructuring. This partial completion of Phase 2 provides the foundation for multi-format feed syndication.

What Was Completed

  • Phase 2.0: RSS Feed Ordering Fix (CRITICAL bug fix)
  • Phase 2.1: Feed Module Restructuring
  • Phase 2.2: ATOM 1.0 Feed Implementation
  • Phase 2.3: JSON Feed 1.1 Implementation
  • Phase 2.4: Content Negotiation (PENDING - for next session)

Key Achievements

  1. Fixed Critical RSS Bug: Streaming RSS was showing oldest-first instead of newest-first
  2. Added ATOM Support: Full RFC 4287 compliance with 11 passing tests
  3. Added JSON Feed Support: JSON Feed 1.1 spec with 13 passing tests
  4. Restructured Code: Clean module organization in starpunk/feeds/
  5. Business Metrics: Integrated feed generation tracking
  6. Test Coverage: 48 total feed tests, all passing

Implementation Details

Phase 2.0: RSS Feed Ordering Fix (0.5 hours)

CRITICAL Production Bug: RSS feeds were displaying entries oldest-first instead of newest-first due to incorrect reversed() call in streaming generation.

Root Cause Analysis

The bug was more subtle than initially described in the instructions:

  1. Feedgen-based RSS (line 100): The reversed() call was CORRECT

    • Feedgen library internally reverses entry order when generating XML
    • Our reversed() compensates for this behavior
    • Removing it would break the feed
  2. Streaming RSS (line 198): The reversed() call was WRONG

    • Manual XML generation doesn't reverse order
    • The reversed() was incorrectly flipping newest-to-oldest
    • Removing it fixed the ordering

Solution Implemented

# feeds/rss.py - Line 100 (feedgen version) - KEPT reversed()
for note in reversed(notes[:limit]):
    fe = fg.add_entry()

# feeds/rss.py - Line 198 (streaming version) - REMOVED reversed()
for note in notes[:limit]:
    yield item_xml

Test Coverage

Created shared test helper /tests/helpers/feed_ordering.py:

  • assert_feed_newest_first() function works for all formats (RSS, ATOM, JSON)
  • Extracts dates in format-specific way
  • Validates descending chronological order
  • Provides clear error messages

Updated RSS tests to use shared helper:

# test_feed.py
from tests/helpers/feed_ordering import assert_feed_newest_first

def test_generate_feed_newest_first(self, app):
    # ... generate feed ...
    assert_feed_newest_first(feed_xml, format_type='rss', expected_count=3)

Phase 2.1: Feed Module Restructuring (2 hours)

Reorganized feed generation code for scalability and maintainability.

New Structure

starpunk/feeds/
├── __init__.py          # Module exports
├── rss.py               # RSS 2.0 generation (moved from feed.py)
├── atom.py              # ATOM 1.0 generation (new)
└── json_feed.py         # JSON Feed 1.1 generation (new)

starpunk/feed.py         # Backward compatibility shim

Module Organization

feeds/__init__.py:

from .rss import generate_rss, generate_rss_streaming
from .atom import generate_atom, generate_atom_streaming
from .json_feed import generate_json_feed, generate_json_feed_streaming

__all__ = [
    "generate_rss", "generate_rss_streaming",
    "generate_atom", "generate_atom_streaming",
    "generate_json_feed", "generate_json_feed_streaming",
]

feed.py Compatibility Shim:

# Maintains backward compatibility
from starpunk.feeds.rss import (
    generate_rss as generate_feed,
    generate_rss_streaming as generate_feed_streaming,
    # ... other functions
)

Business Metrics Integration

Added to all feed generators per Q&A answer I1:

import time
from starpunk.monitoring.business import track_feed_generated

def generate_rss(...):
    start_time = time.time()
    # ... generate feed ...
    duration_ms = (time.time() - start_time) * 1000
    track_feed_generated(
        format='rss',
        item_count=len(notes),
        duration_ms=duration_ms,
        cached=False
    )

Verification

  • All 24 existing RSS tests pass
  • No breaking changes to public API
  • Imports work from both old (starpunk.feed) and new (starpunk.feeds) locations

Phase 2.2: ATOM 1.0 Feed Implementation (2.5 hours)

Implemented ATOM 1.0 feed generation following RFC 4287 specification.

Implementation Approach

Per Q&A answer I3, used Python's standard library xml.etree.ElementTree approach (manual string building with XML escaping) rather than ElementTree object model or feedgen library.

Rationale:

  • No new dependencies
  • Simple and explicit
  • Full control over output format
  • Proper XML escaping via helper function

Key Features

Required ATOM Elements:

  • <feed> with proper namespace (http://www.w3.org/2005/Atom)
  • <id>, <title>, <updated> at feed level
  • <entry> elements with <id>, <title>, <updated>, <published>

Content Handling (per Q&A answer IQ6):

  • type="html" for rendered markdown (escaped)
  • type="text" for plain text (escaped)
  • Skipped type="xhtml" (unnecessary complexity)

Date Format:

  • RFC 3339 (ISO 8601 profile)
  • UTC timestamps with 'Z' suffix
  • Example: 2024-11-26T12:00:00Z

Code Structure

feeds/atom.py:

def generate_atom(...) -> str:
    """Non-streaming for caching"""
    return ''.join(generate_atom_streaming(...))

def generate_atom_streaming(...):
    """Memory-efficient streaming"""
    yield '<?xml version="1.0" encoding="utf-8"?>\n'
    yield f'<feed xmlns="{ATOM_NS}">\n'
    # ... feed metadata ...
    for note in notes[:limit]:  # Newest first - no reversed()!
        yield '  <entry>\n'
        # ... entry content ...
        yield '  </entry>\n'
    yield '</feed>\n'

XML Escaping:

def _escape_xml(text: str) -> str:
    """Escape &, <, >, ", ' in order"""
    if not text:
        return ""
    text = text.replace("&", "&amp;")  # First!
    text = text.replace("<", "&lt;")
    text = text.replace(">", "&gt;")
    text = text.replace('"', "&quot;")
    text = text.replace("'", "&apos;")
    return text

Test Coverage

Created tests/test_feeds_atom.py with 11 tests:

Basic Functionality:

  • Valid ATOM XML generation
  • Empty feed handling
  • Entry limit respected
  • Required/site URL validation

Ordering & Structure:

  • Newest-first ordering (using shared helper)
  • Proper ATOM namespace
  • All required elements present
  • HTML content escaping

Edge Cases:

  • Special XML characters (&, <, >, ", ')
  • Unicode content
  • Empty description

All 11 tests passing.

Phase 2.3: JSON Feed 1.1 Implementation (2.5 hours)

Implemented JSON Feed 1.1 following the official JSON Feed specification.

Implementation Approach

Used Python's standard library json module for serialization. Simple and straightforward - no external dependencies needed.

Key Features

Required JSON Feed Fields:

Optional Fields Used:

  • home_page_url: Site URL
  • feed_url: Self-reference URL
  • description: Feed description
  • language: "en"

Item Structure:

  • id: Permalink (required)
  • url: Permalink
  • title: Note title
  • content_html or content_text: Note content
  • date_published: RFC 3339 timestamp

Custom Extension (per Q&A answer IQ7):

"_starpunk": {
    "permalink_path": "/notes/slug",
    "word_count": 42
}

Minimal extension - only permalink_path and word_count. Can expand later based on user feedback.

Code Structure

feeds/json_feed.py:

def generate_json_feed(...) -> str:
    """Non-streaming for caching"""
    feed = _build_feed_object(...)
    return json.dumps(feed, ensure_ascii=False, indent=2)

def generate_json_feed_streaming(...):
    """Memory-efficient streaming"""
    yield '{\n'
    yield f'  "version": "https://jsonfeed.org/version/1.1",\n'
    yield f'  "title": {json.dumps(site_name)},\n'
    # ... metadata ...
    yield '  "items": [\n'
    for i, note in enumerate(notes[:limit]):  # Newest first!
        item = _build_item_object(site_url, note)
        item_json = json.dumps(item, ensure_ascii=False, indent=4)
        # Proper indentation
        yield indented_item_json
        yield ',\n' if i < len(notes) - 1 else '\n'
    yield '  ]\n'
    yield '}\n'

Date Formatting:

def _format_rfc3339_date(dt: datetime) -> str:
    """RFC 3339 format: 2024-11-26T12:00:00Z"""
    if dt.tzinfo is None:
        dt = dt.replace(tzinfo=timezone.utc)
    if dt.tzinfo == timezone.utc:
        return dt.strftime("%Y-%m-%dT%H:%M:%SZ")
    else:
        return dt.isoformat()

Test Coverage

Created tests/test_feeds_json.py with 13 tests:

Basic Functionality:

  • Valid JSON generation
  • Empty feed handling
  • Entry limit respected
  • Required field validation

Ordering & Structure:

  • Newest-first ordering (using shared helper)
  • JSON Feed 1.1 compliance
  • All required fields present
  • HTML content handling

Format-Specific:

  • StarPunk custom extension (_starpunk)
  • RFC 3339 date format validation
  • UTF-8 encoding
  • Pretty-printed output

All 13 tests passing.

Testing Summary

Test Results

48 total feed tests - ALL PASSING
- RSS: 24 tests (existing + ordering fix)
- ATOM: 11 tests (new)
- JSON Feed: 13 tests (new)

Test Organization

tests/
├── helpers/
│   ├── __init__.py
│   └── feed_ordering.py      # Shared ordering validation
├── test_feed.py               # RSS tests (original)
├── test_feeds_atom.py         # ATOM tests (new)
└── test_feeds_json.py         # JSON Feed tests (new)

Shared Test Helper

The feed_ordering.py helper provides cross-format ordering validation:

def assert_feed_newest_first(feed_content, format_type, expected_count=None):
    """Verify feed items are newest-first regardless of format"""
    if format_type == 'rss':
        dates = _extract_rss_dates(feed_content)  # Parse XML, get pubDate
    elif format_type == 'atom':
        dates = _extract_atom_dates(feed_content)  # Parse XML, get published
    elif format_type == 'json':
        dates = _extract_json_feed_dates(feed_content)  # Parse JSON, get date_published

    # Verify descending order
    for i in range(len(dates) - 1):
        assert dates[i] >= dates[i + 1], "Not in newest-first order!"

This helper is now used by all feed format tests, ensuring consistent ordering validation.

Code Quality

Adherence to Standards

  • RSS 2.0: Full specification compliance, RFC-822 dates
  • ATOM 1.0: RFC 4287 compliance, RFC 3339 dates
  • JSON Feed 1.1: Official spec compliance, RFC 3339 dates

Python Standards

  • Type hints on all function signatures
  • Comprehensive docstrings with examples
  • Standard library usage (no unnecessary dependencies)
  • Proper error handling with ValueError

StarPunk Principles

Simplicity: Minimal code, standard library usage Standards Compliance: Following specs exactly Testing: Comprehensive test coverage Documentation: Clear docstrings and comments

Performance Considerations

Streaming vs Non-Streaming

All formats implement both methods per Q&A answer CQ6:

Non-Streaming (generate_*):

  • Returns complete string
  • Required for caching
  • Built from streaming for consistency

Streaming (generate_*_streaming):

  • Yields chunks
  • Memory-efficient for large feeds
  • Recommended for 100+ entries

Business Metrics Overhead

Minimal impact from metrics tracking:

  • Single time.time() call at start/end
  • One function call to track_feed_generated()
  • No sampling - always records feed generation
  • Estimated overhead: <1ms per feed generation

Files Created/Modified

New Files

starpunk/feeds/__init__.py                    # Module exports
starpunk/feeds/rss.py                         # RSS moved from feed.py
starpunk/feeds/atom.py                        # ATOM 1.0 implementation
starpunk/feeds/json_feed.py                   # JSON Feed 1.1 implementation

tests/helpers/__init__.py                     # Test helpers module
tests/helpers/feed_ordering.py                # Shared ordering validation
tests/test_feeds_atom.py                      # ATOM tests
tests/test_feeds_json.py                      # JSON Feed tests

Modified Files

starpunk/feed.py                              # Now a compatibility shim
tests/test_feed.py                            # Added shared helper usage
CHANGELOG.md                                  # Phase 2 entries

File Sizes

starpunk/feeds/rss.py:        ~400 lines (moved)
starpunk/feeds/atom.py:       ~310 lines (new)
starpunk/feeds/json_feed.py:  ~300 lines (new)
tests/test_feeds_atom.py:     ~260 lines (new)
tests/test_feeds_json.py:     ~290 lines (new)
tests/helpers/feed_ordering.py: ~150 lines (new)

Remaining Work (Phase 2.4)

Content Negotiation

Per Q&A answer CQ3, implement dual endpoint strategy:

Endpoints Needed:

  • /feed - Content negotiation via Accept header
  • /feed.xml or /feed.rss - Explicit RSS (backward compat)
  • /feed.atom - Explicit ATOM
  • /feed.json - Explicit JSON Feed

Content Negotiation Logic:

  • Parse Accept header
  • Quality factor scoring
  • Default to RSS if multiple formats match
  • Return 406 Not Acceptable if no match

Implementation:

  • Create feeds/negotiation.py module
  • Implement ContentNegotiator class
  • Add routes to routes/public.py
  • Update route tests

Estimated Time: 0.5-1 hour

Questions for Architect

None at this time. All questions were answered in the Q&A document. Implementation followed specifications exactly.

Recommendations

Immediate Next Steps

  1. Complete Phase 2.4: Implement content negotiation
  2. Integration Testing: Test all three formats in production-like environment
  3. Feed Reader Testing: Validate with actual feed reader clients

Future Enhancements (Post v1.1.2)

  1. Feed Caching (Phase 3): Implement checksum-based caching per design
  2. Feed Discovery: Add <link> tags to HTML for feed auto-discovery (per Q&A N1)
  3. OPML Export: Allow users to export all feed formats
  4. Enhanced JSON Feed: Add author objects, attachments when supported by Note model

Conclusion

Phase 2 (Phases 2.0-2.3) successfully implemented:

Critical RSS ordering fix Clean feed module architecture ATOM 1.0 feed support JSON Feed 1.1 support Business metrics integration Comprehensive test coverage (48 tests, all passing)

The codebase is now ready for Phase 2.4 (content negotiation) to complete the feed formats feature. All feed generators follow standards, maintain newest-first ordering, and include proper metrics tracking.

Status: Ready for architect review and Phase 2.4 implementation.


Implementation Date: 2025-11-26 Developer: StarPunk Fullstack Developer (AI) Total Time: ~7 hours (of estimated 7-8 hours for Phases 2.0-2.3) Tests: 48 passing Next: Phase 2.4 - Content Negotiation (0.5-1 hour)