# StarPunk v1.1.2 "Syndicate" - Implementation Guide ## Overview This guide provides a phased approach to implementing v1.1.2 "Syndicate" features. The release is structured in three phases totaling 14-16 hours of focused development. ## Pre-Implementation Checklist - [x] Review v1.1.1 performance monitoring specification - [x] Ensure development environment has Python 3.11+ - [x] Create feature branch: `feature/v1.1.2-syndicate` - [ ] Review feed format specifications (RSS 2.0, ATOM 1.0, JSON Feed 1.1) - [ ] Set up feed reader test clients ## Phase 1: Metrics Instrumentation (4-6 hours) ✅ COMPLETE ### Objective Complete the metrics instrumentation that was partially implemented in v1.1.1, adding comprehensive coverage across all system operations. ### 1.1 Database Operation Timing (1.5 hours) ✅ **Location**: `starpunk/monitoring/database.py` **Implementation Steps**: 1. **Create Database Monitor Wrapper** ```python class MonitoredConnection: """Wrapper for SQLite connections with timing""" def execute(self, query, params=None): # Start timer # Execute query # Record metric # Return result ``` 2. **Instrument All Query Types** - SELECT queries (with row count) - INSERT operations (with affected rows) - UPDATE operations (with affected rows) - DELETE operations (rare, but instrumented) - Transaction boundaries (BEGIN/COMMIT) 3. **Add Query Pattern Detection** - Identify query type (SELECT, INSERT, etc.) - Extract table name - Detect slow queries (>1s) - Track prepared statement usage **Metrics to Collect**: - `db.query.duration` - Query execution time - `db.query.count` - Number of queries by type - `db.rows.returned` - Result set size - `db.transaction.duration` - Transaction time - `db.connection.wait` - Connection acquisition time ### 1.2 HTTP Request/Response Metrics (1.5 hours) ✅ **Location**: `starpunk/monitoring/http.py` **Implementation Steps**: 1. **Enhance Request Middleware** ```python @app.before_request def start_request_metrics(): g.metrics = { 'start_time': time.perf_counter(), 'start_memory': get_memory_usage(), 'request_id': generate_request_id() } ``` 2. **Capture Response Metrics** ```python @app.after_request def capture_response_metrics(response): # Calculate duration # Measure memory delta # Record response size # Track status codes ``` 3. **Add Endpoint-Specific Metrics** - Feed generation timing - Micropub processing time - Static file serving - Admin operations **Metrics to Collect**: - `http.request.duration` - Total request time - `http.request.size` - Request body size - `http.response.size` - Response body size - `http.status.{code}` - Status code distribution - `http.endpoint.{name}` - Per-endpoint timing ### 1.3 Memory Monitoring Thread (1 hour) ✅ **Location**: `starpunk/monitoring/memory.py` **Implementation Steps**: 1. **Create Background Monitor** ```python class MemoryMonitor(Thread): def run(self): while self.running: # Get RSS memory # Check for growth # Detect potential leaks # Sleep interval ``` 2. **Track Memory Patterns** - Process RSS memory - Virtual memory size - Memory growth rate - High water mark - Garbage collection stats 3. **Add Leak Detection** - Baseline after startup - Track growth over time - Alert on sustained growth - Identify allocation sources **Metrics to Collect**: - `memory.rss` - Resident set size - `memory.vms` - Virtual memory size - `memory.growth_rate` - MB/hour - `memory.gc.collections` - GC runs - `memory.high_water` - Peak usage ### 1.4 Business Metrics for Syndication (1 hour) ✅ **Location**: `starpunk/monitoring/business.py` **Implementation Steps**: 1. **Track Feed Operations** - Feed requests by format - Cache hit/miss rates - Generation timing - Format negotiation results 2. **Monitor Content Flow** - Notes published per day - Average note length - Media attachments - Syndication success 3. **User Behavior Metrics** - Popular feed formats - Reader user agents - Request patterns - Geographic distribution **Metrics to Collect**: - `feed.requests.{format}` - Requests by format - `feed.cache.hit_rate` - Cache effectiveness - `feed.generation.time` - Generation duration - `content.notes.published` - Publishing rate - `content.syndication.success` - Successful syndications ### Phase 1 Completion Status ✅ **Completed**: 2025-11-25 **Developer**: StarPunk Fullstack Developer (AI) **Review**: Approved by Architect on 2025-11-26 **Test Results**: 28/28 tests passing **Performance**: <1% overhead achieved **Next Step**: Begin Phase 2 - Feed Formats **Note**: All Phase 1 metrics instrumentation is complete and ready for production use. Business metrics functions are available for integration into notes.py and feed.py during Phase 2. ## Phase 2: Feed Formats (6-8 hours) ### Objective Fix RSS feed ordering regression, then implement ATOM and JSON Feed formats alongside existing RSS, with proper content negotiation and caching. ### 2.0 Fix RSS Feed Ordering Regression (0.5 hours) - CRITICAL **Location**: `starpunk/feed.py` **Critical Production Bug**: RSS feed currently shows oldest entries first instead of newest first. This violates RSS standards and user expectations. **Root Cause**: Incorrect `reversed()` calls on lines 100 and 198 that flip the correct DESC order from database. **Implementation Steps**: 1. **Remove Incorrect Reversals** - Line 100: Remove `reversed()` from `for note in reversed(notes[:limit]):` - Line 198: Remove `reversed()` from `for note in reversed(notes[:limit]):` - Update/remove misleading comments about feedgen reversing order 2. **Verify Expected Behavior** - Database returns notes in DESC order (newest first) - confirmed line 440 of notes.py - Feed should maintain this order (newest entries first) - This is the standard for ALL feed formats (RSS, ATOM, JSON Feed) 3. **Add Feed Order Tests** ```python def test_rss_feed_newest_first(): """Test RSS feed shows newest entries first""" # Create notes with different timestamps old_note = create_note(title="Old", created_at=yesterday) new_note = create_note(title="New", created_at=today) # Generate feed feed = generate_rss_feed([old_note, new_note]) # Parse and verify order items = parse_feed_items(feed) assert items[0].title == "New" assert items[1].title == "Old" ``` **Important**: This MUST be fixed before implementing ATOM and JSON feeds to ensure all formats have consistent, correct ordering. ### 2.1 ATOM Feed Generation (2.5 hours) **Location**: `starpunk/feed/atom.py` **Implementation Steps**: 1. **Create ATOM Generator Class** ```python class AtomGenerator: def generate(self, notes, config): # Yield XML declaration # Yield feed element # Yield entries # Stream output ``` 2. **Implement ATOM 1.0 Elements** - Required: id, title, updated - Recommended: author, link, category - Optional: contributor, generator, icon, logo, rights, subtitle 3. **Handle Content Types** - Text content (escaped) - HTML content (in CDATA) - XHTML content (inline) - Base64 for binary 4. **Date Formatting** - RFC 3339 format - Timezone handling - Updated vs published **ATOM Structure**: ```xml Site Title 2024-11-25T12:00:00Z Author Name http://example.com/ Note Title http://example.com/note/1 2024-11-25T12:00:00Z HTML content

]]>
``` ### 2.2 JSON Feed Generation (2.5 hours) **Location**: `starpunk/feed/json_feed.py` **Implementation Steps**: 1. **Create JSON Feed Generator** ```python class JsonFeedGenerator: def generate(self, notes, config): # Build feed object # Add items array # Include metadata # Stream JSON output ``` 2. **Implement JSON Feed 1.1 Schema** - version (required) - title (required) - items (required array) - home_page_url - feed_url - description - authors array - language - icon, favicon 3. **Handle Rich Content** - content_html - content_text - summary - image attachments - tags array - authors array 4. **Add Extensions** - _starpunk namespace - Pagination hints - Hub for real-time **JSON Feed Structure**: ```json { "version": "https://jsonfeed.org/version/1.1", "title": "Site Title", "home_page_url": "https://example.com/", "feed_url": "https://example.com/feed.json", "description": "Site description", "authors": [ { "name": "Author Name", "url": "https://example.com/about" } ], "items": [ { "id": "https://example.com/note/1", "url": "https://example.com/note/1", "title": "Note Title", "content_html": "

HTML content

", "date_published": "2024-11-25T12:00:00Z", "tags": ["tag1", "tag2"] } ] } ``` ### 2.3 Content Negotiation (1.5 hours) **Location**: `starpunk/feed/negotiator.py` **Implementation Steps**: 1. **Create Content Negotiator** ```python class FeedNegotiator: def negotiate(self, accept_header): # Parse Accept header # Score each format # Return best match ``` 2. **Parse Accept Header** - Split on comma - Extract MIME type - Parse quality factors (q=) - Handle wildcards (*/*) 3. **Score Formats** - Exact match: 1.0 - Wildcard match: 0.5 - Type/* match: 0.7 - Default RSS: 0.1 4. **Format Mapping** ```python FORMAT_MIME_TYPES = { 'rss': ['application/rss+xml', 'application/xml', 'text/xml'], 'atom': ['application/atom+xml'], 'json': ['application/json', 'application/feed+json'] } ``` ### 2.4 Feed Validation (1.5 hours) **Location**: `starpunk/feed/validators.py` **Implementation Steps**: 1. **Create Validation Framework** ```python class FeedValidator(Protocol): def validate(self, content: str) -> List[ValidationError]: pass ``` 2. **RSS Validator** - Check required elements - Verify date formats - Validate URLs - Check CDATA escaping 3. **ATOM Validator** - Verify namespace - Check required elements - Validate RFC 3339 dates - Verify ID uniqueness 4. **JSON Feed Validator** - Validate against schema - Check required fields - Verify URL formats - Validate date strings **Validation Levels**: - ERROR: Feed is invalid - WARNING: Non-critical issue - INFO: Suggestion for improvement ## Phase 3: Feed Enhancements (4 hours) ### Objective Add caching, statistics, and operational improvements to the feed system. ### 3.1 Feed Caching Layer (1.5 hours) **Location**: `starpunk/feed/cache.py` **Implementation Steps**: 1. **Create Cache Manager** ```python class FeedCache: def __init__(self, max_size=100, ttl=300): self.cache = LRU(max_size) self.ttl = ttl ``` 2. **Cache Key Generation** - Format type - Item limit - Content checksum - Last modified 3. **Cache Operations** - Get with TTL check - Set with expiration - Invalidate on changes - Clear entire cache 4. **Memory Management** - Monitor cache size - Implement eviction - Track hit rates - Report statistics **Cache Strategy**: ```python def get_or_generate(format, limit): key = generate_cache_key(format, limit) cached = cache.get(key) if cached and not expired(cached): metrics.record_cache_hit() return cached content = generate_feed(format, limit) cache.set(key, content, ttl=300) metrics.record_cache_miss() return content ``` ### 3.2 Statistics Dashboard (1.5 hours) **Location**: `starpunk/admin/syndication.py` **Template**: `templates/admin/syndication.html` **Implementation Steps**: 1. **Create Dashboard Route** ```python @app.route('/admin/syndication') @require_admin def syndication_dashboard(): stats = gather_syndication_stats() return render_template('admin/syndication.html', stats=stats) ``` 2. **Gather Statistics** - Requests by format (pie chart) - Cache hit rates (line graph) - Generation times (histogram) - Popular user agents (table) - Recent errors (log) 3. **Create Dashboard UI** - Overview cards - Time series graphs - Format breakdown - Performance metrics - Configuration status **Dashboard Sections**: - Feed Format Usage - Cache Performance - Generation Times - Client Analysis - Error Log - Configuration ### 3.3 OPML Export (1 hour) **Location**: `starpunk/feed/opml.py` **Implementation Steps**: 1. **Create OPML Generator** ```python def generate_opml(site_config): # Generate OPML header # Add feed outlines # Include metadata return opml_content ``` 2. **OPML Structure** ```xml StarPunk Feeds Mon, 25 Nov 2024 12:00:00 UTC ``` 3. **Add Export Route** ```python @app.route('/feeds.opml') def export_opml(): opml = generate_opml(config) return Response(opml, mimetype='text/x-opml') ``` ## Testing Strategy ### Phase 1 Tests (Metrics) 1. **Unit Tests** - Mock database operations - Test metric collection - Verify memory monitoring - Test business metrics 2. **Integration Tests** - End-to-end request tracking - Database timing accuracy - Memory leak detection - Metrics aggregation ### Phase 2 Tests (Feeds) 1. **Format Tests** - Valid RSS generation - Valid ATOM generation - Valid JSON Feed generation - Content negotiation logic - **Feed ordering (newest first) for ALL formats - CRITICAL** 2. **Feed Ordering Tests (REQUIRED)** ```python def test_all_feeds_newest_first(): """Verify all feed formats show newest entries first""" old_note = create_note(title="Old", created_at=yesterday) new_note = create_note(title="New", created_at=today) notes = [new_note, old_note] # DESC order from database # Test RSS rss_feed = generate_rss_feed(notes) assert first_item(rss_feed).title == "New" # Test ATOM atom_feed = generate_atom_feed(notes) assert first_item(atom_feed).title == "New" # Test JSON json_feed = generate_json_feed(notes) assert json_feed['items'][0]['title'] == "New" ``` 3. **Compliance Tests** - W3C Feed Validator - ATOM validator - JSON Feed validator - Popular readers ### Phase 3 Tests (Enhancements) 1. **Cache Tests** - TTL expiration - LRU eviction - Invalidation - Hit rate tracking 2. **Dashboard Tests** - Statistics accuracy - Graph rendering - OPML validity - Performance impact ## Configuration Updates ### New Configuration Options Add to `config.py`: ```python # Feed configuration FEED_DEFAULT_LIMIT = int(os.getenv('STARPUNK_FEED_DEFAULT_LIMIT', 50)) FEED_MAX_LIMIT = int(os.getenv('STARPUNK_FEED_MAX_LIMIT', 500)) FEED_CACHE_TTL = int(os.getenv('STARPUNK_FEED_CACHE_TTL', 300)) FEED_CACHE_SIZE = int(os.getenv('STARPUNK_FEED_CACHE_SIZE', 100)) # Format support FEED_RSS_ENABLED = str_to_bool(os.getenv('STARPUNK_FEED_RSS_ENABLED', 'true')) FEED_ATOM_ENABLED = str_to_bool(os.getenv('STARPUNK_FEED_ATOM_ENABLED', 'true')) FEED_JSON_ENABLED = str_to_bool(os.getenv('STARPUNK_FEED_JSON_ENABLED', 'true')) # Metrics for syndication METRICS_FEED_TIMING = str_to_bool(os.getenv('STARPUNK_METRICS_FEED_TIMING', 'true')) METRICS_CACHE_STATS = str_to_bool(os.getenv('STARPUNK_METRICS_CACHE_STATS', 'true')) METRICS_FORMAT_USAGE = str_to_bool(os.getenv('STARPUNK_METRICS_FORMAT_USAGE', 'true')) ``` ## Documentation Updates ### User Documentation 1. **Feed Formats Guide** - How to access each format - Which readers support what - Format comparison 2. **Configuration Guide** - New environment variables - Performance tuning - Cache settings ### API Documentation 1. **Feed Endpoints** - `/feed.xml` - RSS feed - `/feed.atom` - ATOM feed - `/feed.json` - JSON feed - `/feeds.opml` - OPML export 2. **Content Negotiation** - Accept header usage - Format precedence - Default behavior ## Deployment Checklist ### Pre-deployment - [ ] All tests passing - [ ] Metrics instrumentation verified - [ ] Feed formats validated - [ ] Cache performance tested - [ ] Documentation updated ### Deployment Steps 1. Backup database 2. Update configuration 3. Deploy new code 4. Run migrations (none for v1.1.2) 5. Clear feed cache 6. Test all feed formats 7. Verify metrics collection ### Post-deployment - [ ] Monitor memory usage - [ ] Check feed generation times - [ ] Verify cache hit rates - [ ] Test with feed readers - [ ] Review error logs ## Rollback Plan If issues arise: 1. **Immediate Rollback** ```bash git checkout v1.1.1 supervisorctl restart starpunk ``` 2. **Cache Cleanup** ```bash redis-cli FLUSHDB # If using Redis rm -rf /tmp/starpunk_cache/* # If file-based ``` 3. **Configuration Rollback** ```bash cp config.backup.ini config.ini ``` ## Success Metrics ### Performance Targets - Feed generation <100ms (50 items) - Cache hit rate >80% - Memory overhead <10MB - Zero performance regression ### Compatibility Targets - 10+ feed readers tested - All validators passing - No breaking changes - Backward compatibility maintained ## Timeline ### Week 1 - Phase 1: Metrics instrumentation (4-6 hours) - Testing and validation ### Week 2 - Phase 2: Feed formats (6-8 hours) - Integration testing ### Week 3 - Phase 3: Enhancements (4 hours) - Final testing and documentation - Deployment Total estimated time: 14-16 hours of focused development