Files
StarPunk/docs/design/v1.1.2/implementation-guide.md
Phil Skentelbery b0230b1233 feat: Complete v1.1.2 Phase 1 - Metrics Instrumentation
Implements the metrics instrumentation framework that was missing from v1.1.1.
The monitoring framework existed but was never actually used to collect metrics.

Phase 1 Deliverables:
- Database operation monitoring with query timing and slow query detection
- HTTP request/response metrics with request IDs for all requests
- Memory monitoring via daemon thread with configurable intervals
- Business metrics framework for notes, feeds, and cache operations
- Configuration management with environment variable support

Implementation Details:
- MonitoredConnection wrapper at pool level for transparent DB monitoring
- Flask middleware hooks for HTTP metrics collection
- Background daemon thread for memory statistics (skipped in test mode)
- Simple business metric helpers for integration in Phase 2
- Comprehensive test suite with 28/28 tests passing

Quality Metrics:
- 100% test pass rate (28/28 tests)
- Zero architectural deviations from specifications
- <1% performance overhead achieved
- Production-ready with minimal memory impact (~2MB)

Architect Review: APPROVED with excellent marks

Documentation:
- Implementation report: docs/reports/v1.1.2-phase1-metrics-implementation.md
- Architect review: docs/reviews/2025-11-26-v1.1.2-phase1-review.md
- Updated CHANGELOG.md with Phase 1 additions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 14:13:44 -07:00

18 KiB

StarPunk v1.1.2 "Syndicate" - Implementation Guide

Overview

This guide provides a phased approach to implementing v1.1.2 "Syndicate" features. The release is structured in three phases totaling 14-16 hours of focused development.

Pre-Implementation Checklist

  • Review v1.1.1 performance monitoring specification
  • Ensure development environment has Python 3.11+
  • Create feature branch: feature/v1.1.2-syndicate
  • Review feed format specifications (RSS 2.0, ATOM 1.0, JSON Feed 1.1)
  • Set up feed reader test clients

Phase 1: Metrics Instrumentation (4-6 hours) COMPLETE

Objective

Complete the metrics instrumentation that was partially implemented in v1.1.1, adding comprehensive coverage across all system operations.

1.1 Database Operation Timing (1.5 hours)

Location: starpunk/monitoring/database.py

Implementation Steps:

  1. Create Database Monitor Wrapper

    class MonitoredConnection:
        """Wrapper for SQLite connections with timing"""
    
        def execute(self, query, params=None):
            # Start timer
            # Execute query
            # Record metric
            # Return result
    
  2. Instrument All Query Types

    • SELECT queries (with row count)
    • INSERT operations (with affected rows)
    • UPDATE operations (with affected rows)
    • DELETE operations (rare, but instrumented)
    • Transaction boundaries (BEGIN/COMMIT)
  3. Add Query Pattern Detection

    • Identify query type (SELECT, INSERT, etc.)
    • Extract table name
    • Detect slow queries (>1s)
    • Track prepared statement usage

Metrics to Collect:

  • db.query.duration - Query execution time
  • db.query.count - Number of queries by type
  • db.rows.returned - Result set size
  • db.transaction.duration - Transaction time
  • db.connection.wait - Connection acquisition time

1.2 HTTP Request/Response Metrics (1.5 hours)

Location: starpunk/monitoring/http.py

Implementation Steps:

  1. Enhance Request Middleware

    @app.before_request
    def start_request_metrics():
        g.metrics = {
            'start_time': time.perf_counter(),
            'start_memory': get_memory_usage(),
            'request_id': generate_request_id()
        }
    
  2. Capture Response Metrics

    @app.after_request
    def capture_response_metrics(response):
        # Calculate duration
        # Measure memory delta
        # Record response size
        # Track status codes
    
  3. Add Endpoint-Specific Metrics

    • Feed generation timing
    • Micropub processing time
    • Static file serving
    • Admin operations

Metrics to Collect:

  • http.request.duration - Total request time
  • http.request.size - Request body size
  • http.response.size - Response body size
  • http.status.{code} - Status code distribution
  • http.endpoint.{name} - Per-endpoint timing

1.3 Memory Monitoring Thread (1 hour)

Location: starpunk/monitoring/memory.py

Implementation Steps:

  1. Create Background Monitor

    class MemoryMonitor(Thread):
        def run(self):
            while self.running:
                # Get RSS memory
                # Check for growth
                # Detect potential leaks
                # Sleep interval
    
  2. Track Memory Patterns

    • Process RSS memory
    • Virtual memory size
    • Memory growth rate
    • High water mark
    • Garbage collection stats
  3. Add Leak Detection

    • Baseline after startup
    • Track growth over time
    • Alert on sustained growth
    • Identify allocation sources

Metrics to Collect:

  • memory.rss - Resident set size
  • memory.vms - Virtual memory size
  • memory.growth_rate - MB/hour
  • memory.gc.collections - GC runs
  • memory.high_water - Peak usage

1.4 Business Metrics for Syndication (1 hour)

Location: starpunk/monitoring/business.py

Implementation Steps:

  1. Track Feed Operations

    • Feed requests by format
    • Cache hit/miss rates
    • Generation timing
    • Format negotiation results
  2. Monitor Content Flow

    • Notes published per day
    • Average note length
    • Media attachments
    • Syndication success
  3. User Behavior Metrics

    • Popular feed formats
    • Reader user agents
    • Request patterns
    • Geographic distribution

Metrics to Collect:

  • feed.requests.{format} - Requests by format
  • feed.cache.hit_rate - Cache effectiveness
  • feed.generation.time - Generation duration
  • content.notes.published - Publishing rate
  • content.syndication.success - Successful syndications

Phase 1 Completion Status

Completed: 2025-11-25 Developer: StarPunk Fullstack Developer (AI) Review: Approved by Architect on 2025-11-26 Test Results: 28/28 tests passing Performance: <1% overhead achieved Next Step: Begin Phase 2 - Feed Formats

Note: All Phase 1 metrics instrumentation is complete and ready for production use. Business metrics functions are available for integration into notes.py and feed.py during Phase 2.

Phase 2: Feed Formats (6-8 hours)

Objective

Fix RSS feed ordering regression, then implement ATOM and JSON Feed formats alongside existing RSS, with proper content negotiation and caching.

2.0 Fix RSS Feed Ordering Regression (0.5 hours) - CRITICAL

Location: starpunk/feed.py

Critical Production Bug: RSS feed currently shows oldest entries first instead of newest first. This violates RSS standards and user expectations.

Root Cause: Incorrect reversed() calls on lines 100 and 198 that flip the correct DESC order from database.

Implementation Steps:

  1. Remove Incorrect Reversals

    • Line 100: Remove reversed() from for note in reversed(notes[:limit]):
    • Line 198: Remove reversed() from for note in reversed(notes[:limit]):
    • Update/remove misleading comments about feedgen reversing order
  2. Verify Expected Behavior

    • Database returns notes in DESC order (newest first) - confirmed line 440 of notes.py
    • Feed should maintain this order (newest entries first)
    • This is the standard for ALL feed formats (RSS, ATOM, JSON Feed)
  3. Add Feed Order Tests

    def test_rss_feed_newest_first():
        """Test RSS feed shows newest entries first"""
        # Create notes with different timestamps
        old_note = create_note(title="Old", created_at=yesterday)
        new_note = create_note(title="New", created_at=today)
    
        # Generate feed
        feed = generate_rss_feed([old_note, new_note])
    
        # Parse and verify order
        items = parse_feed_items(feed)
        assert items[0].title == "New"
        assert items[1].title == "Old"
    

Important: This MUST be fixed before implementing ATOM and JSON feeds to ensure all formats have consistent, correct ordering.

2.1 ATOM Feed Generation (2.5 hours)

Location: starpunk/feed/atom.py

Implementation Steps:

  1. Create ATOM Generator Class

    class AtomGenerator:
        def generate(self, notes, config):
            # Yield XML declaration
            # Yield feed element
            # Yield entries
            # Stream output
    
  2. Implement ATOM 1.0 Elements

    • Required: id, title, updated
    • Recommended: author, link, category
    • Optional: contributor, generator, icon, logo, rights, subtitle
  3. Handle Content Types

    • Text content (escaped)
    • HTML content (in CDATA)
    • XHTML content (inline)
    • Base64 for binary
  4. Date Formatting

    • RFC 3339 format
    • Timezone handling
    • Updated vs published

ATOM Structure:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Site Title</title>
  <link href="http://example.com/"/>
  <link href="http://example.com/feed.atom" rel="self"/>
  <updated>2024-11-25T12:00:00Z</updated>
  <author>
    <name>Author Name</name>
  </author>
  <id>http://example.com/</id>

  <entry>
    <title>Note Title</title>
    <link href="http://example.com/note/1"/>
    <id>http://example.com/note/1</id>
    <updated>2024-11-25T12:00:00Z</updated>
    <content type="html">
      <![CDATA[<p>HTML content</p>]]>
    </content>
  </entry>
</feed>

2.2 JSON Feed Generation (2.5 hours)

Location: starpunk/feed/json_feed.py

Implementation Steps:

  1. Create JSON Feed Generator

    class JsonFeedGenerator:
        def generate(self, notes, config):
            # Build feed object
            # Add items array
            # Include metadata
            # Stream JSON output
    
  2. Implement JSON Feed 1.1 Schema

    • version (required)
    • title (required)
    • items (required array)
    • home_page_url
    • feed_url
    • description
    • authors array
    • language
    • icon, favicon
  3. Handle Rich Content

    • content_html
    • content_text
    • summary
    • image attachments
    • tags array
    • authors array
  4. Add Extensions

    • _starpunk namespace
    • Pagination hints
    • Hub for real-time

JSON Feed Structure:

{
  "version": "https://jsonfeed.org/version/1.1",
  "title": "Site Title",
  "home_page_url": "https://example.com/",
  "feed_url": "https://example.com/feed.json",
  "description": "Site description",
  "authors": [
    {
      "name": "Author Name",
      "url": "https://example.com/about"
    }
  ],
  "items": [
    {
      "id": "https://example.com/note/1",
      "url": "https://example.com/note/1",
      "title": "Note Title",
      "content_html": "<p>HTML content</p>",
      "date_published": "2024-11-25T12:00:00Z",
      "tags": ["tag1", "tag2"]
    }
  ]
}

2.3 Content Negotiation (1.5 hours)

Location: starpunk/feed/negotiator.py

Implementation Steps:

  1. Create Content Negotiator

    class FeedNegotiator:
        def negotiate(self, accept_header):
            # Parse Accept header
            # Score each format
            # Return best match
    
  2. Parse Accept Header

    • Split on comma
    • Extract MIME type
    • Parse quality factors (q=)
    • Handle wildcards (/)
  3. Score Formats

    • Exact match: 1.0
    • Wildcard match: 0.5
    • Type/* match: 0.7
    • Default RSS: 0.1
  4. Format Mapping

    FORMAT_MIME_TYPES = {
        'rss': ['application/rss+xml', 'application/xml', 'text/xml'],
        'atom': ['application/atom+xml'],
        'json': ['application/json', 'application/feed+json']
    }
    

2.4 Feed Validation (1.5 hours)

Location: starpunk/feed/validators.py

Implementation Steps:

  1. Create Validation Framework

    class FeedValidator(Protocol):
        def validate(self, content: str) -> List[ValidationError]:
            pass
    
  2. RSS Validator

    • Check required elements
    • Verify date formats
    • Validate URLs
    • Check CDATA escaping
  3. ATOM Validator

    • Verify namespace
    • Check required elements
    • Validate RFC 3339 dates
    • Verify ID uniqueness
  4. JSON Feed Validator

    • Validate against schema
    • Check required fields
    • Verify URL formats
    • Validate date strings

Validation Levels:

  • ERROR: Feed is invalid
  • WARNING: Non-critical issue
  • INFO: Suggestion for improvement

Phase 3: Feed Enhancements (4 hours)

Objective

Add caching, statistics, and operational improvements to the feed system.

3.1 Feed Caching Layer (1.5 hours)

Location: starpunk/feed/cache.py

Implementation Steps:

  1. Create Cache Manager

    class FeedCache:
        def __init__(self, max_size=100, ttl=300):
            self.cache = LRU(max_size)
            self.ttl = ttl
    
  2. Cache Key Generation

    • Format type
    • Item limit
    • Content checksum
    • Last modified
  3. Cache Operations

    • Get with TTL check
    • Set with expiration
    • Invalidate on changes
    • Clear entire cache
  4. Memory Management

    • Monitor cache size
    • Implement eviction
    • Track hit rates
    • Report statistics

Cache Strategy:

def get_or_generate(format, limit):
    key = generate_cache_key(format, limit)
    cached = cache.get(key)

    if cached and not expired(cached):
        metrics.record_cache_hit()
        return cached

    content = generate_feed(format, limit)
    cache.set(key, content, ttl=300)
    metrics.record_cache_miss()
    return content

3.2 Statistics Dashboard (1.5 hours)

Location: starpunk/admin/syndication.py

Template: templates/admin/syndication.html

Implementation Steps:

  1. Create Dashboard Route

    @app.route('/admin/syndication')
    @require_admin
    def syndication_dashboard():
        stats = gather_syndication_stats()
        return render_template('admin/syndication.html', stats=stats)
    
  2. Gather Statistics

    • Requests by format (pie chart)
    • Cache hit rates (line graph)
    • Generation times (histogram)
    • Popular user agents (table)
    • Recent errors (log)
  3. Create Dashboard UI

    • Overview cards
    • Time series graphs
    • Format breakdown
    • Performance metrics
    • Configuration status

Dashboard Sections:

  • Feed Format Usage
  • Cache Performance
  • Generation Times
  • Client Analysis
  • Error Log
  • Configuration

3.3 OPML Export (1 hour)

Location: starpunk/feed/opml.py

Implementation Steps:

  1. Create OPML Generator

    def generate_opml(site_config):
        # Generate OPML header
        # Add feed outlines
        # Include metadata
        return opml_content
    
  2. OPML Structure

    <?xml version="1.0" encoding="UTF-8"?>
    <opml version="2.0">
      <head>
        <title>StarPunk Feeds</title>
        <dateCreated>Mon, 25 Nov 2024 12:00:00 UTC</dateCreated>
      </head>
      <body>
        <outline type="rss" text="RSS Feed" xmlUrl="https://example.com/feed.xml"/>
        <outline type="atom" text="ATOM Feed" xmlUrl="https://example.com/feed.atom"/>
        <outline type="json" text="JSON Feed" xmlUrl="https://example.com/feed.json"/>
      </body>
    </opml>
    
  3. Add Export Route

    @app.route('/feeds.opml')
    def export_opml():
        opml = generate_opml(config)
        return Response(opml, mimetype='text/x-opml')
    

Testing Strategy

Phase 1 Tests (Metrics)

  1. Unit Tests

    • Mock database operations
    • Test metric collection
    • Verify memory monitoring
    • Test business metrics
  2. Integration Tests

    • End-to-end request tracking
    • Database timing accuracy
    • Memory leak detection
    • Metrics aggregation

Phase 2 Tests (Feeds)

  1. Format Tests

    • Valid RSS generation
    • Valid ATOM generation
    • Valid JSON Feed generation
    • Content negotiation logic
    • Feed ordering (newest first) for ALL formats - CRITICAL
  2. Feed Ordering Tests (REQUIRED)

    def test_all_feeds_newest_first():
        """Verify all feed formats show newest entries first"""
        old_note = create_note(title="Old", created_at=yesterday)
        new_note = create_note(title="New", created_at=today)
        notes = [new_note, old_note]  # DESC order from database
    
        # Test RSS
        rss_feed = generate_rss_feed(notes)
        assert first_item(rss_feed).title == "New"
    
        # Test ATOM
        atom_feed = generate_atom_feed(notes)
        assert first_item(atom_feed).title == "New"
    
        # Test JSON
        json_feed = generate_json_feed(notes)
        assert json_feed['items'][0]['title'] == "New"
    
  3. Compliance Tests

    • W3C Feed Validator
    • ATOM validator
    • JSON Feed validator
    • Popular readers

Phase 3 Tests (Enhancements)

  1. Cache Tests

    • TTL expiration
    • LRU eviction
    • Invalidation
    • Hit rate tracking
  2. Dashboard Tests

    • Statistics accuracy
    • Graph rendering
    • OPML validity
    • Performance impact

Configuration Updates

New Configuration Options

Add to config.py:

# Feed configuration
FEED_DEFAULT_LIMIT = int(os.getenv('STARPUNK_FEED_DEFAULT_LIMIT', 50))
FEED_MAX_LIMIT = int(os.getenv('STARPUNK_FEED_MAX_LIMIT', 500))
FEED_CACHE_TTL = int(os.getenv('STARPUNK_FEED_CACHE_TTL', 300))
FEED_CACHE_SIZE = int(os.getenv('STARPUNK_FEED_CACHE_SIZE', 100))

# Format support
FEED_RSS_ENABLED = str_to_bool(os.getenv('STARPUNK_FEED_RSS_ENABLED', 'true'))
FEED_ATOM_ENABLED = str_to_bool(os.getenv('STARPUNK_FEED_ATOM_ENABLED', 'true'))
FEED_JSON_ENABLED = str_to_bool(os.getenv('STARPUNK_FEED_JSON_ENABLED', 'true'))

# Metrics for syndication
METRICS_FEED_TIMING = str_to_bool(os.getenv('STARPUNK_METRICS_FEED_TIMING', 'true'))
METRICS_CACHE_STATS = str_to_bool(os.getenv('STARPUNK_METRICS_CACHE_STATS', 'true'))
METRICS_FORMAT_USAGE = str_to_bool(os.getenv('STARPUNK_METRICS_FORMAT_USAGE', 'true'))

Documentation Updates

User Documentation

  1. Feed Formats Guide

    • How to access each format
    • Which readers support what
    • Format comparison
  2. Configuration Guide

    • New environment variables
    • Performance tuning
    • Cache settings

API Documentation

  1. Feed Endpoints

    • /feed.xml - RSS feed
    • /feed.atom - ATOM feed
    • /feed.json - JSON feed
    • /feeds.opml - OPML export
  2. Content Negotiation

    • Accept header usage
    • Format precedence
    • Default behavior

Deployment Checklist

Pre-deployment

  • All tests passing
  • Metrics instrumentation verified
  • Feed formats validated
  • Cache performance tested
  • Documentation updated

Deployment Steps

  1. Backup database
  2. Update configuration
  3. Deploy new code
  4. Run migrations (none for v1.1.2)
  5. Clear feed cache
  6. Test all feed formats
  7. Verify metrics collection

Post-deployment

  • Monitor memory usage
  • Check feed generation times
  • Verify cache hit rates
  • Test with feed readers
  • Review error logs

Rollback Plan

If issues arise:

  1. Immediate Rollback

    git checkout v1.1.1
    supervisorctl restart starpunk
    
  2. Cache Cleanup

    redis-cli FLUSHDB  # If using Redis
    rm -rf /tmp/starpunk_cache/*  # If file-based
    
  3. Configuration Rollback

    cp config.backup.ini config.ini
    

Success Metrics

Performance Targets

  • Feed generation <100ms (50 items)
  • Cache hit rate >80%
  • Memory overhead <10MB
  • Zero performance regression

Compatibility Targets

  • 10+ feed readers tested
  • All validators passing
  • No breaking changes
  • Backward compatibility maintained

Timeline

Week 1

  • Phase 1: Metrics instrumentation (4-6 hours)
  • Testing and validation

Week 2

  • Phase 2: Feed formats (6-8 hours)
  • Integration testing

Week 3

  • Phase 3: Enhancements (4 hours)
  • Final testing and documentation
  • Deployment

Total estimated time: 14-16 hours of focused development