Files
StarPunk/docs/architecture/v1.1.2-syndicate-architecture.md
Phil Skentelbery b0230b1233 feat: Complete v1.1.2 Phase 1 - Metrics Instrumentation
Implements the metrics instrumentation framework that was missing from v1.1.1.
The monitoring framework existed but was never actually used to collect metrics.

Phase 1 Deliverables:
- Database operation monitoring with query timing and slow query detection
- HTTP request/response metrics with request IDs for all requests
- Memory monitoring via daemon thread with configurable intervals
- Business metrics framework for notes, feeds, and cache operations
- Configuration management with environment variable support

Implementation Details:
- MonitoredConnection wrapper at pool level for transparent DB monitoring
- Flask middleware hooks for HTTP metrics collection
- Background daemon thread for memory statistics (skipped in test mode)
- Simple business metric helpers for integration in Phase 2
- Comprehensive test suite with 28/28 tests passing

Quality Metrics:
- 100% test pass rate (28/28 tests)
- Zero architectural deviations from specifications
- <1% performance overhead achieved
- Production-ready with minimal memory impact (~2MB)

Architect Review: APPROVED with excellent marks

Documentation:
- Implementation report: docs/reports/v1.1.2-phase1-metrics-implementation.md
- Architect review: docs/reviews/2025-11-26-v1.1.2-phase1-review.md
- Updated CHANGELOG.md with Phase 1 additions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 14:13:44 -07:00

12 KiB

StarPunk v1.1.2 "Syndicate" - Architecture Overview

Executive Summary

Version 1.1.2 "Syndicate" enhances StarPunk's content distribution capabilities by completing the metrics instrumentation from v1.1.1 and adding comprehensive feed format support. This release focuses on making content accessible to the widest possible audience through multiple syndication formats while maintaining visibility into system performance.

Architecture Goals

  1. Complete Observability: Fully instrument all system operations for performance monitoring
  2. Multi-Format Syndication: Support RSS, ATOM, and JSON Feed formats
  3. Efficient Generation: Stream-based feed generation for memory efficiency
  4. Content Negotiation: Smart format selection based on client preferences
  5. Caching Strategy: Minimize regeneration overhead
  6. Standards Compliance: Full adherence to feed specifications

System Architecture

Component Overview

┌─────────────────────────────────────────────────────────┐
│                    HTTP Request Layer                    │
│                          ↓                               │
│              ┌──────────────────────┐                   │
│              │  Content Negotiator   │                   │
│              │  (Accept header)      │                   │
│              └──────────┬───────────┘                   │
│                         ↓                                │
│         ┌───────────────┴────────────────┐              │
│         ↓               ↓                ↓              │
│   ┌──────────┐    ┌──────────┐    ┌──────────┐        │
│   │   RSS    │    │   ATOM   │    │   JSON   │        │
│   │Generator │    │Generator │    │ Generator│        │
│   └────┬─────┘    └────┬─────┘    └────┬─────┘        │
│        └───────────────┬────────────────┘              │
│                        ↓                                │
│              ┌──────────────────────┐                   │
│              │   Feed Cache Layer   │                   │
│              │  (LRU with TTL)      │                   │
│              └──────────┬───────────┘                   │
│                         ↓                                │
│              ┌──────────────────────┐                   │
│              │    Data Layer        │                   │
│              │  (Notes Repository)  │                   │
│              └──────────┬───────────┘                   │
│                         ↓                                │
│              ┌──────────────────────┐                   │
│              │  Metrics Collector   │                   │
│              │  (All operations)    │                   │
│              └──────────────────────┘                   │
└─────────────────────────────────────────────────────────┘

Data Flow

  1. Request Processing

    • Client sends HTTP request with Accept header
    • Content negotiator determines optimal format
    • Check cache for existing feed
  2. Feed Generation

    • If cache miss, fetch notes from database
    • Generate feed using appropriate generator
    • Stream response to client
    • Update cache asynchronously
  3. Metrics Collection

    • Record request timing
    • Track cache hit/miss rates
    • Monitor generation performance
    • Log format popularity

Key Components

1. Metrics Instrumentation Layer

Purpose: Complete visibility into all system operations

Components:

  • Database operation timing (all queries)
  • HTTP request/response metrics
  • Memory monitoring thread
  • Business metrics (syndication stats)

Integration Points:

  • Database connection wrapper
  • Flask middleware hooks
  • Background thread for memory
  • Feed generation decorators

2. Content Negotiation Service

Purpose: Determine optimal feed format based on client preferences

Algorithm:

1. Parse Accept header
2. Score each format:
   - Exact match: 1.0
   - Wildcard match: 0.5
   - No match: 0.0
3. Consider quality factors (q=)
4. Return highest scoring format
5. Default to RSS if no preference

Supported MIME Types:

  • RSS: application/rss+xml, application/xml, text/xml
  • ATOM: application/atom+xml
  • JSON: application/json, application/feed+json

3. Feed Generators

Shared Interface:

class FeedGenerator(Protocol):
    def generate(self, notes: List[Note], config: FeedConfig) -> Iterator[str]:
        """Generate feed chunks"""

    def validate(self, feed_content: str) -> List[ValidationError]:
        """Validate generated feed"""

RSS Generator (existing, enhanced):

  • RSS 2.0 specification
  • Streaming generation
  • CDATA wrapping for HTML

ATOM Generator (new):

  • ATOM 1.0 specification
  • RFC 3339 date formatting
  • Author metadata support
  • Category/tag support

JSON Feed Generator (new):

  • JSON Feed 1.1 specification
  • Attachment support for media
  • Author object with avatar
  • Hub support for real-time

4. Feed Cache System

Purpose: Minimize regeneration overhead

Design:

  • LRU cache with configurable size
  • TTL-based expiration (default: 5 minutes)
  • Format-specific cache keys
  • Invalidation on note changes

Cache Key Structure:

feed:{format}:{limit}:{checksum}

Where checksum is based on:

  • Latest note timestamp
  • Total note count
  • Site configuration

5. Statistics Dashboard

Purpose: Track syndication performance and usage

Metrics Tracked:

  • Feed requests by format
  • Cache hit rates
  • Generation times
  • Client user agents
  • Geographic distribution (via IP)

Dashboard Location: /admin/syndication

6. OPML Export

Purpose: Allow users to share their feed collection

Implementation:

  • Generate OPML 2.0 document
  • Include all available feed formats
  • Add metadata (title, owner, date)

Performance Considerations

Memory Management

Streaming Generation:

  • Generate feeds in chunks
  • Yield results incrementally
  • Avoid loading all notes at once
  • Use generators throughout

Cache Sizing:

  • Monitor memory usage
  • Implement cache eviction
  • Configurable cache limits

Database Optimization

Query Optimization:

  • Index on published status
  • Index on created_at for ordering
  • Limit fetched columns
  • Use prepared statements

Connection Pooling:

  • Reuse database connections
  • Monitor pool usage
  • Track connection wait times

HTTP Optimization

Compression:

  • gzip for text formats (RSS, ATOM)
  • Already compact JSON Feed
  • Configurable compression level

Caching Headers:

  • ETag based on content hash
  • Last-Modified from latest note
  • Cache-Control with max-age

Security Considerations

Input Validation

  • Validate Accept headers
  • Sanitize format parameters
  • Limit feed size
  • Rate limit feed endpoints

Content Security

  • Escape XML entities properly
  • Valid JSON encoding
  • No script injection in feeds
  • CORS headers for JSON feeds

Resource Protection

  • Rate limiting per IP
  • Maximum feed items limit
  • Timeout for generation
  • Circuit breaker for database

Configuration

Feed Settings

# Feed generation
STARPUNK_FEED_DEFAULT_LIMIT = 50
STARPUNK_FEED_MAX_LIMIT = 500
STARPUNK_FEED_CACHE_TTL = 300  # seconds
STARPUNK_FEED_CACHE_SIZE = 100  # entries

# Format support
STARPUNK_FEED_RSS_ENABLED = true
STARPUNK_FEED_ATOM_ENABLED = true
STARPUNK_FEED_JSON_ENABLED = true

# Performance
STARPUNK_FEED_STREAMING = true
STARPUNK_FEED_COMPRESSION = true
STARPUNK_FEED_COMPRESSION_LEVEL = 6

Monitoring Settings

# Metrics collection
STARPUNK_METRICS_FEED_TIMING = true
STARPUNK_METRICS_CACHE_STATS = true
STARPUNK_METRICS_FORMAT_USAGE = true

# Dashboard
STARPUNK_SYNDICATION_DASHBOARD = true
STARPUNK_SYNDICATION_STATS_RETENTION = 7  # days

Testing Strategy

Unit Tests

  1. Content Negotiation

    • Accept header parsing
    • Format scoring algorithm
    • Default behavior
  2. Feed Generators

    • Valid output for each format
    • Streaming behavior
    • Error handling
  3. Cache System

    • LRU eviction
    • TTL expiration
    • Invalidation logic

Integration Tests

  1. End-to-End Feeds

    • Request with various Accept headers
    • Verify correct format returned
    • Check caching behavior
  2. Performance Tests

    • Measure generation time
    • Monitor memory usage
    • Verify streaming works
  3. Compliance Tests

    • Validate against feed specs
    • Test with popular feed readers
    • Check encoding edge cases

Migration Path

From v1.1.1 to v1.1.2

  1. Database: No schema changes required
  2. Configuration: New feed options (backward compatible)
  3. URLs: Existing /feed.xml continues to work
  4. Cache: New cache system, no migration needed

Rollback Plan

  1. Keep v1.1.1 database backup
  2. Configuration rollback script
  3. Clear feed cache
  4. Revert to previous version

Future Considerations

v1.2.0 Possibilities

  1. WebSub Support: Real-time feed updates
  2. Custom Feeds: User-defined filters
  3. Feed Analytics: Detailed reader statistics
  4. Podcast Support: Audio enclosures
  5. ActivityPub: Fediverse integration

Technical Debt

  1. Refactor feed module into package
  2. Extract cache to separate service
  3. Implement feed preview UI
  4. Add feed validation endpoint

Success Metrics

  1. Performance

    • Feed generation <100ms for 50 items
    • Cache hit rate >80%
    • Memory usage <10MB for feeds
  2. Compatibility

    • Works with 10 major feed readers
    • Passes all format validators
    • Zero regression on existing RSS
  3. Usage

    • 20% adoption of non-RSS formats
    • Reduced server load via caching
    • Positive user feedback

Risk Mitigation

Performance Risks

Risk: Feed generation slows down site Mitigation:

  • Streaming generation
  • Aggressive caching
  • Request timeouts
  • Rate limiting

Compatibility Risks

Risk: Feed readers reject new formats Mitigation:

  • Extensive testing with readers
  • Strict spec compliance
  • Format validation
  • Fallback to RSS

Operational Risks

Risk: Cache grows unbounded Mitigation:

  • LRU eviction
  • Size limits
  • Memory monitoring
  • Auto-cleanup

Conclusion

StarPunk v1.1.2 "Syndicate" creates a robust, standards-compliant syndication platform while completing the observability foundation started in v1.1.1. The architecture prioritizes performance through streaming and caching, compatibility through strict standards adherence, and maintainability through clean component separation.

The design balances feature richness with StarPunk's core philosophy of simplicity, adding only what's necessary to serve content to the widest possible audience while maintaining operational visibility.