# StarPunk v1.1.2 "Syndicate" - Architecture Overview

## Executive Summary

Version 1.1.2 "Syndicate" enhances StarPunk's content distribution capabilities by completing the metrics instrumentation from v1.1.1 and adding comprehensive feed format support. This release focuses on making content accessible to the widest possible audience through multiple syndication formats while maintaining visibility into system performance.

## Architecture Goals

1. **Complete Observability**: Fully instrument all system operations for performance monitoring
2. **Multi-Format Syndication**: Support RSS, ATOM, and JSON Feed formats
3. **Efficient Generation**: Stream-based feed generation for memory efficiency
4. **Content Negotiation**: Smart format selection based on client preferences
5. **Caching Strategy**: Minimize regeneration overhead
6. **Standards Compliance**: Full adherence to feed specifications

## System Architecture

### Component Overview

```
┌─────────────────────────────────────────────────────────┐
│                    HTTP Request Layer                    │
│                          ↓                               │
│              ┌──────────────────────┐                   │
│              │  Content Negotiator   │                   │
│              │  (Accept header)      │                   │
│              └──────────┬───────────┘                   │
│                         ↓                                │
│         ┌───────────────┴────────────────┐              │
│         ↓               ↓                ↓              │
│   ┌──────────┐    ┌──────────┐    ┌──────────┐        │
│   │   RSS    │    │   ATOM   │    │   JSON   │        │
│   │Generator │    │Generator │    │ Generator│        │
│   └────┬─────┘    └────┬─────┘    └────┬─────┘        │
│        └───────────────┬────────────────┘              │
│                        ↓                                │
│              ┌──────────────────────┐                   │
│              │   Feed Cache Layer   │                   │
│              │  (LRU with TTL)      │                   │
│              └──────────┬───────────┘                   │
│                         ↓                                │
│              ┌──────────────────────┐                   │
│              │    Data Layer        │                   │
│              │  (Notes Repository)  │                   │
│              └──────────┬───────────┘                   │
│                         ↓                                │
│              ┌──────────────────────┐                   │
│              │  Metrics Collector   │                   │
│              │  (All operations)    │                   │
│              └──────────────────────┘                   │
└─────────────────────────────────────────────────────────┘
```

### Data Flow

1. **Request Processing**
   - Client sends HTTP request with Accept header
   - Content negotiator determines optimal format
   - Check cache for existing feed

2. **Feed Generation**
   - If cache miss, fetch notes from database
   - Generate feed using appropriate generator
   - Stream response to client
   - Update cache asynchronously

3. **Metrics Collection**
   - Record request timing
   - Track cache hit/miss rates
   - Monitor generation performance
   - Log format popularity

## Key Components

### 1. Metrics Instrumentation Layer

**Purpose**: Complete visibility into all system operations

**Components**:
- Database operation timing (all queries)
- HTTP request/response metrics
- Memory monitoring thread
- Business metrics (syndication stats)

**Integration Points**:
- Database connection wrapper
- Flask middleware hooks
- Background thread for memory
- Feed generation decorators

### 2. Content Negotiation Service

**Purpose**: Determine optimal feed format based on client preferences

**Algorithm**:
```
1. Parse Accept header
2. Score each format:
   - Exact match: 1.0
   - Wildcard match: 0.5
   - No match: 0.0
3. Consider quality factors (q=)
4. Return highest scoring format
5. Default to RSS if no preference
```

**Supported MIME Types**:
- RSS: `application/rss+xml`, `application/xml`, `text/xml`
- ATOM: `application/atom+xml`
- JSON: `application/json`, `application/feed+json`

### 3. Feed Generators

**Shared Interface**:
```python
class FeedGenerator(Protocol):
    def generate(self, notes: List[Note], config: FeedConfig) -> Iterator[str]:
        """Generate feed chunks"""

    def validate(self, feed_content: str) -> List[ValidationError]:
        """Validate generated feed"""
```

**RSS Generator** (existing, enhanced):
- RSS 2.0 specification
- Streaming generation
- CDATA wrapping for HTML

**ATOM Generator** (new):
- ATOM 1.0 specification
- RFC 3339 date formatting
- Author metadata support
- Category/tag support

**JSON Feed Generator** (new):
- JSON Feed 1.1 specification
- Attachment support for media
- Author object with avatar
- Hub support for real-time

### 4. Feed Cache System

**Purpose**: Minimize regeneration overhead

**Design**:
- LRU cache with configurable size
- TTL-based expiration (default: 5 minutes)
- Format-specific cache keys
- Invalidation on note changes

**Cache Key Structure**:
```
feed:{format}:{limit}:{checksum}
```

Where checksum is based on:
- Latest note timestamp
- Total note count
- Site configuration

### 5. Statistics Dashboard

**Purpose**: Track syndication performance and usage

**Metrics Tracked**:
- Feed requests by format
- Cache hit rates
- Generation times
- Client user agents
- Geographic distribution (via IP)

**Dashboard Location**: `/admin/syndication`

### 6. OPML Export

**Purpose**: Allow users to share their feed collection

**Implementation**:
- Generate OPML 2.0 document
- Include all available feed formats
- Add metadata (title, owner, date)

## Performance Considerations

### Memory Management

**Streaming Generation**:
- Generate feeds in chunks
- Yield results incrementally
- Avoid loading all notes at once
- Use generators throughout

**Cache Sizing**:
- Monitor memory usage
- Implement cache eviction
- Configurable cache limits

### Database Optimization

**Query Optimization**:
- Index on published status
- Index on created_at for ordering
- Limit fetched columns
- Use prepared statements

**Connection Pooling**:
- Reuse database connections
- Monitor pool usage
- Track connection wait times

### HTTP Optimization

**Compression**:
- gzip for text formats (RSS, ATOM)
- Already compact JSON Feed
- Configurable compression level

**Caching Headers**:
- ETag based on content hash
- Last-Modified from latest note
- Cache-Control with max-age

## Security Considerations

### Input Validation

- Validate Accept headers
- Sanitize format parameters
- Limit feed size
- Rate limit feed endpoints

### Content Security

- Escape XML entities properly
- Valid JSON encoding
- No script injection in feeds
- CORS headers for JSON feeds

### Resource Protection

- Rate limiting per IP
- Maximum feed items limit
- Timeout for generation
- Circuit breaker for database

## Configuration

### Feed Settings

```ini
# Feed generation
STARPUNK_FEED_DEFAULT_LIMIT = 50
STARPUNK_FEED_MAX_LIMIT = 500
STARPUNK_FEED_CACHE_TTL = 300  # seconds
STARPUNK_FEED_CACHE_SIZE = 100  # entries

# Format support
STARPUNK_FEED_RSS_ENABLED = true
STARPUNK_FEED_ATOM_ENABLED = true
STARPUNK_FEED_JSON_ENABLED = true

# Performance
STARPUNK_FEED_STREAMING = true
STARPUNK_FEED_COMPRESSION = true
STARPUNK_FEED_COMPRESSION_LEVEL = 6
```

### Monitoring Settings

```ini
# Metrics collection
STARPUNK_METRICS_FEED_TIMING = true
STARPUNK_METRICS_CACHE_STATS = true
STARPUNK_METRICS_FORMAT_USAGE = true

# Dashboard
STARPUNK_SYNDICATION_DASHBOARD = true
STARPUNK_SYNDICATION_STATS_RETENTION = 7  # days
```

## Testing Strategy

### Unit Tests

1. **Content Negotiation**
   - Accept header parsing
   - Format scoring algorithm
   - Default behavior

2. **Feed Generators**
   - Valid output for each format
   - Streaming behavior
   - Error handling

3. **Cache System**
   - LRU eviction
   - TTL expiration
   - Invalidation logic

### Integration Tests

1. **End-to-End Feeds**
   - Request with various Accept headers
   - Verify correct format returned
   - Check caching behavior

2. **Performance Tests**
   - Measure generation time
   - Monitor memory usage
   - Verify streaming works

3. **Compliance Tests**
   - Validate against feed specs
   - Test with popular feed readers
   - Check encoding edge cases

## Migration Path

### From v1.1.1 to v1.1.2

1. **Database**: No schema changes required
2. **Configuration**: New feed options (backward compatible)
3. **URLs**: Existing `/feed.xml` continues to work
4. **Cache**: New cache system, no migration needed

### Rollback Plan

1. Keep v1.1.1 database backup
2. Configuration rollback script
3. Clear feed cache
4. Revert to previous version

## Future Considerations

### v1.2.0 Possibilities

1. **WebSub Support**: Real-time feed updates
2. **Custom Feeds**: User-defined filters
3. **Feed Analytics**: Detailed reader statistics
4. **Podcast Support**: Audio enclosures
5. **ActivityPub**: Fediverse integration

### Technical Debt

1. Refactor feed module into package
2. Extract cache to separate service
3. Implement feed preview UI
4. Add feed validation endpoint

## Success Metrics

1. **Performance**
   - Feed generation <100ms for 50 items
   - Cache hit rate >80%
   - Memory usage <10MB for feeds

2. **Compatibility**
   - Works with 10 major feed readers
   - Passes all format validators
   - Zero regression on existing RSS

3. **Usage**
   - 20% adoption of non-RSS formats
   - Reduced server load via caching
   - Positive user feedback

## Risk Mitigation

### Performance Risks

**Risk**: Feed generation slows down site
**Mitigation**:
- Streaming generation
- Aggressive caching
- Request timeouts
- Rate limiting

### Compatibility Risks

**Risk**: Feed readers reject new formats
**Mitigation**:
- Extensive testing with readers
- Strict spec compliance
- Format validation
- Fallback to RSS

### Operational Risks

**Risk**: Cache grows unbounded
**Mitigation**:
- LRU eviction
- Size limits
- Memory monitoring
- Auto-cleanup

## Conclusion

StarPunk v1.1.2 "Syndicate" creates a robust, standards-compliant syndication platform while completing the observability foundation started in v1.1.1. The architecture prioritizes performance through streaming and caching, compatibility through strict standards adherence, and maintainability through clean component separation.

The design balances feature richness with StarPunk's core philosophy of simplicity, adding only what's necessary to serve content to the widest possible audience while maintaining operational visibility.