Files

Phil Skentelbery b0230b1233 feat: Complete v1.1.2 Phase 1 - Metrics Instrumentation

Implements the metrics instrumentation framework that was missing from v1.1.1.
The monitoring framework existed but was never actually used to collect metrics.

Phase 1 Deliverables:
- Database operation monitoring with query timing and slow query detection
- HTTP request/response metrics with request IDs for all requests
- Memory monitoring via daemon thread with configurable intervals
- Business metrics framework for notes, feeds, and cache operations
- Configuration management with environment variable support

Implementation Details:
- MonitoredConnection wrapper at pool level for transparent DB monitoring
- Flask middleware hooks for HTTP metrics collection
- Background daemon thread for memory statistics (skipped in test mode)
- Simple business metric helpers for integration in Phase 2
- Comprehensive test suite with 28/28 tests passing

Quality Metrics:
- 100% test pass rate (28/28 tests)
- Zero architectural deviations from specifications
- <1% performance overhead achieved
- Production-ready with minimal memory impact (~2MB)

Architect Review: APPROVED with excellent marks

Documentation:
- Implementation report: docs/reports/v1.1.2-phase1-metrics-implementation.md
- Architect review: docs/reviews/2025-11-26-v1.1.2-phase1-review.md
- Updated CHANGELOG.md with Phase 1 additions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-26 14:13:44 -07:00

20 KiB

Raw Blame History

JSON Feed Specification - v1.1.2

Overview

This specification defines the implementation of JSON Feed 1.1 format for StarPunk, providing a modern, developer-friendly syndication format that's easier to parse than XML-based feeds.

Requirements

Functional Requirements

JSON Feed 1.1 Compliance
- Full conformance to JSON Feed 1.1 spec
- Valid JSON structure
- Required fields present
- Proper date formatting
Rich Content Support
- HTML content
- Plain text content
- Summary field
- Image attachments
- External URLs
Enhanced Metadata
- Author objects with avatars
- Tags array
- Language specification
- Custom extensions
Efficient Generation
- Streaming JSON output
- Minimal memory usage
- Fast serialization

Non-Functional Requirements

Performance
- Generation <50ms for 50 items
- Compact JSON output
- Efficient serialization
Compatibility
- Valid JSON syntax
- Works with JSON Feed readers
- Proper MIME type handling

JSON Feed Structure

Top-Level Object

{
  "version": "https://jsonfeed.org/version/1.1",
  "title": "Required: Feed title",
  "items": [],

  "home_page_url": "https://example.com/",
  "feed_url": "https://example.com/feed.json",
  "description": "Feed description",
  "user_comment": "Free-form comment",
  "next_url": "https://example.com/feed.json?page=2",
  "icon": "https://example.com/icon.png",
  "favicon": "https://example.com/favicon.ico",
  "authors": [],
  "language": "en-US",
  "expired": false,
  "hubs": []
}

Required Fields

Field	Type	Description
`version`	String	Must be "https://jsonfeed.org/version/1.1"
`title`	String	Feed title
`items`	Array	Array of item objects

Optional Feed Fields

Field	Type	Description
`home_page_url`	String	Website URL
`feed_url`	String	URL of this feed
`description`	String	Feed description
`user_comment`	String	Implementation notes
`next_url`	String	Pagination next page
`icon`	String	512x512+ image
`favicon`	String	Website favicon
`authors`	Array	Feed authors
`language`	String	RFC 5646 language tag
`expired`	Boolean	Feed no longer updated
`hubs`	Array	WebSub hubs

Item Object Structure

{
  "id": "Required: unique ID",
  "url": "https://example.com/note/123",
  "external_url": "https://external.com/article",
  "title": "Item title",
  "content_html": "<p>HTML content</p>",
  "content_text": "Plain text content",
  "summary": "Brief summary",
  "image": "https://example.com/image.jpg",
  "banner_image": "https://example.com/banner.jpg",
  "date_published": "2024-11-25T12:00:00Z",
  "date_modified": "2024-11-25T13:00:00Z",
  "authors": [],
  "tags": ["tag1", "tag2"],
  "language": "en",
  "attachments": [],
  "_custom": {}
}

Required Item Fields

Field	Type	Description
`id`	String	Unique, stable ID

Optional Item Fields

Field	Type	Description
`url`	String	Item permalink
`external_url`	String	Link to external content
`title`	String	Item title
`content_html`	String	HTML content
`content_text`	String	Plain text content
`summary`	String	Brief summary
`image`	String	Main image URL
`banner_image`	String	Wide banner image
`date_published`	String	RFC 3339 date
`date_modified`	String	RFC 3339 date
`authors`	Array	Item authors
`tags`	Array	String tags
`language`	String	Language code
`attachments`	Array	File attachments

Author Object

{
  "name": "Author Name",
  "url": "https://example.com/about",
  "avatar": "https://example.com/avatar.jpg"
}

Attachment Object

{
  "url": "https://example.com/file.pdf",
  "mime_type": "application/pdf",
  "title": "Attachment Title",
  "size_in_bytes": 1024000,
  "duration_in_seconds": 300
}

Implementation Design

JSON Feed Generator Class

import json
from typing import List, Dict, Any, Iterator
from datetime import datetime, timezone

class JsonFeedGenerator:
    """JSON Feed 1.1 generator with streaming support"""

    def __init__(self, site_url: str, site_name: str, site_description: str,
                 author_name: str = None, author_url: str = None, author_avatar: str = None):
        self.site_url = site_url.rstrip('/')
        self.site_name = site_name
        self.site_description = site_description
        self.author = {
            'name': author_name,
            'url': author_url,
            'avatar': author_avatar
        } if author_name else None

    def generate(self, notes: List[Note], limit: int = 50) -> str:
        """Generate complete JSON feed

        IMPORTANT: Notes are expected to be in DESC order (newest first)
        from the database. This order MUST be preserved in the feed.
        """
        feed = self._build_feed_object(notes[:limit])
        return json.dumps(feed, ensure_ascii=False, indent=2)

    def generate_streaming(self, notes: List[Note], limit: int = 50) -> Iterator[str]:
        """Generate JSON feed as stream of chunks

        IMPORTANT: Notes are expected to be in DESC order (newest first)
        from the database. This order MUST be preserved in the feed.
        """
        # Start feed object
        yield '{\n'
        yield '  "version": "https://jsonfeed.org/version/1.1",\n'
        yield f'  "title": {json.dumps(self.site_name)},\n'

        # Add optional feed metadata
        yield from self._stream_feed_metadata()

        # Start items array
        yield '  "items": [\n'

        # Stream items - maintain DESC order (newest first)
        # DO NOT reverse! Database order is correct
        items = notes[:limit]
        for i, note in enumerate(items):
            item_json = json.dumps(self._build_item_object(note), indent=4)
            # Indent items properly
            indented = '\n'.join('    ' + line for line in item_json.split('\n'))
            yield indented

            if i < len(items) - 1:
                yield ',\n'
            else:
                yield '\n'

        # Close items array and feed
        yield '  ]\n'
        yield '}\n'

    def _build_feed_object(self, notes: List[Note]) -> Dict[str, Any]:
        """Build complete feed object"""
        feed = {
            'version': 'https://jsonfeed.org/version/1.1',
            'title': self.site_name,
            'home_page_url': self.site_url,
            'feed_url': f'{self.site_url}/feed.json',
            'description': self.site_description,
            'items': [self._build_item_object(note) for note in notes]
        }

        # Add optional fields
        if self.author:
            feed['authors'] = [self._clean_author(self.author)]

        feed['language'] = 'en'  # Make configurable

        # Add icon/favicon if configured
        icon_url = self._get_icon_url()
        if icon_url:
            feed['icon'] = icon_url

        favicon_url = self._get_favicon_url()
        if favicon_url:
            feed['favicon'] = favicon_url

        return feed

    def _build_item_object(self, note: Note) -> Dict[str, Any]:
        """Build item object from note"""
        permalink = f'{self.site_url}{note.permalink}'

        item = {
            'id': permalink,
            'url': permalink,
            'title': note.title or self._format_date_title(note.created_at),
            'date_published': self._format_json_date(note.created_at)
        }

        # Add content (prefer HTML)
        if note.html:
            item['content_html'] = note.html
        elif note.content:
            item['content_text'] = note.content

        # Add modified date if different
        if hasattr(note, 'updated_at') and note.updated_at != note.created_at:
            item['date_modified'] = self._format_json_date(note.updated_at)

        # Add summary if available
        if hasattr(note, 'summary') and note.summary:
            item['summary'] = note.summary

        # Add tags if available
        if hasattr(note, 'tags') and note.tags:
            item['tags'] = note.tags

        # Add author if different from feed author
        if hasattr(note, 'author') and note.author != self.author:
            item['authors'] = [self._clean_author(note.author)]

        # Add image if available
        image_url = self._extract_image_url(note)
        if image_url:
            item['image'] = image_url

        # Add custom extensions
        item['_starpunk'] = {
            'permalink_path': note.permalink,
            'word_count': len(note.content.split()) if note.content else 0
        }

        return item

    def _clean_author(self, author: Any) -> Dict[str, str]:
        """Clean author object for JSON"""
        clean = {}

        if isinstance(author, dict):
            if author.get('name'):
                clean['name'] = author['name']
            if author.get('url'):
                clean['url'] = author['url']
            if author.get('avatar'):
                clean['avatar'] = author['avatar']
        elif hasattr(author, 'name'):
            clean['name'] = author.name
            if hasattr(author, 'url'):
                clean['url'] = author.url
            if hasattr(author, 'avatar'):
                clean['avatar'] = author.avatar
        else:
            clean['name'] = str(author)

        return clean

    def _format_json_date(self, dt: datetime) -> str:
        """Format datetime to RFC 3339 for JSON Feed

        Format: 2024-11-25T12:00:00Z or 2024-11-25T12:00:00-05:00
        """
        if dt.tzinfo is None:
            dt = dt.replace(tzinfo=timezone.utc)

        # Use Z for UTC
        if dt.tzinfo == timezone.utc:
            return dt.strftime('%Y-%m-%dT%H:%M:%SZ')
        else:
            return dt.isoformat()

    def _extract_image_url(self, note: Note) -> Optional[str]:
        """Extract first image URL from note content"""
        if not note.html:
            return None

        # Simple regex to find first img tag
        import re
        match = re.search(r'<img[^>]+src="([^"]+)"', note.html)
        if match:
            img_url = match.group(1)
            # Make absolute if relative
            if not img_url.startswith('http'):
                img_url = f'{self.site_url}{img_url}'
            return img_url

        return None

Streaming JSON Generation

For memory efficiency with large feeds:

class StreamingJsonEncoder:
    """Helper for streaming JSON generation"""

    @staticmethod
    def stream_object(obj: Dict[str, Any], indent: int = 0) -> Iterator[str]:
        """Stream a JSON object"""
        indent_str = ' ' * indent
        yield indent_str + '{\n'

        items = list(obj.items())
        for i, (key, value) in enumerate(items):
            yield f'{indent_str}  "{key}": '

            if isinstance(value, dict):
                yield from StreamingJsonEncoder.stream_object(value, indent + 2)
            elif isinstance(value, list):
                yield from StreamingJsonEncoder.stream_array(value, indent + 2)
            else:
                yield json.dumps(value)

            if i < len(items) - 1:
                yield ','
            yield '\n'

        yield indent_str + '}'

    @staticmethod
    def stream_array(arr: List[Any], indent: int = 0) -> Iterator[str]:
        """Stream a JSON array"""
        indent_str = ' ' * indent
        yield '[\n'

        for i, item in enumerate(arr):
            if isinstance(item, dict):
                yield from StreamingJsonEncoder.stream_object(item, indent + 2)
            else:
                yield indent_str + '  ' + json.dumps(item)

            if i < len(arr) - 1:
                yield ','
            yield '\n'

        yield indent_str + ']'

Complete JSON Feed Example

{
  "version": "https://jsonfeed.org/version/1.1",
  "title": "StarPunk Notes",
  "home_page_url": "https://example.com/",
  "feed_url": "https://example.com/feed.json",
  "description": "Personal notes and thoughts",
  "authors": [
    {
      "name": "John Doe",
      "url": "https://example.com/about",
      "avatar": "https://example.com/avatar.jpg"
    }
  ],
  "language": "en",
  "icon": "https://example.com/icon.png",
  "favicon": "https://example.com/favicon.ico",
  "items": [
    {
      "id": "https://example.com/notes/2024/11/25/first-note",
      "url": "https://example.com/notes/2024/11/25/first-note",
      "title": "My First Note",
      "content_html": "<p>This is my first note with <strong>bold</strong> text.</p>",
      "summary": "Introduction to my notes",
      "image": "https://example.com/images/first.jpg",
      "date_published": "2024-11-25T10:00:00Z",
      "date_modified": "2024-11-25T10:30:00Z",
      "tags": ["personal", "introduction"],
      "_starpunk": {
        "permalink_path": "/notes/2024/11/25/first-note",
        "word_count": 8
      }
    },
    {
      "id": "https://example.com/notes/2024/11/24/another-note",
      "url": "https://example.com/notes/2024/11/24/another-note",
      "title": "Another Note",
      "content_text": "Plain text content for this note.",
      "date_published": "2024-11-24T15:45:00Z",
      "tags": ["thoughts"],
      "_starpunk": {
        "permalink_path": "/notes/2024/11/24/another-note",
        "word_count": 6
      }
    }
  ]
}

Validation

JSON Feed Validator

Validate against the official validator:

https://validator.jsonfeed.org/

Common Validation Issues

Invalid JSON Syntax
- Proper escaping of quotes
- Valid UTF-8 encoding
- No trailing commas
Missing Required Fields
- version, title, items required
- Each item needs id
Invalid Date Format
- Must be RFC 3339
- Include timezone
Invalid URLs
- Must be absolute URLs
- Properly encoded

Testing Strategy

Unit Tests

class TestJsonFeedGenerator:
    def test_required_fields(self):
        """Test all required fields are present"""
        generator = JsonFeedGenerator(site_url, site_name, site_description)
        feed_json = generator.generate(notes)
        feed = json.loads(feed_json)

        assert feed['version'] == 'https://jsonfeed.org/version/1.1'
        assert 'title' in feed
        assert 'items' in feed

    def test_feed_order_newest_first(self):
        """Test JSON feed shows newest entries first (spec convention)"""
        # Create notes with different timestamps
        old_note = Note(
            title="Old Note",
            created_at=datetime(2024, 11, 20, 10, 0, 0, tzinfo=timezone.utc)
        )
        new_note = Note(
            title="New Note",
            created_at=datetime(2024, 11, 25, 10, 0, 0, tzinfo=timezone.utc)
        )

        # Generate feed with notes in DESC order (as from database)
        generator = JsonFeedGenerator(site_url, site_name, site_description)
        feed_json = generator.generate([new_note, old_note])
        feed = json.loads(feed_json)

        # First item should be newest
        assert feed['items'][0]['title'] == "New Note"
        assert '2024-11-25' in feed['items'][0]['date_published']

        # Second item should be oldest
        assert feed['items'][1]['title'] == "Old Note"
        assert '2024-11-20' in feed['items'][1]['date_published']

    def test_json_validity(self):
        """Test output is valid JSON"""
        generator = JsonFeedGenerator(site_url, site_name, site_description)
        feed_json = generator.generate(notes)

        # Should parse without error
        feed = json.loads(feed_json)
        assert isinstance(feed, dict)

    def test_date_formatting(self):
        """Test RFC 3339 date formatting"""
        dt = datetime(2024, 11, 25, 12, 0, 0, tzinfo=timezone.utc)
        formatted = generator._format_json_date(dt)

        assert formatted == '2024-11-25T12:00:00Z'

    def test_streaming_generation(self):
        """Test streaming produces valid JSON"""
        generator = JsonFeedGenerator(site_url, site_name, site_description)
        chunks = list(generator.generate_streaming(notes))
        feed_json = ''.join(chunks)

        # Should be valid JSON
        feed = json.loads(feed_json)
        assert feed['version'] == 'https://jsonfeed.org/version/1.1'

    def test_custom_extensions(self):
        """Test custom _starpunk extension"""
        generator = JsonFeedGenerator(site_url, site_name, site_description)
        feed_json = generator.generate([sample_note])
        feed = json.loads(feed_json)

        item = feed['items'][0]
        assert '_starpunk' in item
        assert 'permalink_path' in item['_starpunk']
        assert 'word_count' in item['_starpunk']

Integration Tests

def test_json_feed_endpoint():
    """Test JSON feed endpoint"""
    response = client.get('/feed.json')

    assert response.status_code == 200
    assert response.content_type == 'application/feed+json'

    feed = json.loads(response.data)
    assert feed['version'] == 'https://jsonfeed.org/version/1.1'

def test_content_negotiation_json():
    """Test content negotiation prefers JSON"""
    response = client.get('/feed', headers={'Accept': 'application/json'})

    assert response.status_code == 200
    assert 'json' in response.content_type.lower()

def test_feed_reader_compatibility():
    """Test with JSON Feed readers"""
    readers = [
        'Feedbin',
        'Inoreader',
        'NewsBlur',
        'NetNewsWire'
    ]

    for reader in readers:
        assert validate_with_reader(feed_url, reader, format='json')

Validation Tests

def test_jsonfeed_validation():
    """Validate against official validator"""
    generator = JsonFeedGenerator(site_url, site_name, site_description)
    feed_json = generator.generate(sample_notes)

    # Submit to validator
    result = validate_json_feed(feed_json)
    assert result['valid'] == True
    assert len(result['errors']) == 0

Performance Benchmarks

Generation Speed

def benchmark_json_generation():
    """Benchmark JSON feed generation"""
    notes = generate_sample_notes(100)
    generator = JsonFeedGenerator(site_url, site_name, site_description)

    start = time.perf_counter()
    feed_json = generator.generate(notes, limit=50)
    duration = time.perf_counter() - start

    assert duration < 0.05  # Less than 50ms
    assert len(feed_json) > 0

Size Comparison

def test_json_vs_xml_size():
    """Compare JSON feed size to RSS/ATOM"""
    notes = generate_sample_notes(50)

    # Generate all formats
    json_feed = json_generator.generate(notes)
    rss_feed = rss_generator.generate(notes)
    atom_feed = atom_generator.generate(notes)

    # JSON should be more compact
    print(f"JSON: {len(json_feed)} bytes")
    print(f"RSS:  {len(rss_feed)} bytes")
    print(f"ATOM: {len(atom_feed)} bytes")

    # Typically JSON is 20-30% smaller

Configuration

JSON Feed Settings

# JSON Feed configuration
STARPUNK_FEED_JSON_ENABLED=true
STARPUNK_FEED_JSON_AUTHOR_NAME=John Doe
STARPUNK_FEED_JSON_AUTHOR_URL=https://example.com/about
STARPUNK_FEED_JSON_AUTHOR_AVATAR=https://example.com/avatar.jpg
STARPUNK_FEED_JSON_ICON=https://example.com/icon.png
STARPUNK_FEED_JSON_FAVICON=https://example.com/favicon.ico
STARPUNK_FEED_JSON_LANGUAGE=en
STARPUNK_FEED_JSON_HUB_URL=  # WebSub hub URL (optional)

Security Considerations

JSON Injection Prevention
- Proper JSON escaping
- No raw user input
- Validate all URLs
Content Security
- HTML content sanitized
- No script injection
- Safe JSON encoding
Size Limits
- Maximum feed size
- Item count limits
- Timeout protection

Migration Notes

Adding JSON Feed

Runs parallel to RSS/ATOM
No changes to existing feeds
Shared caching infrastructure
Same data source

Advanced Features

WebSub Support (Future)

{
  "hubs": [
    {
      "type": "WebSub",
      "url": "https://example.com/hub"
    }
  ]
}

Pagination

{
  "next_url": "https://example.com/feed.json?page=2"
}

Attachments

{
  "attachments": [
    {
      "url": "https://example.com/podcast.mp3",
      "mime_type": "audio/mpeg",
      "title": "Podcast Episode",
      "size_in_bytes": 25000000,
      "duration_in_seconds": 1800
    }
  ]
}

Acceptance Criteria

✅ Valid JSON Feed 1.1 generation
✅ All required fields present
✅ RFC 3339 dates correct
✅ Valid JSON syntax
✅ Streaming generation working
✅ Official validator passing
✅ Works with 5+ JSON Feed readers
✅ Performance target met (<50ms)
✅ Custom extensions working
✅ Security review passed

20 KiB Raw Blame History