# JSON Feed Specification - v1.1.2 ## Overview This specification defines the implementation of JSON Feed 1.1 format for StarPunk, providing a modern, developer-friendly syndication format that's easier to parse than XML-based feeds. ## Requirements ### Functional Requirements 1. **JSON Feed 1.1 Compliance** - Full conformance to JSON Feed 1.1 spec - Valid JSON structure - Required fields present - Proper date formatting 2. **Rich Content Support** - HTML content - Plain text content - Summary field - Image attachments - External URLs 3. **Enhanced Metadata** - Author objects with avatars - Tags array - Language specification - Custom extensions 4. **Efficient Generation** - Streaming JSON output - Minimal memory usage - Fast serialization ### Non-Functional Requirements 1. **Performance** - Generation <50ms for 50 items - Compact JSON output - Efficient serialization 2. **Compatibility** - Valid JSON syntax - Works with JSON Feed readers - Proper MIME type handling ## JSON Feed Structure ### Top-Level Object ```json { "version": "https://jsonfeed.org/version/1.1", "title": "Required: Feed title", "items": [], "home_page_url": "https://example.com/", "feed_url": "https://example.com/feed.json", "description": "Feed description", "user_comment": "Free-form comment", "next_url": "https://example.com/feed.json?page=2", "icon": "https://example.com/icon.png", "favicon": "https://example.com/favicon.ico", "authors": [], "language": "en-US", "expired": false, "hubs": [] } ``` ### Required Fields | Field | Type | Description | |-------|------|-------------| | `version` | String | Must be "https://jsonfeed.org/version/1.1" | | `title` | String | Feed title | | `items` | Array | Array of item objects | ### Optional Feed Fields | Field | Type | Description | |-------|------|-------------| | `home_page_url` | String | Website URL | | `feed_url` | String | URL of this feed | | `description` | String | Feed description | | `user_comment` | String | Implementation notes | | `next_url` | String | Pagination next page | | `icon` | String | 512x512+ image | | `favicon` | String | Website favicon | | `authors` | Array | Feed authors | | `language` | String | RFC 5646 language tag | | `expired` | Boolean | Feed no longer updated | | `hubs` | Array | WebSub hubs | ### Item Object Structure ```json { "id": "Required: unique ID", "url": "https://example.com/note/123", "external_url": "https://external.com/article", "title": "Item title", "content_html": "
HTML content
", "content_text": "Plain text content", "summary": "Brief summary", "image": "https://example.com/image.jpg", "banner_image": "https://example.com/banner.jpg", "date_published": "2024-11-25T12:00:00Z", "date_modified": "2024-11-25T13:00:00Z", "authors": [], "tags": ["tag1", "tag2"], "language": "en", "attachments": [], "_custom": {} } ``` ### Required Item Fields | Field | Type | Description | |-------|------|-------------| | `id` | String | Unique, stable ID | ### Optional Item Fields | Field | Type | Description | |-------|------|-------------| | `url` | String | Item permalink | | `external_url` | String | Link to external content | | `title` | String | Item title | | `content_html` | String | HTML content | | `content_text` | String | Plain text content | | `summary` | String | Brief summary | | `image` | String | Main image URL | | `banner_image` | String | Wide banner image | | `date_published` | String | RFC 3339 date | | `date_modified` | String | RFC 3339 date | | `authors` | Array | Item authors | | `tags` | Array | String tags | | `language` | String | Language code | | `attachments` | Array | File attachments | ### Author Object ```json { "name": "Author Name", "url": "https://example.com/about", "avatar": "https://example.com/avatar.jpg" } ``` ### Attachment Object ```json { "url": "https://example.com/file.pdf", "mime_type": "application/pdf", "title": "Attachment Title", "size_in_bytes": 1024000, "duration_in_seconds": 300 } ``` ## Implementation Design ### JSON Feed Generator Class ```python import json from typing import List, Dict, Any, Iterator from datetime import datetime, timezone class JsonFeedGenerator: """JSON Feed 1.1 generator with streaming support""" def __init__(self, site_url: str, site_name: str, site_description: str, author_name: str = None, author_url: str = None, author_avatar: str = None): self.site_url = site_url.rstrip('/') self.site_name = site_name self.site_description = site_description self.author = { 'name': author_name, 'url': author_url, 'avatar': author_avatar } if author_name else None def generate(self, notes: List[Note], limit: int = 50) -> str: """Generate complete JSON feed IMPORTANT: Notes are expected to be in DESC order (newest first) from the database. This order MUST be preserved in the feed. """ feed = self._build_feed_object(notes[:limit]) return json.dumps(feed, ensure_ascii=False, indent=2) def generate_streaming(self, notes: List[Note], limit: int = 50) -> Iterator[str]: """Generate JSON feed as stream of chunks IMPORTANT: Notes are expected to be in DESC order (newest first) from the database. This order MUST be preserved in the feed. """ # Start feed object yield '{\n' yield ' "version": "https://jsonfeed.org/version/1.1",\n' yield f' "title": {json.dumps(self.site_name)},\n' # Add optional feed metadata yield from self._stream_feed_metadata() # Start items array yield ' "items": [\n' # Stream items - maintain DESC order (newest first) # DO NOT reverse! Database order is correct items = notes[:limit] for i, note in enumerate(items): item_json = json.dumps(self._build_item_object(note), indent=4) # Indent items properly indented = '\n'.join(' ' + line for line in item_json.split('\n')) yield indented if i < len(items) - 1: yield ',\n' else: yield '\n' # Close items array and feed yield ' ]\n' yield '}\n' def _build_feed_object(self, notes: List[Note]) -> Dict[str, Any]: """Build complete feed object""" feed = { 'version': 'https://jsonfeed.org/version/1.1', 'title': self.site_name, 'home_page_url': self.site_url, 'feed_url': f'{self.site_url}/feed.json', 'description': self.site_description, 'items': [self._build_item_object(note) for note in notes] } # Add optional fields if self.author: feed['authors'] = [self._clean_author(self.author)] feed['language'] = 'en' # Make configurable # Add icon/favicon if configured icon_url = self._get_icon_url() if icon_url: feed['icon'] = icon_url favicon_url = self._get_favicon_url() if favicon_url: feed['favicon'] = favicon_url return feed def _build_item_object(self, note: Note) -> Dict[str, Any]: """Build item object from note""" permalink = f'{self.site_url}{note.permalink}' item = { 'id': permalink, 'url': permalink, 'title': note.title or self._format_date_title(note.created_at), 'date_published': self._format_json_date(note.created_at) } # Add content (prefer HTML) if note.html: item['content_html'] = note.html elif note.content: item['content_text'] = note.content # Add modified date if different if hasattr(note, 'updated_at') and note.updated_at != note.created_at: item['date_modified'] = self._format_json_date(note.updated_at) # Add summary if available if hasattr(note, 'summary') and note.summary: item['summary'] = note.summary # Add tags if available if hasattr(note, 'tags') and note.tags: item['tags'] = note.tags # Add author if different from feed author if hasattr(note, 'author') and note.author != self.author: item['authors'] = [self._clean_author(note.author)] # Add image if available image_url = self._extract_image_url(note) if image_url: item['image'] = image_url # Add custom extensions item['_starpunk'] = { 'permalink_path': note.permalink, 'word_count': len(note.content.split()) if note.content else 0 } return item def _clean_author(self, author: Any) -> Dict[str, str]: """Clean author object for JSON""" clean = {} if isinstance(author, dict): if author.get('name'): clean['name'] = author['name'] if author.get('url'): clean['url'] = author['url'] if author.get('avatar'): clean['avatar'] = author['avatar'] elif hasattr(author, 'name'): clean['name'] = author.name if hasattr(author, 'url'): clean['url'] = author.url if hasattr(author, 'avatar'): clean['avatar'] = author.avatar else: clean['name'] = str(author) return clean def _format_json_date(self, dt: datetime) -> str: """Format datetime to RFC 3339 for JSON Feed Format: 2024-11-25T12:00:00Z or 2024-11-25T12:00:00-05:00 """ if dt.tzinfo is None: dt = dt.replace(tzinfo=timezone.utc) # Use Z for UTC if dt.tzinfo == timezone.utc: return dt.strftime('%Y-%m-%dT%H:%M:%SZ') else: return dt.isoformat() def _extract_image_url(self, note: Note) -> Optional[str]: """Extract first image URL from note content""" if not note.html: return None # Simple regex to find first img tag import re match = re.search(r'This is my first note with bold text.
", "summary": "Introduction to my notes", "image": "https://example.com/images/first.jpg", "date_published": "2024-11-25T10:00:00Z", "date_modified": "2024-11-25T10:30:00Z", "tags": ["personal", "introduction"], "_starpunk": { "permalink_path": "/notes/2024/11/25/first-note", "word_count": 8 } }, { "id": "https://example.com/notes/2024/11/24/another-note", "url": "https://example.com/notes/2024/11/24/another-note", "title": "Another Note", "content_text": "Plain text content for this note.", "date_published": "2024-11-24T15:45:00Z", "tags": ["thoughts"], "_starpunk": { "permalink_path": "/notes/2024/11/24/another-note", "word_count": 6 } } ] } ``` ## Validation ### JSON Feed Validator Validate against the official validator: - https://validator.jsonfeed.org/ ### Common Validation Issues 1. **Invalid JSON Syntax** - Proper escaping of quotes - Valid UTF-8 encoding - No trailing commas 2. **Missing Required Fields** - version, title, items required - Each item needs id 3. **Invalid Date Format** - Must be RFC 3339 - Include timezone 4. **Invalid URLs** - Must be absolute URLs - Properly encoded ## Testing Strategy ### Unit Tests ```python class TestJsonFeedGenerator: def test_required_fields(self): """Test all required fields are present""" generator = JsonFeedGenerator(site_url, site_name, site_description) feed_json = generator.generate(notes) feed = json.loads(feed_json) assert feed['version'] == 'https://jsonfeed.org/version/1.1' assert 'title' in feed assert 'items' in feed def test_feed_order_newest_first(self): """Test JSON feed shows newest entries first (spec convention)""" # Create notes with different timestamps old_note = Note( title="Old Note", created_at=datetime(2024, 11, 20, 10, 0, 0, tzinfo=timezone.utc) ) new_note = Note( title="New Note", created_at=datetime(2024, 11, 25, 10, 0, 0, tzinfo=timezone.utc) ) # Generate feed with notes in DESC order (as from database) generator = JsonFeedGenerator(site_url, site_name, site_description) feed_json = generator.generate([new_note, old_note]) feed = json.loads(feed_json) # First item should be newest assert feed['items'][0]['title'] == "New Note" assert '2024-11-25' in feed['items'][0]['date_published'] # Second item should be oldest assert feed['items'][1]['title'] == "Old Note" assert '2024-11-20' in feed['items'][1]['date_published'] def test_json_validity(self): """Test output is valid JSON""" generator = JsonFeedGenerator(site_url, site_name, site_description) feed_json = generator.generate(notes) # Should parse without error feed = json.loads(feed_json) assert isinstance(feed, dict) def test_date_formatting(self): """Test RFC 3339 date formatting""" dt = datetime(2024, 11, 25, 12, 0, 0, tzinfo=timezone.utc) formatted = generator._format_json_date(dt) assert formatted == '2024-11-25T12:00:00Z' def test_streaming_generation(self): """Test streaming produces valid JSON""" generator = JsonFeedGenerator(site_url, site_name, site_description) chunks = list(generator.generate_streaming(notes)) feed_json = ''.join(chunks) # Should be valid JSON feed = json.loads(feed_json) assert feed['version'] == 'https://jsonfeed.org/version/1.1' def test_custom_extensions(self): """Test custom _starpunk extension""" generator = JsonFeedGenerator(site_url, site_name, site_description) feed_json = generator.generate([sample_note]) feed = json.loads(feed_json) item = feed['items'][0] assert '_starpunk' in item assert 'permalink_path' in item['_starpunk'] assert 'word_count' in item['_starpunk'] ``` ### Integration Tests ```python def test_json_feed_endpoint(): """Test JSON feed endpoint""" response = client.get('/feed.json') assert response.status_code == 200 assert response.content_type == 'application/feed+json' feed = json.loads(response.data) assert feed['version'] == 'https://jsonfeed.org/version/1.1' def test_content_negotiation_json(): """Test content negotiation prefers JSON""" response = client.get('/feed', headers={'Accept': 'application/json'}) assert response.status_code == 200 assert 'json' in response.content_type.lower() def test_feed_reader_compatibility(): """Test with JSON Feed readers""" readers = [ 'Feedbin', 'Inoreader', 'NewsBlur', 'NetNewsWire' ] for reader in readers: assert validate_with_reader(feed_url, reader, format='json') ``` ### Validation Tests ```python def test_jsonfeed_validation(): """Validate against official validator""" generator = JsonFeedGenerator(site_url, site_name, site_description) feed_json = generator.generate(sample_notes) # Submit to validator result = validate_json_feed(feed_json) assert result['valid'] == True assert len(result['errors']) == 0 ``` ## Performance Benchmarks ### Generation Speed ```python def benchmark_json_generation(): """Benchmark JSON feed generation""" notes = generate_sample_notes(100) generator = JsonFeedGenerator(site_url, site_name, site_description) start = time.perf_counter() feed_json = generator.generate(notes, limit=50) duration = time.perf_counter() - start assert duration < 0.05 # Less than 50ms assert len(feed_json) > 0 ``` ### Size Comparison ```python def test_json_vs_xml_size(): """Compare JSON feed size to RSS/ATOM""" notes = generate_sample_notes(50) # Generate all formats json_feed = json_generator.generate(notes) rss_feed = rss_generator.generate(notes) atom_feed = atom_generator.generate(notes) # JSON should be more compact print(f"JSON: {len(json_feed)} bytes") print(f"RSS: {len(rss_feed)} bytes") print(f"ATOM: {len(atom_feed)} bytes") # Typically JSON is 20-30% smaller ``` ## Configuration ### JSON Feed Settings ```ini # JSON Feed configuration STARPUNK_FEED_JSON_ENABLED=true STARPUNK_FEED_JSON_AUTHOR_NAME=John Doe STARPUNK_FEED_JSON_AUTHOR_URL=https://example.com/about STARPUNK_FEED_JSON_AUTHOR_AVATAR=https://example.com/avatar.jpg STARPUNK_FEED_JSON_ICON=https://example.com/icon.png STARPUNK_FEED_JSON_FAVICON=https://example.com/favicon.ico STARPUNK_FEED_JSON_LANGUAGE=en STARPUNK_FEED_JSON_HUB_URL= # WebSub hub URL (optional) ``` ## Security Considerations 1. **JSON Injection Prevention** - Proper JSON escaping - No raw user input - Validate all URLs 2. **Content Security** - HTML content sanitized - No script injection - Safe JSON encoding 3. **Size Limits** - Maximum feed size - Item count limits - Timeout protection ## Migration Notes ### Adding JSON Feed - Runs parallel to RSS/ATOM - No changes to existing feeds - Shared caching infrastructure - Same data source ## Advanced Features ### WebSub Support (Future) ```json { "hubs": [ { "type": "WebSub", "url": "https://example.com/hub" } ] } ``` ### Pagination ```json { "next_url": "https://example.com/feed.json?page=2" } ``` ### Attachments ```json { "attachments": [ { "url": "https://example.com/podcast.mp3", "mime_type": "audio/mpeg", "title": "Podcast Episode", "size_in_bytes": 25000000, "duration_in_seconds": 1800 } ] } ``` ## Acceptance Criteria 1. ✅ Valid JSON Feed 1.1 generation 2. ✅ All required fields present 3. ✅ RFC 3339 dates correct 4. ✅ Valid JSON syntax 5. ✅ Streaming generation working 6. ✅ Official validator passing 7. ✅ Works with 5+ JSON Feed readers 8. ✅ Performance target met (<50ms) 9. ✅ Custom extensions working 10. ✅ Security review passed