feat: v1.4.0 Phase 3 - Micropub Media Endpoint

Implement W3C Micropub media endpoint for external client uploads.

Changes:
- Add POST /micropub/media endpoint in routes/micropub.py
  - Accept multipart/form-data with 'file' field
  - Require bearer token with 'create' scope
  - Return 201 Created with Location header
  - Validate, optimize, and generate variants via save_media()

- Update q=config response to advertise media-endpoint
  - Include media-endpoint URL in config response
  - Add 'photo' post-type to supported types

- Add photo property support to Micropub create
  - extract_photos() function to parse photo property
  - Handles both simple URL strings and structured objects with alt text
  - _attach_photos_to_note() function to attach photos by URL
  - Only attach photos from our server (by URL match)
  - External URLs logged but ignored (no download)
  - Maximum 4 photos per note (per ADR-057)

- SITE_URL normalization pattern
  - Use .rstrip('/') for consistent URL comparison
  - Applied in media endpoint and photo attachment

Per design document: docs/design/v1.4.0/media-implementation-design.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-12-10 18:32:21 -07:00
parent 501a711050
commit c64feaea23
5 changed files with 2171 additions and 5 deletions

View File

@@ -0,0 +1,302 @@
# Feed Tags Implementation Design
**Version**: 1.3.1 "Syndicate Tags"
**Status**: Ready for Implementation
**Estimated Effort**: 1-2 hours
## Overview
This document specifies the implementation for adding tags/categories to all three syndication feed formats. Tags were added to the backend in v1.3.0 but are not currently included in feed output.
## Current State Analysis
### Tag Data Structure
Tags are stored as dictionaries with two fields:
- `name`: Normalized, URL-safe identifier (e.g., `machine-learning`)
- `display_name`: Human-readable label (e.g., `Machine Learning`)
The `get_note_tags(note_id)` function returns a list of these dictionaries, ordered alphabetically by display_name.
### Feed Generation Routes
The `_get_cached_notes()` function in `starpunk/routes/public.py` already attaches media to notes but **does not attach tags**. This is the key change needed to make tags available to feed generators.
### Feed Generator Functions
Each feed module uses a consistent pattern:
- Non-streaming function builds complete feed
- Streaming function yields chunks
- Both accept `notes: list[Note]` where notes may have attached attributes
## Design Decisions
Per user confirmation:
1. **Omit `scheme`/`domain` attributes** - Keep implementation minimal
2. **Omit `tags` field when empty** - Do not output empty array in JSON Feed
## Implementation Specification
### Phase 1: Load Tags in Feed Routes
**File**: `starpunk/routes/public.py`
**Change**: Modify `_get_cached_notes()` to attach tags to each note.
**Current code** (lines 66-69):
```python
# Attach media to each note (v1.2.0 Phase 3)
for note in notes:
media = get_note_media(note.id)
object.__setattr__(note, 'media', media)
```
**Required change**: Add tag loading after media loading:
```python
# Attach media to each note (v1.2.0 Phase 3)
for note in notes:
media = get_note_media(note.id)
object.__setattr__(note, 'media', media)
# Attach tags to each note (v1.3.1)
tags = get_note_tags(note.id)
object.__setattr__(note, 'tags', tags)
```
**Import needed**: Add `get_note_tags` to imports from `starpunk.tags`.
### Phase 2: RSS 2.0 Categories
**File**: `starpunk/feeds/rss.py`
**Standard**: RSS 2.0 Specification - `<category>` sub-element of `<item>`
**Format**:
```xml
<category>Display Name</category>
```
#### Non-Streaming Function (`generate_rss`)
The feedgen library's `FeedEntry` supports categories via `fe.category()`.
**Location**: After description is set (around line 143), add:
```python
# Add category elements for tags (v1.3.1)
if hasattr(note, 'tags') and note.tags:
for tag in note.tags:
fe.category({'term': tag['display_name']})
```
Note: feedgen's category accepts a dict with 'term' key for RSS output.
#### Streaming Function (`generate_rss_streaming`)
**Location**: After description in the item XML building (around line 293), add category elements:
Insert after the `<description>` CDATA section and before the media elements:
```python
# Add category elements for tags (v1.3.1)
if hasattr(note, 'tags') and note.tags:
for tag in note.tags:
item_xml += f"""
<category>{_escape_xml(tag['display_name'])}</category>"""
```
**Expected output**:
```xml
<item>
<title>My Post</title>
<link>https://example.com/note/my-post</link>
<guid isPermaLink="true">https://example.com/note/my-post</guid>
<pubDate>Mon, 18 Nov 2024 12:00:00 +0000</pubDate>
<description><![CDATA[...]]></description>
<category>Machine Learning</category>
<category>Python</category>
...
</item>
```
### Phase 3: Atom 1.0 Categories
**File**: `starpunk/feeds/atom.py`
**Standard**: RFC 4287 Section 4.2.2 - The `atom:category` Element
**Format**:
```xml
<category term="machine-learning" label="Machine Learning"/>
```
- `term` (REQUIRED): Normalized tag name for machine processing
- `label` (OPTIONAL): Human-readable display name
#### Streaming Function (`generate_atom_streaming`)
Note: `generate_atom()` delegates to streaming, so only one change needed.
**Location**: After the entry link element (around line 179), before content:
```python
# Add category elements for tags (v1.3.1)
if hasattr(note, 'tags') and note.tags:
for tag in note.tags:
yield f' <category term="{_escape_xml(tag["name"])}" label="{_escape_xml(tag["display_name"])}"/>\n'
```
**Expected output**:
```xml
<entry>
<id>https://example.com/note/my-post</id>
<title>My Post</title>
<published>2024-11-25T12:00:00Z</published>
<updated>2024-11-25T12:00:00Z</updated>
<link rel="alternate" type="text/html" href="https://example.com/note/my-post"/>
<category term="machine-learning" label="Machine Learning"/>
<category term="python" label="Python"/>
<content type="html">...</content>
</entry>
```
### Phase 4: JSON Feed 1.1 Tags
**File**: `starpunk/feeds/json_feed.py`
**Standard**: JSON Feed 1.1 Specification - `tags` field
**Format**:
```json
{
"tags": ["Machine Learning", "Python"]
}
```
Per user decision: **Omit `tags` field entirely when no tags** (do not output empty array).
#### Item Builder Function (`_build_item_object`)
**Location**: After attachments section (around line 308), before `_starpunk` extension:
```python
# Add tags array (v1.3.1)
# Per spec: array of plain strings (tags, not categories)
# Omit field when no tags (user decision: no empty array)
if hasattr(note, 'tags') and note.tags:
item["tags"] = [tag['display_name'] for tag in note.tags]
```
**Expected output** (note with tags):
```json
{
"id": "https://example.com/note/my-post",
"url": "https://example.com/note/my-post",
"title": "My Post",
"content_html": "...",
"date_published": "2024-11-25T12:00:00Z",
"tags": ["Machine Learning", "Python"],
"_starpunk": {...}
}
```
**Expected output** (note without tags):
```json
{
"id": "https://example.com/note/my-post",
"url": "https://example.com/note/my-post",
"title": "My Post",
"content_html": "...",
"date_published": "2024-11-25T12:00:00Z",
"_starpunk": {...}
}
```
Note: No `"tags"` field at all when empty.
## Testing Requirements
### Unit Tests
Create test file: `tests/unit/feeds/test_feed_tags.py`
#### RSS Tests
1. `test_rss_note_with_tags_has_category_elements`
2. `test_rss_note_without_tags_has_no_category_elements`
3. `test_rss_multiple_tags_multiple_categories`
4. `test_rss_streaming_tags`
#### Atom Tests
1. `test_atom_note_with_tags_has_category_elements`
2. `test_atom_category_has_term_and_label_attributes`
3. `test_atom_note_without_tags_has_no_category_elements`
4. `test_atom_streaming_tags`
#### JSON Feed Tests
1. `test_json_note_with_tags_has_tags_array`
2. `test_json_note_without_tags_omits_tags_field`
3. `test_json_tags_array_contains_display_names`
4. `test_json_streaming_tags`
### Integration Tests
Add to existing feed integration tests:
1. `test_feed_generation_with_mixed_tagged_notes` - Mix of notes with and without tags
2. `test_feed_tags_ordering` - Tags appear in alphabetical order by display_name
### Test Data Setup
```python
# Test note with tags attached
note = Note(...)
object.__setattr__(note, 'tags', [
{'name': 'machine-learning', 'display_name': 'Machine Learning'},
{'name': 'python', 'display_name': 'Python'},
])
# Test note without tags
note_no_tags = Note(...)
object.__setattr__(note_no_tags, 'tags', [])
```
## Implementation Order
1. **Routes change** (`public.py`) - Load tags in `_get_cached_notes()`
2. **JSON Feed** (`json_feed.py`) - Simplest change, good for validation
3. **Atom Feed** (`atom.py`) - Single streaming function
4. **RSS Feed** (`rss.py`) - Both streaming and non-streaming functions
5. **Tests** - Unit and integration tests
## Validation Checklist
- [ ] RSS feed validates against RSS 2.0 spec
- [ ] Atom feed validates against RFC 4287
- [ ] JSON Feed validates against JSON Feed 1.1 spec
- [ ] Notes without tags produce valid feeds (no empty elements/arrays)
- [ ] Special characters in tag names are properly escaped
- [ ] Existing tests continue to pass
- [ ] Feed caching works correctly with tags
## Standards References
- [RSS 2.0 - category element](https://www.rssboard.org/rss-specification#ltcategorygtSubelementOfLtitemgt)
- [RFC 4287 Section 4.2.2 - atom:category](https://datatracker.ietf.org/doc/html/rfc4287#section-4.2.2)
- [JSON Feed 1.1 - tags](https://www.jsonfeed.org/version/1.1/)
## Files to Modify
| File | Change |
|------|--------|
| `starpunk/routes/public.py` | Add tag loading to `_get_cached_notes()` |
| `starpunk/feeds/rss.py` | Add `<category>` elements in both functions |
| `starpunk/feeds/atom.py` | Add `<category term="..." label="..."/>` elements |
| `starpunk/feeds/json_feed.py` | Add `tags` array to `_build_item_object()` |
| `tests/unit/feeds/test_feed_tags.py` | New test file |
## Summary
This is a straightforward feature addition:
- One route change to load tags
- Three feed module changes to render tags
- Follows established patterns in existing code
- No new dependencies required
- Backward compatible (tags are optional in all specs)

File diff suppressed because it is too large Load Diff