Files
StarPunk/docs/design/v1.4.0/media-implementation-design.md
Phil Skentelbery c64feaea23 feat: v1.4.0 Phase 3 - Micropub Media Endpoint
Implement W3C Micropub media endpoint for external client uploads.

Changes:
- Add POST /micropub/media endpoint in routes/micropub.py
  - Accept multipart/form-data with 'file' field
  - Require bearer token with 'create' scope
  - Return 201 Created with Location header
  - Validate, optimize, and generate variants via save_media()

- Update q=config response to advertise media-endpoint
  - Include media-endpoint URL in config response
  - Add 'photo' post-type to supported types

- Add photo property support to Micropub create
  - extract_photos() function to parse photo property
  - Handles both simple URL strings and structured objects with alt text
  - _attach_photos_to_note() function to attach photos by URL
  - Only attach photos from our server (by URL match)
  - External URLs logged but ignored (no download)
  - Maximum 4 photos per note (per ADR-057)

- SITE_URL normalization pattern
  - Use .rstrip('/') for consistent URL comparison
  - Applied in media endpoint and photo attachment

Per design document: docs/design/v1.4.0/media-implementation-design.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-10 18:32:21 -07:00

52 KiB

v1.4.0 "Media" Release - Implementation Design

Version: 1.4.0 Status: Design Complete Author: StarPunk Architecture Date: 2025-12-10


Revision History

Date Version Changes
2025-12-10 1.0 Initial design document
2025-12-10 1.1 Post-Q&A corrections: validate_image() signature unchanged; file_size computed in save_media(); animated GIF handling; simplified variant path calculation; optimized_bytes passed to variants; file cleanup on variant failure; backwards-compatible variants key; make_response import; photo truncation warning; isDefault fallback logic; configurable about URL; test image noise generation; SITE_URL normalization pattern

Executive Summary

This document provides the complete implementation design for v1.4.0 "Media", which adds three major features:

  1. Micropub Media Endpoint - W3C-compliant media upload via /micropub/media
  2. Large Image Support - Accept and resize images up to 50MB
  3. Enhanced Feed Media - Multiple image sizes with complete Media RSS implementation

Total Estimated Effort: 28-40 hours


Table of Contents

  1. Confirmed Decisions
  2. Phase 1: Large Image Support
  3. Phase 2: Image Variants
  4. Phase 3: Micropub Media Endpoint
  5. Phase 4: Enhanced Feed Media
  6. Phase 5: Testing & Documentation
  7. Database Schema
  8. API Specifications
  9. File Modifications Summary
  10. Test Requirements
  11. Developer Q&A

Confirmed Decisions

The following decisions have been confirmed during the design phase:

Decision Outcome
Scope flexibility NOT locked - defer to large image support only if necessary
Token scope No new scope - existing create tokens work for media uploads
Unused upload retention Delete unused media after 24 hours
Photo property URLs Accept URL values directly without downloading
Quality edge case Reject if still >10MB after optimization
EXIF dimensions Nice to have (not required)
Variant timing Synchronous/eager generation on upload
Existing media Only new uploads get variants
Thumbnail cropping Center crop using Pillow.ImageOps.fit()
media:group usage Use for size variants of same image only
JSON Feed extension _starpunk namespace with about URL

Phase 1: Large Image Support

Estimated Effort: 4-6 hours

Overview

Remove the 10MB file size rejection and implement tiered resize strategy for large images.

Current Behavior (v1.2.0)

  • Files >10MB rejected with error
  • Files <=10MB accepted and resized if >2048px

New Behavior (v1.4.0)

  • Files up to 50MB accepted
  • Tiered resize strategy based on input size
  • Final output always <=10MB after optimization
  • Reject if optimization cannot achieve target

Tiered Resize Strategy

Input Size Max Dimension Quality Target Output
<=10MB 2048px 95% <=5MB typical
10-25MB 1600px 90% <=5MB target
25-50MB 1280px 85% <=5MB target
>50MB Rejected - Error message

Iterative Quality Reduction

If first pass produces >10MB output:

  1. Reduce max dimension by 20%
  2. Reduce quality by 5%
  3. Repeat until <=10MB or min quality (70%) reached
  4. If still >10MB at 70% quality, reject with error

File Modifications

/home/phil/Projects/starpunk/starpunk/media.py

Constants to modify:

# OLD
MAX_FILE_SIZE = 10 * 1024 * 1024  # 10MB

# NEW
MAX_FILE_SIZE = 50 * 1024 * 1024  # 50MB
MAX_OUTPUT_SIZE = 10 * 1024 * 1024  # 10MB (target after optimization)
MIN_QUALITY = 70  # Minimum JPEG quality before rejection

New function: get_optimization_params()

def get_optimization_params(file_size: int) -> Tuple[int, int]:
    """
    Determine optimization parameters based on input file size

    Args:
        file_size: Original file size in bytes

    Returns:
        Tuple of (max_dimension, quality_percent)
    """
    if file_size <= 10 * 1024 * 1024:  # <=10MB
        return (2048, 95)
    elif file_size <= 25 * 1024 * 1024:  # 10-25MB
        return (1600, 90)
    else:  # 25-50MB
        return (1280, 85)

Modified function: validate_image()

Changes:

  • Update MAX_FILE_SIZE check to 50MB
  • Add animated GIF detection and specific error message
  • NOTE: Return signature remains unchanged (3-tuple). File size is computed in save_media().
def validate_image(file_data: bytes, filename: str) -> Tuple[str, int, int]:
    """
    Validate image file

    Returns:
        Tuple of (mime_type, width, height)
    """
    file_size = len(file_data)

    # Check file size (new 50MB limit)
    if file_size > MAX_FILE_SIZE:
        raise ValueError("File too large. Maximum size is 50MB")

    # ... existing validation code ...

    # Check for animated GIF (reject if >10MB since we can't resize)
    if img.format == 'GIF':
        try:
            img.seek(1)
            # It's animated
            if file_size > MAX_OUTPUT_SIZE:
                raise ValueError(
                    "Animated GIF too large. Maximum size for animated GIFs is 10MB. "
                    "Consider using a shorter clip or lower resolution."
                )
            img.seek(0)
        except EOFError:
            # Not animated, continue normally
            pass

    return mime_type, width, height

Modified function: optimize_image()

Changes:

  • Accept original_size parameter
  • Implement tiered resize strategy
  • Add iterative quality reduction loop
def optimize_image(image_data: bytes, original_size: int = None) -> Tuple[Image.Image, int, int, bytes]:
    """
    Optimize image for web display with size-aware strategy

    Args:
        image_data: Raw image bytes
        original_size: Original file size (for tiered optimization)

    Returns:
        Tuple of (optimized_image, width, height, optimized_bytes)

    Raises:
        ValueError: If image cannot be optimized to target size
    """
    if original_size is None:
        original_size = len(image_data)

    # Get initial optimization parameters
    max_dim, quality = get_optimization_params(original_size)

    img = Image.open(io.BytesIO(image_data))
    img = ImageOps.exif_transpose(img) if img.format != 'GIF' else img

    # Iterative optimization loop
    while True:
        # Create copy for this iteration
        work_img = img.copy()

        # Resize if needed
        if max(work_img.size) > max_dim:
            work_img.thumbnail((max_dim, max_dim), Image.Resampling.LANCZOS)

        # Save to bytes to check size
        output = io.BytesIO()
        save_format = work_img.format or 'JPEG'
        save_kwargs = {'optimize': True}

        if save_format in ['JPEG', 'JPG']:
            save_kwargs['quality'] = quality
        elif save_format == 'WEBP':
            save_kwargs['quality'] = quality

        work_img.save(output, format=save_format, **save_kwargs)
        output_bytes = output.getvalue()

        # Check output size
        if len(output_bytes) <= MAX_OUTPUT_SIZE:
            width, height = work_img.size
            return work_img, width, height, output_bytes

        # Need to reduce further
        if quality > MIN_QUALITY:
            # Reduce quality first
            quality -= 5
        else:
            # Already at min quality, reduce dimensions
            max_dim = int(max_dim * 0.8)
            quality = 85  # Reset quality for new dimension

            # Safety check: minimum dimension
            if max_dim < 640:
                raise ValueError(
                    "Image cannot be optimized to target size. "
                    "Please use a smaller or lower-resolution image."
                )

Modified function: save_media()

Changes:

  • Compute file_size = len(file_data) after validation (signature unchanged)
  • Pass original size to optimize_image()
  • Use returned bytes directly instead of re-saving
def save_media(file_data: bytes, filename: str) -> Dict:
    """Save uploaded media file with size-aware optimization"""

    # Validate image (returns 3-tuple, unchanged signature)
    mime_type, orig_width, orig_height = validate_image(file_data, filename)

    # Compute file size for optimization strategy
    file_size = len(file_data)

    # Optimize image with size-aware strategy
    optimized_img, width, height, optimized_bytes = optimize_image(file_data, file_size)

    # ... generate filename and path ...

    # Write optimized bytes directly (already saved during optimization)
    full_path.write_bytes(optimized_bytes)

    # ... database insert and return ...

Error Messages

  • "File too large. Maximum size is 50MB"
  • "Image cannot be optimized to target size. Please use a smaller or lower-resolution image."
  • "Animated GIF too large. Maximum size for animated GIFs is 10MB. Consider using a shorter clip or lower resolution."

Phase 2: Image Variants

Estimated Effort: 8-12 hours

Overview

Generate multiple renditions on upload for responsive image delivery and feed optimization.

Variant Specifications

Variant Dimensions Method Use Case
thumb 150x150 (square) Center crop Thumbnails, previews
small 320px width Aspect preserve Mobile, low bandwidth
medium 640px width Aspect preserve Standard display
large 1280px width Aspect preserve High-res display
original As uploaded (<=2048px) From optimization Full quality

Storage Structure

/data/media/2025/01/
    abc123.jpg           # Original/large (from optimization)
    abc123_medium.jpg    # 640px width
    abc123_small.jpg     # 320px width
    abc123_thumb.jpg     # 150x150 center crop

Database Schema

New table: media_variants

CREATE TABLE media_variants (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    media_id INTEGER NOT NULL,
    variant_type TEXT NOT NULL,  -- 'thumb', 'small', 'medium', 'large', 'original'
    path TEXT NOT NULL,
    width INTEGER NOT NULL,
    height INTEGER NOT NULL,
    size_bytes INTEGER NOT NULL,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (media_id) REFERENCES media(id) ON DELETE CASCADE,
    UNIQUE(media_id, variant_type)
);

CREATE INDEX idx_media_variants_media ON media_variants(media_id);

File Modifications

/home/phil/Projects/starpunk/starpunk/media.py

New constants:

# Variant specifications
VARIANT_SPECS = {
    'thumb': {'size': (150, 150), 'crop': True},
    'small': {'width': 320, 'crop': False},
    'medium': {'width': 640, 'crop': False},
    'large': {'width': 1280, 'crop': False},
}

New function: generate_variant()

def generate_variant(
    img: Image.Image,
    variant_type: str,
    base_path: Path,
    base_filename: str,
    file_ext: str
) -> Dict:
    """
    Generate a single image variant

    Args:
        img: Source PIL Image
        variant_type: One of 'thumb', 'small', 'medium', 'large'
        base_path: Directory to save to
        base_filename: Base filename (UUID without extension)
        file_ext: File extension (e.g., '.jpg')

    Returns:
        Dict with variant metadata (path, width, height, size_bytes)
    """
    spec = VARIANT_SPECS[variant_type]
    work_img = img.copy()

    if spec.get('crop'):
        # Center crop for thumbnails using ImageOps.fit()
        work_img = ImageOps.fit(
            work_img,
            spec['size'],
            method=Image.Resampling.LANCZOS,
            centering=(0.5, 0.5)
        )
    else:
        # Aspect-preserving resize
        target_width = spec['width']
        if work_img.width > target_width:
            ratio = target_width / work_img.width
            new_height = int(work_img.height * ratio)
            work_img = work_img.resize(
                (target_width, new_height),
                Image.Resampling.LANCZOS
            )

    # Generate variant filename
    variant_filename = f"{base_filename}_{variant_type}{file_ext}"
    variant_path = base_path / variant_filename

    # Save with appropriate quality
    save_kwargs = {'optimize': True}
    if work_img.format in ['JPEG', 'JPG', None]:
        save_kwargs['quality'] = 85

    # Determine format from extension
    save_format = 'JPEG' if file_ext.lower() in ['.jpg', '.jpeg'] else file_ext[1:].upper()
    work_img.save(variant_path, format=save_format, **save_kwargs)

    return {
        'variant_type': variant_type,
        'path': str(variant_path.relative_to(base_path.parent.parent.parent)),  # Relative to media root
        'width': work_img.width,
        'height': work_img.height,
        'size_bytes': variant_path.stat().st_size
    }

New function: generate_all_variants()

def generate_all_variants(
    img: Image.Image,
    base_path: Path,
    base_filename: str,
    file_ext: str,
    media_id: int,
    year: str,
    month: str,
    optimized_bytes: bytes
) -> List[Dict]:
    """
    Generate all variants for an image and store in database

    Args:
        img: Source PIL Image (the optimized original)
        base_path: Directory containing the original
        base_filename: Base filename (UUID without extension)
        file_ext: File extension
        media_id: ID of parent media record
        year: Year string (e.g., '2025') for path calculation
        month: Month string (e.g., '01') for path calculation
        optimized_bytes: Bytes of optimized original (avoids re-reading file)

    Returns:
        List of variant metadata dicts
    """
    from starpunk.database import get_db

    variants = []
    db = get_db(current_app)
    created_files = []  # Track files for cleanup on failure

    try:
        # Generate each variant type
        for variant_type in ['thumb', 'small', 'medium', 'large']:
            # Skip if image is smaller than target
            spec = VARIANT_SPECS[variant_type]
            target_width = spec.get('width') or spec['size'][0]

            if img.width < target_width and variant_type != 'thumb':
                continue  # Skip variants larger than original

            variant = generate_variant(img, variant_type, base_path, base_filename, file_ext)
            variants.append(variant)
            created_files.append(base_path / f"{base_filename}_{variant_type}{file_ext}")

            # Insert into database
            db.execute(
                """
                INSERT INTO media_variants
                (media_id, variant_type, path, width, height, size_bytes)
                VALUES (?, ?, ?, ?, ?, ?)
                """,
                (media_id, variant['variant_type'], variant['path'],
                 variant['width'], variant['height'], variant['size_bytes'])
            )

        # Also record the original as 'original' variant
        # Use explicit year/month for path calculation (avoids fragile parent traversal)
        original_path = f"{year}/{month}/{base_filename}{file_ext}"
        db.execute(
            """
            INSERT INTO media_variants
            (media_id, variant_type, path, width, height, size_bytes)
            VALUES (?, ?, ?, ?, ?, ?)
            """,
            (media_id, 'original', original_path, img.width, img.height,
             len(optimized_bytes))  # Use passed bytes instead of file I/O
        )

        db.commit()
        return variants

    except Exception as e:
        # Clean up any created variant files on failure
        for file_path in created_files:
            try:
                if file_path.exists():
                    file_path.unlink()
            except OSError:
                pass  # Best effort cleanup
        raise  # Re-raise the original exception

Modified function: save_media()

Add variant generation after saving original:

def save_media(file_data: bytes, filename: str) -> Dict:
    """Save uploaded media file with variants"""

    # ... existing validation and optimization ...
    # (optimized_bytes is returned from optimize_image())

    # Generate path components (year/month already computed for file path)
    year = now.strftime('%Y')
    month = now.strftime('%m')

    # Save optimized original
    full_path.write_bytes(optimized_bytes)

    # ... database insert for media table ...

    # Generate variants (synchronous)
    # Pass year, month, and optimized_bytes to avoid fragile path traversal and file I/O
    base_filename = stored_filename.rsplit('.', 1)[0]
    variants = generate_all_variants(
        optimized_img,
        full_dir,
        base_filename,
        file_ext,
        media_id,
        year,
        month,
        optimized_bytes
    )

    return {
        'id': media_id,
        # ... existing fields ...
        'variants': variants
    }

Modified function: get_note_media()

Include variants in response (only when they exist for backwards compatibility):

def get_note_media(note_id: int) -> List[Dict]:
    """Get all media attached to a note with variants"""

    # ... existing query ...

    media_list = []
    for row in rows:
        media_dict = {
            # ... existing fields ...
        }

        # Fetch variants for this media
        variants = db.execute(
            """
            SELECT variant_type, path, width, height, size_bytes
            FROM media_variants
            WHERE media_id = ?
            ORDER BY
                CASE variant_type
                    WHEN 'thumb' THEN 1
                    WHEN 'small' THEN 2
                    WHEN 'medium' THEN 3
                    WHEN 'large' THEN 4
                    WHEN 'original' THEN 5
                END
            """,
            (row[0],)
        ).fetchall()

        # Only add 'variants' key if variants exist (backwards compatibility)
        # Pre-v1.4.0 media won't have variants, and consumers shouldn't
        # expect the key to be present
        if variants:
            media_dict['variants'] = {
                v[0]: {
                    'path': v[1],
                    'width': v[2],
                    'height': v[3],
                    'size_bytes': v[4]
                }
                for v in variants
            }

        media_list.append(media_dict)

    return media_list

Migration File

/home/phil/Projects/starpunk/migrations/009_add_media_variants.sql

-- Migration 009: Add media variants support
-- Version: 1.4.0 Phase 2
-- Per ADR-059: Full Feed Media Standardization (Phase A)

-- Media variants table for multiple image sizes
CREATE TABLE IF NOT EXISTS media_variants (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    media_id INTEGER NOT NULL,
    variant_type TEXT NOT NULL,  -- 'thumb', 'small', 'medium', 'large', 'original'
    path TEXT NOT NULL,
    width INTEGER NOT NULL,
    height INTEGER NOT NULL,
    size_bytes INTEGER NOT NULL,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (media_id) REFERENCES media(id) ON DELETE CASCADE,
    UNIQUE(media_id, variant_type)
);

-- Index for efficient variant lookup
CREATE INDEX IF NOT EXISTS idx_media_variants_media ON media_variants(media_id);

Phase 3: Micropub Media Endpoint

Estimated Effort: 6-8 hours

Overview

Implement W3C Micropub media endpoint for external client uploads.

Endpoint Specification

Endpoint: POST /micropub/media

Request:

  • Content-Type: multipart/form-data
  • Single file part named file
  • Authorization: Bearer token with create scope (no new scope needed)

Response:

  • Success: 201 Created with Location header
  • Error: JSON with error and error_description

File Modifications

/home/phil/Projects/starpunk/starpunk/routes/micropub.py

New route: /micropub/media

@bp.route('/media', methods=['POST'])
def media_endpoint():
    """
    Micropub media endpoint for file uploads

    W3C Micropub Specification compliant media upload.
    Accepts multipart/form-data with single file part named 'file'.

    Returns:
        201 Created with Location header on success
        4xx/5xx error responses per OAuth 2.0 format
    """
    from starpunk.media import save_media, ALLOWED_MIME_TYPES

    # Extract and verify token
    token = extract_bearer_token(request)
    if not token:
        return error_response("unauthorized", "No access token provided", 401)

    token_info = verify_external_token(token)
    if not token_info:
        return error_response("unauthorized", "Invalid or expired access token", 401)

    # Check scope (create scope allows media upload)
    if not check_scope("create", token_info.get("scope", "")):
        return error_response(
            "insufficient_scope",
            "Token lacks create scope",
            403
        )

    # Validate content type
    content_type = request.headers.get("Content-Type", "")
    if "multipart/form-data" not in content_type:
        return error_response(
            "invalid_request",
            "Content-Type must be multipart/form-data",
            400
        )

    # Extract file
    if 'file' not in request.files:
        return error_response(
            "invalid_request",
            "No file provided. Use 'file' as the form field name.",
            400
        )

    uploaded_file = request.files['file']

    if not uploaded_file.filename:
        return error_response(
            "invalid_request",
            "No filename provided",
            400
        )

    try:
        # Read file data
        file_data = uploaded_file.read()

        # Save media (validates, optimizes, generates variants)
        media = save_media(file_data, uploaded_file.filename)

        # Build media URL
        site_url = current_app.config.get("SITE_URL", "http://localhost:5000")
        media_url = f"{site_url}/media/{media['path']}"

        # Return 201 with Location header (per W3C Micropub spec)
        response = make_response("", 201)
        response.headers["Location"] = media_url
        return response

    except ValueError as e:
        # Validation errors (file too large, invalid format, etc.)
        return error_response("invalid_request", str(e), 400)

    except Exception as e:
        current_app.logger.error(f"Media upload failed: {e}")
        return error_response("server_error", "Failed to process upload", 500)

Import additions at top of file:

from flask import Blueprint, current_app, request, make_response  # NOTE: Add make_response to existing imports
from starpunk.auth_external import verify_external_token, check_scope

/home/phil/Projects/starpunk/starpunk/micropub.py

Modified function: handle_query() - Update q=config response

def handle_query(args: dict, token_info: dict):
    """Handle Micropub query endpoints"""
    q = args.get("q")

    if q == "config":
        # Return server configuration with media endpoint
        site_url = current_app.config.get("SITE_URL", "http://localhost:5000")
        config = {
            "media-endpoint": f"{site_url}/micropub/media",  # NEW: Advertise media endpoint
            "syndicate-to": [],
            "post-types": [
                {"type": "note", "name": "Note", "properties": ["content"]},
                {"type": "photo", "name": "Photo", "properties": ["photo"]}  # NEW
            ],
        }
        return jsonify(config), 200

    # ... rest of handle_query unchanged ...

New function: extract_photos()

def extract_photos(properties: dict) -> List[Dict[str, str]]:
    """
    Extract photo URLs and alt text from Micropub properties

    Handles both simple URL strings and structured photo objects with alt text.

    Args:
        properties: Normalized Micropub properties dict

    Returns:
        List of dicts with 'url' and optional 'alt' keys

    Examples:
        >>> # Simple URL
        >>> extract_photos({'photo': ['https://example.com/photo.jpg']})
        [{'url': 'https://example.com/photo.jpg', 'alt': ''}]

        >>> # With alt text
        >>> extract_photos({'photo': [{'value': 'https://example.com/photo.jpg', 'alt': 'Sunset'}]})
        [{'url': 'https://example.com/photo.jpg', 'alt': 'Sunset'}]
    """
    photos = properties.get("photo", [])
    result = []

    for photo in photos:
        if isinstance(photo, str):
            # Simple URL string
            result.append({'url': photo, 'alt': ''})
        elif isinstance(photo, dict):
            # Structured object with value and alt
            url = photo.get('value') or photo.get('url', '')
            alt = photo.get('alt', '')
            if url:
                result.append({'url': url, 'alt': alt})

    return result

Modified function: handle_create()

Add photo property handling:

def handle_create(data: dict, token_info: dict):
    """Handle Micropub create action"""

    # ... existing scope check and property extraction ...

    # Extract photos (NEW)
    photos = extract_photos(properties)

    # Create note
    try:
        note = create_note(
            content=content,
            published=True,
            created_at=published_date,
            custom_slug=custom_slug,
            tags=tags if tags else None
        )

        # Attach photos if present (NEW)
        if photos:
            _attach_photos_to_note(note.id, photos)

        # ... rest unchanged ...

New function: _attach_photos_to_note()

def _attach_photos_to_note(note_id: int, photos: List[Dict[str, str]]) -> None:
    """
    Attach photos to a note by URL

    Photos must already exist on this server (uploaded via media endpoint).
    External URLs are accepted but stored as-is (no download).

    Args:
        note_id: ID of the note to attach to
        photos: List of dicts with 'url' and 'alt' keys
    """
    from starpunk.database import get_db
    from starpunk.media import attach_media_to_note
    from urllib.parse import urlparse

    # Normalize SITE_URL by stripping trailing slash for consistent comparison
    site_url = current_app.config.get("SITE_URL", "http://localhost:5000").rstrip('/')
    db = get_db(current_app)

    media_ids = []
    captions = []

    # Log warning if photos are being truncated
    if len(photos) > 4:
        current_app.logger.warning(
            f"Micropub create received {len(photos)} photos, truncating to 4 per ADR-057"
        )

    for photo in photos[:4]:  # Max 4 photos per ADR-057
        url = photo['url']
        alt = photo.get('alt', '')

        # Check if URL is on our server
        if url.startswith(site_url) or url.startswith('/media/'):
            # Extract path from URL
            if url.startswith(site_url):
                path = url[len(site_url):]
            else:
                path = url

            # Remove leading /media/ if present
            if path.startswith('/media/'):
                path = path[7:]

            # Look up media by path
            row = db.execute(
                "SELECT id FROM media WHERE path = ?",
                (path,)
            ).fetchone()

            if row:
                media_ids.append(row[0])
                captions.append(alt)
            else:
                current_app.logger.warning(f"Photo URL not found in media: {url}")
        else:
            # External URL - log but don't fail
            current_app.logger.info(f"External photo URL ignored: {url}")

    if media_ids:
        attach_media_to_note(note_id, media_ids, captions)

Phase 4: Enhanced Feed Media

Estimated Effort: 6-8 hours

Overview

Implement complete Media RSS specification with multiple image sizes and enhance JSON Feed with variant information.

RSS Feed Enhancements

/home/phil/Projects/starpunk/starpunk/feeds/rss.py

Changes to generate_rss_streaming():

def generate_rss_streaming(
    site_url: str,
    site_name: str,
    site_description: str,
    notes: list[Note],
    limit: int = 50,
):
    """Generate RSS 2.0 with full Media RSS support"""

    # ... existing header generation ...

    for note in notes[:limit]:
        # ... existing item generation ...

        # Enhanced media handling with variants
        if hasattr(note, 'media') and note.media:
            for media_item in note.media:
                variants = media_item.get('variants', {})

                # Use media:group for multiple sizes of same image
                if variants:
                    item_xml += '\n      <media:group>'

                    # Determine which variant is the default (largest available)
                    # Fallback order: large -> medium -> small
                    default_variant = None
                    for fallback in ['large', 'medium', 'small']:
                        if fallback in variants:
                            default_variant = fallback
                            break

                    # Add each variant as media:content
                    for variant_type in ['large', 'medium', 'small']:
                        if variant_type in variants:
                            v = variants[variant_type]
                            media_url = f"{site_url}/media/{v['path']}"
                            is_default = 'true' if variant_type == default_variant else 'false'
                            item_xml += f'''
        <media:content url="{_escape_xml(media_url)}"
                       type="{media_item.get('mime_type', 'image/jpeg')}"
                       medium="image"
                       isDefault="{is_default}"
                       width="{v['width']}"
                       height="{v['height']}"
                       fileSize="{v['size_bytes']}"/>'''

                    item_xml += '\n      </media:group>'

                    # Add media:thumbnail
                    if 'thumb' in variants:
                        thumb = variants['thumb']
                        thumb_url = f"{site_url}/media/{thumb['path']}"
                        item_xml += f'''
      <media:thumbnail url="{_escape_xml(thumb_url)}"
                       width="{thumb['width']}"
                       height="{thumb['height']}"/>'''

                    # Add media:title for caption
                    if media_item.get('caption'):
                        item_xml += f'''
      <media:title type="plain">{_escape_xml(media_item['caption'])}</media:title>'''

                else:
                    # Fallback for media without variants (legacy)
                    media_url = f"{site_url}/media/{media_item['path']}"
                    item_xml += f'''
      <media:content url="{_escape_xml(media_url)}"
                     type="{media_item.get('mime_type', 'image/jpeg')}"
                     medium="image"
                     fileSize="{media_item.get('size', 0)}"/>'''

        # ... rest of item generation ...

JSON Feed Enhancements

/home/phil/Projects/starpunk/starpunk/feeds/json_feed.py

Changes to _build_item_object():

def _build_item_object(site_url: str, note: Note) -> Dict[str, Any]:
    """Build JSON Feed item with enhanced media support"""

    # ... existing item construction ...

    # Enhanced _starpunk extension with variants
    # about URL is configurable via STARPUNK_ABOUT_URL config, with sensible default
    about_url = current_app.config.get(
        "STARPUNK_ABOUT_URL",
        "https://github.com/yourusername/starpunk"
    )
    starpunk_ext = {
        "permalink_path": note.permalink,
        "word_count": len(note.content.split()),
        "about": about_url  # Extension info URL (configurable)
    }

    # Add media variants if present
    if hasattr(note, 'media') and note.media:
        media_variants = []

        for media_item in note.media:
            variants = media_item.get('variants', {})

            if variants:
                media_info = {
                    "caption": media_item.get('caption', ''),
                    "variants": {}
                }

                for variant_type, variant_data in variants.items():
                    media_info["variants"][variant_type] = {
                        "url": f"{site_url}/media/{variant_data['path']}",
                        "width": variant_data['width'],
                        "height": variant_data['height'],
                        "size_in_bytes": variant_data['size_bytes']
                    }

                media_variants.append(media_info)

        if media_variants:
            starpunk_ext["media_variants"] = media_variants

    item["_starpunk"] = starpunk_ext

    return item

ATOM Feed Enhancements

/home/phil/Projects/starpunk/starpunk/feeds/atom.py

Changes to generate_atom_streaming():

Add enhanced enclosure links with proper attributes:

def generate_atom_streaming(...):
    # ... existing generation ...

    for note in notes[:limit]:
        # ... existing entry generation ...

        # Enhanced media enclosures with title attribute
        if hasattr(note, 'media') and note.media:
            for item in note.media:
                media_url = f"{site_url}/media/{item['path']}"
                mime_type = item.get('mime_type', 'image/jpeg')
                size = item.get('size', 0)
                caption = item.get('caption', '')

                # Include title attribute for caption
                title_attr = f' title="{_escape_xml(caption)}"' if caption else ''

                yield f'    <link rel="enclosure" type="{_escape_xml(mime_type)}" href="{_escape_xml(media_url)}" length="{size}"{title_attr}/>\n'

Phase 5: Testing & Documentation

Estimated Effort: 4-6 hours

Test Requirements

Unit Tests

/home/phil/Projects/starpunk/tests/test_media_v140.py

"""Tests for v1.4.0 media features"""

import pytest
from io import BytesIO
from PIL import Image


class TestLargeImageSupport:
    """Tests for large image (>10MB) handling"""

    def test_accept_file_up_to_50mb(self, app, client):
        """Files up to 50MB should be accepted"""
        pass

    def test_reject_file_over_50mb(self, app, client):
        """Files over 50MB should be rejected"""
        pass

    def test_tiered_resize_10_to_25mb(self, app, client):
        """Files 10-25MB should resize to 1600px max"""
        pass

    def test_tiered_resize_25_to_50mb(self, app, client):
        """Files 25-50MB should resize to 1280px max"""
        pass

    def test_iterative_quality_reduction(self, app, client):
        """Quality should reduce iteratively if output >10MB"""
        pass

    def test_reject_if_cannot_optimize(self, app, client):
        """Reject if optimization cannot achieve target size"""
        pass


class TestImageVariants:
    """Tests for image variant generation"""

    def test_generate_all_variants(self, app, client):
        """All four variants should be generated"""
        pass

    def test_thumb_is_square_crop(self, app, client):
        """Thumbnail should be 150x150 center crop"""
        pass

    def test_small_preserves_aspect(self, app, client):
        """Small variant should preserve aspect ratio"""
        pass

    def test_variants_stored_in_database(self, app, client):
        """Variants should be recorded in media_variants table"""
        pass

    def test_get_note_media_includes_variants(self, app, client):
        """get_note_media() should include variant data"""
        pass

    def test_skip_variant_larger_than_original(self, app, client):
        """Skip generating variants larger than source"""
        pass


class TestMicropubMediaEndpoint:
    """Tests for /micropub/media endpoint"""

    def test_upload_success_returns_201(self, app, client):
        """Successful upload returns 201 with Location header"""
        pass

    def test_upload_requires_auth(self, app, client):
        """Upload without token returns 401"""
        pass

    def test_upload_requires_create_scope(self, app, client):
        """Upload without create scope returns 403"""
        pass

    def test_upload_validates_content_type(self, app, client):
        """Non-multipart requests return 400"""
        pass

    def test_upload_requires_file_field(self, app, client):
        """Missing 'file' field returns 400"""
        pass

    def test_config_query_includes_media_endpoint(self, app, client):
        """q=config should include media-endpoint URL"""
        pass


class TestPhotoProperty:
    """Tests for Micropub photo property handling"""

    def test_photo_url_string(self, app, client):
        """Simple URL string in photo property"""
        pass

    def test_photo_with_alt_text(self, app, client):
        """Photo object with value and alt"""
        pass

    def test_multiple_photos(self, app, client):
        """Multiple photos in photo array"""
        pass

    def test_max_four_photos(self, app, client):
        """Only first 4 photos should be attached"""
        pass

    def test_external_url_logged_not_failed(self, app, client):
        """External URLs should log but not fail"""
        pass


class TestFeedMediaEnhancements:
    """Tests for enhanced feed media support"""

    def test_rss_media_group(self, app, client):
        """RSS should use media:group for variants"""
        pass

    def test_rss_media_thumbnail(self, app, client):
        """RSS should include media:thumbnail"""
        pass

    def test_rss_media_title_for_caption(self, app, client):
        """RSS should include media:title for captions"""
        pass

    def test_json_feed_starpunk_variants(self, app, client):
        """JSON Feed should include variants in _starpunk"""
        pass

    def test_json_feed_about_url(self, app, client):
        """JSON Feed _starpunk should include about URL"""
        pass

    def test_atom_enclosure_title(self, app, client):
        """ATOM enclosures should have title attribute"""
        pass

Integration Tests

/home/phil/Projects/starpunk/tests/integration/test_media_workflow.py

"""Integration tests for complete media workflow"""

class TestMediaWorkflow:
    """End-to-end media upload and display"""

    def test_upload_via_micropub_display_in_feed(self, app, client):
        """Upload via /micropub/media, create note with photo, verify feed"""
        pass

    def test_large_image_complete_workflow(self, app, client):
        """Upload large image, verify resize, verify variants, verify feed"""
        pass

Documentation Updates

  1. Update /home/phil/Projects/starpunk/docs/architecture/syndication-architecture.md

    • Add Media RSS variant support
    • Document _starpunk extension format
  2. Update /home/phil/Projects/starpunk/CHANGELOG.md

    • Add v1.4.0 section with all features
  3. Update /home/phil/Projects/starpunk/docs/standards/testing-checklist.md

    • Add media upload validation steps
    • Add feed validation for Media RSS

Database Schema

Complete Migration SQL

File: /home/phil/Projects/starpunk/migrations/009_add_media_variants.sql

-- Migration 009: Add media variants support
-- Version: 1.4.0 Phase 2
-- Per ADR-059: Full Feed Media Standardization (Phase A)

-- Media variants table for multiple image sizes
-- Each uploaded image gets thumb, small, medium, large, and original variants
CREATE TABLE IF NOT EXISTS media_variants (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    media_id INTEGER NOT NULL,
    variant_type TEXT NOT NULL CHECK (variant_type IN ('thumb', 'small', 'medium', 'large', 'original')),
    path TEXT NOT NULL,           -- Relative path: YYYY/MM/uuid_variant.ext
    width INTEGER NOT NULL,
    height INTEGER NOT NULL,
    size_bytes INTEGER NOT NULL,
    created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (media_id) REFERENCES media(id) ON DELETE CASCADE,
    UNIQUE(media_id, variant_type)
);

-- Index for efficient variant lookup by media ID
CREATE INDEX IF NOT EXISTS idx_media_variants_media ON media_variants(media_id);

Schema Diagram

+----------------+       +---------------+       +------------------+
|     notes      |       |  note_media   |       |      media       |
+----------------+       +---------------+       +------------------+
| id (PK)        |<------| note_id (FK)  |       | id (PK)          |
| slug           |       | media_id (FK) |------>| filename         |
| file_path      |       | display_order |       | stored_filename  |
| published      |       | caption       |       | path             |
| created_at     |       +---------------+       | mime_type        |
| updated_at     |                               | size             |
| deleted_at     |                               | width            |
| content_hash   |                               | height           |
+----------------+                               | uploaded_at      |
                                                 +------------------+
                                                         |
                                                         | 1:N
                                                         v
                                                 +------------------+
                                                 | media_variants   |
                                                 +------------------+
                                                 | id (PK)          |
                                                 | media_id (FK)    |
                                                 | variant_type     |
                                                 | path             |
                                                 | width            |
                                                 | height           |
                                                 | size_bytes       |
                                                 | created_at       |
                                                 +------------------+

API Specifications

Micropub Media Endpoint

Endpoint: POST /micropub/media

Request:

POST /micropub/media HTTP/1.1
Host: example.com
Authorization: Bearer {token}
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary

------WebKitFormBoundary
Content-Disposition: form-data; name="file"; filename="photo.jpg"
Content-Type: image/jpeg

[binary data]
------WebKitFormBoundary--

Success Response:

HTTP/1.1 201 Created
Location: https://example.com/media/2025/01/abc123-def456.jpg

Error Responses:

Status Error Description
400 invalid_request Missing file, invalid format, file too large
401 unauthorized No token or invalid token
403 insufficient_scope Token lacks create scope
500 server_error Processing failure

Micropub Config Query

Request:

GET /micropub?q=config HTTP/1.1
Authorization: Bearer {token}

Response:

{
  "media-endpoint": "https://example.com/micropub/media",
  "syndicate-to": [],
  "post-types": [
    {"type": "note", "name": "Note", "properties": ["content"]},
    {"type": "photo", "name": "Photo", "properties": ["photo"]}
  ]
}

Photo Property in Micropub Create

Simple URL:

POST /micropub HTTP/1.1
Content-Type: application/x-www-form-urlencoded

h=entry&content=My+photo&photo=https://example.com/media/2025/01/abc.jpg

JSON with Alt Text:

{
  "type": ["h-entry"],
  "properties": {
    "content": ["My photo post"],
    "photo": [{
      "value": "https://example.com/media/2025/01/abc.jpg",
      "alt": "A beautiful sunset over the ocean"
    }]
  }
}

File Modifications Summary

Files to Create

File Purpose
/home/phil/Projects/starpunk/migrations/009_add_media_variants.sql Database migration for variants table
/home/phil/Projects/starpunk/tests/test_media_v140.py Unit tests for new features

Files to Modify

File Changes
/home/phil/Projects/starpunk/starpunk/media.py Large image support, variant generation
/home/phil/Projects/starpunk/starpunk/micropub.py Photo property extraction, config update
/home/phil/Projects/starpunk/starpunk/routes/micropub.py New media endpoint route
/home/phil/Projects/starpunk/starpunk/feeds/rss.py Media RSS enhancements
/home/phil/Projects/starpunk/starpunk/feeds/json_feed.py Variant info in _starpunk
/home/phil/Projects/starpunk/starpunk/feeds/atom.py Enclosure title attribute
/home/phil/Projects/starpunk/starpunk/migrations.py Add migration 009 detection (if needed)

Developer Q&A

General Questions

Q1: What happens to existing media files when upgrading to v1.4.0?

A: Existing media files continue to work unchanged. Variants are only generated for new uploads after upgrading. Existing media will show in feeds without variant information - feeds gracefully handle both cases.

Q2: Can I retroactively generate variants for existing media?

A: Not automatically. A management command could be added post-release if needed, but it's not in scope for v1.4.0.

Q3: How much additional storage do variants use?

A: Approximately 4x per image:

  • Original: 100%
  • Large (1280px): ~50%
  • Medium (640px): ~25%
  • Small (320px): ~12%
  • Thumb (150x150): ~3%

For a typical 500KB optimized image, expect ~900KB total with variants.

Large Image Support

Q4: What if a user uploads a 45MB image that still can't fit in 10MB after optimization?

A: The iterative optimization will:

  1. Try resize to 1280px at 85% quality
  2. Reduce quality to 80%, 75%, 70%
  3. If still >10MB, reduce dimensions to 1024px at 85%
  4. Continue until success or minimum (640px at 70%)
  5. If 640px at 70% still >10MB, reject with error

This handles extreme edge cases like uncompressed TIFFs converted to JPEG.

Q5: Is the 50MB limit configurable?

A: In v1.4.0, it's a constant. Configuration could be added later if needed.

Micropub Media Endpoint

Q6: Do I need a new token scope for media uploads?

A: No. The existing create scope is sufficient. Per the confirmed decisions, tokens with create scope can upload media.

Q7: What happens if a Micropub client sends a photo URL that doesn't exist on my server?

A: The URL is logged and ignored. The note is still created, but without that photo attached. This prevents failures when clients reference external URLs.

Q8: Can I upload multiple files in one request?

A: No. The W3C Micropub spec defines a single file per request. Upload multiple files with multiple requests, then reference all URLs in the create request's photo property.

Q9: What's the maximum number of photos per note?

A: 4 photos, per ADR-057. This matches Twitter/Mastodon limits.

Feed Enhancements

Q10: How do feed readers handle the media:group element?

A: Most modern feed readers (Feedly, Inoreader, NewsBlur) understand Media RSS and will:

  • Use the isDefault="true" variant for display
  • Allow users to view other sizes
  • Show thumbnails in list views

Older readers ignore the namespace and fall back to the HTML in description.

Q11: What's the _starpunk.about URL for?

A: Per JSON Feed extension best practices, custom namespaces should include an about URL that documents the extension. This helps consumers understand the data format.

Q12: Will Media RSS validation pass after these changes?

A: Yes. The implementation follows the Media RSS specification at https://www.rssboard.org/media-rss. Run the W3C Feed Validator to confirm.

Implementation Order

Q13: Can phases be implemented out of order?

A: Phases 1 and 2 should be done together (variants depend on large image support changes to save_media()). Phase 3 (Micropub) can be done independently. Phase 4 (feeds) requires Phase 2 completion.

Q14: What's the minimum viable v1.4.0?

A: If time is constrained, Phase 1 (large image support) alone provides significant user value and can ship independently. Other phases can be moved to v1.4.1.

Testing

Q15: How do I test with large images without storing them in the repo?

A: Generate test images programmatically:

from PIL import Image
import io
import numpy as np

def create_test_image(width, height, target_size_mb):
    """Create a test image of approximate target size"""
    # Create image with random noise to prevent JPEG compression from
    # shrinking it too much. Solid colors compress extremely well.
    noise = np.random.randint(0, 256, (height, width, 3), dtype=np.uint8)
    img = Image.fromarray(noise, 'RGB')

    # Iteratively adjust quality to hit target size
    quality = 95
    while quality > 50:
        output = io.BytesIO()
        img.save(output, 'JPEG', quality=quality)
        size_mb = len(output.getvalue()) / (1024 * 1024)
        if size_mb >= target_size_mb * 0.9:  # Within 10% of target
            return output.getvalue()
        # Need larger file, but can't increase noise, so use higher resolution
        break

    output = io.BytesIO()
    img.save(output, 'JPEG', quality=95)
    return output.getvalue()

# Example usage in tests:
# large_image = create_test_image(4000, 3000, 15)  # ~15MB test image

Q16: How do I validate Media RSS output?

A: Use the W3C Feed Validator (https://validator.w3.org/feed/) and verify:

  • No errors for media: namespace elements
  • Proper attribute validation
  • Valid XML structure

Acceptance Criteria

Phase 1: Large Image Support

  • Files up to 50MB accepted
  • Files >50MB rejected with clear error
  • Tiered resize strategy applied based on input size
  • Iterative quality reduction works for edge cases
  • Final output always <=10MB
  • All existing tests pass

Phase 2: Image Variants

  • Migration 009 creates media_variants table
  • All four variants generated on upload
  • Thumbnail is center-cropped square
  • Variants smaller than source not generated
  • get_note_media() returns variant data
  • Variants cascade-deleted with parent media

Phase 3: Micropub Media Endpoint

  • POST /micropub/media accepts uploads
  • Returns 201 with Location header on success
  • Requires valid bearer token with create scope
  • q=config includes media-endpoint URL
  • Photo property attaches images to notes
  • Alt text preserved as caption

Phase 4: Enhanced Feed Media

  • RSS uses media:group for variants
  • RSS includes media:thumbnail
  • RSS includes media:title for captions
  • JSON Feed _starpunk includes variants
  • JSON Feed _starpunk includes about URL
  • ATOM enclosures have title attribute
  • All feeds validate without errors

Phase 5: Testing & Documentation

  • All new tests pass
  • Test coverage maintained >80%
  • CHANGELOG updated
  • Architecture docs updated
  • Version bumped to 1.4.0

Implementation Notes

SITE_URL Normalization

Throughout this implementation, SITE_URL should be normalized by stripping trailing slashes before use. This ensures consistent URL construction:

# Standard pattern for SITE_URL normalization
site_url = current_app.config.get("SITE_URL", "http://localhost:5000").rstrip('/')
media_url = f"{site_url}/media/{path}"

This pattern is used in:

  • _attach_photos_to_note() for URL comparison
  • Media endpoint for Location header
  • Feed generation for media URLs

Configuration Options

Config Key Default Description
SITE_URL http://localhost:5000 Base URL for the site
STARPUNK_ABOUT_URL https://github.com/yourusername/starpunk URL documenting the _starpunk JSON Feed extension

References