From 730eb8d58c4c44deff765fcd09fa27379b78f68c Mon Sep 17 00:00:00 2001 From: Phil Skentelbery Date: Tue, 16 Dec 2025 18:57:39 -0700 Subject: [PATCH] docs: Add v1.4.2 review documents and update backlog MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Fix stale docstring in media.py (4096 -> 12000) - Add developer review document - Add architect review document - Update backlog with identified issues πŸ€– Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- .../v1.4.2/2025-12-16-architect-review.md | 527 ++++++++++++++++++ .../v1.4.2/2025-12-16-developer-review.md | 360 ++++++++++++ docs/projectplan/BACKLOG.md | 107 +++- starpunk/media.py | 2 +- 4 files changed, 981 insertions(+), 15 deletions(-) create mode 100644 docs/design/v1.4.2/2025-12-16-architect-review.md create mode 100644 docs/design/v1.4.2/2025-12-16-developer-review.md diff --git a/docs/design/v1.4.2/2025-12-16-architect-review.md b/docs/design/v1.4.2/2025-12-16-architect-review.md new file mode 100644 index 0000000..f902e60 --- /dev/null +++ b/docs/design/v1.4.2/2025-12-16-architect-review.md @@ -0,0 +1,527 @@ +# Architect Review: v1.4.2 Implementation and End-to-End Media Pipeline + +**Date**: 2025-12-16 +**Architect**: Claude (StarPunk Architect Agent) +**Scope**: v1.4.2 HEIC support implementation + comprehensive media pipeline review +**Status**: Complete + +--- + +## Executive Summary + +The v1.4.2 implementation is architecturally sound and follows the established design patterns. The HEIC support was implemented cleanly with minimal code changes. However, the comprehensive end-to-end media pipeline review reveals several architectural concerns that should be addressed in future releases, ranging from security hardening to consistency improvements. + +**Overall Assessment**: Acceptable with recommendations for future improvement. + +--- + +## Part 1: v1.4.2 Implementation Review + +### 1.1 Design Decisions Assessment + +| Decision | Assessment | Notes | +|----------|------------|-------| +| D1: Convert at validation time | **Acceptable** | Keeps change minimal; conversion in `validate_image()` is logical since it normalizes input formats | +| D2: Convert to JPEG | **Acceptable** | JPEG has universal browser support; appropriate for photographic content | +| D3: Graceful degradation | **Good** | Conditional import with `HEIC_SUPPORTED` flag enables runtime flexibility | +| D4: Quality 95 for conversion | **Acceptable** | High quality preserved; subsequent optimization will reduce if needed | +| D5: Return signature change | **Acceptable** | 4-tuple return is clean; API change is internal-only | + +### 1.2 Issues Found in v1.4.2 + +#### Issue 1.2.1: Module-Level Documentation Stale (LOW) + +**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 1-12 + +**Problem**: The module docstring still states "4096x4096 max dimensions" but `MAX_DIMENSION` was updated to 12000 in v1.4.2. + +**Impact**: Documentation mismatch causes confusion; could lead to incorrect assumptions by future developers. + +**Recommendation**: Update docstring to reflect current limits: +``` +- 12000x12000 max input dimensions (v1.4.2) +``` + +#### Issue 1.2.2: Debug File Storage Without Cleanup (MEDIUM) + +**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 133-137 + +**Problem**: Failed uploads are saved to `data/debug/` directory for analysis, but there is no mechanism to clean up these files. Over time, this could consume significant disk space, especially if under attack. + +```python +debug_dir = Path(current_app.config.get('DATA_PATH', 'data')) / 'debug' +debug_dir.mkdir(parents=True, exist_ok=True) +debug_file = debug_dir / f"failed_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{filename}" +debug_file.write_bytes(file_data) +``` + +**Impact**: +- Disk space exhaustion risk +- Potential storage of malicious payloads +- No visibility into debug file accumulation + +**Recommendation**: +1. Add a configuration option to enable/disable debug file saving +2. Implement automatic cleanup (e.g., files older than 7 days) +3. Add disk space check before saving +4. Consider rate limiting debug file creation + +#### Issue 1.2.3: Filename in Debug Path Not Sanitized (MEDIUM) + +**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, line 135 + +**Problem**: The original filename is used directly in the debug file path without sanitization. + +```python +debug_file = debug_dir / f"failed_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{filename}" +``` + +**Impact**: Path traversal or special character issues could occur if filename contains malicious patterns (though `pathlib` provides some protection). + +**Recommendation**: Sanitize filename before use: +```python +safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50] +``` + +#### Issue 1.2.4: Explicit HEIC Read After Pillow Failure (LOW) + +**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 111-139 + +**Problem**: When Pillow fails to open a file, the code attempts to read it explicitly as HEIC. This is a workaround for iOS files with wrong extensions, but the error handling is complex and could be clearer. + +**Impact**: Code complexity; potential for subtle bugs in error paths. + +**Recommendation**: Consider refactoring to a cleaner pattern: +``` +1. Try Pillow standard open +2. If fails and HEIC_SUPPORTED, try explicit HEIC +3. If both fail, provide clear diagnostic +``` + +### 1.3 v1.4.2 Architecture Compliance + +| Principle | Compliance | Notes | +|-----------|------------|-------| +| Minimal code | **Yes** | 41 lines added for significant functionality | +| Standards first | **Yes** | HEIC conversion preserves IndieWeb compatibility | +| No lock-in | **Yes** | JPEG output is universal | +| Single responsibility | **Yes** | Validation handles input normalization | +| Documentation | **Partial** | Design doc complete; module docstring stale | + +--- + +## Part 2: End-to-End Media Pipeline Review + +### 2.1 Architecture Diagram + +``` + MEDIA PIPELINE ARCHITECTURE + =========================== + + USER UPLOAD FEED DELIVERY + =========== ============ + + +-------------------+ + Admin UI | | + /admin/new ---------> | validate_image | ----+ + POST multipart | | | + +-------------------+ | + | + +-------------------+ | +------------------+ + Micropub Endpoint | | | | | + /media POST --------> | save_media |<----+---->| SQLite DB | + multipart/form-data | | | - media | + +-------------------+ | - note_media | + | | - media_vars | + v +------------------+ + +-------------------+ ^ + | | | + | optimize_image | | + | (resize, compress)| | + +-------------------+ | + | | + v | + +-------------------+ | + | | | + | generate_variants |--------------------+ + | (thumb/sm/md/lg) | + +-------------------+ + | + v + +-------------------+ + | | + | FILESYSTEM | + | data/media/ | + | YYYY/MM/uuid.ext| + +-------------------+ + | + | + +----------------------------+----------------------------+ + | | | + v v v ++--------+ +--------+ +--------+ +| RSS | | ATOM | | JSON | +| /feed | | /feed | | /feed | +| .xml | | .atom | | .json | ++--------+ +--------+ +--------+ + | | | + +----------------------------+----------------------------+ + | + v + +-------------------+ + | | + | /media/ | + | Static serving | + +-------------------+ +``` + +### 2.2 Stage-by-Stage Analysis + +#### Stage 1: Upload Entry Points + +**Admin Upload (`/home/phil/Projects/starpunk/starpunk/routes/admin.py`, lines 112-136)** + +| Aspect | Assessment | Notes | +|--------|------------|-------| +| Authentication | **Good** | `@require_auth` decorator enforced | +| Input validation | **Partial** | Relies on `save_media`; no pre-check on file count | +| Error handling | **Good** | Per-file errors collected; partial success allowed | +| Content-Type check | **Missing** | No verification of `multipart/form-data` | + +**Issues**: +- No maximum file count enforced at route level (relies on downstream check) +- `request.files.getlist('media_files')` could be empty list with misleading behavior + +**Micropub Upload (`/home/phil/Projects/starpunk/starpunk/routes/micropub.py`, lines 124-202)** + +| Aspect | Assessment | Notes | +|--------|------------|-------| +| Authentication | **Good** | Token extraction and verification | +| Scope check | **Good** | Requires `create` scope | +| Content-Type check | **Good** | Explicit `multipart/form-data` verification | +| Input validation | **Good** | Single file field validated | +| Error handling | **Good** | OAuth 2.0 error format | + +**Issues**: +- None significant; well-implemented endpoint + +#### Stage 2: Validation (`validate_image`) + +**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 79-219 + +| Check | Order | Assessment | +|-------|-------|------------| +| File size | 1st | **Good** - Early rejection before processing | +| Pillow verify | 2nd | **Good** - Validates image integrity | +| HEIC fallback | 3rd | **Acceptable** - Complex but necessary | +| Format conversion | 4th | **Good** - HEIC/MPO to JPEG | +| MIME type check | 5th | **Good** - Whitelist approach | +| Dimension check | 6th | **Good** - Prevents memory issues | +| Animated GIF check | 7th | **Good** - Special handling for animations | + +**Issues Found**: + +##### Issue 2.2.1: No Content-Type Header Validation (MEDIUM) + +**Problem**: The upload content-type from the HTTP request is not validated against the detected image format. + +**Impact**: Potential for MIME type confusion attacks. A file uploaded as `image/png` could actually be a JPEG. + +**Recommendation**: Log warning when declared MIME type differs from detected format. + +##### Issue 2.2.2: Missing WebP Animation Detection (LOW) + +**Problem**: Animated GIF detection exists but animated WebP is not handled. + +**Impact**: Large animated WebP files could bypass the animated image size check. + +**Recommendation**: Add animated WebP detection similar to GIF handling. + +##### Issue 2.2.3: Pillow Decompression Bomb Protection (LOW) + +**Problem**: No explicit `Image.MAX_IMAGE_PIXELS` configuration. + +**Impact**: Pillow has default protection, but explicit configuration documents intent. + +**Recommendation**: Add explicit `Image.MAX_IMAGE_PIXELS` setting or document reliance on default. + +#### Stage 3: Optimization (`optimize_image`) + +**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 222-304 + +| Aspect | Assessment | Notes | +|--------|------------|-------| +| Tiered strategy | **Good** | Size-aware quality/dimension selection | +| EXIF handling | **Good** | Orientation correction applied | +| GIF passthrough | **Good** | Animated GIFs preserved | +| Iterative reduction | **Good** | Quality then dimension reduction | +| Safety limits | **Good** | MIN_DIMENSION prevents infinite loop | + +**Issues Found**: + +##### Issue 2.2.4: PNG Optimization Limited (LOW) + +**Problem**: PNG files only get `optimize=True` flag; no palette reduction or other optimizations. + +**Impact**: Large PNG files may not compress well. + +**Recommendation**: Consider PNG-specific optimization (pngquant integration) for future release. + +##### Issue 2.2.5: No WebP Quality Handling (LOW) + +**Problem**: WebP gets same quality treatment as JPEG but WebP quality values behave differently. + +**Impact**: WebP files may not be optimized as expected. + +**Recommendation**: Consider WebP-specific quality mapping. + +#### Stage 4: Variant Generation (`generate_all_variants`) + +**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 306-452 + +| Aspect | Assessment | Notes | +|--------|------------|-------| +| Spec definitions | **Good** | Clear variant specifications | +| Skip logic | **Good** | Smaller images skip larger variants | +| Cleanup on failure | **Good** | Created files removed on error | +| Database integration | **Good** | Variants recorded with dimensions | + +**Issues Found**: + +##### Issue 2.2.6: Transaction Not Atomic (MEDIUM) + +**Problem**: Files are created on disk before database commit. If database fails, files remain orphaned (cleanup only happens on exception within the loop). + +```python +try: + for variant_type in ['thumb', 'small', 'medium', 'large']: + variant = generate_variant(...) # File created + variants.append(variant) + created_files.append(...) + db.execute(...) # DB insert + + db.execute(...) # Original variant + db.commit() # <-- If this fails, files already exist +``` + +**Impact**: Database failure after file creation could leave orphaned files. + +**Recommendation**: Consider writing to temp location first, then moving after commit. + +##### Issue 2.2.7: Variant Path Calculation Fragile (LOW) + +**Problem**: Line 363 calculates relative path using parent traversal: +```python +'path': str(variant_path.relative_to(base_path.parent.parent)) +``` + +**Impact**: Dependent on directory structure assumptions. + +**Recommendation**: Use explicit path construction as done for original (lines 428-429). + +#### Stage 5: Storage and Serving + +**Location**: `/home/phil/Projects/starpunk/starpunk/routes/public.py`, lines 174-221 + +| Aspect | Assessment | Notes | +|--------|------------|-------| +| Path traversal protection | **Good** | Resolve and prefix check | +| Cache headers | **Good** | 1 year immutable cache | +| File serving | **Good** | Uses Flask's `send_from_directory` | + +**Issues Found**: + +##### Issue 2.2.8: Symlink Following (LOW) + +**Problem**: `resolve()` follows symlinks, which could potentially escape the media directory if symlinks exist within. + +**Impact**: Low risk since symlinks would need to be created manually. + +**Recommendation**: Add `strict=True` to resolve for Python 3.6+ or check `is_symlink()`. + +#### Stage 6: Feed Integration + +**Locations**: +- `/home/phil/Projects/starpunk/starpunk/feeds/rss.py` +- `/home/phil/Projects/starpunk/starpunk/routes/public.py` (`_get_cached_notes`) + +| Aspect | Assessment | Notes | +|--------|------------|-------| +| Media attachment | **Good** | Media loaded and attached to notes | +| URL construction | **Good** | Consistent absolute URLs | +| Media RSS | **Good** | Proper namespace and elements | +| Enclosure element | **Good** | First image for RSS 2.0 spec | +| Variant selection | **Good** | Fallback order for default variant | + +**Issues Found**: + +##### Issue 2.2.9: N+1 Query Pattern in Feed Generation (MEDIUM) + +**Problem**: In `_get_cached_notes()`, media and tags are loaded per-note in loops: + +```python +for note in notes: + media = get_note_media(note.id) # DB query per note + object.__setattr__(note, 'media', media) + tags = get_note_tags(note.id) # DB query per note +``` + +**Impact**: For 50 notes, this is 100 additional queries. Performance degrades with more notes. + +**Recommendation**: Implement batch loading: +```python +def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[Dict]]: + # Single query with WHERE note_id IN (...) +``` + +##### Issue 2.2.10: Caption Not Escaped in RSS (LOW) + +**Problem**: In RSS generation, caption is used directly in alt attribute: +```python +html_content += f'{caption}' +``` + +**Impact**: If caption contains `"` or other HTML special characters, could break markup. + +**Recommendation**: Use `html.escape()` for caption in HTML context. + +### 2.3 Security Assessment + +| Category | Status | Notes | +|----------|--------|-------| +| **Authentication** | **Good** | Admin routes protected; Micropub uses token auth | +| **Authorization** | **Good** | Scope checking on Micropub | +| **File Type Validation** | **Good** | Whitelist + Pillow verification | +| **Path Traversal** | **Good** | Protected in media serving | +| **File Size Limits** | **Good** | 50MB upload, 10MB output | +| **Dimension Limits** | **Good** | 12000px max input | +| **Filename Handling** | **Good** | UUID-based storage filenames | +| **Debug File Exposure** | **Needs Attention** | Debug files may contain malicious content | +| **DoS Protection** | **Partial** | Limits exist but no rate limiting | + +**Security Recommendations**: + +1. **Add rate limiting** on media upload endpoints (medium priority) +2. **Disable debug file saving** in production or add access controls (medium priority) +3. **Log MIME type mismatches** for security monitoring (low priority) +4. **Consider Content-Security-Policy** headers for served media (low priority) + +### 2.4 Performance Assessment + +| Operation | Assessment | Notes | +|-----------|------------|-------| +| Upload processing | **Acceptable** | In-memory processing; synchronous variant generation | +| Feed generation | **Needs Improvement** | N+1 query pattern | +| Media serving | **Good** | Static file serving with long cache | +| Caching | **Good** | Feed caching with ETag support | + +**Performance Recommendations**: + +1. **Implement batch media loading** for feeds (high priority) +2. **Consider async variant generation** for large uploads (low priority) +3. **Add database query logging** in development mode (low priority) + +--- + +## Part 3: Recommendations Summary + +### 3.1 Immediate Fixes (None Critical) + +No blocking issues found. The implementation is production-ready. + +### 3.2 Future Improvements + +#### High Priority + +| Item | Description | Effort | +|------|-------------|--------| +| N+1 Query Fix | Batch load media/tags for feeds | Small | +| Debug File Controls | Config option + cleanup mechanism | Small | + +#### Medium Priority + +| Item | Description | Effort | +|------|-------------|--------| +| Rate Limiting | Add upload rate limits | Medium | +| Caption Escaping | HTML escape in feed generation | Small | +| Filename Sanitization | Sanitize debug filenames | Small | +| Transaction Atomicity | Temp files before commit | Medium | + +#### Low Priority + +| Item | Description | Effort | +|------|-------------|--------| +| WebP Animation | Detect animated WebP | Small | +| PNG Optimization | Enhanced PNG compression | Medium | +| Decompression Bomb Config | Explicit pixel limit | Trivial | +| Symlink Handling | Stricter path resolution | Small | +| Module Docstring | Update documentation | Trivial | + +### 3.3 Technical Debt Register + +| Item | Location | Notes | +|------|----------|-------| +| Complex HEIC fallback logic | `media.py:111-139` | Works but could be cleaner | +| Variant path calculation | `media.py:363` | Uses parent traversal | +| Debug file accumulation | `media.py:133-137` | No cleanup mechanism | +| N+1 queries in feed | `public.py:68-74` | Performance impact | + +--- + +## Part 4: Architectural Observations + +### 4.1 Strengths + +1. **Clean separation of concerns**: Validation, optimization, variant generation are distinct functions +2. **Defensive programming**: Extensive try/except blocks with proper cleanup +3. **Standards compliance**: Good adherence to IndieWeb, RSS, and W3C specs +4. **Logging**: Comprehensive logging at INFO and WARNING levels +5. **Backward compatibility**: Variants are optional; pre-v1.4.0 media works + +### 4.2 Design Pattern Compliance + +| Pattern | Usage | Assessment | +|---------|-------|------------| +| Route Adapter | Feed generation | **Good** | +| Graceful Degradation | HEIC support | **Good** | +| Early Rejection | File size check | **Good** | +| Iterative Optimization | Quality reduction | **Good** | +| UUID-based Storage | Collision avoidance | **Good** | + +### 4.3 Areas for Future ADRs + +1. **Media Processing Strategy**: Consider documenting the full media pipeline as an ADR +2. **Debug/Diagnostic Data Handling**: Policy for storing failed uploads +3. **Performance Targets**: Document expected query counts and response times + +--- + +## Conclusion + +The v1.4.2 implementation successfully addresses the iPhone HEIC upload issue with minimal, clean changes. The broader media pipeline is well-architected with appropriate security controls and error handling. The main areas for improvement are: + +1. **Performance**: N+1 query pattern in feed generation should be addressed +2. **Operations**: Debug file management needs cleanup mechanism +3. **Security**: Rate limiting would harden upload endpoints + +None of these issues block the v1.4.2 release. They should be tracked in the project backlog for future releases. + +**Recommendation**: Accept v1.4.2 as implemented. Create backlog items for identified improvements. + +--- + +## Appendix: Files Reviewed + +| File | Lines | Purpose | +|------|-------|---------| +| `starpunk/media.py` | 775 | Core media handling | +| `starpunk/routes/admin.py` | 603 | Admin upload endpoint | +| `starpunk/routes/micropub.py` | 203 | Micropub media endpoint | +| `starpunk/routes/public.py` | 567 | Media serving, feeds | +| `starpunk/feeds/rss.py` | 601 | RSS feed with media | +| `migrations/007_add_media_support.sql` | 38 | Media schema | +| `migrations/009_add_media_variants.sql` | 22 | Variant schema | +| `docs/decisions/ADR-057-media-attachment-model.md` | 110 | Media architecture | +| `docs/decisions/ADR-058-image-optimization-strategy.md` | 183 | Optimization strategy | +| `docs/design/v1.4.2/heic-support-design.md` | 220 | HEIC design spec | +| `docs/design/v1.4.2/2025-12-16-implementation-report.md` | 171 | Implementation report | diff --git a/docs/design/v1.4.2/2025-12-16-developer-review.md b/docs/design/v1.4.2/2025-12-16-developer-review.md new file mode 100644 index 0000000..3cbc4e9 --- /dev/null +++ b/docs/design/v1.4.2/2025-12-16-developer-review.md @@ -0,0 +1,360 @@ +# v1.4.2 Developer Review - HEIC/MPO Support and Dimension Limit Increase + +**Date**: 2025-12-16 +**Reviewer**: Claude (Fullstack Developer Agent - Review Mode) +**Implementation Report**: `/home/phil/Projects/starpunk/docs/design/v1.4.2/2025-12-16-implementation-report.md` +**Design Document**: `/home/phil/Projects/starpunk/docs/design/v1.4.2/heic-support-design.md` + +## Executive Summary + +The v1.4.2 implementation successfully addresses iPhone photo upload issues through HEIC/MPO format support and increased dimension limits. The implementation is **production-ready** with **minor recommendations** for improvements. All core functionality works correctly, tests pass, and backward compatibility is maintained. + +**Overall Assessment**: APPROVED with recommendations for follow-up enhancements + +## Implementation Summary + +### Changes Delivered + +1. **HEIC/HEIF Support** - Automatic detection and conversion to JPEG +2. **MPO Support** - Multi-Picture Object format handling for iPhone depth/portrait photos +3. **Dimension Limit Increase** - From 4096px to 12000px (supports 48MP+ cameras) +4. **Enhanced Error Handling** - Fallback HEIC detection, debug logging, and file saving +5. **Graceful Degradation** - Works without pillow-heif library (with clear errors) + +### Files Modified + +- `starpunk/media.py` - Core media handling logic +- `starpunk/config.py` - Added MAX_CONTENT_LENGTH +- `requirements.txt` - Added pillow-heif dependency, bumped Pillow version +- `tests/test_media_upload.py` - Added HEIC test coverage +- `CHANGELOG.md` - Release notes +- `starpunk/__init__.py` - Version bump to 1.4.2 + +## Code Quality Assessment + +### Strengths + +1. **Minimal Changes**: Implementation focused on specific problem without scope creep +2. **Error Handling**: Comprehensive fallback strategy for HEIC detection +3. **Debug Support**: Failed uploads saved to debug directory with magic bytes logged +4. **Test Coverage**: 5 new HEIC-specific tests, all passing (33/33 total) +5. **Documentation**: Clear inline comments explaining iPhone-specific behavior +6. **Backward Compatibility**: No breaking changes, existing uploads unaffected + +### Issues Found + +#### Critical Issues + +**None identified** + +#### Major Issues + +**None identified** + +#### Minor Issues + +**M1: MPO Format Has No Test Coverage** + +- **Location**: `starpunk/media.py` lines 163-173 +- **Issue**: MPO conversion code exists but no tests verify it works +- **Impact**: MPO handling is untested; potential for silent failures +- **Evidence**: Grep search found zero MPO test cases in test suite +- **Recommendation**: Add MPO test case to `TestHEICSupport` class: + ```python + def test_mpo_detection_and_conversion(self): + """Test MPO file is detected and converted to JPEG""" + # MPO format is used by iPhone for depth/portrait mode photos + # Create an MPO test image (Pillow supports creating these) + # ... test implementation + ``` +- **Severity**: Minor (MPO is relatively rare format, but advertised in CHANGELOG) + +**M2: Debug File Saving Could Fill Disk** + +- **Location**: `starpunk/media.py` lines 133-137 +- **Issue**: Failed uploads saved to `data/debug/` without size limits or cleanup +- **Impact**: Repeated failed uploads could accumulate and consume disk space +- **Scenario**: Malicious user repeatedly uploads invalid 50MB files +- **Recommendation**: Add cleanup strategy: + - Limit debug directory to 100MB total + - Auto-delete files older than 7 days + - Add config option `DEBUG_SAVE_FAILED_UPLOADS` (default: false in production) +- **Severity**: Minor (requires deliberate abuse; easily monitored) + +**M3: HEIC Conversion Quality Not Configurable** + +- **Location**: `starpunk/media.py` line 157 +- **Issue**: Hardcoded `quality=95` for HEICβ†’JPEG conversion +- **Impact**: Users with different quality/size tradeoff preferences cannot adjust +- **Current Behavior**: 95% quality always used, then subsequent optimization may reduce further +- **Recommendation**: Consider adding `HEIC_CONVERSION_QUALITY` config variable +- **Severity**: Minor (95% is reasonable default; optimization handles size anyway) + +**M4: Missing Dimension Limit in Config** + +- **Location**: `starpunk/media.py` line 41 +- **Issue**: `MAX_DIMENSION = 12000` is hardcoded constant +- **Impact**: Cannot adjust limit without code change +- **Rationale**: Design doc shows this was intentional (comment says "v1.4.2 - supports modern phone cameras") +- **Recommendation**: Consider making this configurable in future versions +- **Severity**: Minor (12000px is generous; few use cases for larger) + +**M5: Incomplete Documentation of MPO Format** + +- **Location**: `starpunk/media.py` lines 163-165 +- **Issue**: Comment says "extract primary image" but doesn't explain multi-frame nature +- **Missing Context**: MPO files contain multiple images (left/right for 3D, or depth map) +- **Recommendation**: Enhance comment: + ```python + # MPO (Multi-Picture Object) conversion (v1.4.2) + # MPO is used by iPhones for depth/portrait photos - contains multiple frames + # (e.g., original + depth map). We extract only the primary/first frame as JPEG. + # Depth information is discarded, which is acceptable for web display. + ``` +- **Severity**: Minor (code works correctly; just documentation clarity) + +## Security Review + +### Findings + +**S1: Debug File Naming Could Leak Timing Information** + +- **Location**: `starpunk/media.py` line 135 +- **Issue**: Filename includes timestamp with second precision +- **Pattern**: `failed_20251216_174532_photo.heic` +- **Concern**: Timing could reveal upload attempt patterns +- **Recommendation**: Use UUID instead of timestamp, or truncate to date only +- **Severity**: Very Low (minimal actual risk; defense in depth) + +**S2: Increased Attack Surface from pillow-heif** + +- **Location**: `requirements.txt` line 38 +- **Issue**: New dependency increases vulnerability surface area +- **Mitigation**: Version is pinned (`pillow-heif==0.18.*`) +- **Recommendation**: + - Monitor pillow-heif security advisories + - Document in security policy that HEIC support requires this dependency + - Consider making pillow-heif optional in future (already has graceful degradation) +- **Severity**: Low (acceptable tradeoff for iPhone support) + +**S3: No Magic Bytes Validation for HEIC** + +- **Location**: `starpunk/media.py` lines 115-121 +- **Issue**: Fallback HEIC detection uses `pillow_heif.read_heif()` without verifying magic bytes first +- **Risk**: Could attempt to parse arbitrary binary data as HEIC +- **Current Mitigation**: Exception handling catches parse failures +- **Recommendation**: Add magic bytes check before calling `read_heif()`: + ```python + # Check for HEIC/HEIF magic bytes (ftyp followed by heic/heix/hevc/etc) + if file_data[4:8] == b'ftyp' and file_data[8:12] in [b'heic', b'heix', b'hevc', b'heim']: + # Proceed with read_heif() + ``` +- **Severity**: Very Low (existing exception handling prevents exploitation) + +### Security Assessment: ACCEPTABLE + +No critical or major security issues identified. The increased attack surface from pillow-heif is an acceptable tradeoff for iPhone compatibility. + +## Performance Analysis + +### Observations + +1. **HEIC Conversion Overhead**: Implementation report notes ~100-300ms per image +2. **In-Memory Processing**: Uses BytesIO (no temp files) - good for security and performance +3. **Double Processing**: HEIC files processed twice (conversion + optimization) + - Could be optimized by combining steps, but current approach is cleaner +4. **Large File Handling**: 50MB files through HEIC conversion could spike memory + - Current implementation acceptable (server-grade machines) + +### Performance Assessment: ACCEPTABLE + +No significant performance concerns. The double-processing of HEIC files is a reasonable tradeoff for code simplicity and maintainability. + +## Test Coverage Analysis + +### Coverage Breakdown + +**HEIC Tests (5 cases):** +- βœ… Basic detection and conversion +- βœ… RGBA mode handling +- βœ… Dimension preservation +- βœ… Error without library +- βœ… Full upload flow + +**Missing Tests:** +- ❌ MPO format handling (see M1) +- ❌ Large HEIC file (40MB+) conversion +- ❌ Corrupted HEIC file handling +- ❌ Debug file saving behavior +- ❌ Fallback HEIC detection path (lines 115-121) + +### Test Coverage Assessment: GOOD + +HEIC support is well-tested. MPO support is the main gap. Recommend adding tests for: +1. MPO format (priority: high) +2. Fallback HEIC detection path (priority: medium) +3. Debug file saving (priority: low) + +## Edge Cases Review + +### Handled Correctly + +1. βœ… HEIC with alpha channel β†’ RGB conversion +2. βœ… HEIC with P mode β†’ RGB conversion +3. βœ… Missing pillow-heif library β†’ Clear error +4. βœ… HEIC files >50MB β†’ Rejected at validation +5. βœ… iOS mislabeled files (HEIC with .jpeg extension) β†’ Fallback detection + +### Not Handled / Unclear + +1. **Animated HEIC sequences**: Do they exist? If so, behavior undefined + - Recommendation: Document that only first frame is used (like MPO) +2. **HEIC with EXIF orientation**: Does conversion preserve EXIF? + - Code re-opens image after conversion, so EXIF lost + - Subsequent `optimize_image()` applies EXIF orientation, so this is OK +3. **Very large dimension HEIC (8000x12000)**: Will conversion succeed? + - Should work (within 12000px limit), but not explicitly tested + +## Documentation Review + +### CHANGELOG.md + +**Assessment**: GOOD + +- Clear feature descriptions +- Mentions both HEIC and MPO +- Documents dimension limit increase +- Lists dependency changes + +**Suggestion**: Add upgrade notes section: +```markdown +### Upgrade Notes +- Run `pip install -r requirements.txt` to install pillow-heif +- No configuration changes required +- Existing media uploads unaffected +``` + +### Inline Comments + +**Assessment**: GOOD + +- Clear version markers (v1.4.2) +- Explains WHY (browsers can't display HEIC, iOS uses MPO for depth) +- Links to relevant concepts (portrait mode, depth maps) + +### Implementation Report + +**Assessment**: EXCELLENT + +- Comprehensive technical decisions documented +- Clear before/after code examples +- Deployment notes included +- Performance considerations noted + +## Technical Debt Assessment + +### New Debt Introduced + +1. **Fallback HEIC Detection Complexity**: Lines 111-142 are complex nested exception handling + - Could be refactored into separate function for clarity + - Not urgent; works correctly +2. **Debug File Accumulation**: See M2 above +3. **Hardcoded Constants**: See M3 and M4 above + +### Debt Assessment: LOW + +Minimal new technical debt. The complex error handling is justified by the real-world issue it solves (iOS mislabeling files). + +## Dependencies Review + +### New Dependencies + +**pillow-heif==0.18.*** +- Purpose: HEIC/HEIF format support +- License: BSD-3-Clause (compatible) +- Maintenance: Active (last release 2024) +- Security: No known CVEs as of 2025-12-16 + +**Pillow 10.0.* β†’ 10.1.*** +- Reason: Required by pillow-heif +- Risk: Minor version bump +- Mitigation: Full test suite passed (879 tests) + +### Dependencies Assessment: ACCEPTABLE + +Dependencies are well-maintained, properly licensed, and version-pinned. + +## Recommendations + +### Immediate (Before Production Deployment) + +**None** - Implementation is production-ready as-is + +### Short-Term (Next Patch/Minor Release) + +1. **Add MPO test coverage** (M1) - Priority: HIGH + - Advertised in CHANGELOG but untested + - Create test case in `TestHEICSupport` class + +2. **Add debug file cleanup** (M2) - Priority: MEDIUM + - Prevent disk exhaustion from failed uploads + - Add size limit and age-based cleanup + +3. **Document edge cases in code** (M5) - Priority: LOW + - Enhance MPO comment to explain multi-frame nature + - Add comment about animated HEIC handling (if applicable) + +### Long-Term (Future Versions) + +1. **Make HEIC quality configurable** (M3) +2. **Make MAX_DIMENSION configurable** (M4) +3. **Add magic bytes validation** (S3) +4. **Refactor fallback detection** (Technical Debt #1) +5. **Consider WebP conversion target** (Better compression than JPEG) + +## Deployment Checklist + +- βœ… Dependencies updated in requirements.txt +- βœ… Version number incremented +- βœ… CHANGELOG.md updated +- βœ… Tests pass (33/33 media tests, 879/879 total) +- βœ… Backward compatibility maintained +- βœ… No database migrations required +- βœ… No configuration changes required +- ⚠️ MPO format untested (see M1) + +## Conclusion + +The v1.4.2 implementation successfully solves the iPhone photo upload problem with a clean, well-documented solution. The code quality is high, security concerns are minimal, and backward compatibility is maintained. + +**Recommendation**: **APPROVE for production deployment** + +The identified minor issues (particularly MPO test coverage) should be addressed in a follow-up patch, but do not block this release. The implementation solves a real user problem (iPhone uploads failing) and does so safely. + +### Confidence Level + +- **Code Quality**: HIGH (clean implementation, follows project standards) +- **Security**: MEDIUM-HIGH (acceptable risk from new dependency) +- **Test Coverage**: MEDIUM (HEIC well-tested, MPO untested) +- **Documentation**: HIGH (clear comments and design docs) +- **Overall**: HIGH (ready for production use) + +## Follow-Up Actions + +### For Developer + +1. Consider adding MPO test case before next release +2. Monitor pillow-heif for security updates +3. Add debug directory cleanup in v1.4.3 or v1.5.0 + +### For Architect + +1. Review recommendation to make MAX_DIMENSION configurable +2. Decide if WebP conversion should replace JPEG for HEIC (better compression) +3. Consider documenting animated HEIC handling policy + +--- + +**Review completed**: 2025-12-16 +**Reviewed by**: Claude (Fullstack Developer Agent) +**Status**: Implementation APPROVED with minor recommendations diff --git a/docs/projectplan/BACKLOG.md b/docs/projectplan/BACKLOG.md index 2ad262f..1b60f50 100644 --- a/docs/projectplan/BACKLOG.md +++ b/docs/projectplan/BACKLOG.md @@ -1,9 +1,20 @@ # StarPunk Backlog -**Last Updated**: 2025-12-10 +**Last Updated**: 2025-12-16 ## Recently Completed +### v1.4.2 - HEIC/MPO Support and Dimension Limit Increase (Complete) +- HEIC/HEIF format detection and conversion to JPEG +- MPO (Multi-Picture Object) format support for iPhone depth photos +- MAX_DIMENSION increased from 4096 to 12000 pixels +- Enhanced debug logging for failed uploads + +### v1.4.0/v1.4.1 - Media Variants (Complete) +- Image variant generation (thumb, small, medium, large) +- Micropub media endpoint +- Enhanced feed media support with Media RSS + ### v1.3.0 - Microformats2 Compliance and Tags (Complete) - Tag/Category system with database schema - p-category microformats2 markup @@ -29,31 +40,63 @@ ## High -### Enhanced Feed Media Support *(Scheduled: v1.4.0)* -- Multiple image sizes/thumbnails (150px, 320px, 640px, 1280px) -- Full Media RSS implementation (media:group, all attributes) -- Enhanced JSON Feed attachments -- ATOM enclosure links for all media -- See: ADR-059 +### MPO Format Test Coverage +- **Description**: MPO conversion code exists but has no test coverage. MPO is advertised in the CHANGELOG but the handling is untested. +- **Location**: `starpunk/media.py` lines 163-173 +- **Source**: Developer Review (M1) +- **Approach**: Add `test_mpo_detection_and_conversion()` to `TestHEICSupport` class in `tests/test_media_upload.py`. Create an MPO test image using Pillow's MPO support. -### POSSE +### POSSE - Native syndication to social networks - Supported networks: - - First iteration: + - First iteration: - Mastodon (and compatible services) - Bluesky - - Second iteration - - TBD + - Second iteration + - TBD - Solution should include a configuration UI for setup --- ## Medium -### Default slug change -- The default slug should be a date time stamp +### Debug File Storage Without Cleanup Mechanism +- **Description**: Failed uploads are saved to `data/debug/` directory for analysis, but there is no mechanism to clean up these files. This could consume significant disk space, especially if under attack. +- **Location**: `starpunk/media.py` lines 133-137 +- **Source**: Developer Review (M2), Architect Review (Issue 1.2.2) +- **Approach**: + 1. Add `DEBUG_SAVE_FAILED_UPLOADS` config option (default: false in production) + 2. Implement automatic cleanup (files older than 7 days) + 3. Add disk space check or size limit (e.g., 100MB max) + +### Filename Not Sanitized in Debug Path (Security) +- **Description**: The original filename is used directly in the debug file path without sanitization, which could cause path traversal or special character issues. +- **Location**: `starpunk/media.py` line 135 +- **Source**: Architect Review (Issue 1.2.3) +- **Approach**: Sanitize filename before use: `safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]` + +### N+1 Query Pattern in Feed Generation +- **Description**: In `_get_cached_notes()`, media and tags are loaded per-note in separate queries. For 50 notes, this is 100 additional database queries, degrading performance. +- **Location**: `starpunk/routes/public.py` lines 68-74 +- **Source**: Architect Review (Issue 2.2.9) +- **Approach**: Implement batch loading function `get_media_for_notes(note_ids: List[int])` using a single query with `WHERE note_id IN (...)`. + +### Transaction Not Atomic in Variant Generation +- **Description**: Files are written to disk before database commit. If the database commit fails, orphaned files remain on disk. +- **Location**: `starpunk/media.py` lines 404-440 +- **Source**: Architect Review (Issue 2.2.6) +- **Approach**: Write variant files to a temporary location first, then move to final location after successful database commit. + +### Rate Limiting on Upload Endpoints +- **Description**: No rate limiting exists on media upload endpoints, making them vulnerable to abuse. +- **Location**: `/admin/new` (admin.py), `/media` (micropub.py) +- **Source**: Architect Review (Security Assessment) +- **Approach**: Implement Flask-Limiter or similar rate limiting middleware for upload routes. + +### Default Slug Change +- The default slug should be a date time stamp - YYYYMMDDHHMMSS -- Edge case, if the slug would somehow be a duplicate append a "-x" e.g. -1 +- Edge case, if the slug would somehow be a duplicate append a "-x" e.g. -1 ### Tag Enhancements (v1.3.0 Follow-up) - Tag pagination on archive pages (when note count exceeds threshold) @@ -105,6 +148,42 @@ ## Low +### HEIC/MPO Conversion Quality Not Configurable +- **Description**: HEIC and MPO to JPEG conversion uses hardcoded `quality=95`. Users with different quality/size tradeoff preferences cannot adjust this. +- **Location**: `starpunk/media.py` line 157 +- **Source**: Developer Review (M3) +- **Approach**: Add `HEIC_CONVERSION_QUALITY` config variable with 95 as default. + +### MAX_DIMENSION Not Configurable +- **Description**: `MAX_DIMENSION = 12000` is a hardcoded constant. Cannot adjust limit without code change. +- **Location**: `starpunk/media.py` line 41 +- **Source**: Developer Review (M4) +- **Approach**: Make configurable via config variable, keeping 12000 as default. + +### Animated WebP Not Detected +- **Description**: Animated GIF detection exists but animated WebP is not handled, potentially bypassing animated image size checks. +- **Location**: `starpunk/media.py` (validate_image function) +- **Source**: Architect Review (Issue 2.2.2) +- **Approach**: Add animated WebP detection similar to existing GIF handling. + +### Caption Not Escaped in RSS HTML +- **Description**: In RSS generation, caption is used directly in alt attribute without HTML escaping. Could break markup if caption contains `"` or other special characters. +- **Location**: `starpunk/feeds/rss.py` line 136 +- **Source**: Architect Review (Issue 2.2.10) +- **Approach**: Use `html.escape()` for caption when generating HTML content. + +### Incomplete MPO Documentation +- **Description**: Code comment says "extract primary image" but doesn't explain the multi-frame nature of MPO files (contain multiple images for 3D or depth maps). +- **Location**: `starpunk/media.py` lines 163-165 +- **Source**: Developer Review (M5) +- **Approach**: Enhance inline comment to document that MPO files contain multiple frames and only the primary frame is extracted. + +### Module Docstring Stale +- **Description**: Module docstring still states "4096x4096 max dimensions" but MAX_DIMENSION was updated to 12000 in v1.4.2. +- **Location**: `starpunk/media.py` lines 1-12 +- **Source**: Architect Review (Issue 1.2.1) +- **Approach**: Update docstring to reflect current 12000px limit. + ### Flaky Migration Race Condition Tests - Improve `test_migration_race_condition.py::TestGraduatedLogging::test_debug_level_for_early_retries` - Test expects DEBUG retry messages but passes when migration succeeds without retries diff --git a/starpunk/media.py b/starpunk/media.py index dd5db7b..6b5bd63 100644 --- a/starpunk/media.py +++ b/starpunk/media.py @@ -7,7 +7,7 @@ Per ADR-057 and ADR-058: - 50MB max upload, 10MB max output (v1.4.0) - Image variants: thumb, small, medium, large (v1.4.0) - Tiered resize strategy based on input size (v1.4.0) -- 4096x4096 max dimensions +- 12000x12000 max dimensions (v1.4.2) - 4 images max per note """