Compare commits
8 Commits
v1.4.2-rc.
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 9dcc5c5710 | |||
| 7be2fb0f62 | |||
| 730eb8d58c | |||
| 6682339a86 | |||
| d416463242 | |||
| 25b8cbd79d | |||
| 042505d5a6 | |||
| 72f3d4a55e |
@@ -12,9 +12,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
### Added
|
||||
|
||||
- HEIC/HEIF image format support for iPhone photo uploads
|
||||
- Automatic HEIC to JPEG conversion (browsers cannot display HEIC)
|
||||
- MPO (Multi-Picture Object) format support for iPhone depth/portrait photos
|
||||
- Automatic HEIC/MPO to JPEG conversion (browsers cannot display these formats)
|
||||
- Graceful error handling if pillow-heif library not installed
|
||||
|
||||
### Changed
|
||||
|
||||
- Increased maximum input image dimensions from 4096x4096 to 12000x12000 to support modern phone cameras (48MP+); images are still optimized to smaller sizes for web delivery
|
||||
|
||||
### Dependencies
|
||||
|
||||
- Added `pillow-heif` for HEIC image processing
|
||||
|
||||
197
docs/decisions/ADR-062-timestamp-based-slug-format.md
Normal file
197
docs/decisions/ADR-062-timestamp-based-slug-format.md
Normal file
@@ -0,0 +1,197 @@
|
||||
# ADR-062: Timestamp-Based Slug Format
|
||||
|
||||
## Status
|
||||
Accepted (Supersedes ADR-007)
|
||||
|
||||
## Context
|
||||
|
||||
ADR-007 established a content-based slug generation algorithm that extracts the first 5 words from note content to create URL slugs. While this approach provides readable, SEO-friendly URLs, it has drawbacks for a personal note-taking system:
|
||||
|
||||
1. **Privacy Concerns**: The slug reveals note content in the URL. Private thoughts or draft content become visible in URLs that may be shared or logged.
|
||||
|
||||
2. **SEO Irrelevance**: StarPunk is designed for personal IndieWeb notes, not public blogs. Notes are typically short-form content (similar to tweets or status updates) where SEO optimization provides no meaningful benefit.
|
||||
|
||||
3. **Unpredictable Slugs**: Users cannot predict what slug will be generated without examining their content carefully.
|
||||
|
||||
4. **Edge Case Handling**: Content-based slugs require complex fallback logic for short content, unicode-only content, or special characters.
|
||||
|
||||
5. **Collision Complexity**: Similar notes require random suffix generation (e.g., `hello-world-a7c9`), which undermines the readability goal.
|
||||
|
||||
The user has explicitly stated: "Notes don't need SEO."
|
||||
|
||||
## Decision
|
||||
|
||||
Change the default slug format from content-based to timestamp-based:
|
||||
|
||||
**New Default Format**: `YYYYMMDDHHMMSS`
|
||||
- Example: `20251216143052`
|
||||
- Compact, sortable, predictable
|
||||
- 14 characters total
|
||||
|
||||
**Collision Handling**: Sequential numeric suffix
|
||||
- First collision: `20251216143052-1`
|
||||
- Second collision: `20251216143052-2`
|
||||
- Simple, predictable, no randomness
|
||||
|
||||
**Custom Slugs**: Continue to support user-specified slugs via `mp-slug` Micropub property and the web UI custom slug field. When provided, custom slugs take precedence.
|
||||
|
||||
### Algorithm Specification
|
||||
|
||||
```python
|
||||
def generate_slug(custom_slug: str = None, created_at: datetime = None) -> str:
|
||||
"""Generate a URL-safe slug for a note.
|
||||
|
||||
Args:
|
||||
custom_slug: User-provided custom slug (takes precedence if provided)
|
||||
created_at: Note creation timestamp (defaults to now)
|
||||
|
||||
Returns:
|
||||
URL-safe slug string
|
||||
"""
|
||||
if custom_slug:
|
||||
return sanitize_slug(custom_slug)
|
||||
|
||||
# Default: timestamp-based
|
||||
timestamp = (created_at or datetime.now()).strftime("%Y%m%d%H%M%S")
|
||||
return ensure_unique_slug(timestamp)
|
||||
|
||||
def ensure_unique_slug(base_slug: str) -> str:
|
||||
"""Ensure slug is unique, adding numeric suffix if needed."""
|
||||
if not slug_exists(base_slug):
|
||||
return base_slug
|
||||
|
||||
suffix = 1
|
||||
while slug_exists(f"{base_slug}-{suffix}"):
|
||||
suffix += 1
|
||||
return f"{base_slug}-{suffix}"
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
| Scenario | Generated Slug |
|
||||
|----------|----------------|
|
||||
| Normal note at 2:30:52 PM on Dec 16, 2025 | `20251216143052` |
|
||||
| Second note in same second | `20251216143052-1` |
|
||||
| Third note in same second | `20251216143052-2` |
|
||||
| Note with custom slug "my-custom-slug" | `my-custom-slug` |
|
||||
|
||||
## Rationale
|
||||
|
||||
### Timestamp Format (Score: 9/10)
|
||||
|
||||
**Pros**:
|
||||
- **Privacy**: URL reveals nothing about content
|
||||
- **Predictability**: User knows exactly what slug format to expect
|
||||
- **Sortability**: Chronological sorting by URL is possible
|
||||
- **Simplicity**: No complex word extraction or normalization
|
||||
- **Collision Rarity**: Same-second creation is rare; handled cleanly when it occurs
|
||||
- **Compact**: 14 characters vs potentially 50+ for content-based
|
||||
|
||||
**Cons**:
|
||||
- **Not Memorable**: `20251216143052` is harder to remember than `hello-world`
|
||||
- **No SEO Value**: Search engines prefer descriptive URLs
|
||||
|
||||
**Why Cons Don't Matter**:
|
||||
- StarPunk is for personal notes, not public SEO-optimized content
|
||||
- Notes are accessed via UI, feeds, or bookmarks, not memorized URLs
|
||||
- Users wanting memorable URLs can use custom slugs
|
||||
|
||||
### Sequential Suffix (Score: 10/10)
|
||||
|
||||
**Pros**:
|
||||
- **Deterministic**: No randomness; same collision always gets same suffix
|
||||
- **Simple**: No cryptographic random generation needed
|
||||
- **Readable**: `-1`, `-2` are clear and obvious
|
||||
- **Debuggable**: Easy to understand collision resolution
|
||||
|
||||
**Cons**:
|
||||
- **Enumerable**: Sequential numbers could be probed
|
||||
- Not a real security concern for note slugs
|
||||
|
||||
### Comparison with ADR-007 Approach
|
||||
|
||||
| Aspect | ADR-007 (Content-Based) | ADR-062 (Timestamp) |
|
||||
|--------|------------------------|---------------------|
|
||||
| Privacy | Reveals content | Reveals only time |
|
||||
| Complexity | High (word extraction, normalization, unicode handling) | Low (strftime) |
|
||||
| SEO | Good | None |
|
||||
| Predictability | Low | High |
|
||||
| Collision handling | Random suffix | Sequential suffix |
|
||||
| Fallback cases | Many (short content, unicode, etc.) | None |
|
||||
| Code lines | ~50 | ~15 |
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Simplified Code**: Remove complex word extraction, unicode normalization, and multiple fallback paths
|
||||
2. **Better Privacy**: Note content never appears in URLs
|
||||
3. **Predictable Output**: Users always know what slug format to expect
|
||||
4. **Fewer Edge Cases**: No special handling for short content, unicode, or special characters
|
||||
5. **Cleaner Collisions**: Sequential suffixes are more intuitive than random strings
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Migration**: Existing notes keep their content-based slugs (no migration needed)
|
||||
2. **Not Human-Readable**: URLs don't describe content
|
||||
3. **No SEO**: Search engines won't benefit from descriptive URLs
|
||||
|
||||
### Mitigations
|
||||
|
||||
**Human-Readable URLs**:
|
||||
- Users wanting descriptive URLs can use custom slugs via `mp-slug`
|
||||
- The web UI custom slug field remains available
|
||||
- This is opt-in rather than default
|
||||
|
||||
**SEO**:
|
||||
- IndieWeb notes are typically not SEO targets
|
||||
- Content is in the page body where search engines can index it
|
||||
- Microformats2 markup provides semantic meaning
|
||||
|
||||
## Backward Compatibility
|
||||
|
||||
- Existing notes retain their slugs (no data migration)
|
||||
- New notes use timestamp format by default
|
||||
- Custom slug functionality unchanged
|
||||
- All existing URLs remain valid
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
- Test default slug generation produces timestamp format
|
||||
- Test collision handling with sequential suffixes
|
||||
- Test custom slugs still take precedence
|
||||
- Test edge case: multiple notes in same second
|
||||
- Test reserved slug rejection still works
|
||||
- Verify existing tests for custom slugs pass
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
Changes required in:
|
||||
- `starpunk/slug_utils.py`: Update `generate_slug()` function
|
||||
- `starpunk/notes.py`: Remove content parameter from slug generation call
|
||||
- Tests: Update expected slug formats
|
||||
|
||||
Estimated effort: Small (1-2 hours implementation, 1 hour testing)
|
||||
|
||||
## References
|
||||
|
||||
- ADR-007: Slug Generation Algorithm (Superseded by this ADR)
|
||||
- ADR-035: Custom Slugs (Unchanged; complements this decision)
|
||||
- IndieWeb Permalink Best Practices: https://indieweb.org/permalink
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Default slugs use `YYYYMMDDHHMMSS` format
|
||||
- [ ] Collision handling uses sequential suffix (`-1`, `-2`, etc.)
|
||||
- [ ] Custom slugs via `mp-slug` continue to work
|
||||
- [ ] Custom slugs via web UI continue to work
|
||||
- [ ] Reserved slug validation unchanged
|
||||
- [ ] Existing notes unaffected
|
||||
- [ ] All tests pass
|
||||
- [ ] Code complexity reduced
|
||||
|
||||
---
|
||||
|
||||
**Approved**: 2025-12-16
|
||||
**Architect**: StarPunk Architect Agent
|
||||
**Supersedes**: ADR-007 (Slug Generation Algorithm)
|
||||
527
docs/design/v1.4.2/2025-12-16-architect-review.md
Normal file
527
docs/design/v1.4.2/2025-12-16-architect-review.md
Normal file
@@ -0,0 +1,527 @@
|
||||
# Architect Review: v1.4.2 Implementation and End-to-End Media Pipeline
|
||||
|
||||
**Date**: 2025-12-16
|
||||
**Architect**: Claude (StarPunk Architect Agent)
|
||||
**Scope**: v1.4.2 HEIC support implementation + comprehensive media pipeline review
|
||||
**Status**: Complete
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The v1.4.2 implementation is architecturally sound and follows the established design patterns. The HEIC support was implemented cleanly with minimal code changes. However, the comprehensive end-to-end media pipeline review reveals several architectural concerns that should be addressed in future releases, ranging from security hardening to consistency improvements.
|
||||
|
||||
**Overall Assessment**: Acceptable with recommendations for future improvement.
|
||||
|
||||
---
|
||||
|
||||
## Part 1: v1.4.2 Implementation Review
|
||||
|
||||
### 1.1 Design Decisions Assessment
|
||||
|
||||
| Decision | Assessment | Notes |
|
||||
|----------|------------|-------|
|
||||
| D1: Convert at validation time | **Acceptable** | Keeps change minimal; conversion in `validate_image()` is logical since it normalizes input formats |
|
||||
| D2: Convert to JPEG | **Acceptable** | JPEG has universal browser support; appropriate for photographic content |
|
||||
| D3: Graceful degradation | **Good** | Conditional import with `HEIC_SUPPORTED` flag enables runtime flexibility |
|
||||
| D4: Quality 95 for conversion | **Acceptable** | High quality preserved; subsequent optimization will reduce if needed |
|
||||
| D5: Return signature change | **Acceptable** | 4-tuple return is clean; API change is internal-only |
|
||||
|
||||
### 1.2 Issues Found in v1.4.2
|
||||
|
||||
#### Issue 1.2.1: Module-Level Documentation Stale (LOW)
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 1-12
|
||||
|
||||
**Problem**: The module docstring still states "4096x4096 max dimensions" but `MAX_DIMENSION` was updated to 12000 in v1.4.2.
|
||||
|
||||
**Impact**: Documentation mismatch causes confusion; could lead to incorrect assumptions by future developers.
|
||||
|
||||
**Recommendation**: Update docstring to reflect current limits:
|
||||
```
|
||||
- 12000x12000 max input dimensions (v1.4.2)
|
||||
```
|
||||
|
||||
#### Issue 1.2.2: Debug File Storage Without Cleanup (MEDIUM)
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 133-137
|
||||
|
||||
**Problem**: Failed uploads are saved to `data/debug/` directory for analysis, but there is no mechanism to clean up these files. Over time, this could consume significant disk space, especially if under attack.
|
||||
|
||||
```python
|
||||
debug_dir = Path(current_app.config.get('DATA_PATH', 'data')) / 'debug'
|
||||
debug_dir.mkdir(parents=True, exist_ok=True)
|
||||
debug_file = debug_dir / f"failed_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{filename}"
|
||||
debug_file.write_bytes(file_data)
|
||||
```
|
||||
|
||||
**Impact**:
|
||||
- Disk space exhaustion risk
|
||||
- Potential storage of malicious payloads
|
||||
- No visibility into debug file accumulation
|
||||
|
||||
**Recommendation**:
|
||||
1. Add a configuration option to enable/disable debug file saving
|
||||
2. Implement automatic cleanup (e.g., files older than 7 days)
|
||||
3. Add disk space check before saving
|
||||
4. Consider rate limiting debug file creation
|
||||
|
||||
#### Issue 1.2.3: Filename in Debug Path Not Sanitized (MEDIUM)
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, line 135
|
||||
|
||||
**Problem**: The original filename is used directly in the debug file path without sanitization.
|
||||
|
||||
```python
|
||||
debug_file = debug_dir / f"failed_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{filename}"
|
||||
```
|
||||
|
||||
**Impact**: Path traversal or special character issues could occur if filename contains malicious patterns (though `pathlib` provides some protection).
|
||||
|
||||
**Recommendation**: Sanitize filename before use:
|
||||
```python
|
||||
safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]
|
||||
```
|
||||
|
||||
#### Issue 1.2.4: Explicit HEIC Read After Pillow Failure (LOW)
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 111-139
|
||||
|
||||
**Problem**: When Pillow fails to open a file, the code attempts to read it explicitly as HEIC. This is a workaround for iOS files with wrong extensions, but the error handling is complex and could be clearer.
|
||||
|
||||
**Impact**: Code complexity; potential for subtle bugs in error paths.
|
||||
|
||||
**Recommendation**: Consider refactoring to a cleaner pattern:
|
||||
```
|
||||
1. Try Pillow standard open
|
||||
2. If fails and HEIC_SUPPORTED, try explicit HEIC
|
||||
3. If both fail, provide clear diagnostic
|
||||
```
|
||||
|
||||
### 1.3 v1.4.2 Architecture Compliance
|
||||
|
||||
| Principle | Compliance | Notes |
|
||||
|-----------|------------|-------|
|
||||
| Minimal code | **Yes** | 41 lines added for significant functionality |
|
||||
| Standards first | **Yes** | HEIC conversion preserves IndieWeb compatibility |
|
||||
| No lock-in | **Yes** | JPEG output is universal |
|
||||
| Single responsibility | **Yes** | Validation handles input normalization |
|
||||
| Documentation | **Partial** | Design doc complete; module docstring stale |
|
||||
|
||||
---
|
||||
|
||||
## Part 2: End-to-End Media Pipeline Review
|
||||
|
||||
### 2.1 Architecture Diagram
|
||||
|
||||
```
|
||||
MEDIA PIPELINE ARCHITECTURE
|
||||
===========================
|
||||
|
||||
USER UPLOAD FEED DELIVERY
|
||||
=========== ============
|
||||
|
||||
+-------------------+
|
||||
Admin UI | |
|
||||
/admin/new ---------> | validate_image | ----+
|
||||
POST multipart | | |
|
||||
+-------------------+ |
|
||||
|
|
||||
+-------------------+ | +------------------+
|
||||
Micropub Endpoint | | | | |
|
||||
/media POST --------> | save_media |<----+---->| SQLite DB |
|
||||
multipart/form-data | | | - media |
|
||||
+-------------------+ | - note_media |
|
||||
| | - media_vars |
|
||||
v +------------------+
|
||||
+-------------------+ ^
|
||||
| | |
|
||||
| optimize_image | |
|
||||
| (resize, compress)| |
|
||||
+-------------------+ |
|
||||
| |
|
||||
v |
|
||||
+-------------------+ |
|
||||
| | |
|
||||
| generate_variants |--------------------+
|
||||
| (thumb/sm/md/lg) |
|
||||
+-------------------+
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| |
|
||||
| FILESYSTEM |
|
||||
| data/media/ |
|
||||
| YYYY/MM/uuid.ext|
|
||||
+-------------------+
|
||||
|
|
||||
|
|
||||
+----------------------------+----------------------------+
|
||||
| | |
|
||||
v v v
|
||||
+--------+ +--------+ +--------+
|
||||
| RSS | | ATOM | | JSON |
|
||||
| /feed | | /feed | | /feed |
|
||||
| .xml | | .atom | | .json |
|
||||
+--------+ +--------+ +--------+
|
||||
| | |
|
||||
+----------------------------+----------------------------+
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| |
|
||||
| /media/<path> |
|
||||
| Static serving |
|
||||
+-------------------+
|
||||
```
|
||||
|
||||
### 2.2 Stage-by-Stage Analysis
|
||||
|
||||
#### Stage 1: Upload Entry Points
|
||||
|
||||
**Admin Upload (`/home/phil/Projects/starpunk/starpunk/routes/admin.py`, lines 112-136)**
|
||||
|
||||
| Aspect | Assessment | Notes |
|
||||
|--------|------------|-------|
|
||||
| Authentication | **Good** | `@require_auth` decorator enforced |
|
||||
| Input validation | **Partial** | Relies on `save_media`; no pre-check on file count |
|
||||
| Error handling | **Good** | Per-file errors collected; partial success allowed |
|
||||
| Content-Type check | **Missing** | No verification of `multipart/form-data` |
|
||||
|
||||
**Issues**:
|
||||
- No maximum file count enforced at route level (relies on downstream check)
|
||||
- `request.files.getlist('media_files')` could be empty list with misleading behavior
|
||||
|
||||
**Micropub Upload (`/home/phil/Projects/starpunk/starpunk/routes/micropub.py`, lines 124-202)**
|
||||
|
||||
| Aspect | Assessment | Notes |
|
||||
|--------|------------|-------|
|
||||
| Authentication | **Good** | Token extraction and verification |
|
||||
| Scope check | **Good** | Requires `create` scope |
|
||||
| Content-Type check | **Good** | Explicit `multipart/form-data` verification |
|
||||
| Input validation | **Good** | Single file field validated |
|
||||
| Error handling | **Good** | OAuth 2.0 error format |
|
||||
|
||||
**Issues**:
|
||||
- None significant; well-implemented endpoint
|
||||
|
||||
#### Stage 2: Validation (`validate_image`)
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 79-219
|
||||
|
||||
| Check | Order | Assessment |
|
||||
|-------|-------|------------|
|
||||
| File size | 1st | **Good** - Early rejection before processing |
|
||||
| Pillow verify | 2nd | **Good** - Validates image integrity |
|
||||
| HEIC fallback | 3rd | **Acceptable** - Complex but necessary |
|
||||
| Format conversion | 4th | **Good** - HEIC/MPO to JPEG |
|
||||
| MIME type check | 5th | **Good** - Whitelist approach |
|
||||
| Dimension check | 6th | **Good** - Prevents memory issues |
|
||||
| Animated GIF check | 7th | **Good** - Special handling for animations |
|
||||
|
||||
**Issues Found**:
|
||||
|
||||
##### Issue 2.2.1: No Content-Type Header Validation (MEDIUM)
|
||||
|
||||
**Problem**: The upload content-type from the HTTP request is not validated against the detected image format.
|
||||
|
||||
**Impact**: Potential for MIME type confusion attacks. A file uploaded as `image/png` could actually be a JPEG.
|
||||
|
||||
**Recommendation**: Log warning when declared MIME type differs from detected format.
|
||||
|
||||
##### Issue 2.2.2: Missing WebP Animation Detection (LOW)
|
||||
|
||||
**Problem**: Animated GIF detection exists but animated WebP is not handled.
|
||||
|
||||
**Impact**: Large animated WebP files could bypass the animated image size check.
|
||||
|
||||
**Recommendation**: Add animated WebP detection similar to GIF handling.
|
||||
|
||||
##### Issue 2.2.3: Pillow Decompression Bomb Protection (LOW)
|
||||
|
||||
**Problem**: No explicit `Image.MAX_IMAGE_PIXELS` configuration.
|
||||
|
||||
**Impact**: Pillow has default protection, but explicit configuration documents intent.
|
||||
|
||||
**Recommendation**: Add explicit `Image.MAX_IMAGE_PIXELS` setting or document reliance on default.
|
||||
|
||||
#### Stage 3: Optimization (`optimize_image`)
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 222-304
|
||||
|
||||
| Aspect | Assessment | Notes |
|
||||
|--------|------------|-------|
|
||||
| Tiered strategy | **Good** | Size-aware quality/dimension selection |
|
||||
| EXIF handling | **Good** | Orientation correction applied |
|
||||
| GIF passthrough | **Good** | Animated GIFs preserved |
|
||||
| Iterative reduction | **Good** | Quality then dimension reduction |
|
||||
| Safety limits | **Good** | MIN_DIMENSION prevents infinite loop |
|
||||
|
||||
**Issues Found**:
|
||||
|
||||
##### Issue 2.2.4: PNG Optimization Limited (LOW)
|
||||
|
||||
**Problem**: PNG files only get `optimize=True` flag; no palette reduction or other optimizations.
|
||||
|
||||
**Impact**: Large PNG files may not compress well.
|
||||
|
||||
**Recommendation**: Consider PNG-specific optimization (pngquant integration) for future release.
|
||||
|
||||
##### Issue 2.2.5: No WebP Quality Handling (LOW)
|
||||
|
||||
**Problem**: WebP gets same quality treatment as JPEG but WebP quality values behave differently.
|
||||
|
||||
**Impact**: WebP files may not be optimized as expected.
|
||||
|
||||
**Recommendation**: Consider WebP-specific quality mapping.
|
||||
|
||||
#### Stage 4: Variant Generation (`generate_all_variants`)
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/media.py`, lines 306-452
|
||||
|
||||
| Aspect | Assessment | Notes |
|
||||
|--------|------------|-------|
|
||||
| Spec definitions | **Good** | Clear variant specifications |
|
||||
| Skip logic | **Good** | Smaller images skip larger variants |
|
||||
| Cleanup on failure | **Good** | Created files removed on error |
|
||||
| Database integration | **Good** | Variants recorded with dimensions |
|
||||
|
||||
**Issues Found**:
|
||||
|
||||
##### Issue 2.2.6: Transaction Not Atomic (MEDIUM)
|
||||
|
||||
**Problem**: Files are created on disk before database commit. If database fails, files remain orphaned (cleanup only happens on exception within the loop).
|
||||
|
||||
```python
|
||||
try:
|
||||
for variant_type in ['thumb', 'small', 'medium', 'large']:
|
||||
variant = generate_variant(...) # File created
|
||||
variants.append(variant)
|
||||
created_files.append(...)
|
||||
db.execute(...) # DB insert
|
||||
|
||||
db.execute(...) # Original variant
|
||||
db.commit() # <-- If this fails, files already exist
|
||||
```
|
||||
|
||||
**Impact**: Database failure after file creation could leave orphaned files.
|
||||
|
||||
**Recommendation**: Consider writing to temp location first, then moving after commit.
|
||||
|
||||
##### Issue 2.2.7: Variant Path Calculation Fragile (LOW)
|
||||
|
||||
**Problem**: Line 363 calculates relative path using parent traversal:
|
||||
```python
|
||||
'path': str(variant_path.relative_to(base_path.parent.parent))
|
||||
```
|
||||
|
||||
**Impact**: Dependent on directory structure assumptions.
|
||||
|
||||
**Recommendation**: Use explicit path construction as done for original (lines 428-429).
|
||||
|
||||
#### Stage 5: Storage and Serving
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/routes/public.py`, lines 174-221
|
||||
|
||||
| Aspect | Assessment | Notes |
|
||||
|--------|------------|-------|
|
||||
| Path traversal protection | **Good** | Resolve and prefix check |
|
||||
| Cache headers | **Good** | 1 year immutable cache |
|
||||
| File serving | **Good** | Uses Flask's `send_from_directory` |
|
||||
|
||||
**Issues Found**:
|
||||
|
||||
##### Issue 2.2.8: Symlink Following (LOW)
|
||||
|
||||
**Problem**: `resolve()` follows symlinks, which could potentially escape the media directory if symlinks exist within.
|
||||
|
||||
**Impact**: Low risk since symlinks would need to be created manually.
|
||||
|
||||
**Recommendation**: Add `strict=True` to resolve for Python 3.6+ or check `is_symlink()`.
|
||||
|
||||
#### Stage 6: Feed Integration
|
||||
|
||||
**Locations**:
|
||||
- `/home/phil/Projects/starpunk/starpunk/feeds/rss.py`
|
||||
- `/home/phil/Projects/starpunk/starpunk/routes/public.py` (`_get_cached_notes`)
|
||||
|
||||
| Aspect | Assessment | Notes |
|
||||
|--------|------------|-------|
|
||||
| Media attachment | **Good** | Media loaded and attached to notes |
|
||||
| URL construction | **Good** | Consistent absolute URLs |
|
||||
| Media RSS | **Good** | Proper namespace and elements |
|
||||
| Enclosure element | **Good** | First image for RSS 2.0 spec |
|
||||
| Variant selection | **Good** | Fallback order for default variant |
|
||||
|
||||
**Issues Found**:
|
||||
|
||||
##### Issue 2.2.9: N+1 Query Pattern in Feed Generation (MEDIUM)
|
||||
|
||||
**Problem**: In `_get_cached_notes()`, media and tags are loaded per-note in loops:
|
||||
|
||||
```python
|
||||
for note in notes:
|
||||
media = get_note_media(note.id) # DB query per note
|
||||
object.__setattr__(note, 'media', media)
|
||||
tags = get_note_tags(note.id) # DB query per note
|
||||
```
|
||||
|
||||
**Impact**: For 50 notes, this is 100 additional queries. Performance degrades with more notes.
|
||||
|
||||
**Recommendation**: Implement batch loading:
|
||||
```python
|
||||
def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[Dict]]:
|
||||
# Single query with WHERE note_id IN (...)
|
||||
```
|
||||
|
||||
##### Issue 2.2.10: Caption Not Escaped in RSS (LOW)
|
||||
|
||||
**Problem**: In RSS generation, caption is used directly in alt attribute:
|
||||
```python
|
||||
html_content += f'<img src="{media_url}" alt="{caption}" />'
|
||||
```
|
||||
|
||||
**Impact**: If caption contains `"` or other HTML special characters, could break markup.
|
||||
|
||||
**Recommendation**: Use `html.escape()` for caption in HTML context.
|
||||
|
||||
### 2.3 Security Assessment
|
||||
|
||||
| Category | Status | Notes |
|
||||
|----------|--------|-------|
|
||||
| **Authentication** | **Good** | Admin routes protected; Micropub uses token auth |
|
||||
| **Authorization** | **Good** | Scope checking on Micropub |
|
||||
| **File Type Validation** | **Good** | Whitelist + Pillow verification |
|
||||
| **Path Traversal** | **Good** | Protected in media serving |
|
||||
| **File Size Limits** | **Good** | 50MB upload, 10MB output |
|
||||
| **Dimension Limits** | **Good** | 12000px max input |
|
||||
| **Filename Handling** | **Good** | UUID-based storage filenames |
|
||||
| **Debug File Exposure** | **Needs Attention** | Debug files may contain malicious content |
|
||||
| **DoS Protection** | **Partial** | Limits exist but no rate limiting |
|
||||
|
||||
**Security Recommendations**:
|
||||
|
||||
1. **Add rate limiting** on media upload endpoints (medium priority)
|
||||
2. **Disable debug file saving** in production or add access controls (medium priority)
|
||||
3. **Log MIME type mismatches** for security monitoring (low priority)
|
||||
4. **Consider Content-Security-Policy** headers for served media (low priority)
|
||||
|
||||
### 2.4 Performance Assessment
|
||||
|
||||
| Operation | Assessment | Notes |
|
||||
|-----------|------------|-------|
|
||||
| Upload processing | **Acceptable** | In-memory processing; synchronous variant generation |
|
||||
| Feed generation | **Needs Improvement** | N+1 query pattern |
|
||||
| Media serving | **Good** | Static file serving with long cache |
|
||||
| Caching | **Good** | Feed caching with ETag support |
|
||||
|
||||
**Performance Recommendations**:
|
||||
|
||||
1. **Implement batch media loading** for feeds (high priority)
|
||||
2. **Consider async variant generation** for large uploads (low priority)
|
||||
3. **Add database query logging** in development mode (low priority)
|
||||
|
||||
---
|
||||
|
||||
## Part 3: Recommendations Summary
|
||||
|
||||
### 3.1 Immediate Fixes (None Critical)
|
||||
|
||||
No blocking issues found. The implementation is production-ready.
|
||||
|
||||
### 3.2 Future Improvements
|
||||
|
||||
#### High Priority
|
||||
|
||||
| Item | Description | Effort |
|
||||
|------|-------------|--------|
|
||||
| N+1 Query Fix | Batch load media/tags for feeds | Small |
|
||||
| Debug File Controls | Config option + cleanup mechanism | Small |
|
||||
|
||||
#### Medium Priority
|
||||
|
||||
| Item | Description | Effort |
|
||||
|------|-------------|--------|
|
||||
| Rate Limiting | Add upload rate limits | Medium |
|
||||
| Caption Escaping | HTML escape in feed generation | Small |
|
||||
| Filename Sanitization | Sanitize debug filenames | Small |
|
||||
| Transaction Atomicity | Temp files before commit | Medium |
|
||||
|
||||
#### Low Priority
|
||||
|
||||
| Item | Description | Effort |
|
||||
|------|-------------|--------|
|
||||
| WebP Animation | Detect animated WebP | Small |
|
||||
| PNG Optimization | Enhanced PNG compression | Medium |
|
||||
| Decompression Bomb Config | Explicit pixel limit | Trivial |
|
||||
| Symlink Handling | Stricter path resolution | Small |
|
||||
| Module Docstring | Update documentation | Trivial |
|
||||
|
||||
### 3.3 Technical Debt Register
|
||||
|
||||
| Item | Location | Notes |
|
||||
|------|----------|-------|
|
||||
| Complex HEIC fallback logic | `media.py:111-139` | Works but could be cleaner |
|
||||
| Variant path calculation | `media.py:363` | Uses parent traversal |
|
||||
| Debug file accumulation | `media.py:133-137` | No cleanup mechanism |
|
||||
| N+1 queries in feed | `public.py:68-74` | Performance impact |
|
||||
|
||||
---
|
||||
|
||||
## Part 4: Architectural Observations
|
||||
|
||||
### 4.1 Strengths
|
||||
|
||||
1. **Clean separation of concerns**: Validation, optimization, variant generation are distinct functions
|
||||
2. **Defensive programming**: Extensive try/except blocks with proper cleanup
|
||||
3. **Standards compliance**: Good adherence to IndieWeb, RSS, and W3C specs
|
||||
4. **Logging**: Comprehensive logging at INFO and WARNING levels
|
||||
5. **Backward compatibility**: Variants are optional; pre-v1.4.0 media works
|
||||
|
||||
### 4.2 Design Pattern Compliance
|
||||
|
||||
| Pattern | Usage | Assessment |
|
||||
|---------|-------|------------|
|
||||
| Route Adapter | Feed generation | **Good** |
|
||||
| Graceful Degradation | HEIC support | **Good** |
|
||||
| Early Rejection | File size check | **Good** |
|
||||
| Iterative Optimization | Quality reduction | **Good** |
|
||||
| UUID-based Storage | Collision avoidance | **Good** |
|
||||
|
||||
### 4.3 Areas for Future ADRs
|
||||
|
||||
1. **Media Processing Strategy**: Consider documenting the full media pipeline as an ADR
|
||||
2. **Debug/Diagnostic Data Handling**: Policy for storing failed uploads
|
||||
3. **Performance Targets**: Document expected query counts and response times
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The v1.4.2 implementation successfully addresses the iPhone HEIC upload issue with minimal, clean changes. The broader media pipeline is well-architected with appropriate security controls and error handling. The main areas for improvement are:
|
||||
|
||||
1. **Performance**: N+1 query pattern in feed generation should be addressed
|
||||
2. **Operations**: Debug file management needs cleanup mechanism
|
||||
3. **Security**: Rate limiting would harden upload endpoints
|
||||
|
||||
None of these issues block the v1.4.2 release. They should be tracked in the project backlog for future releases.
|
||||
|
||||
**Recommendation**: Accept v1.4.2 as implemented. Create backlog items for identified improvements.
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Files Reviewed
|
||||
|
||||
| File | Lines | Purpose |
|
||||
|------|-------|---------|
|
||||
| `starpunk/media.py` | 775 | Core media handling |
|
||||
| `starpunk/routes/admin.py` | 603 | Admin upload endpoint |
|
||||
| `starpunk/routes/micropub.py` | 203 | Micropub media endpoint |
|
||||
| `starpunk/routes/public.py` | 567 | Media serving, feeds |
|
||||
| `starpunk/feeds/rss.py` | 601 | RSS feed with media |
|
||||
| `migrations/007_add_media_support.sql` | 38 | Media schema |
|
||||
| `migrations/009_add_media_variants.sql` | 22 | Variant schema |
|
||||
| `docs/decisions/ADR-057-media-attachment-model.md` | 110 | Media architecture |
|
||||
| `docs/decisions/ADR-058-image-optimization-strategy.md` | 183 | Optimization strategy |
|
||||
| `docs/design/v1.4.2/heic-support-design.md` | 220 | HEIC design spec |
|
||||
| `docs/design/v1.4.2/2025-12-16-implementation-report.md` | 171 | Implementation report |
|
||||
360
docs/design/v1.4.2/2025-12-16-developer-review.md
Normal file
360
docs/design/v1.4.2/2025-12-16-developer-review.md
Normal file
@@ -0,0 +1,360 @@
|
||||
# v1.4.2 Developer Review - HEIC/MPO Support and Dimension Limit Increase
|
||||
|
||||
**Date**: 2025-12-16
|
||||
**Reviewer**: Claude (Fullstack Developer Agent - Review Mode)
|
||||
**Implementation Report**: `/home/phil/Projects/starpunk/docs/design/v1.4.2/2025-12-16-implementation-report.md`
|
||||
**Design Document**: `/home/phil/Projects/starpunk/docs/design/v1.4.2/heic-support-design.md`
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The v1.4.2 implementation successfully addresses iPhone photo upload issues through HEIC/MPO format support and increased dimension limits. The implementation is **production-ready** with **minor recommendations** for improvements. All core functionality works correctly, tests pass, and backward compatibility is maintained.
|
||||
|
||||
**Overall Assessment**: APPROVED with recommendations for follow-up enhancements
|
||||
|
||||
## Implementation Summary
|
||||
|
||||
### Changes Delivered
|
||||
|
||||
1. **HEIC/HEIF Support** - Automatic detection and conversion to JPEG
|
||||
2. **MPO Support** - Multi-Picture Object format handling for iPhone depth/portrait photos
|
||||
3. **Dimension Limit Increase** - From 4096px to 12000px (supports 48MP+ cameras)
|
||||
4. **Enhanced Error Handling** - Fallback HEIC detection, debug logging, and file saving
|
||||
5. **Graceful Degradation** - Works without pillow-heif library (with clear errors)
|
||||
|
||||
### Files Modified
|
||||
|
||||
- `starpunk/media.py` - Core media handling logic
|
||||
- `starpunk/config.py` - Added MAX_CONTENT_LENGTH
|
||||
- `requirements.txt` - Added pillow-heif dependency, bumped Pillow version
|
||||
- `tests/test_media_upload.py` - Added HEIC test coverage
|
||||
- `CHANGELOG.md` - Release notes
|
||||
- `starpunk/__init__.py` - Version bump to 1.4.2
|
||||
|
||||
## Code Quality Assessment
|
||||
|
||||
### Strengths
|
||||
|
||||
1. **Minimal Changes**: Implementation focused on specific problem without scope creep
|
||||
2. **Error Handling**: Comprehensive fallback strategy for HEIC detection
|
||||
3. **Debug Support**: Failed uploads saved to debug directory with magic bytes logged
|
||||
4. **Test Coverage**: 5 new HEIC-specific tests, all passing (33/33 total)
|
||||
5. **Documentation**: Clear inline comments explaining iPhone-specific behavior
|
||||
6. **Backward Compatibility**: No breaking changes, existing uploads unaffected
|
||||
|
||||
### Issues Found
|
||||
|
||||
#### Critical Issues
|
||||
|
||||
**None identified**
|
||||
|
||||
#### Major Issues
|
||||
|
||||
**None identified**
|
||||
|
||||
#### Minor Issues
|
||||
|
||||
**M1: MPO Format Has No Test Coverage**
|
||||
|
||||
- **Location**: `starpunk/media.py` lines 163-173
|
||||
- **Issue**: MPO conversion code exists but no tests verify it works
|
||||
- **Impact**: MPO handling is untested; potential for silent failures
|
||||
- **Evidence**: Grep search found zero MPO test cases in test suite
|
||||
- **Recommendation**: Add MPO test case to `TestHEICSupport` class:
|
||||
```python
|
||||
def test_mpo_detection_and_conversion(self):
|
||||
"""Test MPO file is detected and converted to JPEG"""
|
||||
# MPO format is used by iPhone for depth/portrait mode photos
|
||||
# Create an MPO test image (Pillow supports creating these)
|
||||
# ... test implementation
|
||||
```
|
||||
- **Severity**: Minor (MPO is relatively rare format, but advertised in CHANGELOG)
|
||||
|
||||
**M2: Debug File Saving Could Fill Disk**
|
||||
|
||||
- **Location**: `starpunk/media.py` lines 133-137
|
||||
- **Issue**: Failed uploads saved to `data/debug/` without size limits or cleanup
|
||||
- **Impact**: Repeated failed uploads could accumulate and consume disk space
|
||||
- **Scenario**: Malicious user repeatedly uploads invalid 50MB files
|
||||
- **Recommendation**: Add cleanup strategy:
|
||||
- Limit debug directory to 100MB total
|
||||
- Auto-delete files older than 7 days
|
||||
- Add config option `DEBUG_SAVE_FAILED_UPLOADS` (default: false in production)
|
||||
- **Severity**: Minor (requires deliberate abuse; easily monitored)
|
||||
|
||||
**M3: HEIC Conversion Quality Not Configurable**
|
||||
|
||||
- **Location**: `starpunk/media.py` line 157
|
||||
- **Issue**: Hardcoded `quality=95` for HEIC→JPEG conversion
|
||||
- **Impact**: Users with different quality/size tradeoff preferences cannot adjust
|
||||
- **Current Behavior**: 95% quality always used, then subsequent optimization may reduce further
|
||||
- **Recommendation**: Consider adding `HEIC_CONVERSION_QUALITY` config variable
|
||||
- **Severity**: Minor (95% is reasonable default; optimization handles size anyway)
|
||||
|
||||
**M4: Missing Dimension Limit in Config**
|
||||
|
||||
- **Location**: `starpunk/media.py` line 41
|
||||
- **Issue**: `MAX_DIMENSION = 12000` is hardcoded constant
|
||||
- **Impact**: Cannot adjust limit without code change
|
||||
- **Rationale**: Design doc shows this was intentional (comment says "v1.4.2 - supports modern phone cameras")
|
||||
- **Recommendation**: Consider making this configurable in future versions
|
||||
- **Severity**: Minor (12000px is generous; few use cases for larger)
|
||||
|
||||
**M5: Incomplete Documentation of MPO Format**
|
||||
|
||||
- **Location**: `starpunk/media.py` lines 163-165
|
||||
- **Issue**: Comment says "extract primary image" but doesn't explain multi-frame nature
|
||||
- **Missing Context**: MPO files contain multiple images (left/right for 3D, or depth map)
|
||||
- **Recommendation**: Enhance comment:
|
||||
```python
|
||||
# MPO (Multi-Picture Object) conversion (v1.4.2)
|
||||
# MPO is used by iPhones for depth/portrait photos - contains multiple frames
|
||||
# (e.g., original + depth map). We extract only the primary/first frame as JPEG.
|
||||
# Depth information is discarded, which is acceptable for web display.
|
||||
```
|
||||
- **Severity**: Minor (code works correctly; just documentation clarity)
|
||||
|
||||
## Security Review
|
||||
|
||||
### Findings
|
||||
|
||||
**S1: Debug File Naming Could Leak Timing Information**
|
||||
|
||||
- **Location**: `starpunk/media.py` line 135
|
||||
- **Issue**: Filename includes timestamp with second precision
|
||||
- **Pattern**: `failed_20251216_174532_photo.heic`
|
||||
- **Concern**: Timing could reveal upload attempt patterns
|
||||
- **Recommendation**: Use UUID instead of timestamp, or truncate to date only
|
||||
- **Severity**: Very Low (minimal actual risk; defense in depth)
|
||||
|
||||
**S2: Increased Attack Surface from pillow-heif**
|
||||
|
||||
- **Location**: `requirements.txt` line 38
|
||||
- **Issue**: New dependency increases vulnerability surface area
|
||||
- **Mitigation**: Version is pinned (`pillow-heif==0.18.*`)
|
||||
- **Recommendation**:
|
||||
- Monitor pillow-heif security advisories
|
||||
- Document in security policy that HEIC support requires this dependency
|
||||
- Consider making pillow-heif optional in future (already has graceful degradation)
|
||||
- **Severity**: Low (acceptable tradeoff for iPhone support)
|
||||
|
||||
**S3: No Magic Bytes Validation for HEIC**
|
||||
|
||||
- **Location**: `starpunk/media.py` lines 115-121
|
||||
- **Issue**: Fallback HEIC detection uses `pillow_heif.read_heif()` without verifying magic bytes first
|
||||
- **Risk**: Could attempt to parse arbitrary binary data as HEIC
|
||||
- **Current Mitigation**: Exception handling catches parse failures
|
||||
- **Recommendation**: Add magic bytes check before calling `read_heif()`:
|
||||
```python
|
||||
# Check for HEIC/HEIF magic bytes (ftyp followed by heic/heix/hevc/etc)
|
||||
if file_data[4:8] == b'ftyp' and file_data[8:12] in [b'heic', b'heix', b'hevc', b'heim']:
|
||||
# Proceed with read_heif()
|
||||
```
|
||||
- **Severity**: Very Low (existing exception handling prevents exploitation)
|
||||
|
||||
### Security Assessment: ACCEPTABLE
|
||||
|
||||
No critical or major security issues identified. The increased attack surface from pillow-heif is an acceptable tradeoff for iPhone compatibility.
|
||||
|
||||
## Performance Analysis
|
||||
|
||||
### Observations
|
||||
|
||||
1. **HEIC Conversion Overhead**: Implementation report notes ~100-300ms per image
|
||||
2. **In-Memory Processing**: Uses BytesIO (no temp files) - good for security and performance
|
||||
3. **Double Processing**: HEIC files processed twice (conversion + optimization)
|
||||
- Could be optimized by combining steps, but current approach is cleaner
|
||||
4. **Large File Handling**: 50MB files through HEIC conversion could spike memory
|
||||
- Current implementation acceptable (server-grade machines)
|
||||
|
||||
### Performance Assessment: ACCEPTABLE
|
||||
|
||||
No significant performance concerns. The double-processing of HEIC files is a reasonable tradeoff for code simplicity and maintainability.
|
||||
|
||||
## Test Coverage Analysis
|
||||
|
||||
### Coverage Breakdown
|
||||
|
||||
**HEIC Tests (5 cases):**
|
||||
- ✅ Basic detection and conversion
|
||||
- ✅ RGBA mode handling
|
||||
- ✅ Dimension preservation
|
||||
- ✅ Error without library
|
||||
- ✅ Full upload flow
|
||||
|
||||
**Missing Tests:**
|
||||
- ❌ MPO format handling (see M1)
|
||||
- ❌ Large HEIC file (40MB+) conversion
|
||||
- ❌ Corrupted HEIC file handling
|
||||
- ❌ Debug file saving behavior
|
||||
- ❌ Fallback HEIC detection path (lines 115-121)
|
||||
|
||||
### Test Coverage Assessment: GOOD
|
||||
|
||||
HEIC support is well-tested. MPO support is the main gap. Recommend adding tests for:
|
||||
1. MPO format (priority: high)
|
||||
2. Fallback HEIC detection path (priority: medium)
|
||||
3. Debug file saving (priority: low)
|
||||
|
||||
## Edge Cases Review
|
||||
|
||||
### Handled Correctly
|
||||
|
||||
1. ✅ HEIC with alpha channel → RGB conversion
|
||||
2. ✅ HEIC with P mode → RGB conversion
|
||||
3. ✅ Missing pillow-heif library → Clear error
|
||||
4. ✅ HEIC files >50MB → Rejected at validation
|
||||
5. ✅ iOS mislabeled files (HEIC with .jpeg extension) → Fallback detection
|
||||
|
||||
### Not Handled / Unclear
|
||||
|
||||
1. **Animated HEIC sequences**: Do they exist? If so, behavior undefined
|
||||
- Recommendation: Document that only first frame is used (like MPO)
|
||||
2. **HEIC with EXIF orientation**: Does conversion preserve EXIF?
|
||||
- Code re-opens image after conversion, so EXIF lost
|
||||
- Subsequent `optimize_image()` applies EXIF orientation, so this is OK
|
||||
3. **Very large dimension HEIC (8000x12000)**: Will conversion succeed?
|
||||
- Should work (within 12000px limit), but not explicitly tested
|
||||
|
||||
## Documentation Review
|
||||
|
||||
### CHANGELOG.md
|
||||
|
||||
**Assessment**: GOOD
|
||||
|
||||
- Clear feature descriptions
|
||||
- Mentions both HEIC and MPO
|
||||
- Documents dimension limit increase
|
||||
- Lists dependency changes
|
||||
|
||||
**Suggestion**: Add upgrade notes section:
|
||||
```markdown
|
||||
### Upgrade Notes
|
||||
- Run `pip install -r requirements.txt` to install pillow-heif
|
||||
- No configuration changes required
|
||||
- Existing media uploads unaffected
|
||||
```
|
||||
|
||||
### Inline Comments
|
||||
|
||||
**Assessment**: GOOD
|
||||
|
||||
- Clear version markers (v1.4.2)
|
||||
- Explains WHY (browsers can't display HEIC, iOS uses MPO for depth)
|
||||
- Links to relevant concepts (portrait mode, depth maps)
|
||||
|
||||
### Implementation Report
|
||||
|
||||
**Assessment**: EXCELLENT
|
||||
|
||||
- Comprehensive technical decisions documented
|
||||
- Clear before/after code examples
|
||||
- Deployment notes included
|
||||
- Performance considerations noted
|
||||
|
||||
## Technical Debt Assessment
|
||||
|
||||
### New Debt Introduced
|
||||
|
||||
1. **Fallback HEIC Detection Complexity**: Lines 111-142 are complex nested exception handling
|
||||
- Could be refactored into separate function for clarity
|
||||
- Not urgent; works correctly
|
||||
2. **Debug File Accumulation**: See M2 above
|
||||
3. **Hardcoded Constants**: See M3 and M4 above
|
||||
|
||||
### Debt Assessment: LOW
|
||||
|
||||
Minimal new technical debt. The complex error handling is justified by the real-world issue it solves (iOS mislabeling files).
|
||||
|
||||
## Dependencies Review
|
||||
|
||||
### New Dependencies
|
||||
|
||||
**pillow-heif==0.18.***
|
||||
- Purpose: HEIC/HEIF format support
|
||||
- License: BSD-3-Clause (compatible)
|
||||
- Maintenance: Active (last release 2024)
|
||||
- Security: No known CVEs as of 2025-12-16
|
||||
|
||||
**Pillow 10.0.* → 10.1.***
|
||||
- Reason: Required by pillow-heif
|
||||
- Risk: Minor version bump
|
||||
- Mitigation: Full test suite passed (879 tests)
|
||||
|
||||
### Dependencies Assessment: ACCEPTABLE
|
||||
|
||||
Dependencies are well-maintained, properly licensed, and version-pinned.
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Immediate (Before Production Deployment)
|
||||
|
||||
**None** - Implementation is production-ready as-is
|
||||
|
||||
### Short-Term (Next Patch/Minor Release)
|
||||
|
||||
1. **Add MPO test coverage** (M1) - Priority: HIGH
|
||||
- Advertised in CHANGELOG but untested
|
||||
- Create test case in `TestHEICSupport` class
|
||||
|
||||
2. **Add debug file cleanup** (M2) - Priority: MEDIUM
|
||||
- Prevent disk exhaustion from failed uploads
|
||||
- Add size limit and age-based cleanup
|
||||
|
||||
3. **Document edge cases in code** (M5) - Priority: LOW
|
||||
- Enhance MPO comment to explain multi-frame nature
|
||||
- Add comment about animated HEIC handling (if applicable)
|
||||
|
||||
### Long-Term (Future Versions)
|
||||
|
||||
1. **Make HEIC quality configurable** (M3)
|
||||
2. **Make MAX_DIMENSION configurable** (M4)
|
||||
3. **Add magic bytes validation** (S3)
|
||||
4. **Refactor fallback detection** (Technical Debt #1)
|
||||
5. **Consider WebP conversion target** (Better compression than JPEG)
|
||||
|
||||
## Deployment Checklist
|
||||
|
||||
- ✅ Dependencies updated in requirements.txt
|
||||
- ✅ Version number incremented
|
||||
- ✅ CHANGELOG.md updated
|
||||
- ✅ Tests pass (33/33 media tests, 879/879 total)
|
||||
- ✅ Backward compatibility maintained
|
||||
- ✅ No database migrations required
|
||||
- ✅ No configuration changes required
|
||||
- ⚠️ MPO format untested (see M1)
|
||||
|
||||
## Conclusion
|
||||
|
||||
The v1.4.2 implementation successfully solves the iPhone photo upload problem with a clean, well-documented solution. The code quality is high, security concerns are minimal, and backward compatibility is maintained.
|
||||
|
||||
**Recommendation**: **APPROVE for production deployment**
|
||||
|
||||
The identified minor issues (particularly MPO test coverage) should be addressed in a follow-up patch, but do not block this release. The implementation solves a real user problem (iPhone uploads failing) and does so safely.
|
||||
|
||||
### Confidence Level
|
||||
|
||||
- **Code Quality**: HIGH (clean implementation, follows project standards)
|
||||
- **Security**: MEDIUM-HIGH (acceptable risk from new dependency)
|
||||
- **Test Coverage**: MEDIUM (HEIC well-tested, MPO untested)
|
||||
- **Documentation**: HIGH (clear comments and design docs)
|
||||
- **Overall**: HIGH (ready for production use)
|
||||
|
||||
## Follow-Up Actions
|
||||
|
||||
### For Developer
|
||||
|
||||
1. Consider adding MPO test case before next release
|
||||
2. Monitor pillow-heif for security updates
|
||||
3. Add debug directory cleanup in v1.4.3 or v1.5.0
|
||||
|
||||
### For Architect
|
||||
|
||||
1. Review recommendation to make MAX_DIMENSION configurable
|
||||
2. Decide if WebP conversion should replace JPEG for HEIC (better compression)
|
||||
3. Consider documenting animated HEIC handling policy
|
||||
|
||||
---
|
||||
|
||||
**Review completed**: 2025-12-16
|
||||
**Reviewed by**: Claude (Fullstack Developer Agent)
|
||||
**Status**: Implementation APPROVED with minor recommendations
|
||||
412
docs/design/v1.5.0/2025-12-16-architect-responses.md
Normal file
412
docs/design/v1.5.0/2025-12-16-architect-responses.md
Normal file
@@ -0,0 +1,412 @@
|
||||
# Architect Responses - v1.5.0 Developer Questions
|
||||
|
||||
**Date**: 2025-12-16
|
||||
**Architect**: StarPunk Architect Agent
|
||||
**In Response To**: `docs/design/v1.5.0/2025-12-16-developer-questions.md`
|
||||
|
||||
---
|
||||
|
||||
## Response Summary
|
||||
|
||||
All 8 questions have been addressed. The implementation can proceed immediately after reading this document. Key decisions:
|
||||
|
||||
1. **Q1**: Keep existing pattern (pass `existing_slugs: Set[str]`). ADR-062 pseudocode is conceptual.
|
||||
2. **Q2**: First note gets base slug, first collision gets `-1`. The table in ADR-062 is correct.
|
||||
3. **Q3**: Create new function in `slug_utils.py`. Migration path provided.
|
||||
4. **Q4**: Keep reserved slug validation for custom slugs only. Safe to skip for timestamp slugs.
|
||||
5. **Q5**: Tests should use pattern matching for timestamps. Test strategy provided.
|
||||
6. **Q6**: Phase 0 is a hard dependency. Cannot proceed with Phase 1 until all tests pass.
|
||||
7. **Q7**: Slug format is opaque. No format-aware code needed.
|
||||
8. **Q8**: Phases 2-4 can be parallelized. Commit each phase separately.
|
||||
|
||||
---
|
||||
|
||||
## Detailed Responses
|
||||
|
||||
### Q1: Collision handling function location and signature
|
||||
|
||||
**Decision**: Keep the existing pattern of accepting `existing_slugs: Set[str]` as a parameter.
|
||||
|
||||
**Rationale**:
|
||||
The ADR-062 pseudocode (`slug_exists(base_slug)`) is conceptual illustration only. The actual implementation should:
|
||||
|
||||
1. Follow the existing pattern in the codebase (passing pre-fetched slug sets)
|
||||
2. Avoid introducing database coupling into utility functions
|
||||
3. Prevent N+1 queries when checking multiple candidates
|
||||
|
||||
**Implementation Specification**:
|
||||
|
||||
Create a new function in `/home/phil/Projects/starpunk/starpunk/slug_utils.py`:
|
||||
|
||||
```python
|
||||
def generate_timestamp_slug(
|
||||
created_at: datetime = None,
|
||||
existing_slugs: Set[str] = None
|
||||
) -> str:
|
||||
"""Generate a timestamp-based slug with collision handling.
|
||||
|
||||
Per ADR-062: Default format is YYYYMMDDHHMMSS with sequential
|
||||
suffix (-1, -2, etc.) for collisions.
|
||||
|
||||
Args:
|
||||
created_at: Note creation timestamp (defaults to now)
|
||||
existing_slugs: Set of existing slugs to check for collisions
|
||||
|
||||
Returns:
|
||||
Unique timestamp-based slug
|
||||
"""
|
||||
if created_at is None:
|
||||
created_at = datetime.utcnow()
|
||||
|
||||
if existing_slugs is None:
|
||||
existing_slugs = set()
|
||||
|
||||
base_slug = created_at.strftime("%Y%m%d%H%M%S")
|
||||
|
||||
# If no collision, return base slug
|
||||
if base_slug not in existing_slugs:
|
||||
return base_slug
|
||||
|
||||
# Sequential suffix for collisions
|
||||
suffix = 1
|
||||
while f"{base_slug}-{suffix}" in existing_slugs:
|
||||
suffix += 1
|
||||
|
||||
return f"{base_slug}-{suffix}"
|
||||
```
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/slug_utils.py`
|
||||
|
||||
**Do NOT modify**: The `ensure_unique_slug()` function signature in ADR-062 is pseudocode. Do not create a function with database coupling.
|
||||
|
||||
---
|
||||
|
||||
### Q2: Sequential suffix starting number
|
||||
|
||||
**Decision**: First note gets base slug (no suffix). First collision gets `-1`.
|
||||
|
||||
**Clarification**: The examples table in ADR-062 is correct. The text "first collision: `20251216143052-1`" means the first **duplicate** (collision) gets `-1`, not the first note overall.
|
||||
|
||||
**Example Scenario**:
|
||||
```
|
||||
Note A created at 14:30:52 -> slug = "20251216143052" (base, no suffix)
|
||||
Note B created at 14:30:52 -> slug = "20251216143052-1" (first collision)
|
||||
Note C created at 14:30:52 -> slug = "20251216143052-2" (second collision)
|
||||
```
|
||||
|
||||
**Implementation**: Sequential suffix loop should start at `1`, not `2`:
|
||||
|
||||
```python
|
||||
suffix = 1
|
||||
while f"{base_slug}-{suffix}" in existing_slugs:
|
||||
suffix += 1
|
||||
return f"{base_slug}-{suffix}"
|
||||
```
|
||||
|
||||
**Note**: This differs from the existing `make_slug_unique_with_suffix()` function in `slug_utils.py` which starts at `-2`. The new timestamp slug function should start at `-1` per ADR-062.
|
||||
|
||||
---
|
||||
|
||||
### Q3: Two `generate_slug()` functions with different signatures
|
||||
|
||||
**Decision**: Create a new function in `slug_utils.py`. Do not modify or remove the existing function in `utils.py`.
|
||||
|
||||
**Migration Path**:
|
||||
|
||||
1. **Create new function** in `/home/phil/Projects/starpunk/starpunk/slug_utils.py`:
|
||||
- Name: `generate_timestamp_slug()` (as specified in Q1 response)
|
||||
- Do not name it `generate_slug()` to avoid confusion during migration
|
||||
|
||||
2. **Update `/home/phil/Projects/starpunk/starpunk/notes.py`** (lines 228-234):
|
||||
|
||||
**Current code** (line 228-234):
|
||||
```python
|
||||
else:
|
||||
# Generate base slug from content
|
||||
base_slug = generate_slug(content, created_at)
|
||||
|
||||
# Make unique if collision
|
||||
slug = make_slug_unique(base_slug, existing_slugs)
|
||||
```
|
||||
|
||||
**Replace with**:
|
||||
```python
|
||||
else:
|
||||
# Generate timestamp-based slug (ADR-062)
|
||||
from starpunk.slug_utils import generate_timestamp_slug
|
||||
slug = generate_timestamp_slug(created_at, existing_slugs)
|
||||
```
|
||||
|
||||
3. **Keep the existing function** in `utils.py` unchanged. It may be used elsewhere or for future content-based slug options.
|
||||
|
||||
4. **Update imports** in `notes.py`:
|
||||
- Remove `generate_slug` from the `utils` import line (line 34) if no longer needed
|
||||
- Verify no other callers use `generate_slug()` for default note creation
|
||||
|
||||
**Rationale**:
|
||||
- Creating a distinctly-named function makes the transition explicit
|
||||
- Preserves backward compatibility if content-based slugs are ever needed
|
||||
- Allows gradual migration with clear audit trail
|
||||
|
||||
---
|
||||
|
||||
### Q4: Reserved slug handling in timestamp context
|
||||
|
||||
**Decision**:
|
||||
- **Default timestamp slugs**: Safe to skip reserved slug validation (timestamps cannot collide with reserved words)
|
||||
- **Custom slugs**: Must retain reserved slug validation
|
||||
|
||||
**Implementation**:
|
||||
|
||||
The new `generate_timestamp_slug()` function does NOT need to check reserved slugs because:
|
||||
- Reserved slugs are alphabetic (`api`, `admin`, `feed`, etc.)
|
||||
- Timestamp slugs are purely numeric (`20251216143052`)
|
||||
- These sets are mutually exclusive by construction
|
||||
|
||||
**However**, the existing custom slug validation in `validate_and_sanitize_custom_slug()` must retain reserved slug checking because user-provided slugs could be anything.
|
||||
|
||||
**Code Impact**:
|
||||
- `generate_timestamp_slug()`: No reserved slug check needed
|
||||
- `validate_and_sanitize_custom_slug()`: Unchanged (already handles reserved slugs)
|
||||
- `validate_slug()` in `utils.py`: Unchanged (still validates reserved slugs for custom slugs)
|
||||
|
||||
**Simplification**: You may remove the defensive `validate_slug()` call after timestamp generation in `notes.py` (line 236-237) since timestamp slugs are guaranteed valid by construction:
|
||||
```python
|
||||
# This check is no longer needed for timestamp slugs:
|
||||
# if not validate_slug(slug):
|
||||
# raise InvalidNoteDataError(...)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Q5: Test update strategy for slug format changes
|
||||
|
||||
**Decision**: Use pattern matching for timestamp assertions. Preserve content-based tests in a separate test file or clearly marked section.
|
||||
|
||||
**Test Strategy**:
|
||||
|
||||
1. **For new timestamp slug tests**, use regex pattern matching:
|
||||
```python
|
||||
import re
|
||||
|
||||
TIMESTAMP_SLUG_PATTERN = re.compile(r'^\d{14}(-\d+)?$')
|
||||
|
||||
def test_default_slug_is_timestamp():
|
||||
note = create_note("Some content here")
|
||||
assert TIMESTAMP_SLUG_PATTERN.match(note.slug)
|
||||
|
||||
def test_slug_collision_adds_suffix():
|
||||
# Create two notes at same timestamp
|
||||
fixed_time = datetime(2025, 12, 16, 14, 30, 52)
|
||||
note1 = create_note("First note", created_at=fixed_time)
|
||||
note2 = create_note("Second note", created_at=fixed_time)
|
||||
|
||||
assert note1.slug == "20251216143052"
|
||||
assert note2.slug == "20251216143052-1"
|
||||
```
|
||||
|
||||
2. **For timestamp-specific assertions**, use fixed timestamps in tests:
|
||||
```python
|
||||
def test_timestamp_slug_format():
|
||||
"""Test that timestamp slug matches exact format."""
|
||||
fixed_time = datetime(2025, 12, 16, 14, 30, 52)
|
||||
slug = generate_timestamp_slug(fixed_time, set())
|
||||
assert slug == "20251216143052"
|
||||
```
|
||||
|
||||
3. **Content-based tests**: Move to a separate section or file named `test_legacy_slug_generation.py` with a comment:
|
||||
```python
|
||||
"""
|
||||
Legacy tests for content-based slug generation.
|
||||
|
||||
These test the generate_slug() function in utils.py which is
|
||||
preserved for backward compatibility. New notes use timestamp-based
|
||||
slugs per ADR-062.
|
||||
"""
|
||||
```
|
||||
|
||||
4. **Update existing tests that assert specific slug formats**:
|
||||
- Replace exact string assertions (`assert slug == "hello-world"`) with pattern matching
|
||||
- OR use fixed timestamps and assert exact expected values
|
||||
|
||||
**Affected Test Files** (estimate from your question):
|
||||
- `/home/phil/Projects/starpunk/tests/test_utils.py`: ~20 tests
|
||||
- `/home/phil/Projects/starpunk/tests/test_notes.py`: ~10 tests
|
||||
- `/home/phil/Projects/starpunk/tests/test_micropub.py`: ~10 tests
|
||||
|
||||
**Helper Function** (add to test utilities or conftest.py):
|
||||
```python
|
||||
def assert_valid_timestamp_slug(slug: str) -> None:
|
||||
"""Assert slug matches timestamp format per ADR-062."""
|
||||
pattern = re.compile(r'^\d{14}(-\d+)?$')
|
||||
assert pattern.match(slug), f"Slug '{slug}' does not match timestamp format"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Q6: Phase 0 priority - are tests blocking?
|
||||
|
||||
**Decision**: Phase 0 is a **hard dependency**. Phase 1 cannot proceed until all tests pass.
|
||||
|
||||
**Rationale**:
|
||||
|
||||
1. **Failing tests mask regressions**: If you implement Phase 1 (slug changes) while 19 tests are failing, you cannot distinguish between:
|
||||
- Pre-existing failures
|
||||
- New failures caused by your changes
|
||||
|
||||
2. **CI/CD integrity**: The project should maintain a green build. Committing Phase 1 work with known failing tests violates this principle.
|
||||
|
||||
3. **Debugging complexity**: If slug changes interact unexpectedly with the failing test areas (e.g., feed generation, content negotiation), debugging becomes significantly harder.
|
||||
|
||||
**Implementation Order**:
|
||||
|
||||
```
|
||||
Phase 0 (Test Fixes) <-- MUST COMPLETE FIRST
|
||||
|
|
||||
v
|
||||
+-- Phase 1 (Timestamp Slugs)
|
||||
|
|
||||
+-- Phase 2 (Debug Files) <-- Can run in parallel
|
||||
|
|
||||
+-- Phase 3 (N+1 Query Fix) <-- after Phase 0
|
||||
|
|
||||
+-- Phase 4 (Atomic Variants)
|
||||
|
|
||||
v
|
||||
Phase 5 (Test Coverage) <-- LAST
|
||||
```
|
||||
|
||||
**Exception**: If fixing a Phase 0 test reveals it requires slug-related changes, document the circular dependency and bring to architect attention.
|
||||
|
||||
**Recommendation**: Start Phase 0 immediately. Many of those 19 failures may be simple test maintenance issues (timeouts, assertions that need updating, etc.).
|
||||
|
||||
---
|
||||
|
||||
### Q7: Backwards compatibility for existing notes
|
||||
|
||||
**Decision**: Slug format is **opaque**. No code needs to be format-aware.
|
||||
|
||||
**Clarification**:
|
||||
|
||||
1. **No format detection needed**: Do not create an `is_timestamp_slug()` helper function. The system should treat all slugs identically regardless of format.
|
||||
|
||||
2. **Code paths are format-agnostic**:
|
||||
- Database queries use slug as opaque string key
|
||||
- URL routing uses slug as path parameter
|
||||
- Feed generation includes slug verbatim
|
||||
- Templates display slug without interpretation
|
||||
|
||||
3. **Coexistence is automatic**: Content-based slugs (`my-first-post`) and timestamp slugs (`20251216143052`) are both valid slug patterns. They coexist naturally.
|
||||
|
||||
4. **Sorting behavior**:
|
||||
- Database ORDER BY on `created_at` is unaffected
|
||||
- Alphabetical slug sorting would interleave formats (numbers sort before letters)
|
||||
- This is acceptable; no UI sorts by slug alphabetically
|
||||
|
||||
5. **Display considerations**:
|
||||
- Timestamps are less readable in URLs, but URLs are rarely human-read
|
||||
- UI displays note content, not slugs
|
||||
- RSS feeds include full URLs; format doesn't affect feed readers
|
||||
|
||||
**No audit required**: You do not need to audit slug usages. The design ensures format-agnosticism.
|
||||
|
||||
---
|
||||
|
||||
### Q8: Phase ordering rationale
|
||||
|
||||
**Decision**:
|
||||
- Phases 2, 3, and 4 **can be parallelized** after Phase 0 completes
|
||||
- Phase 1 (Timestamp Slugs) should be done **before** Phases 2-4 for cleaner testing
|
||||
- Each phase should be **committed separately**
|
||||
|
||||
**Recommended Order**:
|
||||
|
||||
```
|
||||
Day 1: Phase 0 (Test Fixes) - Get to green build
|
||||
Day 2: Phase 1 (Timestamp Slugs) - Quick win, user-visible improvement
|
||||
Day 3: Phase 2 (Debug Files) or Phase 3 (N+1) or Phase 4 (Atomic) - Any order
|
||||
Day 4: Remaining phases from above
|
||||
Day 5: Phase 5 (Test Coverage) - Must be last
|
||||
```
|
||||
|
||||
**Commit Strategy**:
|
||||
|
||||
Each phase gets its own commit or small series of commits:
|
||||
```
|
||||
feat(slugs): Implement timestamp-based slugs per ADR-062
|
||||
fix(tests): Resolve 19 failing tests in Phase 0
|
||||
feat(media): Add debug file cleanup and configuration
|
||||
perf(feed): Batch load media and tags to fix N+1 query
|
||||
feat(media): Make variant generation atomic with database
|
||||
test: Expand coverage to 90% for v1.5.0
|
||||
```
|
||||
|
||||
**Why Not Batch**: Batching commits makes bisecting regressions harder and obscures the changelog.
|
||||
|
||||
**Parallelization Note**: If you have capacity to work on multiple phases simultaneously (e.g., waiting for test runs), Phases 2-4 have no code overlap and can be developed in parallel branches:
|
||||
- `feature/v1.5.0-debug-files`
|
||||
- `feature/v1.5.0-n+1-fix`
|
||||
- `feature/v1.5.0-atomic-variants`
|
||||
|
||||
Then merge all into main before Phase 5.
|
||||
|
||||
---
|
||||
|
||||
## Observations Responses
|
||||
|
||||
### O1: Code duplication between `utils.py` and `slug_utils.py`
|
||||
|
||||
**Acknowledged**. The duplicate `RESERVED_SLUGS` definitions should be consolidated in a future cleanup. For v1.5.0:
|
||||
|
||||
- Use the more comprehensive list in `slug_utils.py` (12 items) as the canonical source
|
||||
- Do not consolidate in this release (scope creep)
|
||||
- Add to BACKLOG.md for v1.5.1 or v1.6.0
|
||||
|
||||
### O2: Timestamp format consistency
|
||||
|
||||
**Confirmed**: The hyphen removal in ADR-062 (`YYYYMMDDHHMMSS`) vs existing fallback (`YYYYMMDD-HHMMSS`) is intentional.
|
||||
|
||||
- ADR-062 format: `20251216143052` (14 characters, no separator)
|
||||
- Old fallback format: `20251216-143052` (15 characters, with hyphen)
|
||||
|
||||
**Rationale**:
|
||||
- One fewer character
|
||||
- Consistent with ISO 8601 compact format
|
||||
- No functional difference; purely aesthetic
|
||||
|
||||
**Action**: Use the unhyphenated format per ADR-062.
|
||||
|
||||
### O3: Test coverage gap mentioned in Phase 5
|
||||
|
||||
**Clarification**: This is intentional duplication for visibility.
|
||||
|
||||
- BACKLOG.md lists the gap for tracking
|
||||
- Phase 5 in RELEASE.md lists it for implementation planning
|
||||
- Both point to the same work item
|
||||
|
||||
No action needed; this is not an error.
|
||||
|
||||
---
|
||||
|
||||
## ADR-062 Updates Required
|
||||
|
||||
No updates to ADR-062 are required based on these questions. The ADR's pseudocode is understood to be conceptual, and all implementation details have been clarified in this response document.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
Based on these responses, the developer can now proceed with:
|
||||
|
||||
- [ ] Phase 0: Fix 19 failing tests
|
||||
- [ ] Create `generate_timestamp_slug()` in `slug_utils.py`
|
||||
- [ ] Update `notes.py` to use new timestamp slug function
|
||||
- [ ] Update tests with pattern matching strategy
|
||||
- [ ] Proceed with Phases 2-4 (order flexible)
|
||||
- [ ] Complete Phase 5 test coverage last
|
||||
|
||||
---
|
||||
|
||||
**Architect**: StarPunk Architect Agent
|
||||
**Status**: Questions Answered - Ready for Implementation
|
||||
**Next Action**: Developer to begin Phase 0 (Test Fixes)
|
||||
272
docs/design/v1.5.0/2025-12-16-developer-questions.md
Normal file
272
docs/design/v1.5.0/2025-12-16-developer-questions.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# Developer Review Questions - v1.5.0 Release Plan
|
||||
|
||||
**Date**: 2025-12-16
|
||||
**Reviewer**: StarPunk Developer Agent
|
||||
**Documents Reviewed**:
|
||||
- `docs/decisions/ADR-062-timestamp-based-slug-format.md`
|
||||
- `docs/projectplan/v1.5.0/RELEASE.md`
|
||||
- `docs/projectplan/BACKLOG.md`
|
||||
|
||||
## Executive Summary
|
||||
|
||||
After thorough review of the v1.5.0 release plan and ADR-062, I have identified **8 questions** that need clarification from the architect before implementation can begin. These questions fall into three categories:
|
||||
|
||||
1. **Slug implementation details** (Questions 1-4)
|
||||
2. **Test structure and expectations** (Questions 5-6)
|
||||
3. **Phase dependencies and ordering** (Questions 7-8)
|
||||
|
||||
The majority of questions concern Phase 1 (Timestamp-Based Slugs), which has some ambiguities around collision handling and interaction with existing code paths.
|
||||
|
||||
---
|
||||
|
||||
## Questions
|
||||
|
||||
### Phase 1: Timestamp-Based Slugs
|
||||
|
||||
#### Q1: Collision handling function location and signature
|
||||
|
||||
**Issue**: ADR-062 specifies an `ensure_unique_slug(base_slug: str) -> str` function that checks `slug_exists(base_slug)`, but the implementation location and database interaction pattern are unclear.
|
||||
|
||||
**Current State**:
|
||||
- `starpunk/slug_utils.py` has `make_slug_unique_with_suffix(base_slug, existing_slugs: Set[str])` that takes a pre-fetched set
|
||||
- `starpunk/utils.py` has `make_slug_unique(base_slug, existing_slugs: set[str])` with random suffix logic
|
||||
- `starpunk/notes.py` line 220 calls `_get_existing_slugs(db)` to fetch all slugs, then passes to uniqueness check
|
||||
|
||||
**Questions**:
|
||||
1. Should `ensure_unique_slug()` perform its own database queries (violates current pattern)?
|
||||
2. Or should it follow the existing pattern of accepting `existing_slugs: Set[str]` parameter?
|
||||
3. Should this function live in `slug_utils.py` or `utils.py`?
|
||||
4. The ADR shows `slug_exists()` as a separate function - should this be implemented, or is this pseudocode?
|
||||
|
||||
**Implementation Impact**: This affects the function signature and which module needs modification. The current codebase consistently passes pre-fetched slug sets to avoid N+1 queries.
|
||||
|
||||
**Suggested Clarification**: Confirm whether to:
|
||||
- Keep existing pattern: `ensure_unique_slug(base_slug: str, existing_slugs: Set[str]) -> str`
|
||||
- Or introduce database coupling: `ensure_unique_slug(base_slug: str, db: Connection) -> str`
|
||||
|
||||
---
|
||||
|
||||
#### Q2: Sequential suffix starting number
|
||||
|
||||
**Issue**: ADR-062 says "first collision: `20251216143052-1`" but the examples table shows the second note as `-1`, implying the first note has no suffix.
|
||||
|
||||
**Current State**:
|
||||
- Existing `make_slug_unique_with_suffix()` starts at `-2` for first collision (base slug is attempt #1)
|
||||
- This matches table in ADR-062 examples
|
||||
|
||||
**Questions**:
|
||||
1. Does "first collision" mean the first duplicate attempt (which should get `-1`)?
|
||||
2. Or does it mean the first note gets the base slug, and second note gets `-1`?
|
||||
|
||||
**Example Scenario**:
|
||||
```python
|
||||
# Two notes created at exactly 2025-12-16 14:30:52
|
||||
Note A created first: slug = ?
|
||||
Note B created second: slug = ?
|
||||
|
||||
# Is it:
|
||||
Option 1: A = 20251216143052, B = 20251216143052-1 (matches table)
|
||||
Option 2: A = 20251216143052-1, B = 20251216143052-2 (matches "first collision" text)
|
||||
```
|
||||
|
||||
**Implementation Impact**: Determines whether sequential suffix loop starts at 1 or 2.
|
||||
|
||||
**Suggested Clarification**: Confirm that first note gets base slug (no suffix), first collision gets `-1`.
|
||||
|
||||
---
|
||||
|
||||
#### Q3: Two `generate_slug()` functions with different signatures
|
||||
|
||||
**Issue**: The codebase has two `generate_slug()` functions:
|
||||
|
||||
**Current State**:
|
||||
- `starpunk/utils.py` line 173: `generate_slug(content: str, created_at: Optional[datetime]) -> str`
|
||||
- Content-based slug generation (extracts words)
|
||||
- Used by `notes.py` line 230
|
||||
- ADR-062 specifies: `generate_slug(custom_slug: str = None, created_at: datetime = None) -> str`
|
||||
- Timestamp-based, no content parameter
|
||||
|
||||
**Questions**:
|
||||
1. Should I replace the function in `utils.py`?
|
||||
2. Should I create a new function in `slug_utils.py` and leave `utils.py` as deprecated?
|
||||
3. Should both functions coexist with different names?
|
||||
4. The removal of the `content` parameter is a breaking change for `notes.py` - should `notes.py` be updated to not pass content at all?
|
||||
|
||||
**Implementation Impact**:
|
||||
- If replacing `utils.py` function: ~15 test files will need updates
|
||||
- If creating new function: need to update all callers to use new function
|
||||
- `notes.py` line 230 currently calls `generate_slug(content, created_at)` - this will break
|
||||
|
||||
**Suggested Clarification**: Specify exact function location and migration path for callers.
|
||||
|
||||
---
|
||||
|
||||
#### Q4: Reserved slug handling in timestamp context
|
||||
|
||||
**Issue**: ADR-062 doesn't address what happens if a timestamp-based slug conflicts with a reserved slug.
|
||||
|
||||
**Current State**:
|
||||
- Reserved slugs: `['api', 'admin', 'auth', 'feed', ...]` (all alphabetic)
|
||||
- Timestamp format: `20251216143052` (all numeric)
|
||||
- **These can never collide**
|
||||
|
||||
**Questions**:
|
||||
1. Should reserved slug validation be removed from the default slug path since timestamps can't conflict?
|
||||
2. Should it remain for safety/future-proofing?
|
||||
3. Custom slugs still need reserved validation - should this logic split?
|
||||
|
||||
**Implementation Impact**: May simplify code by removing unnecessary validation from timestamp path.
|
||||
|
||||
**Suggested Clarification**: Confirm whether reserved slug check is still needed for default slugs.
|
||||
|
||||
---
|
||||
|
||||
### Phase 0: Test Fixes
|
||||
|
||||
#### Q5: Test update strategy for slug format changes
|
||||
|
||||
**Issue**: Phase 1 will change default slug format from content-based to timestamp-based. Many tests currently assert content-based slugs.
|
||||
|
||||
**Current State**:
|
||||
- `tests/test_utils.py` has 20+ tests asserting content-based slug behavior (e.g., `assert slug == "hello-world-this-is-my"`)
|
||||
- Phase 1 acceptance criteria: "Update expected slug formats in test assertions"
|
||||
|
||||
**Questions**:
|
||||
1. Should these tests be updated to assert timestamp format (e.g., `assert slug.startswith("2025")`)?
|
||||
2. Should the old content-based tests be preserved but marked as testing legacy behavior?
|
||||
3. Should tests be split: "default slug generation" vs "content-based slug generation" (if we keep the old function)?
|
||||
4. Do you want all slug assertions to use pattern matching (timestamp format) or fixed timestamps in tests?
|
||||
|
||||
**Implementation Impact**:
|
||||
- Affects ~30 test assertions in `test_utils.py`
|
||||
- Affects ~10 integration tests in `test_notes.py`, `test_micropub.py`
|
||||
- May require test helper functions for deterministic timestamps
|
||||
|
||||
**Suggested Clarification**: Provide guidance on test update strategy.
|
||||
|
||||
---
|
||||
|
||||
#### Q6: Phase 0 priority - are tests blocking?
|
||||
|
||||
**Issue**: Phase 0 is marked "Must complete first - unblocks all other phases" but the 19 failing tests appear unrelated to slug changes.
|
||||
|
||||
**Current State**:
|
||||
- Failing tests: Migration performance, feed streaming, content negotiation, search security
|
||||
- None directly related to slug generation
|
||||
- Phase 1 (slugs) could theoretically proceed with these tests failing
|
||||
|
||||
**Questions**:
|
||||
1. Is Phase 0 truly blocking for Phase 1, or can Phase 1 proceed if slug-related tests pass?
|
||||
2. Should Phase 0 be completed before ANY other phase, or just before phases that depend on those specific test areas?
|
||||
3. If I discover that fixing some Phase 0 tests requires changes that conflict with Phase 1 work, what's the priority?
|
||||
|
||||
**Implementation Impact**: Affects work sequencing and potential merge conflicts.
|
||||
|
||||
**Suggested Clarification**: Confirm hard dependency of Phase 0 on all other phases, or allow parallel work.
|
||||
|
||||
---
|
||||
|
||||
### General Implementation Questions
|
||||
|
||||
#### Q7: Backwards compatibility for existing notes
|
||||
|
||||
**Issue**: ADR-062 states "Existing notes retain their slugs (no data migration)" but doesn't specify how the dual slug formats coexist.
|
||||
|
||||
**Current State**:
|
||||
- Database has notes with content-based slugs (e.g., `my-first-post`)
|
||||
- New notes will have timestamp slugs (e.g., `20251216143052`)
|
||||
- Both slug formats will exist simultaneously
|
||||
|
||||
**Questions**:
|
||||
1. Do any code paths need to detect/handle the two formats differently?
|
||||
2. Should there be a helper function `is_timestamp_slug(slug: str) -> bool` for future use?
|
||||
3. Are there any edge cases where code assumes slug format (e.g., sorting, display, URLs)?
|
||||
4. Does feed generation or UI need updates to handle mixed slug formats?
|
||||
|
||||
**Implementation Impact**: May require audit of all slug usages to ensure format-agnostic handling.
|
||||
|
||||
**Suggested Clarification**: Confirm if any code needs to be format-aware, or if slug format is truly opaque.
|
||||
|
||||
---
|
||||
|
||||
#### Q8: Phase ordering rationale
|
||||
|
||||
**Issue**: Phases 2-4 are all marked "Phase 0 complete" as dependency, but no interdependencies are noted between them.
|
||||
|
||||
**Current State**:
|
||||
- Phase 2: Debug file management
|
||||
- Phase 3: N+1 query fix (feed generation)
|
||||
- Phase 4: Atomic variant generation
|
||||
- All independent of each other
|
||||
|
||||
**Questions**:
|
||||
1. Can Phases 2-4 be implemented in parallel after Phase 0?
|
||||
2. Is there a preferred order (2→3→4) for any reason?
|
||||
3. Should each phase be committed separately or can they be batched?
|
||||
|
||||
**Implementation Impact**: Affects work planning and commit structure.
|
||||
|
||||
**Suggested Clarification**: Confirm whether Phases 2-4 can be parallelized or should be sequential.
|
||||
|
||||
---
|
||||
|
||||
## Additional Observations (Not Blocking)
|
||||
|
||||
These observations don't block implementation but may inform architectural decisions:
|
||||
|
||||
### O1: Code duplication between `utils.py` and `slug_utils.py`
|
||||
|
||||
**Observation**: Reserved slugs are defined in both:
|
||||
- `utils.py` line 24: `RESERVED_SLUGS = {"admin", "api", ...}` (5 items)
|
||||
- `slug_utils.py` line 24: `RESERVED_SLUGS = frozenset([...])` (12 items, more comprehensive)
|
||||
|
||||
**Impact**: Low - but may cause confusion. Consider consolidating in v1.5.1.
|
||||
|
||||
---
|
||||
|
||||
### O2: Timestamp format consistency
|
||||
|
||||
**Observation**: ADR-062 uses `YYYYMMDDHHMMSS` (no separator) but existing fallback in `utils.py` line 220 uses `YYYYMMDD-HHMMSS` (with hyphen).
|
||||
|
||||
**Questions**:
|
||||
- Is the hyphen removal intentional (shorter, more compact)?
|
||||
- Current: `20241118-143045` (15 chars)
|
||||
- ADR-062: `20241118143045` (14 chars)
|
||||
|
||||
**Impact**: Minor - affects slug length by 1 character.
|
||||
|
||||
---
|
||||
|
||||
### O3: Test coverage gap mentioned in Phase 5
|
||||
|
||||
**Observation**: Phase 5 mentions "MPO format handling (untested)" but this is already in BACKLOG as "High priority - Scheduled for v1.5.0".
|
||||
|
||||
**Question**: Is this intentionally listed in both BACKLOG and Phase 5, or is there duplication?
|
||||
|
||||
**Impact**: None - just a clarity question.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**Implementation Readiness**: 6/10
|
||||
|
||||
The release plan is well-structured with clear phases and acceptance criteria. However, **Phase 1 (Timestamp-Based Slugs)** has enough ambiguity in implementation details (Questions 1-4) that I cannot confidently proceed without architect clarification.
|
||||
|
||||
**Recommended Next Steps**:
|
||||
1. Architect addresses Questions 1-4 (Phase 1 blocking issues)
|
||||
2. Architect clarifies Questions 5-8 (helps inform implementation approach)
|
||||
3. Developer proceeds with Phase 0 (test fixes) while awaiting Q1-Q4 answers
|
||||
4. Developer implements Phases 1-4 after clarifications received
|
||||
|
||||
**Estimated Time to Implementation-Ready**: 1-2 hours for architect to address questions.
|
||||
|
||||
---
|
||||
|
||||
## Approval
|
||||
|
||||
Once these questions are addressed, I am confident the implementation can proceed smoothly with the 18-29 hour estimate provided in the release plan.
|
||||
|
||||
**Developer**: StarPunk Developer Agent
|
||||
**Status**: Awaiting Architect Response
|
||||
**Next Action**: Architect to review and answer questions 1-8
|
||||
@@ -1,9 +1,20 @@
|
||||
# StarPunk Backlog
|
||||
|
||||
**Last Updated**: 2025-12-10
|
||||
**Last Updated**: 2025-12-17
|
||||
|
||||
## Recently Completed
|
||||
|
||||
### v1.4.2 - HEIC/MPO Support and Dimension Limit Increase (Complete)
|
||||
- HEIC/HEIF format detection and conversion to JPEG
|
||||
- MPO (Multi-Picture Object) format support for iPhone depth photos
|
||||
- MAX_DIMENSION increased from 4096 to 12000 pixels
|
||||
- Enhanced debug logging for failed uploads
|
||||
|
||||
### v1.4.0/v1.4.1 - Media Variants (Complete)
|
||||
- Image variant generation (thumb, small, medium, large)
|
||||
- Micropub media endpoint
|
||||
- Enhanced feed media support with Media RSS
|
||||
|
||||
### v1.3.0 - Microformats2 Compliance and Tags (Complete)
|
||||
- Tag/Category system with database schema
|
||||
- p-category microformats2 markup
|
||||
@@ -29,31 +40,76 @@
|
||||
|
||||
## High
|
||||
|
||||
### Enhanced Feed Media Support *(Scheduled: v1.4.0)*
|
||||
- Multiple image sizes/thumbnails (150px, 320px, 640px, 1280px)
|
||||
- Full Media RSS implementation (media:group, all attributes)
|
||||
- Enhanced JSON Feed attachments
|
||||
- ATOM enclosure links for all media
|
||||
- See: ADR-059
|
||||
### MPO Format Test Coverage *(Scheduled: v1.5.0)*
|
||||
- **Description**: MPO conversion code exists but has no test coverage. MPO is advertised in the CHANGELOG but the handling is untested.
|
||||
- **Location**: `starpunk/media.py` lines 163-173
|
||||
- **Source**: Developer Review (M1)
|
||||
- **Approach**: Add `test_mpo_detection_and_conversion()` to `TestHEICSupport` class in `tests/test_media_upload.py`. Create an MPO test image using Pillow's MPO support.
|
||||
|
||||
### POSSE
|
||||
### POSSE
|
||||
- Native syndication to social networks
|
||||
- Supported networks:
|
||||
- First iteration:
|
||||
- First iteration:
|
||||
- Mastodon (and compatible services)
|
||||
- Bluesky
|
||||
- Second iteration
|
||||
- TBD
|
||||
- Second iteration
|
||||
- TBD
|
||||
- Solution should include a configuration UI for setup
|
||||
|
||||
---
|
||||
|
||||
## Medium
|
||||
|
||||
### Default slug change
|
||||
- The default slug should be a date time stamp
|
||||
### Debug File Storage Without Cleanup Mechanism *(Scheduled: v1.5.0)*
|
||||
- **Description**: Failed uploads are saved to `data/debug/` directory for analysis, but there is no mechanism to clean up these files. This could consume significant disk space, especially if under attack.
|
||||
- **Location**: `starpunk/media.py` lines 133-137
|
||||
- **Source**: Developer Review (M2), Architect Review (Issue 1.2.2)
|
||||
- **Approach**:
|
||||
1. Add `DEBUG_SAVE_FAILED_UPLOADS` config option (default: false in production)
|
||||
2. Implement automatic cleanup (files older than 7 days)
|
||||
3. Add disk space check or size limit (e.g., 100MB max)
|
||||
|
||||
### Filename Not Sanitized in Debug Path (Security) *(Scheduled: v1.5.0)*
|
||||
- **Description**: The original filename is used directly in the debug file path without sanitization, which could cause path traversal or special character issues.
|
||||
- **Location**: `starpunk/media.py` line 135
|
||||
- **Source**: Architect Review (Issue 1.2.3)
|
||||
- **Approach**: Sanitize filename before use: `safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]`
|
||||
|
||||
### N+1 Query Pattern in Feed Generation *(Scheduled: v1.5.0 - Partial)*
|
||||
- **Description**: In `_get_cached_notes()`, media and tags are loaded per-note in separate queries. For 50 notes, this is 100 additional database queries, degrading performance.
|
||||
- **Location**: `starpunk/routes/public.py` lines 68-74
|
||||
- **Source**: Architect Review (Issue 2.2.9)
|
||||
- **Approach**: Implement batch loading function `get_media_for_notes(note_ids: List[int])` using a single query with `WHERE note_id IN (...)`.
|
||||
- **v1.5.0 Scope**: Only `_get_cached_notes()` will be fixed in v1.5.0. Other N+1 patterns deferred (see Low Priority section).
|
||||
|
||||
### N+1 Query Patterns - Deferred Locations
|
||||
- **Description**: N+1 query patterns exist in multiple locations beyond `_get_cached_notes()`. These are lower priority due to lower traffic or single-note contexts.
|
||||
- **Deferred Locations**:
|
||||
1. **Admin Edit Page** (`starpunk/routes/admin.py` line 186): Single note context, low impact
|
||||
2. **Tag Archive Pages** (`starpunk/routes/public.py` lines 243-248, 332): Lower traffic than main feed
|
||||
3. **Individual Note View** (`starpunk/routes/public.py` lines 282-289): Single note, negligible
|
||||
4. **Note Model Property** (`starpunk/models.py` lines 380-381): On-demand loading, acceptable
|
||||
5. **Tag Module** (`starpunk/tags.py` line 197): Internal helper, low volume
|
||||
- **Source**: Architect Review, v1.5.0 scoping decision
|
||||
- **Approach**: Future optimization if performance issues arise. Consider batch loading patterns established in v1.5.0.
|
||||
- **Priority**: Deferred to post-v1.5.0
|
||||
|
||||
### Transaction Not Atomic in Variant Generation *(Scheduled: v1.5.0)*
|
||||
- **Description**: Files are written to disk before database commit. If the database commit fails, orphaned files remain on disk.
|
||||
- **Location**: `starpunk/media.py` lines 404-440
|
||||
- **Source**: Architect Review (Issue 2.2.6)
|
||||
- **Approach**: Write variant files to a temporary location first, then move to final location after successful database commit.
|
||||
|
||||
### Rate Limiting on Upload Endpoints
|
||||
- **Description**: No rate limiting exists on media upload endpoints, making them vulnerable to abuse.
|
||||
- **Location**: `/admin/new` (admin.py), `/media` (micropub.py)
|
||||
- **Source**: Architect Review (Security Assessment)
|
||||
- **Approach**: Implement Flask-Limiter or similar rate limiting middleware for upload routes.
|
||||
|
||||
### Default Slug Change *(Scheduled: v1.5.0)*
|
||||
- The default slug should be a date time stamp
|
||||
- YYYYMMDDHHMMSS
|
||||
- Edge case, if the slug would somehow be a duplicate append a "-x" e.g. -1
|
||||
- Edge case, if the slug would somehow be a duplicate append a "-x" e.g. -1
|
||||
|
||||
### Tag Enhancements (v1.3.0 Follow-up)
|
||||
- Tag pagination on archive pages (when note count exceeds threshold)
|
||||
@@ -105,6 +161,42 @@
|
||||
|
||||
## Low
|
||||
|
||||
### HEIC/MPO Conversion Quality Not Configurable
|
||||
- **Description**: HEIC and MPO to JPEG conversion uses hardcoded `quality=95`. Users with different quality/size tradeoff preferences cannot adjust this.
|
||||
- **Location**: `starpunk/media.py` line 157
|
||||
- **Source**: Developer Review (M3)
|
||||
- **Approach**: Add `HEIC_CONVERSION_QUALITY` config variable with 95 as default.
|
||||
|
||||
### MAX_DIMENSION Not Configurable
|
||||
- **Description**: `MAX_DIMENSION = 12000` is a hardcoded constant. Cannot adjust limit without code change.
|
||||
- **Location**: `starpunk/media.py` line 41
|
||||
- **Source**: Developer Review (M4)
|
||||
- **Approach**: Make configurable via config variable, keeping 12000 as default.
|
||||
|
||||
### Animated WebP Not Detected
|
||||
- **Description**: Animated GIF detection exists but animated WebP is not handled, potentially bypassing animated image size checks.
|
||||
- **Location**: `starpunk/media.py` (validate_image function)
|
||||
- **Source**: Architect Review (Issue 2.2.2)
|
||||
- **Approach**: Add animated WebP detection similar to existing GIF handling.
|
||||
|
||||
### Caption Not Escaped in RSS HTML
|
||||
- **Description**: In RSS generation, caption is used directly in alt attribute without HTML escaping. Could break markup if caption contains `"` or other special characters.
|
||||
- **Location**: `starpunk/feeds/rss.py` line 136
|
||||
- **Source**: Architect Review (Issue 2.2.10)
|
||||
- **Approach**: Use `html.escape()` for caption when generating HTML content.
|
||||
|
||||
### Incomplete MPO Documentation
|
||||
- **Description**: Code comment says "extract primary image" but doesn't explain the multi-frame nature of MPO files (contain multiple images for 3D or depth maps).
|
||||
- **Location**: `starpunk/media.py` lines 163-165
|
||||
- **Source**: Developer Review (M5)
|
||||
- **Approach**: Enhance inline comment to document that MPO files contain multiple frames and only the primary frame is extracted.
|
||||
|
||||
### Module Docstring Stale
|
||||
- **Description**: Module docstring still states "4096x4096 max dimensions" but MAX_DIMENSION was updated to 12000 in v1.4.2.
|
||||
- **Location**: `starpunk/media.py` lines 1-12
|
||||
- **Source**: Architect Review (Issue 1.2.1)
|
||||
- **Approach**: Update docstring to reflect current 12000px limit.
|
||||
|
||||
### Flaky Migration Race Condition Tests
|
||||
- Improve `test_migration_race_condition.py::TestGraduatedLogging::test_debug_level_for_early_retries`
|
||||
- Test expects DEBUG retry messages but passes when migration succeeds without retries
|
||||
|
||||
346
docs/projectplan/v1.5.0/RELEASE.md
Normal file
346
docs/projectplan/v1.5.0/RELEASE.md
Normal file
@@ -0,0 +1,346 @@
|
||||
# StarPunk v1.5.0 Release Plan
|
||||
|
||||
**Status**: Approved
|
||||
**Codename**: "Trigger"
|
||||
**Focus**: Stability, Test Coverage, and Technical Debt Reduction
|
||||
**Last Updated**: 2025-12-17
|
||||
|
||||
## Overview
|
||||
|
||||
v1.5.0 is a quality-focused release that addresses failing tests, increases test coverage to 90%, implements critical fixes from the v1.4.x review cycle, and resolves technical debt. No new user-facing features are planned.
|
||||
|
||||
## Goals
|
||||
|
||||
1. **Fix Failing Tests** - Resolve all 19 currently failing tests
|
||||
2. **90% Test Coverage** - Increase overall test coverage to 90% minimum
|
||||
3. **Technical Debt Reduction** - Address 6 specific backlog items
|
||||
4. **Code Quality** - Better error handling, atomicity, and performance
|
||||
|
||||
## Phased Implementation Plan
|
||||
|
||||
### Phase 0: Test Fixes (Critical Path)
|
||||
**Priority**: Must complete first - unblocks all other phases
|
||||
|
||||
#### Scope
|
||||
Fix the 19 failing tests identified in the current test suite:
|
||||
|
||||
| Category | Count | Tests |
|
||||
|----------|-------|-------|
|
||||
| Migration Performance | 2 | `test_single_worker_performance`, `test_concurrent_workers_performance` |
|
||||
| Feed Route (Streaming) | 1 | `test_feed_route_streaming` |
|
||||
| Feed Endpoints | 3 | `test_feed_rss_endpoint`, `test_feed_json_endpoint`, `test_feed_xml_legacy_endpoint` |
|
||||
| Content Negotiation | 6 | `test_accept_rss`, `test_accept_json_feed`, `test_accept_json_generic`, `test_accept_wildcard`, `test_no_accept_header`, `test_quality_factor_json_wins` |
|
||||
| Backward Compatibility | 1 | `test_feed_xml_contains_rss` |
|
||||
| Search Security | 1 | `test_search_escapes_html_in_note_content` |
|
||||
|
||||
#### Approach
|
||||
1. Investigate each failing test category
|
||||
2. Determine if failure is test issue or code issue
|
||||
3. Fix appropriately (prefer fixing tests over changing working code)
|
||||
4. Document any behavioral changes
|
||||
|
||||
#### Acceptance Criteria
|
||||
- [ ] All 879 tests pass
|
||||
- [ ] No test skips added (unless justified)
|
||||
- [ ] No test timeouts
|
||||
|
||||
#### Dependencies
|
||||
None - this is the first phase
|
||||
|
||||
---
|
||||
|
||||
### Phase 1: Timestamp-Based Slugs
|
||||
**Priority**: High - Addresses user-reported issue
|
||||
|
||||
#### Scope
|
||||
Implement ADR-062 to change default slug format from content-based to timestamp-based.
|
||||
|
||||
#### Implementation
|
||||
1. Update `starpunk/slug_utils.py`:
|
||||
- Change `generate_slug()` to use timestamp format `YYYYMMDDHHMMSS`
|
||||
- Update collision handling to use sequential suffix (`-1`, `-2`, etc.)
|
||||
- Preserve custom slug functionality
|
||||
|
||||
2. Update `starpunk/notes.py`:
|
||||
- Remove content parameter from slug generation calls
|
||||
|
||||
3. Update tests:
|
||||
- Modify expected slug formats in test assertions
|
||||
- Add tests for new timestamp format
|
||||
- Add tests for sequential collision handling
|
||||
|
||||
#### Acceptance Criteria
|
||||
- [ ] Default slugs use `YYYYMMDDHHMMSS` format
|
||||
- [ ] Collision handling uses `-1`, `-2` suffix
|
||||
- [ ] Custom slugs via `mp-slug` work unchanged
|
||||
- [ ] Custom slugs via web UI work unchanged
|
||||
- [ ] Existing notes unaffected
|
||||
- [ ] ADR-062 referenced in code comments
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete (all tests passing)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Debug File Management
|
||||
**Priority**: Medium - Security and operations concern
|
||||
|
||||
#### Scope
|
||||
Implement cleanup mechanism for failed upload debug files and add configuration controls.
|
||||
|
||||
#### Implementation
|
||||
1. Add configuration options:
|
||||
```python
|
||||
DEBUG_SAVE_FAILED_UPLOADS = False # Default: disabled in production
|
||||
DEBUG_FILE_MAX_AGE_DAYS = 7 # Auto-delete threshold
|
||||
DEBUG_FILE_MAX_SIZE_MB = 100 # Maximum debug folder size
|
||||
```
|
||||
|
||||
2. Implement cleanup logic in `starpunk/media.py`:
|
||||
- Check config before saving debug files
|
||||
- Implement `cleanup_old_debug_files()` function
|
||||
- Add size limit check before saving
|
||||
|
||||
3. Add startup cleanup:
|
||||
- Run cleanup on application startup
|
||||
- Log cleanup actions
|
||||
|
||||
4. Sanitize filenames:
|
||||
- Sanitize filename before debug path construction
|
||||
- Pattern: `"".join(c for c in filename if c.isalnum() or c in "._-")[:50]`
|
||||
|
||||
#### Acceptance Criteria
|
||||
- [ ] Debug files disabled by default
|
||||
- [ ] Files older than 7 days auto-deleted when enabled
|
||||
- [ ] Folder size limited to 100MB
|
||||
- [ ] Filenames sanitized (no path traversal)
|
||||
- [ ] Cleanup runs on startup
|
||||
- [ ] Tests cover all scenarios
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: N+1 Query Fix (Feed Generation)
|
||||
**Priority**: Medium - Performance improvement
|
||||
|
||||
#### Scope
|
||||
Fix N+1 query pattern in `_get_cached_notes()` only. Other N+1 patterns are deferred (documented in BACKLOG.md).
|
||||
|
||||
#### Implementation
|
||||
1. Create batch loading functions in `starpunk/media.py`:
|
||||
```python
|
||||
def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
|
||||
"""Batch load media for multiple notes in single query."""
|
||||
```
|
||||
|
||||
2. Create batch loading functions in `starpunk/tags.py`:
|
||||
```python
|
||||
def get_tags_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
|
||||
"""Batch load tags for multiple notes in single query."""
|
||||
```
|
||||
|
||||
3. Update `_get_cached_notes()` in `starpunk/routes/public.py`:
|
||||
- Use batch loading instead of per-note queries
|
||||
- Maintain same output format
|
||||
|
||||
#### Acceptance Criteria
|
||||
- [ ] Feed generation uses batch queries
|
||||
- [ ] Query count reduced from O(n) to O(1) for media/tags
|
||||
- [ ] No change to API behavior
|
||||
- [ ] Performance improvement verified in tests
|
||||
- [ ] Other N+1 locations documented in BACKLOG.md (not fixed)
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Atomic Variant Generation
|
||||
**Priority**: Medium - Data integrity
|
||||
|
||||
#### Scope
|
||||
Make variant file generation atomic with database commits to prevent orphaned files.
|
||||
|
||||
#### Implementation
|
||||
1. Modify `generate_all_variants()` in `starpunk/media.py`:
|
||||
- Write variants to temporary directory first
|
||||
- Perform database inserts in transaction
|
||||
- Move files to final location after successful commit
|
||||
- Clean up temp files on any failure
|
||||
|
||||
2. Add startup recovery:
|
||||
- Detect orphaned variant files on startup
|
||||
- Log warnings for orphans found
|
||||
- Optionally clean up orphans
|
||||
|
||||
#### Flow
|
||||
```
|
||||
1. Generate variants to temp directory (data/media/.tmp/)
|
||||
2. BEGIN TRANSACTION
|
||||
3. INSERT media record
|
||||
4. INSERT variant records
|
||||
5. COMMIT
|
||||
6. Move files from temp to final location
|
||||
7. On failure: ROLLBACK, delete temp files
|
||||
```
|
||||
|
||||
#### Acceptance Criteria
|
||||
- [ ] No orphaned files on database failures
|
||||
- [ ] No orphaned DB records on file failures
|
||||
- [ ] Atomic operation for all media saves
|
||||
- [ ] Startup recovery detects orphans
|
||||
- [ ] Tests simulate failure scenarios
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Test Coverage Expansion
|
||||
**Priority**: Medium - Quality assurance
|
||||
|
||||
#### Scope
|
||||
Increase overall test coverage to 90% minimum.
|
||||
|
||||
#### Approach
|
||||
1. Run coverage report: `uv run pytest --cov=starpunk --cov-report=html`
|
||||
2. Identify modules below 90% coverage
|
||||
3. Prioritize based on risk and complexity
|
||||
4. Write tests for uncovered paths
|
||||
|
||||
#### Known Coverage Gaps (to verify)
|
||||
- MPO format handling (untested)
|
||||
- Edge cases in error paths
|
||||
- Configuration validation paths
|
||||
- Startup/shutdown hooks
|
||||
|
||||
#### Specific Test Additions
|
||||
1. **MPO Format Tests** (`tests/test_media_upload.py`):
|
||||
- `test_mpo_detection_and_conversion()`
|
||||
- `test_mpo_corrupted_file()`
|
||||
- `test_mpo_single_frame()`
|
||||
|
||||
2. **Debug File Tests** (new test file or extend `test_media_upload.py`):
|
||||
- `test_debug_file_disabled_by_default()`
|
||||
- `test_debug_file_cleanup_old_files()`
|
||||
- `test_debug_file_size_limit()`
|
||||
- `test_debug_filename_sanitization()`
|
||||
|
||||
3. **Batch Loading Tests**:
|
||||
- `test_get_media_for_notes_batch()`
|
||||
- `test_get_tags_for_notes_batch()`
|
||||
- `test_batch_with_empty_list()`
|
||||
|
||||
#### Acceptance Criteria
|
||||
- [ ] Overall coverage >= 90%
|
||||
- [ ] No module below 85% coverage
|
||||
- [ ] All new code in v1.5.0 has 100% coverage
|
||||
- [ ] MPO handling fully tested
|
||||
|
||||
#### Dependencies
|
||||
- Phases 1-4 complete (tests cover new functionality)
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
Items explicitly excluded from v1.5.0:
|
||||
|
||||
| Item | Reason |
|
||||
|------|--------|
|
||||
| Rate limiting | Handled by reverse proxy (Caddy/Nginx) |
|
||||
| Schema changes | Not needed for v1.5.0 fixes |
|
||||
| New user features | Quality-focused release |
|
||||
| N+1 fixes in admin/search | Low traffic, deferred to BACKLOG |
|
||||
| UI changes | No frontend work planned |
|
||||
|
||||
---
|
||||
|
||||
## Recommendation: Single Release vs. Multiple Patches
|
||||
|
||||
**Recommendation: Single v1.5.0 Release**
|
||||
|
||||
### Rationale
|
||||
|
||||
1. **Phase Dependencies**: Most phases depend on Phase 0 (test fixes). Splitting would create artificial release boundaries.
|
||||
|
||||
2. **Cognitive Overhead**: Multiple patch releases (1.5.1, 1.5.2, etc.) require:
|
||||
- Multiple changelog entries
|
||||
- Multiple version bumps
|
||||
- Multiple release notes
|
||||
- More git tags/branches
|
||||
|
||||
3. **Test Coverage Integration**: Test coverage work (Phase 5) tests functionality from Phases 1-4. Separating them creates incomplete test coverage.
|
||||
|
||||
4. **User Experience**: Users prefer fewer, more significant updates over many small patches.
|
||||
|
||||
5. **Scope Alignment**: All v1.5.0 work is internally focused (no external API changes). Users see one "quality improvement" release.
|
||||
|
||||
### Exception
|
||||
|
||||
If Phase 0 (test fixes) reveals critical bugs affecting production, those fixes should be:
|
||||
- Backported to a v1.4.3 patch release on the current branch
|
||||
- Then merged forward to v1.5.0
|
||||
|
||||
### Alternative Considered
|
||||
|
||||
Splitting into:
|
||||
- v1.5.0: Phase 0 (test fixes) + Phase 1 (slugs)
|
||||
- v1.5.1: Phase 2-4 (technical debt)
|
||||
- v1.5.2: Phase 5 (test coverage)
|
||||
|
||||
**Rejected** because test coverage work must test the new functionality, making separation counterproductive.
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
| # | Criterion | Verification |
|
||||
|---|-----------|--------------|
|
||||
| 1 | All tests pass | `uv run pytest` shows 0 failures |
|
||||
| 2 | Coverage >= 90% | `uv run pytest --cov=starpunk` |
|
||||
| 3 | MPO tested | MPO tests in test suite |
|
||||
| 4 | Debug cleanup works | Manual verification + tests |
|
||||
| 5 | N+1 fixed in feed | Performance tests show improvement |
|
||||
| 6 | Variants atomic | Failure simulation tests pass |
|
||||
| 7 | Slugs timestamp-based | New notes use `YYYYMMDDHHMMSS` format |
|
||||
| 8 | No regressions | Full test suite passes |
|
||||
| 9 | ADRs documented | ADR-062 in `/docs/decisions/` |
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- ADR-062: Timestamp-Based Slug Format (supersedes ADR-007)
|
||||
- ADR-007: Slug Generation Algorithm (superseded)
|
||||
- BACKLOG.md: Deferred N+1 query locations documented
|
||||
- v1.4.2 Architect Review: Source of many v1.5.0 items
|
||||
|
||||
---
|
||||
|
||||
## Implementation Timeline
|
||||
|
||||
| Phase | Estimated Effort | Dependencies |
|
||||
|-------|------------------|--------------|
|
||||
| Phase 0: Test Fixes | 2-4 hours | None |
|
||||
| Phase 1: Timestamp Slugs | 2-3 hours | Phase 0 |
|
||||
| Phase 2: Debug Files | 3-4 hours | Phase 0 |
|
||||
| Phase 3: N+1 Fix | 3-4 hours | Phase 0 |
|
||||
| Phase 4: Atomic Variants | 4-6 hours | Phase 0 |
|
||||
| Phase 5: Coverage | 4-8 hours | Phases 1-4 |
|
||||
|
||||
**Total Estimated**: 18-29 hours
|
||||
|
||||
---
|
||||
|
||||
## Post-Release
|
||||
|
||||
After v1.5.0 ships, update BACKLOG.md to move completed items to "Recently Completed" section:
|
||||
- MPO Format Test Coverage
|
||||
- Debug File Storage Cleanup
|
||||
- Filename Sanitization
|
||||
- N+1 Query Fix (Feed Generation - partial)
|
||||
- Atomic Variant Generation
|
||||
- Default Slug Change
|
||||
@@ -86,6 +86,10 @@ def load_config(app, config_override=None):
|
||||
app.config["FEED_CACHE_ENABLED"] = os.getenv("FEED_CACHE_ENABLED", "true").lower() == "true"
|
||||
app.config["FEED_CACHE_MAX_SIZE"] = int(os.getenv("FEED_CACHE_MAX_SIZE", "50"))
|
||||
|
||||
# Upload limits (v1.4.2)
|
||||
# Flask MAX_CONTENT_LENGTH limits request body size (matches media.py MAX_FILE_SIZE)
|
||||
app.config["MAX_CONTENT_LENGTH"] = 50 * 1024 * 1024 # 50MB
|
||||
|
||||
# Metrics configuration (v1.1.2 Phase 1)
|
||||
app.config["METRICS_ENABLED"] = os.getenv("METRICS_ENABLED", "true").lower() == "true"
|
||||
app.config["METRICS_SLOW_QUERY_THRESHOLD"] = float(os.getenv("METRICS_SLOW_QUERY_THRESHOLD", "1.0"))
|
||||
|
||||
@@ -7,7 +7,7 @@ Per ADR-057 and ADR-058:
|
||||
- 50MB max upload, 10MB max output (v1.4.0)
|
||||
- Image variants: thumb, small, medium, large (v1.4.0)
|
||||
- Tiered resize strategy based on input size (v1.4.0)
|
||||
- 4096x4096 max dimensions
|
||||
- 12000x12000 max dimensions (v1.4.2)
|
||||
- 4 images max per note
|
||||
"""
|
||||
|
||||
@@ -38,7 +38,7 @@ ALLOWED_MIME_TYPES = {
|
||||
# Limits per Q&A and ADR-058 (updated in v1.4.0)
|
||||
MAX_FILE_SIZE = 50 * 1024 * 1024 # 50MB (v1.4.0)
|
||||
MAX_OUTPUT_SIZE = 10 * 1024 * 1024 # 10MB target after optimization (v1.4.0)
|
||||
MAX_DIMENSION = 4096 # 4096x4096 max
|
||||
MAX_DIMENSION = 12000 # 12000x12000 max input (v1.4.2 - supports modern phone cameras)
|
||||
RESIZE_DIMENSION = 2048 # Auto-resize to 2048px (default)
|
||||
MIN_QUALITY = 70 # Minimum JPEG quality before rejection (v1.4.0)
|
||||
MIN_DIMENSION = 640 # Minimum dimension before rejection (v1.4.0)
|
||||
@@ -122,13 +122,19 @@ def validate_image(file_data: bytes, filename: str) -> Tuple[bytes, str, int, in
|
||||
# Mark as HEIF so conversion happens below
|
||||
img.format = 'HEIF'
|
||||
except Exception as heic_error:
|
||||
# Log the magic bytes for debugging (if in app context)
|
||||
# Log the magic bytes and save file for debugging (if in app context)
|
||||
try:
|
||||
magic = file_data[:12].hex() if len(file_data) >= 12 else file_data.hex()
|
||||
current_app.logger.warning(
|
||||
f'Media upload failed both Pillow and HEIC: filename="{filename}", '
|
||||
f'magic_bytes={magic}, pillow_error="{e}", heic_error="{heic_error}"'
|
||||
)
|
||||
# Save failed file for analysis
|
||||
debug_dir = Path(current_app.config.get('DATA_PATH', 'data')) / 'debug'
|
||||
debug_dir.mkdir(parents=True, exist_ok=True)
|
||||
debug_file = debug_dir / f"failed_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{filename}"
|
||||
debug_file.write_bytes(file_data)
|
||||
current_app.logger.info(f'Saved failed upload for analysis: {debug_file}')
|
||||
except RuntimeError:
|
||||
pass # Outside app context (e.g., tests)
|
||||
raise ValueError(f"Invalid or corrupted image: {e}")
|
||||
@@ -154,6 +160,18 @@ def validate_image(file_data: bytes, filename: str) -> Tuple[bytes, str, int, in
|
||||
file_data = output.getvalue()
|
||||
img = Image.open(io.BytesIO(file_data))
|
||||
|
||||
# MPO (Multi-Picture Object) conversion (v1.4.2)
|
||||
# MPO is used by iPhones for depth/portrait photos - extract primary image as JPEG
|
||||
if img.format == 'MPO':
|
||||
output = io.BytesIO()
|
||||
# Convert to RGB if needed
|
||||
if img.mode in ('RGBA', 'P'):
|
||||
img = img.convert('RGB')
|
||||
img.save(output, format='JPEG', quality=95)
|
||||
output.seek(0)
|
||||
file_data = output.getvalue()
|
||||
img = Image.open(io.BytesIO(file_data))
|
||||
|
||||
# Check format is allowed
|
||||
if img.format:
|
||||
format_lower = img.format.lower()
|
||||
@@ -164,11 +182,20 @@ def validate_image(file_data: bytes, filename: str) -> Tuple[bytes, str, int, in
|
||||
mime_type = 'image/jpeg'
|
||||
|
||||
if mime_type not in ALLOWED_MIME_TYPES:
|
||||
raise ValueError(f"Invalid image format. Accepted: JPEG, PNG, GIF, WebP")
|
||||
# Log the detected format for debugging (v1.4.2)
|
||||
try:
|
||||
current_app.logger.warning(
|
||||
f'Media upload rejected format: filename="{filename}", '
|
||||
f'detected_format="{img.format}", mime_type="{mime_type}"'
|
||||
)
|
||||
except RuntimeError:
|
||||
pass # Outside app context
|
||||
raise ValueError(f"Invalid image format '{img.format}'. Accepted: JPEG, PNG, GIF, WebP")
|
||||
else:
|
||||
raise ValueError("Could not determine image format")
|
||||
|
||||
# Check dimensions
|
||||
# Check dimensions (v1.4.2: increased to 12000px to support modern phone cameras)
|
||||
# Images will be resized by optimize_image() anyway
|
||||
width, height = img.size
|
||||
if max(width, height) > MAX_DIMENSION:
|
||||
raise ValueError(f"Image dimensions too large. Maximum is {MAX_DIMENSION}x{MAX_DIMENSION} pixels")
|
||||
|
||||
@@ -127,8 +127,8 @@ class TestImageValidation:
|
||||
assert "File too large" in str(exc_info.value)
|
||||
|
||||
def test_dimensions_too_large(self):
|
||||
"""Test rejection of >4096px image (per ADR-058)"""
|
||||
large_image = create_test_image(5000, 5000, 'PNG')
|
||||
"""Test rejection of >12000px image (v1.4.2: increased from 4096)"""
|
||||
large_image = create_test_image(13000, 13000, 'PNG')
|
||||
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
validate_image(large_image, 'huge.png')
|
||||
|
||||
Reference in New Issue
Block a user