docs: v1.5.0 planning - ADR-062, release plan, and design docs

- ADR-062: Timestamp-based slug format (supersedes ADR-007) - Updated v1.5.0 RELEASE.md with 6-phase plan - Updated BACKLOG.md with deferred N+1 query locations - Developer questions and architect responses for Phase 1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-16 19:38:01 -07:00
parent 7be2fb0f62
commit 9dcc5c5710
5 changed files with 1161 additions and 178 deletions
--- a/docs/projectplan/v1.5.0/RELEASE.md
+++ b/docs/projectplan/v1.5.0/RELEASE.md
@@ -1,257 +1,346 @@
-# StarPunk v1.5.0 Release
+# StarPunk v1.5.0 Release Plan

-**Status**: Planning
+**Status**: Approved
 **Codename**: "Trigger"
-**Focus**: Cleanup, Test Coverage, and Quality of Life Improvements
+**Focus**: Stability, Test Coverage, and Technical Debt Reduction
+**Last Updated**: 2025-12-17

 ## Overview

-This minor release focuses on technical debt reduction, improving test coverage to 90%, and addressing quality-of-life issues identified in previous release reviews. No new user-facing features are planned.
+v1.5.0 is a quality-focused release that addresses failing tests, increases test coverage to 90%, implements critical fixes from the v1.4.x review cycle, and resolves technical debt. No new user-facing features are planned.

 ## Goals

-1. **90% Test Coverage** - Increase overall test coverage from current baseline to 90%
-2. **Address Backlog Technical Debt** - Resolve 6 specific backlog items
-3. **Improve Code Quality** - Better error handling, atomicity, and performance
+1. **Fix Failing Tests** - Resolve all 19 currently failing tests
+2. **90% Test Coverage** - Increase overall test coverage to 90% minimum
+3. **Technical Debt Reduction** - Address 6 specific backlog items
+4. **Code Quality** - Better error handling, atomicity, and performance

-## Features
+## Phased Implementation Plan

-### 1. Test Coverage Target: 90%
-
-Increase overall test coverage to 90% minimum across all modules.
+### Phase 0: Test Fixes (Critical Path)
+**Priority**: Must complete first - unblocks all other phases

 #### Scope
- Identify modules with coverage below 90%
- Write missing unit tests for uncovered code paths
- Add integration tests for critical workflows
- Ensure edge cases and error paths are tested
+Fix the 19 failing tests identified in the current test suite:
+
+| Category | Count | Tests |
+|----------|-------|-------|
+| Migration Performance | 2 | `test_single_worker_performance`, `test_concurrent_workers_performance` |
+| Feed Route (Streaming) | 1 | `test_feed_route_streaming` |
+| Feed Endpoints | 3 | `test_feed_rss_endpoint`, `test_feed_json_endpoint`, `test_feed_xml_legacy_endpoint` |
+| Content Negotiation | 6 | `test_accept_rss`, `test_accept_json_feed`, `test_accept_json_generic`, `test_accept_wildcard`, `test_no_accept_header`, `test_quality_factor_json_wins` |
+| Backward Compatibility | 1 | `test_feed_xml_contains_rss` |
+| Search Security | 1 | `test_search_escapes_html_in_note_content` |
+
+#### Approach
+1. Investigate each failing test category
+2. Determine if failure is test issue or code issue
+3. Fix appropriately (prefer fixing tests over changing working code)
+4. Document any behavioral changes

 #### Acceptance Criteria
- `uv run pytest --cov=starpunk` reports ≥90% overall coverage
- No module below 85% coverage
- All new code in this release has 100% coverage
+- [ ] All 879 tests pass
+- [ ] No test skips added (unless justified)
+- [ ] No test timeouts
+
+#### Dependencies
+None - this is the first phase

 ---

-### 2. MPO Format Test Coverage
+### Phase 1: Timestamp-Based Slugs
+**Priority**: High - Addresses user-reported issue

-**Source**: Backlog - High Priority (Developer Review M1)
-
-Add test coverage for MPO (Multi-Picture Object) format handling.
-
-#### Current State
- MPO conversion code exists at `starpunk/media.py` lines 163-173
- Advertised in CHANGELOG but untested
+#### Scope
+Implement ADR-062 to change default slug format from content-based to timestamp-based.

 #### Implementation
- Add `test_mpo_detection_and_conversion()` to `TestHEICSupport` class
- Create MPO test image using Pillow's MPO support
- Test primary image extraction
- Test conversion to JPEG
+1. Update `starpunk/slug_utils.py`:
+   - Change `generate_slug()` to use timestamp format `YYYYMMDDHHMMSS`
+   - Update collision handling to use sequential suffix (`-1`, `-2`, etc.)
+   - Preserve custom slug functionality
+
+2. Update `starpunk/notes.py`:
+   - Remove content parameter from slug generation calls
+
+3. Update tests:
+   - Modify expected slug formats in test assertions
+   - Add tests for new timestamp format
+   - Add tests for sequential collision handling

 #### Acceptance Criteria
- MPO detection tested
- MPO to JPEG conversion tested
- Edge cases (corrupted MPO, single-frame MPO) tested
+- [ ] Default slugs use `YYYYMMDDHHMMSS` format
+- [ ] Collision handling uses `-1`, `-2` suffix
+- [ ] Custom slugs via `mp-slug` work unchanged
+- [ ] Custom slugs via web UI work unchanged
+- [ ] Existing notes unaffected
+- [ ] ADR-062 referenced in code comments
+
+#### Dependencies
+- Phase 0 complete (all tests passing)

 ---

-### 3. Debug File Storage Cleanup
+### Phase 2: Debug File Management
+**Priority**: Medium - Security and operations concern

-**Source**: Backlog - Medium Priority (Developer Review M2, Architect Review 1.2.2)
-
-Implement cleanup mechanism for failed upload debug files.
-
-#### Current State
- Failed uploads saved to `data/debug/` directory
- No cleanup mechanism exists
- Potential disk space exhaustion
+#### Scope
+Implement cleanup mechanism for failed upload debug files and add configuration controls.

 #### Implementation
-1. Add `DEBUG_SAVE_FAILED_UPLOADS` config option (default: `false` in production)
-2. Implement automatic cleanup for files older than 7 days
-3. Add disk space check or size limit (100MB max for debug folder)
-4. Cleanup runs on application startup and periodically
+1. Add configuration options:
+   ```python
+   DEBUG_SAVE_FAILED_UPLOADS = False  # Default: disabled in production
+   DEBUG_FILE_MAX_AGE_DAYS = 7        # Auto-delete threshold
+   DEBUG_FILE_MAX_SIZE_MB = 100       # Maximum debug folder size
+   ```

-#### Configuration
-```python
-DEBUG_SAVE_FAILED_UPLOADS = False  # Enable only for debugging
-DEBUG_FILE_MAX_AGE_DAYS = 7        # Auto-delete after 7 days
-DEBUG_FILE_MAX_SIZE_MB = 100       # Maximum debug folder size
-```
+2. Implement cleanup logic in `starpunk/media.py`:
+   - Check config before saving debug files
+   - Implement `cleanup_old_debug_files()` function
+   - Add size limit check before saving
+
+3. Add startup cleanup:
+   - Run cleanup on application startup
+   - Log cleanup actions
+
+4. Sanitize filenames:
+   - Sanitize filename before debug path construction
+   - Pattern: `"".join(c for c in filename if c.isalnum() or c in "._-")[:50]`

 #### Acceptance Criteria
- Debug files disabled by default in production
- Old files automatically cleaned up when enabled
- Disk space protected with size limit
- Tests cover all cleanup scenarios
+- [ ] Debug files disabled by default
+- [ ] Files older than 7 days auto-deleted when enabled
+- [ ] Folder size limited to 100MB
+- [ ] Filenames sanitized (no path traversal)
+- [ ] Cleanup runs on startup
+- [ ] Tests cover all scenarios
+
+#### Dependencies
+- Phase 0 complete

 ---

-### 4. Filename Sanitization in Debug Path
+### Phase 3: N+1 Query Fix (Feed Generation)
+**Priority**: Medium - Performance improvement

-**Source**: Backlog - Medium Priority (Architect Review 1.2.3)
-
-Sanitize filenames before use in debug file paths.
-
-#### Current State
- Original filename used directly at `starpunk/media.py` line 135
- Path traversal or special character issues possible
+#### Scope
+Fix N+1 query pattern in `_get_cached_notes()` only. Other N+1 patterns are deferred (documented in BACKLOG.md).

 #### Implementation
-```python
-safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]
-```
+1. Create batch loading functions in `starpunk/media.py`:
+   ```python
+   def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
+       """Batch load media for multiple notes in single query."""
+   ```
+
+2. Create batch loading functions in `starpunk/tags.py`:
+   ```python
+   def get_tags_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
+       """Batch load tags for multiple notes in single query."""
+   ```
+
+3. Update `_get_cached_notes()` in `starpunk/routes/public.py`:
+   - Use batch loading instead of per-note queries
+   - Maintain same output format

 #### Acceptance Criteria
- Filenames sanitized before debug path construction
- Path traversal attempts neutralized
- Special characters removed
- Filename length limited
- Tests cover malicious filename patterns
+- [ ] Feed generation uses batch queries
+- [ ] Query count reduced from O(n) to O(1) for media/tags
+- [ ] No change to API behavior
+- [ ] Performance improvement verified in tests
+- [ ] Other N+1 locations documented in BACKLOG.md (not fixed)
+
+#### Dependencies
+- Phase 0 complete

 ---

-### 5. N+1 Query Pattern Fix
+### Phase 4: Atomic Variant Generation
+**Priority**: Medium - Data integrity

-**Source**: Backlog - Medium Priority (Architect Review 2.2.9)
-
-Eliminate N+1 query pattern in feed generation.
-
-#### Current State
- `_get_cached_notes()` loads media and tags per-note
- For 50 notes: 100 additional database queries
- Performance degrades with note count
+#### Scope
+Make variant file generation atomic with database commits to prevent orphaned files.

 #### Implementation
-1. Create `get_media_for_notes(note_ids: List[int])` batch function
-2. Create `get_tags_for_notes(note_ids: List[int])` batch function
-3. Use single query with `WHERE note_id IN (...)`
-4. Update `_get_cached_notes()` to use batch loading
+1. Modify `generate_all_variants()` in `starpunk/media.py`:
+   - Write variants to temporary directory first
+   - Perform database inserts in transaction
+   - Move files to final location after successful commit
+   - Clean up temp files on any failure

-#### Example
-```python
-def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
-    """Batch load media for multiple notes."""
-    # Single query: SELECT * FROM note_media WHERE note_id IN (?)
-    # Returns: {note_id: [media_list]}
-```
-
-#### Acceptance Criteria
- Feed generation uses batch queries
- Query count reduced from O(n) to O(1) for media/tags
- Performance improvement measurable in tests
- No change to API behavior
-
---
-
-### 6. Atomic Variant Generation
-
-**Source**: Backlog - Medium Priority (Architect Review 2.2.6)
-
-Make variant file generation atomic with database commits.
-
-#### Current State
- Files written to disk before database commit
- Database commit failure leaves orphaned files
-
-#### Implementation
-1. Write variant files to temporary location first
-2. Database insert in transaction
-3. Move files to final location after successful commit
-4. Cleanup temp files on any failure
+2. Add startup recovery:
+   - Detect orphaned variant files on startup
+   - Log warnings for orphans found
+   - Optionally clean up orphans

 #### Flow
 ```
-1. Generate variants to temp directory
+1. Generate variants to temp directory (data/media/.tmp/)
 2. BEGIN TRANSACTION
 3. INSERT media record
 4. INSERT variant records
 5. COMMIT
 6. Move files from temp to final location
-7. On failure: rollback DB, delete temp files
+7. On failure: ROLLBACK, delete temp files
 ```

 #### Acceptance Criteria
- No orphaned files on database failures
- No orphaned database records on file failures
- Atomic operation for all media saves
- Tests simulate failure scenarios
+- [ ] No orphaned files on database failures
+- [ ] No orphaned DB records on file failures
+- [ ] Atomic operation for all media saves
+- [ ] Startup recovery detects orphans
+- [ ] Tests simulate failure scenarios
+
+#### Dependencies
+- Phase 0 complete

 ---

-### 7. Default Slug Format Change
+### Phase 5: Test Coverage Expansion
+**Priority**: Medium - Quality assurance

-**Source**: Backlog - Medium Priority
+#### Scope
+Increase overall test coverage to 90% minimum.

-Change default slug format from content-based to timestamp-based.
+#### Approach
+1. Run coverage report: `uv run pytest --cov=starpunk --cov-report=html`
+2. Identify modules below 90% coverage
+3. Prioritize based on risk and complexity
+4. Write tests for uncovered paths

-#### Current State
- Slugs generated from note content
- Can produce unwanted slugs from private content
+#### Known Coverage Gaps (to verify)
+- MPO format handling (untested)
+- Edge cases in error paths
+- Configuration validation paths
+- Startup/shutdown hooks

-#### New Format
- Default: `YYYYMMDDHHMMSS` (e.g., `20251216143052`)
- Duplicate handling: append `-1`, `-2`, etc.
- Custom slugs still supported via `mp-slug`
+#### Specific Test Additions
+1. **MPO Format Tests** (`tests/test_media_upload.py`):
+   - `test_mpo_detection_and_conversion()`
+   - `test_mpo_corrupted_file()`
+   - `test_mpo_single_frame()`

-#### Implementation
-1. Update `generate_slug()` function in `starpunk/slug_utils.py`
-2. Generate timestamp-based slug by default
-3. Check for duplicates and append suffix if needed
-4. Preserve custom slug functionality
+2. **Debug File Tests** (new test file or extend `test_media_upload.py`):
+   - `test_debug_file_disabled_by_default()`
+   - `test_debug_file_cleanup_old_files()`
+   - `test_debug_file_size_limit()`
+   - `test_debug_filename_sanitization()`

-#### Example
-```python
-def generate_slug(content: str = None, custom_slug: str = None) -> str:
-    if custom_slug:
-        return sanitize_slug(custom_slug)
-    # Default: timestamp-based
-    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
-    return ensure_unique_slug(timestamp)
-```
+3. **Batch Loading Tests**:
+   - `test_get_media_for_notes_batch()`
+   - `test_get_tags_for_notes_batch()`
+   - `test_batch_with_empty_list()`

 #### Acceptance Criteria
- New notes use timestamp slugs by default
- Duplicate timestamps handled with suffix
- Custom `mp-slug` still works
- Existing notes unchanged
- Tests cover all slug generation scenarios
+- [ ] Overall coverage >= 90%
+- [ ] No module below 85% coverage
+- [ ] All new code in v1.5.0 has 100% coverage
+- [ ] MPO handling fully tested
+
+#### Dependencies
+- Phases 1-4 complete (tests cover new functionality)

 ---

 ## Out of Scope

- New user-facing features
- UI changes
- New feed formats
- Micropub extensions
- Database schema changes (except for bug fixes)
+Items explicitly excluded from v1.5.0:

-## Dependencies
+| Item | Reason |
+|------|--------|
+| Rate limiting | Handled by reverse proxy (Caddy/Nginx) |
+| Schema changes | Not needed for v1.5.0 fixes |
+| New user features | Quality-focused release |
+| N+1 fixes in admin/search | Low traffic, deferred to BACKLOG |
+| UI changes | No frontend work planned |

- v1.4.x complete
- No new external dependencies expected
+---
+
+## Recommendation: Single Release vs. Multiple Patches
+
+**Recommendation: Single v1.5.0 Release**
+
+### Rationale
+
+1. **Phase Dependencies**: Most phases depend on Phase 0 (test fixes). Splitting would create artificial release boundaries.
+
+2. **Cognitive Overhead**: Multiple patch releases (1.5.1, 1.5.2, etc.) require:
+   - Multiple changelog entries
+   - Multiple version bumps
+   - Multiple release notes
+   - More git tags/branches
+
+3. **Test Coverage Integration**: Test coverage work (Phase 5) tests functionality from Phases 1-4. Separating them creates incomplete test coverage.
+
+4. **User Experience**: Users prefer fewer, more significant updates over many small patches.
+
+5. **Scope Alignment**: All v1.5.0 work is internally focused (no external API changes). Users see one "quality improvement" release.
+
+### Exception
+
+If Phase 0 (test fixes) reveals critical bugs affecting production, those fixes should be:
+- Backported to a v1.4.3 patch release on the current branch
+- Then merged forward to v1.5.0
+
+### Alternative Considered
+
+Splitting into:
+- v1.5.0: Phase 0 (test fixes) + Phase 1 (slugs)
+- v1.5.1: Phase 2-4 (technical debt)
+- v1.5.2: Phase 5 (test coverage)
+
+**Rejected** because test coverage work must test the new functionality, making separation counterproductive.
+
+---

 ## Success Criteria

-1. ✅ Test coverage ≥90% overall
-2. ✅ MPO format has test coverage
-3. ✅ Debug file cleanup implemented and tested
-4. ✅ Filename sanitization implemented and tested
-5. ✅ N+1 query pattern eliminated
-6. ✅ Variant generation is atomic
-7. ✅ Default slugs are timestamp-based
-8. ✅ All existing tests continue to pass
-9. ✅ No regressions in functionality
+| # | Criterion | Verification |
+|---|-----------|--------------|
+| 1 | All tests pass | `uv run pytest` shows 0 failures |
+| 2 | Coverage >= 90% | `uv run pytest --cov=starpunk` |
+| 3 | MPO tested | MPO tests in test suite |
+| 4 | Debug cleanup works | Manual verification + tests |
+| 5 | N+1 fixed in feed | Performance tests show improvement |
+| 6 | Variants atomic | Failure simulation tests pass |
+| 7 | Slugs timestamp-based | New notes use `YYYYMMDDHHMMSS` format |
+| 8 | No regressions | Full test suite passes |
+| 9 | ADRs documented | ADR-062 in `/docs/decisions/` |

-## Related Backlog Items
+---

-After completion, the following backlog items should be marked complete or moved to "Recently Completed":
+## Related Documentation

- MPO Format Test Coverage (High)
- Debug File Storage Without Cleanup Mechanism (Medium)
- Filename Not Sanitized in Debug Path (Medium)
- N+1 Query Pattern in Feed Generation (Medium)
- Transaction Not Atomic in Variant Generation (Medium)
- Default Slug Change (Medium)
+- ADR-062: Timestamp-Based Slug Format (supersedes ADR-007)
+- ADR-007: Slug Generation Algorithm (superseded)
+- BACKLOG.md: Deferred N+1 query locations documented
+- v1.4.2 Architect Review: Source of many v1.5.0 items
+
+---
+
+## Implementation Timeline
+
+| Phase | Estimated Effort | Dependencies |
+|-------|------------------|--------------|
+| Phase 0: Test Fixes | 2-4 hours | None |
+| Phase 1: Timestamp Slugs | 2-3 hours | Phase 0 |
+| Phase 2: Debug Files | 3-4 hours | Phase 0 |
+| Phase 3: N+1 Fix | 3-4 hours | Phase 0 |
+| Phase 4: Atomic Variants | 4-6 hours | Phase 0 |
+| Phase 5: Coverage | 4-8 hours | Phases 1-4 |
+
+**Total Estimated**: 18-29 hours
+
+---
+
+## Post-Release
+
+After v1.5.0 ships, update BACKLOG.md to move completed items to "Recently Completed" section:
+- MPO Format Test Coverage
+- Debug File Storage Cleanup
+- Filename Sanitization
+- N+1 Query Fix (Feed Generation - partial)
+- Atomic Variant Generation
+- Default Slug Change