From 9dcc5c571012308eca81ae9911328d9241794069 Mon Sep 17 00:00:00 2001 From: Phil Skentelbery Date: Tue, 16 Dec 2025 19:38:01 -0700 Subject: [PATCH] docs: v1.5.0 planning - ADR-062, release plan, and design docs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - ADR-062: Timestamp-based slug format (supersedes ADR-007) - Updated v1.5.0 RELEASE.md with 6-phase plan - Updated BACKLOG.md with deferred N+1 query locations - Developer questions and architect responses for Phase 1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- .../ADR-062-timestamp-based-slug-format.md | 197 ++++++++ .../v1.5.0/2025-12-16-architect-responses.md | 412 ++++++++++++++++ .../v1.5.0/2025-12-16-developer-questions.md | 272 +++++++++++ docs/projectplan/BACKLOG.md | 17 +- docs/projectplan/v1.5.0/RELEASE.md | 441 +++++++++++------- 5 files changed, 1161 insertions(+), 178 deletions(-) create mode 100644 docs/decisions/ADR-062-timestamp-based-slug-format.md create mode 100644 docs/design/v1.5.0/2025-12-16-architect-responses.md create mode 100644 docs/design/v1.5.0/2025-12-16-developer-questions.md diff --git a/docs/decisions/ADR-062-timestamp-based-slug-format.md b/docs/decisions/ADR-062-timestamp-based-slug-format.md new file mode 100644 index 0000000..13672cf --- /dev/null +++ b/docs/decisions/ADR-062-timestamp-based-slug-format.md @@ -0,0 +1,197 @@ +# ADR-062: Timestamp-Based Slug Format + +## Status +Accepted (Supersedes ADR-007) + +## Context + +ADR-007 established a content-based slug generation algorithm that extracts the first 5 words from note content to create URL slugs. While this approach provides readable, SEO-friendly URLs, it has drawbacks for a personal note-taking system: + +1. **Privacy Concerns**: The slug reveals note content in the URL. Private thoughts or draft content become visible in URLs that may be shared or logged. + +2. **SEO Irrelevance**: StarPunk is designed for personal IndieWeb notes, not public blogs. Notes are typically short-form content (similar to tweets or status updates) where SEO optimization provides no meaningful benefit. + +3. **Unpredictable Slugs**: Users cannot predict what slug will be generated without examining their content carefully. + +4. **Edge Case Handling**: Content-based slugs require complex fallback logic for short content, unicode-only content, or special characters. + +5. **Collision Complexity**: Similar notes require random suffix generation (e.g., `hello-world-a7c9`), which undermines the readability goal. + +The user has explicitly stated: "Notes don't need SEO." + +## Decision + +Change the default slug format from content-based to timestamp-based: + +**New Default Format**: `YYYYMMDDHHMMSS` +- Example: `20251216143052` +- Compact, sortable, predictable +- 14 characters total + +**Collision Handling**: Sequential numeric suffix +- First collision: `20251216143052-1` +- Second collision: `20251216143052-2` +- Simple, predictable, no randomness + +**Custom Slugs**: Continue to support user-specified slugs via `mp-slug` Micropub property and the web UI custom slug field. When provided, custom slugs take precedence. + +### Algorithm Specification + +```python +def generate_slug(custom_slug: str = None, created_at: datetime = None) -> str: + """Generate a URL-safe slug for a note. + + Args: + custom_slug: User-provided custom slug (takes precedence if provided) + created_at: Note creation timestamp (defaults to now) + + Returns: + URL-safe slug string + """ + if custom_slug: + return sanitize_slug(custom_slug) + + # Default: timestamp-based + timestamp = (created_at or datetime.now()).strftime("%Y%m%d%H%M%S") + return ensure_unique_slug(timestamp) + +def ensure_unique_slug(base_slug: str) -> str: + """Ensure slug is unique, adding numeric suffix if needed.""" + if not slug_exists(base_slug): + return base_slug + + suffix = 1 + while slug_exists(f"{base_slug}-{suffix}"): + suffix += 1 + return f"{base_slug}-{suffix}" +``` + +### Examples + +| Scenario | Generated Slug | +|----------|----------------| +| Normal note at 2:30:52 PM on Dec 16, 2025 | `20251216143052` | +| Second note in same second | `20251216143052-1` | +| Third note in same second | `20251216143052-2` | +| Note with custom slug "my-custom-slug" | `my-custom-slug` | + +## Rationale + +### Timestamp Format (Score: 9/10) + +**Pros**: +- **Privacy**: URL reveals nothing about content +- **Predictability**: User knows exactly what slug format to expect +- **Sortability**: Chronological sorting by URL is possible +- **Simplicity**: No complex word extraction or normalization +- **Collision Rarity**: Same-second creation is rare; handled cleanly when it occurs +- **Compact**: 14 characters vs potentially 50+ for content-based + +**Cons**: +- **Not Memorable**: `20251216143052` is harder to remember than `hello-world` +- **No SEO Value**: Search engines prefer descriptive URLs + +**Why Cons Don't Matter**: +- StarPunk is for personal notes, not public SEO-optimized content +- Notes are accessed via UI, feeds, or bookmarks, not memorized URLs +- Users wanting memorable URLs can use custom slugs + +### Sequential Suffix (Score: 10/10) + +**Pros**: +- **Deterministic**: No randomness; same collision always gets same suffix +- **Simple**: No cryptographic random generation needed +- **Readable**: `-1`, `-2` are clear and obvious +- **Debuggable**: Easy to understand collision resolution + +**Cons**: +- **Enumerable**: Sequential numbers could be probed +- Not a real security concern for note slugs + +### Comparison with ADR-007 Approach + +| Aspect | ADR-007 (Content-Based) | ADR-062 (Timestamp) | +|--------|------------------------|---------------------| +| Privacy | Reveals content | Reveals only time | +| Complexity | High (word extraction, normalization, unicode handling) | Low (strftime) | +| SEO | Good | None | +| Predictability | Low | High | +| Collision handling | Random suffix | Sequential suffix | +| Fallback cases | Many (short content, unicode, etc.) | None | +| Code lines | ~50 | ~15 | + +## Consequences + +### Positive + +1. **Simplified Code**: Remove complex word extraction, unicode normalization, and multiple fallback paths +2. **Better Privacy**: Note content never appears in URLs +3. **Predictable Output**: Users always know what slug format to expect +4. **Fewer Edge Cases**: No special handling for short content, unicode, or special characters +5. **Cleaner Collisions**: Sequential suffixes are more intuitive than random strings + +### Negative + +1. **Migration**: Existing notes keep their content-based slugs (no migration needed) +2. **Not Human-Readable**: URLs don't describe content +3. **No SEO**: Search engines won't benefit from descriptive URLs + +### Mitigations + +**Human-Readable URLs**: +- Users wanting descriptive URLs can use custom slugs via `mp-slug` +- The web UI custom slug field remains available +- This is opt-in rather than default + +**SEO**: +- IndieWeb notes are typically not SEO targets +- Content is in the page body where search engines can index it +- Microformats2 markup provides semantic meaning + +## Backward Compatibility + +- Existing notes retain their slugs (no data migration) +- New notes use timestamp format by default +- Custom slug functionality unchanged +- All existing URLs remain valid + +## Testing Requirements + +- Test default slug generation produces timestamp format +- Test collision handling with sequential suffixes +- Test custom slugs still take precedence +- Test edge case: multiple notes in same second +- Test reserved slug rejection still works +- Verify existing tests for custom slugs pass + +## Implementation Notes + +Changes required in: +- `starpunk/slug_utils.py`: Update `generate_slug()` function +- `starpunk/notes.py`: Remove content parameter from slug generation call +- Tests: Update expected slug formats + +Estimated effort: Small (1-2 hours implementation, 1 hour testing) + +## References + +- ADR-007: Slug Generation Algorithm (Superseded by this ADR) +- ADR-035: Custom Slugs (Unchanged; complements this decision) +- IndieWeb Permalink Best Practices: https://indieweb.org/permalink + +## Acceptance Criteria + +- [ ] Default slugs use `YYYYMMDDHHMMSS` format +- [ ] Collision handling uses sequential suffix (`-1`, `-2`, etc.) +- [ ] Custom slugs via `mp-slug` continue to work +- [ ] Custom slugs via web UI continue to work +- [ ] Reserved slug validation unchanged +- [ ] Existing notes unaffected +- [ ] All tests pass +- [ ] Code complexity reduced + +--- + +**Approved**: 2025-12-16 +**Architect**: StarPunk Architect Agent +**Supersedes**: ADR-007 (Slug Generation Algorithm) diff --git a/docs/design/v1.5.0/2025-12-16-architect-responses.md b/docs/design/v1.5.0/2025-12-16-architect-responses.md new file mode 100644 index 0000000..1668367 --- /dev/null +++ b/docs/design/v1.5.0/2025-12-16-architect-responses.md @@ -0,0 +1,412 @@ +# Architect Responses - v1.5.0 Developer Questions + +**Date**: 2025-12-16 +**Architect**: StarPunk Architect Agent +**In Response To**: `docs/design/v1.5.0/2025-12-16-developer-questions.md` + +--- + +## Response Summary + +All 8 questions have been addressed. The implementation can proceed immediately after reading this document. Key decisions: + +1. **Q1**: Keep existing pattern (pass `existing_slugs: Set[str]`). ADR-062 pseudocode is conceptual. +2. **Q2**: First note gets base slug, first collision gets `-1`. The table in ADR-062 is correct. +3. **Q3**: Create new function in `slug_utils.py`. Migration path provided. +4. **Q4**: Keep reserved slug validation for custom slugs only. Safe to skip for timestamp slugs. +5. **Q5**: Tests should use pattern matching for timestamps. Test strategy provided. +6. **Q6**: Phase 0 is a hard dependency. Cannot proceed with Phase 1 until all tests pass. +7. **Q7**: Slug format is opaque. No format-aware code needed. +8. **Q8**: Phases 2-4 can be parallelized. Commit each phase separately. + +--- + +## Detailed Responses + +### Q1: Collision handling function location and signature + +**Decision**: Keep the existing pattern of accepting `existing_slugs: Set[str]` as a parameter. + +**Rationale**: +The ADR-062 pseudocode (`slug_exists(base_slug)`) is conceptual illustration only. The actual implementation should: + +1. Follow the existing pattern in the codebase (passing pre-fetched slug sets) +2. Avoid introducing database coupling into utility functions +3. Prevent N+1 queries when checking multiple candidates + +**Implementation Specification**: + +Create a new function in `/home/phil/Projects/starpunk/starpunk/slug_utils.py`: + +```python +def generate_timestamp_slug( + created_at: datetime = None, + existing_slugs: Set[str] = None +) -> str: + """Generate a timestamp-based slug with collision handling. + + Per ADR-062: Default format is YYYYMMDDHHMMSS with sequential + suffix (-1, -2, etc.) for collisions. + + Args: + created_at: Note creation timestamp (defaults to now) + existing_slugs: Set of existing slugs to check for collisions + + Returns: + Unique timestamp-based slug + """ + if created_at is None: + created_at = datetime.utcnow() + + if existing_slugs is None: + existing_slugs = set() + + base_slug = created_at.strftime("%Y%m%d%H%M%S") + + # If no collision, return base slug + if base_slug not in existing_slugs: + return base_slug + + # Sequential suffix for collisions + suffix = 1 + while f"{base_slug}-{suffix}" in existing_slugs: + suffix += 1 + + return f"{base_slug}-{suffix}" +``` + +**Location**: `/home/phil/Projects/starpunk/starpunk/slug_utils.py` + +**Do NOT modify**: The `ensure_unique_slug()` function signature in ADR-062 is pseudocode. Do not create a function with database coupling. + +--- + +### Q2: Sequential suffix starting number + +**Decision**: First note gets base slug (no suffix). First collision gets `-1`. + +**Clarification**: The examples table in ADR-062 is correct. The text "first collision: `20251216143052-1`" means the first **duplicate** (collision) gets `-1`, not the first note overall. + +**Example Scenario**: +``` +Note A created at 14:30:52 -> slug = "20251216143052" (base, no suffix) +Note B created at 14:30:52 -> slug = "20251216143052-1" (first collision) +Note C created at 14:30:52 -> slug = "20251216143052-2" (second collision) +``` + +**Implementation**: Sequential suffix loop should start at `1`, not `2`: + +```python +suffix = 1 +while f"{base_slug}-{suffix}" in existing_slugs: + suffix += 1 +return f"{base_slug}-{suffix}" +``` + +**Note**: This differs from the existing `make_slug_unique_with_suffix()` function in `slug_utils.py` which starts at `-2`. The new timestamp slug function should start at `-1` per ADR-062. + +--- + +### Q3: Two `generate_slug()` functions with different signatures + +**Decision**: Create a new function in `slug_utils.py`. Do not modify or remove the existing function in `utils.py`. + +**Migration Path**: + +1. **Create new function** in `/home/phil/Projects/starpunk/starpunk/slug_utils.py`: + - Name: `generate_timestamp_slug()` (as specified in Q1 response) + - Do not name it `generate_slug()` to avoid confusion during migration + +2. **Update `/home/phil/Projects/starpunk/starpunk/notes.py`** (lines 228-234): + + **Current code** (line 228-234): + ```python + else: + # Generate base slug from content + base_slug = generate_slug(content, created_at) + + # Make unique if collision + slug = make_slug_unique(base_slug, existing_slugs) + ``` + + **Replace with**: + ```python + else: + # Generate timestamp-based slug (ADR-062) + from starpunk.slug_utils import generate_timestamp_slug + slug = generate_timestamp_slug(created_at, existing_slugs) + ``` + +3. **Keep the existing function** in `utils.py` unchanged. It may be used elsewhere or for future content-based slug options. + +4. **Update imports** in `notes.py`: + - Remove `generate_slug` from the `utils` import line (line 34) if no longer needed + - Verify no other callers use `generate_slug()` for default note creation + +**Rationale**: +- Creating a distinctly-named function makes the transition explicit +- Preserves backward compatibility if content-based slugs are ever needed +- Allows gradual migration with clear audit trail + +--- + +### Q4: Reserved slug handling in timestamp context + +**Decision**: +- **Default timestamp slugs**: Safe to skip reserved slug validation (timestamps cannot collide with reserved words) +- **Custom slugs**: Must retain reserved slug validation + +**Implementation**: + +The new `generate_timestamp_slug()` function does NOT need to check reserved slugs because: +- Reserved slugs are alphabetic (`api`, `admin`, `feed`, etc.) +- Timestamp slugs are purely numeric (`20251216143052`) +- These sets are mutually exclusive by construction + +**However**, the existing custom slug validation in `validate_and_sanitize_custom_slug()` must retain reserved slug checking because user-provided slugs could be anything. + +**Code Impact**: +- `generate_timestamp_slug()`: No reserved slug check needed +- `validate_and_sanitize_custom_slug()`: Unchanged (already handles reserved slugs) +- `validate_slug()` in `utils.py`: Unchanged (still validates reserved slugs for custom slugs) + +**Simplification**: You may remove the defensive `validate_slug()` call after timestamp generation in `notes.py` (line 236-237) since timestamp slugs are guaranteed valid by construction: +```python +# This check is no longer needed for timestamp slugs: +# if not validate_slug(slug): +# raise InvalidNoteDataError(...) +``` + +--- + +### Q5: Test update strategy for slug format changes + +**Decision**: Use pattern matching for timestamp assertions. Preserve content-based tests in a separate test file or clearly marked section. + +**Test Strategy**: + +1. **For new timestamp slug tests**, use regex pattern matching: + ```python + import re + + TIMESTAMP_SLUG_PATTERN = re.compile(r'^\d{14}(-\d+)?$') + + def test_default_slug_is_timestamp(): + note = create_note("Some content here") + assert TIMESTAMP_SLUG_PATTERN.match(note.slug) + + def test_slug_collision_adds_suffix(): + # Create two notes at same timestamp + fixed_time = datetime(2025, 12, 16, 14, 30, 52) + note1 = create_note("First note", created_at=fixed_time) + note2 = create_note("Second note", created_at=fixed_time) + + assert note1.slug == "20251216143052" + assert note2.slug == "20251216143052-1" + ``` + +2. **For timestamp-specific assertions**, use fixed timestamps in tests: + ```python + def test_timestamp_slug_format(): + """Test that timestamp slug matches exact format.""" + fixed_time = datetime(2025, 12, 16, 14, 30, 52) + slug = generate_timestamp_slug(fixed_time, set()) + assert slug == "20251216143052" + ``` + +3. **Content-based tests**: Move to a separate section or file named `test_legacy_slug_generation.py` with a comment: + ```python + """ + Legacy tests for content-based slug generation. + + These test the generate_slug() function in utils.py which is + preserved for backward compatibility. New notes use timestamp-based + slugs per ADR-062. + """ + ``` + +4. **Update existing tests that assert specific slug formats**: + - Replace exact string assertions (`assert slug == "hello-world"`) with pattern matching + - OR use fixed timestamps and assert exact expected values + +**Affected Test Files** (estimate from your question): +- `/home/phil/Projects/starpunk/tests/test_utils.py`: ~20 tests +- `/home/phil/Projects/starpunk/tests/test_notes.py`: ~10 tests +- `/home/phil/Projects/starpunk/tests/test_micropub.py`: ~10 tests + +**Helper Function** (add to test utilities or conftest.py): +```python +def assert_valid_timestamp_slug(slug: str) -> None: + """Assert slug matches timestamp format per ADR-062.""" + pattern = re.compile(r'^\d{14}(-\d+)?$') + assert pattern.match(slug), f"Slug '{slug}' does not match timestamp format" +``` + +--- + +### Q6: Phase 0 priority - are tests blocking? + +**Decision**: Phase 0 is a **hard dependency**. Phase 1 cannot proceed until all tests pass. + +**Rationale**: + +1. **Failing tests mask regressions**: If you implement Phase 1 (slug changes) while 19 tests are failing, you cannot distinguish between: + - Pre-existing failures + - New failures caused by your changes + +2. **CI/CD integrity**: The project should maintain a green build. Committing Phase 1 work with known failing tests violates this principle. + +3. **Debugging complexity**: If slug changes interact unexpectedly with the failing test areas (e.g., feed generation, content negotiation), debugging becomes significantly harder. + +**Implementation Order**: + +``` +Phase 0 (Test Fixes) <-- MUST COMPLETE FIRST + | + v ++-- Phase 1 (Timestamp Slugs) +| ++-- Phase 2 (Debug Files) <-- Can run in parallel +| ++-- Phase 3 (N+1 Query Fix) <-- after Phase 0 +| ++-- Phase 4 (Atomic Variants) + | + v +Phase 5 (Test Coverage) <-- LAST +``` + +**Exception**: If fixing a Phase 0 test reveals it requires slug-related changes, document the circular dependency and bring to architect attention. + +**Recommendation**: Start Phase 0 immediately. Many of those 19 failures may be simple test maintenance issues (timeouts, assertions that need updating, etc.). + +--- + +### Q7: Backwards compatibility for existing notes + +**Decision**: Slug format is **opaque**. No code needs to be format-aware. + +**Clarification**: + +1. **No format detection needed**: Do not create an `is_timestamp_slug()` helper function. The system should treat all slugs identically regardless of format. + +2. **Code paths are format-agnostic**: + - Database queries use slug as opaque string key + - URL routing uses slug as path parameter + - Feed generation includes slug verbatim + - Templates display slug without interpretation + +3. **Coexistence is automatic**: Content-based slugs (`my-first-post`) and timestamp slugs (`20251216143052`) are both valid slug patterns. They coexist naturally. + +4. **Sorting behavior**: + - Database ORDER BY on `created_at` is unaffected + - Alphabetical slug sorting would interleave formats (numbers sort before letters) + - This is acceptable; no UI sorts by slug alphabetically + +5. **Display considerations**: + - Timestamps are less readable in URLs, but URLs are rarely human-read + - UI displays note content, not slugs + - RSS feeds include full URLs; format doesn't affect feed readers + +**No audit required**: You do not need to audit slug usages. The design ensures format-agnosticism. + +--- + +### Q8: Phase ordering rationale + +**Decision**: +- Phases 2, 3, and 4 **can be parallelized** after Phase 0 completes +- Phase 1 (Timestamp Slugs) should be done **before** Phases 2-4 for cleaner testing +- Each phase should be **committed separately** + +**Recommended Order**: + +``` +Day 1: Phase 0 (Test Fixes) - Get to green build +Day 2: Phase 1 (Timestamp Slugs) - Quick win, user-visible improvement +Day 3: Phase 2 (Debug Files) or Phase 3 (N+1) or Phase 4 (Atomic) - Any order +Day 4: Remaining phases from above +Day 5: Phase 5 (Test Coverage) - Must be last +``` + +**Commit Strategy**: + +Each phase gets its own commit or small series of commits: +``` +feat(slugs): Implement timestamp-based slugs per ADR-062 +fix(tests): Resolve 19 failing tests in Phase 0 +feat(media): Add debug file cleanup and configuration +perf(feed): Batch load media and tags to fix N+1 query +feat(media): Make variant generation atomic with database +test: Expand coverage to 90% for v1.5.0 +``` + +**Why Not Batch**: Batching commits makes bisecting regressions harder and obscures the changelog. + +**Parallelization Note**: If you have capacity to work on multiple phases simultaneously (e.g., waiting for test runs), Phases 2-4 have no code overlap and can be developed in parallel branches: +- `feature/v1.5.0-debug-files` +- `feature/v1.5.0-n+1-fix` +- `feature/v1.5.0-atomic-variants` + +Then merge all into main before Phase 5. + +--- + +## Observations Responses + +### O1: Code duplication between `utils.py` and `slug_utils.py` + +**Acknowledged**. The duplicate `RESERVED_SLUGS` definitions should be consolidated in a future cleanup. For v1.5.0: + +- Use the more comprehensive list in `slug_utils.py` (12 items) as the canonical source +- Do not consolidate in this release (scope creep) +- Add to BACKLOG.md for v1.5.1 or v1.6.0 + +### O2: Timestamp format consistency + +**Confirmed**: The hyphen removal in ADR-062 (`YYYYMMDDHHMMSS`) vs existing fallback (`YYYYMMDD-HHMMSS`) is intentional. + +- ADR-062 format: `20251216143052` (14 characters, no separator) +- Old fallback format: `20251216-143052` (15 characters, with hyphen) + +**Rationale**: +- One fewer character +- Consistent with ISO 8601 compact format +- No functional difference; purely aesthetic + +**Action**: Use the unhyphenated format per ADR-062. + +### O3: Test coverage gap mentioned in Phase 5 + +**Clarification**: This is intentional duplication for visibility. + +- BACKLOG.md lists the gap for tracking +- Phase 5 in RELEASE.md lists it for implementation planning +- Both point to the same work item + +No action needed; this is not an error. + +--- + +## ADR-062 Updates Required + +No updates to ADR-062 are required based on these questions. The ADR's pseudocode is understood to be conceptual, and all implementation details have been clarified in this response document. + +--- + +## Implementation Checklist + +Based on these responses, the developer can now proceed with: + +- [ ] Phase 0: Fix 19 failing tests +- [ ] Create `generate_timestamp_slug()` in `slug_utils.py` +- [ ] Update `notes.py` to use new timestamp slug function +- [ ] Update tests with pattern matching strategy +- [ ] Proceed with Phases 2-4 (order flexible) +- [ ] Complete Phase 5 test coverage last + +--- + +**Architect**: StarPunk Architect Agent +**Status**: Questions Answered - Ready for Implementation +**Next Action**: Developer to begin Phase 0 (Test Fixes) diff --git a/docs/design/v1.5.0/2025-12-16-developer-questions.md b/docs/design/v1.5.0/2025-12-16-developer-questions.md new file mode 100644 index 0000000..af1228b --- /dev/null +++ b/docs/design/v1.5.0/2025-12-16-developer-questions.md @@ -0,0 +1,272 @@ +# Developer Review Questions - v1.5.0 Release Plan + +**Date**: 2025-12-16 +**Reviewer**: StarPunk Developer Agent +**Documents Reviewed**: +- `docs/decisions/ADR-062-timestamp-based-slug-format.md` +- `docs/projectplan/v1.5.0/RELEASE.md` +- `docs/projectplan/BACKLOG.md` + +## Executive Summary + +After thorough review of the v1.5.0 release plan and ADR-062, I have identified **8 questions** that need clarification from the architect before implementation can begin. These questions fall into three categories: + +1. **Slug implementation details** (Questions 1-4) +2. **Test structure and expectations** (Questions 5-6) +3. **Phase dependencies and ordering** (Questions 7-8) + +The majority of questions concern Phase 1 (Timestamp-Based Slugs), which has some ambiguities around collision handling and interaction with existing code paths. + +--- + +## Questions + +### Phase 1: Timestamp-Based Slugs + +#### Q1: Collision handling function location and signature + +**Issue**: ADR-062 specifies an `ensure_unique_slug(base_slug: str) -> str` function that checks `slug_exists(base_slug)`, but the implementation location and database interaction pattern are unclear. + +**Current State**: +- `starpunk/slug_utils.py` has `make_slug_unique_with_suffix(base_slug, existing_slugs: Set[str])` that takes a pre-fetched set +- `starpunk/utils.py` has `make_slug_unique(base_slug, existing_slugs: set[str])` with random suffix logic +- `starpunk/notes.py` line 220 calls `_get_existing_slugs(db)` to fetch all slugs, then passes to uniqueness check + +**Questions**: +1. Should `ensure_unique_slug()` perform its own database queries (violates current pattern)? +2. Or should it follow the existing pattern of accepting `existing_slugs: Set[str]` parameter? +3. Should this function live in `slug_utils.py` or `utils.py`? +4. The ADR shows `slug_exists()` as a separate function - should this be implemented, or is this pseudocode? + +**Implementation Impact**: This affects the function signature and which module needs modification. The current codebase consistently passes pre-fetched slug sets to avoid N+1 queries. + +**Suggested Clarification**: Confirm whether to: +- Keep existing pattern: `ensure_unique_slug(base_slug: str, existing_slugs: Set[str]) -> str` +- Or introduce database coupling: `ensure_unique_slug(base_slug: str, db: Connection) -> str` + +--- + +#### Q2: Sequential suffix starting number + +**Issue**: ADR-062 says "first collision: `20251216143052-1`" but the examples table shows the second note as `-1`, implying the first note has no suffix. + +**Current State**: +- Existing `make_slug_unique_with_suffix()` starts at `-2` for first collision (base slug is attempt #1) +- This matches table in ADR-062 examples + +**Questions**: +1. Does "first collision" mean the first duplicate attempt (which should get `-1`)? +2. Or does it mean the first note gets the base slug, and second note gets `-1`? + +**Example Scenario**: +```python +# Two notes created at exactly 2025-12-16 14:30:52 +Note A created first: slug = ? +Note B created second: slug = ? + +# Is it: +Option 1: A = 20251216143052, B = 20251216143052-1 (matches table) +Option 2: A = 20251216143052-1, B = 20251216143052-2 (matches "first collision" text) +``` + +**Implementation Impact**: Determines whether sequential suffix loop starts at 1 or 2. + +**Suggested Clarification**: Confirm that first note gets base slug (no suffix), first collision gets `-1`. + +--- + +#### Q3: Two `generate_slug()` functions with different signatures + +**Issue**: The codebase has two `generate_slug()` functions: + +**Current State**: +- `starpunk/utils.py` line 173: `generate_slug(content: str, created_at: Optional[datetime]) -> str` + - Content-based slug generation (extracts words) + - Used by `notes.py` line 230 +- ADR-062 specifies: `generate_slug(custom_slug: str = None, created_at: datetime = None) -> str` + - Timestamp-based, no content parameter + +**Questions**: +1. Should I replace the function in `utils.py`? +2. Should I create a new function in `slug_utils.py` and leave `utils.py` as deprecated? +3. Should both functions coexist with different names? +4. The removal of the `content` parameter is a breaking change for `notes.py` - should `notes.py` be updated to not pass content at all? + +**Implementation Impact**: +- If replacing `utils.py` function: ~15 test files will need updates +- If creating new function: need to update all callers to use new function +- `notes.py` line 230 currently calls `generate_slug(content, created_at)` - this will break + +**Suggested Clarification**: Specify exact function location and migration path for callers. + +--- + +#### Q4: Reserved slug handling in timestamp context + +**Issue**: ADR-062 doesn't address what happens if a timestamp-based slug conflicts with a reserved slug. + +**Current State**: +- Reserved slugs: `['api', 'admin', 'auth', 'feed', ...]` (all alphabetic) +- Timestamp format: `20251216143052` (all numeric) +- **These can never collide** + +**Questions**: +1. Should reserved slug validation be removed from the default slug path since timestamps can't conflict? +2. Should it remain for safety/future-proofing? +3. Custom slugs still need reserved validation - should this logic split? + +**Implementation Impact**: May simplify code by removing unnecessary validation from timestamp path. + +**Suggested Clarification**: Confirm whether reserved slug check is still needed for default slugs. + +--- + +### Phase 0: Test Fixes + +#### Q5: Test update strategy for slug format changes + +**Issue**: Phase 1 will change default slug format from content-based to timestamp-based. Many tests currently assert content-based slugs. + +**Current State**: +- `tests/test_utils.py` has 20+ tests asserting content-based slug behavior (e.g., `assert slug == "hello-world-this-is-my"`) +- Phase 1 acceptance criteria: "Update expected slug formats in test assertions" + +**Questions**: +1. Should these tests be updated to assert timestamp format (e.g., `assert slug.startswith("2025")`)? +2. Should the old content-based tests be preserved but marked as testing legacy behavior? +3. Should tests be split: "default slug generation" vs "content-based slug generation" (if we keep the old function)? +4. Do you want all slug assertions to use pattern matching (timestamp format) or fixed timestamps in tests? + +**Implementation Impact**: +- Affects ~30 test assertions in `test_utils.py` +- Affects ~10 integration tests in `test_notes.py`, `test_micropub.py` +- May require test helper functions for deterministic timestamps + +**Suggested Clarification**: Provide guidance on test update strategy. + +--- + +#### Q6: Phase 0 priority - are tests blocking? + +**Issue**: Phase 0 is marked "Must complete first - unblocks all other phases" but the 19 failing tests appear unrelated to slug changes. + +**Current State**: +- Failing tests: Migration performance, feed streaming, content negotiation, search security +- None directly related to slug generation +- Phase 1 (slugs) could theoretically proceed with these tests failing + +**Questions**: +1. Is Phase 0 truly blocking for Phase 1, or can Phase 1 proceed if slug-related tests pass? +2. Should Phase 0 be completed before ANY other phase, or just before phases that depend on those specific test areas? +3. If I discover that fixing some Phase 0 tests requires changes that conflict with Phase 1 work, what's the priority? + +**Implementation Impact**: Affects work sequencing and potential merge conflicts. + +**Suggested Clarification**: Confirm hard dependency of Phase 0 on all other phases, or allow parallel work. + +--- + +### General Implementation Questions + +#### Q7: Backwards compatibility for existing notes + +**Issue**: ADR-062 states "Existing notes retain their slugs (no data migration)" but doesn't specify how the dual slug formats coexist. + +**Current State**: +- Database has notes with content-based slugs (e.g., `my-first-post`) +- New notes will have timestamp slugs (e.g., `20251216143052`) +- Both slug formats will exist simultaneously + +**Questions**: +1. Do any code paths need to detect/handle the two formats differently? +2. Should there be a helper function `is_timestamp_slug(slug: str) -> bool` for future use? +3. Are there any edge cases where code assumes slug format (e.g., sorting, display, URLs)? +4. Does feed generation or UI need updates to handle mixed slug formats? + +**Implementation Impact**: May require audit of all slug usages to ensure format-agnostic handling. + +**Suggested Clarification**: Confirm if any code needs to be format-aware, or if slug format is truly opaque. + +--- + +#### Q8: Phase ordering rationale + +**Issue**: Phases 2-4 are all marked "Phase 0 complete" as dependency, but no interdependencies are noted between them. + +**Current State**: +- Phase 2: Debug file management +- Phase 3: N+1 query fix (feed generation) +- Phase 4: Atomic variant generation +- All independent of each other + +**Questions**: +1. Can Phases 2-4 be implemented in parallel after Phase 0? +2. Is there a preferred order (2→3→4) for any reason? +3. Should each phase be committed separately or can they be batched? + +**Implementation Impact**: Affects work planning and commit structure. + +**Suggested Clarification**: Confirm whether Phases 2-4 can be parallelized or should be sequential. + +--- + +## Additional Observations (Not Blocking) + +These observations don't block implementation but may inform architectural decisions: + +### O1: Code duplication between `utils.py` and `slug_utils.py` + +**Observation**: Reserved slugs are defined in both: +- `utils.py` line 24: `RESERVED_SLUGS = {"admin", "api", ...}` (5 items) +- `slug_utils.py` line 24: `RESERVED_SLUGS = frozenset([...])` (12 items, more comprehensive) + +**Impact**: Low - but may cause confusion. Consider consolidating in v1.5.1. + +--- + +### O2: Timestamp format consistency + +**Observation**: ADR-062 uses `YYYYMMDDHHMMSS` (no separator) but existing fallback in `utils.py` line 220 uses `YYYYMMDD-HHMMSS` (with hyphen). + +**Questions**: +- Is the hyphen removal intentional (shorter, more compact)? +- Current: `20241118-143045` (15 chars) +- ADR-062: `20241118143045` (14 chars) + +**Impact**: Minor - affects slug length by 1 character. + +--- + +### O3: Test coverage gap mentioned in Phase 5 + +**Observation**: Phase 5 mentions "MPO format handling (untested)" but this is already in BACKLOG as "High priority - Scheduled for v1.5.0". + +**Question**: Is this intentionally listed in both BACKLOG and Phase 5, or is there duplication? + +**Impact**: None - just a clarity question. + +--- + +## Summary + +**Implementation Readiness**: 6/10 + +The release plan is well-structured with clear phases and acceptance criteria. However, **Phase 1 (Timestamp-Based Slugs)** has enough ambiguity in implementation details (Questions 1-4) that I cannot confidently proceed without architect clarification. + +**Recommended Next Steps**: +1. Architect addresses Questions 1-4 (Phase 1 blocking issues) +2. Architect clarifies Questions 5-8 (helps inform implementation approach) +3. Developer proceeds with Phase 0 (test fixes) while awaiting Q1-Q4 answers +4. Developer implements Phases 1-4 after clarifications received + +**Estimated Time to Implementation-Ready**: 1-2 hours for architect to address questions. + +--- + +## Approval + +Once these questions are addressed, I am confident the implementation can proceed smoothly with the 18-29 hour estimate provided in the release plan. + +**Developer**: StarPunk Developer Agent +**Status**: Awaiting Architect Response +**Next Action**: Architect to review and answer questions 1-8 diff --git a/docs/projectplan/BACKLOG.md b/docs/projectplan/BACKLOG.md index fc30fc4..9630dcb 100644 --- a/docs/projectplan/BACKLOG.md +++ b/docs/projectplan/BACKLOG.md @@ -1,6 +1,6 @@ # StarPunk Backlog -**Last Updated**: 2025-12-16 +**Last Updated**: 2025-12-17 ## Recently Completed @@ -75,11 +75,24 @@ - **Source**: Architect Review (Issue 1.2.3) - **Approach**: Sanitize filename before use: `safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]` -### N+1 Query Pattern in Feed Generation *(Scheduled: v1.5.0)* +### N+1 Query Pattern in Feed Generation *(Scheduled: v1.5.0 - Partial)* - **Description**: In `_get_cached_notes()`, media and tags are loaded per-note in separate queries. For 50 notes, this is 100 additional database queries, degrading performance. - **Location**: `starpunk/routes/public.py` lines 68-74 - **Source**: Architect Review (Issue 2.2.9) - **Approach**: Implement batch loading function `get_media_for_notes(note_ids: List[int])` using a single query with `WHERE note_id IN (...)`. +- **v1.5.0 Scope**: Only `_get_cached_notes()` will be fixed in v1.5.0. Other N+1 patterns deferred (see Low Priority section). + +### N+1 Query Patterns - Deferred Locations +- **Description**: N+1 query patterns exist in multiple locations beyond `_get_cached_notes()`. These are lower priority due to lower traffic or single-note contexts. +- **Deferred Locations**: + 1. **Admin Edit Page** (`starpunk/routes/admin.py` line 186): Single note context, low impact + 2. **Tag Archive Pages** (`starpunk/routes/public.py` lines 243-248, 332): Lower traffic than main feed + 3. **Individual Note View** (`starpunk/routes/public.py` lines 282-289): Single note, negligible + 4. **Note Model Property** (`starpunk/models.py` lines 380-381): On-demand loading, acceptable + 5. **Tag Module** (`starpunk/tags.py` line 197): Internal helper, low volume +- **Source**: Architect Review, v1.5.0 scoping decision +- **Approach**: Future optimization if performance issues arise. Consider batch loading patterns established in v1.5.0. +- **Priority**: Deferred to post-v1.5.0 ### Transaction Not Atomic in Variant Generation *(Scheduled: v1.5.0)* - **Description**: Files are written to disk before database commit. If the database commit fails, orphaned files remain on disk. diff --git a/docs/projectplan/v1.5.0/RELEASE.md b/docs/projectplan/v1.5.0/RELEASE.md index 0ea172c..8c28961 100644 --- a/docs/projectplan/v1.5.0/RELEASE.md +++ b/docs/projectplan/v1.5.0/RELEASE.md @@ -1,257 +1,346 @@ -# StarPunk v1.5.0 Release +# StarPunk v1.5.0 Release Plan -**Status**: Planning +**Status**: Approved **Codename**: "Trigger" -**Focus**: Cleanup, Test Coverage, and Quality of Life Improvements +**Focus**: Stability, Test Coverage, and Technical Debt Reduction +**Last Updated**: 2025-12-17 ## Overview -This minor release focuses on technical debt reduction, improving test coverage to 90%, and addressing quality-of-life issues identified in previous release reviews. No new user-facing features are planned. +v1.5.0 is a quality-focused release that addresses failing tests, increases test coverage to 90%, implements critical fixes from the v1.4.x review cycle, and resolves technical debt. No new user-facing features are planned. ## Goals -1. **90% Test Coverage** - Increase overall test coverage from current baseline to 90% -2. **Address Backlog Technical Debt** - Resolve 6 specific backlog items -3. **Improve Code Quality** - Better error handling, atomicity, and performance +1. **Fix Failing Tests** - Resolve all 19 currently failing tests +2. **90% Test Coverage** - Increase overall test coverage to 90% minimum +3. **Technical Debt Reduction** - Address 6 specific backlog items +4. **Code Quality** - Better error handling, atomicity, and performance -## Features +## Phased Implementation Plan -### 1. Test Coverage Target: 90% - -Increase overall test coverage to 90% minimum across all modules. +### Phase 0: Test Fixes (Critical Path) +**Priority**: Must complete first - unblocks all other phases #### Scope -- Identify modules with coverage below 90% -- Write missing unit tests for uncovered code paths -- Add integration tests for critical workflows -- Ensure edge cases and error paths are tested +Fix the 19 failing tests identified in the current test suite: + +| Category | Count | Tests | +|----------|-------|-------| +| Migration Performance | 2 | `test_single_worker_performance`, `test_concurrent_workers_performance` | +| Feed Route (Streaming) | 1 | `test_feed_route_streaming` | +| Feed Endpoints | 3 | `test_feed_rss_endpoint`, `test_feed_json_endpoint`, `test_feed_xml_legacy_endpoint` | +| Content Negotiation | 6 | `test_accept_rss`, `test_accept_json_feed`, `test_accept_json_generic`, `test_accept_wildcard`, `test_no_accept_header`, `test_quality_factor_json_wins` | +| Backward Compatibility | 1 | `test_feed_xml_contains_rss` | +| Search Security | 1 | `test_search_escapes_html_in_note_content` | + +#### Approach +1. Investigate each failing test category +2. Determine if failure is test issue or code issue +3. Fix appropriately (prefer fixing tests over changing working code) +4. Document any behavioral changes #### Acceptance Criteria -- `uv run pytest --cov=starpunk` reports ≥90% overall coverage -- No module below 85% coverage -- All new code in this release has 100% coverage +- [ ] All 879 tests pass +- [ ] No test skips added (unless justified) +- [ ] No test timeouts + +#### Dependencies +None - this is the first phase --- -### 2. MPO Format Test Coverage +### Phase 1: Timestamp-Based Slugs +**Priority**: High - Addresses user-reported issue -**Source**: Backlog - High Priority (Developer Review M1) - -Add test coverage for MPO (Multi-Picture Object) format handling. - -#### Current State -- MPO conversion code exists at `starpunk/media.py` lines 163-173 -- Advertised in CHANGELOG but untested +#### Scope +Implement ADR-062 to change default slug format from content-based to timestamp-based. #### Implementation -- Add `test_mpo_detection_and_conversion()` to `TestHEICSupport` class -- Create MPO test image using Pillow's MPO support -- Test primary image extraction -- Test conversion to JPEG +1. Update `starpunk/slug_utils.py`: + - Change `generate_slug()` to use timestamp format `YYYYMMDDHHMMSS` + - Update collision handling to use sequential suffix (`-1`, `-2`, etc.) + - Preserve custom slug functionality + +2. Update `starpunk/notes.py`: + - Remove content parameter from slug generation calls + +3. Update tests: + - Modify expected slug formats in test assertions + - Add tests for new timestamp format + - Add tests for sequential collision handling #### Acceptance Criteria -- MPO detection tested -- MPO to JPEG conversion tested -- Edge cases (corrupted MPO, single-frame MPO) tested +- [ ] Default slugs use `YYYYMMDDHHMMSS` format +- [ ] Collision handling uses `-1`, `-2` suffix +- [ ] Custom slugs via `mp-slug` work unchanged +- [ ] Custom slugs via web UI work unchanged +- [ ] Existing notes unaffected +- [ ] ADR-062 referenced in code comments + +#### Dependencies +- Phase 0 complete (all tests passing) --- -### 3. Debug File Storage Cleanup +### Phase 2: Debug File Management +**Priority**: Medium - Security and operations concern -**Source**: Backlog - Medium Priority (Developer Review M2, Architect Review 1.2.2) - -Implement cleanup mechanism for failed upload debug files. - -#### Current State -- Failed uploads saved to `data/debug/` directory -- No cleanup mechanism exists -- Potential disk space exhaustion +#### Scope +Implement cleanup mechanism for failed upload debug files and add configuration controls. #### Implementation -1. Add `DEBUG_SAVE_FAILED_UPLOADS` config option (default: `false` in production) -2. Implement automatic cleanup for files older than 7 days -3. Add disk space check or size limit (100MB max for debug folder) -4. Cleanup runs on application startup and periodically +1. Add configuration options: + ```python + DEBUG_SAVE_FAILED_UPLOADS = False # Default: disabled in production + DEBUG_FILE_MAX_AGE_DAYS = 7 # Auto-delete threshold + DEBUG_FILE_MAX_SIZE_MB = 100 # Maximum debug folder size + ``` -#### Configuration -```python -DEBUG_SAVE_FAILED_UPLOADS = False # Enable only for debugging -DEBUG_FILE_MAX_AGE_DAYS = 7 # Auto-delete after 7 days -DEBUG_FILE_MAX_SIZE_MB = 100 # Maximum debug folder size -``` +2. Implement cleanup logic in `starpunk/media.py`: + - Check config before saving debug files + - Implement `cleanup_old_debug_files()` function + - Add size limit check before saving + +3. Add startup cleanup: + - Run cleanup on application startup + - Log cleanup actions + +4. Sanitize filenames: + - Sanitize filename before debug path construction + - Pattern: `"".join(c for c in filename if c.isalnum() or c in "._-")[:50]` #### Acceptance Criteria -- Debug files disabled by default in production -- Old files automatically cleaned up when enabled -- Disk space protected with size limit -- Tests cover all cleanup scenarios +- [ ] Debug files disabled by default +- [ ] Files older than 7 days auto-deleted when enabled +- [ ] Folder size limited to 100MB +- [ ] Filenames sanitized (no path traversal) +- [ ] Cleanup runs on startup +- [ ] Tests cover all scenarios + +#### Dependencies +- Phase 0 complete --- -### 4. Filename Sanitization in Debug Path +### Phase 3: N+1 Query Fix (Feed Generation) +**Priority**: Medium - Performance improvement -**Source**: Backlog - Medium Priority (Architect Review 1.2.3) - -Sanitize filenames before use in debug file paths. - -#### Current State -- Original filename used directly at `starpunk/media.py` line 135 -- Path traversal or special character issues possible +#### Scope +Fix N+1 query pattern in `_get_cached_notes()` only. Other N+1 patterns are deferred (documented in BACKLOG.md). #### Implementation -```python -safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50] -``` +1. Create batch loading functions in `starpunk/media.py`: + ```python + def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]: + """Batch load media for multiple notes in single query.""" + ``` + +2. Create batch loading functions in `starpunk/tags.py`: + ```python + def get_tags_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]: + """Batch load tags for multiple notes in single query.""" + ``` + +3. Update `_get_cached_notes()` in `starpunk/routes/public.py`: + - Use batch loading instead of per-note queries + - Maintain same output format #### Acceptance Criteria -- Filenames sanitized before debug path construction -- Path traversal attempts neutralized -- Special characters removed -- Filename length limited -- Tests cover malicious filename patterns +- [ ] Feed generation uses batch queries +- [ ] Query count reduced from O(n) to O(1) for media/tags +- [ ] No change to API behavior +- [ ] Performance improvement verified in tests +- [ ] Other N+1 locations documented in BACKLOG.md (not fixed) + +#### Dependencies +- Phase 0 complete --- -### 5. N+1 Query Pattern Fix +### Phase 4: Atomic Variant Generation +**Priority**: Medium - Data integrity -**Source**: Backlog - Medium Priority (Architect Review 2.2.9) - -Eliminate N+1 query pattern in feed generation. - -#### Current State -- `_get_cached_notes()` loads media and tags per-note -- For 50 notes: 100 additional database queries -- Performance degrades with note count +#### Scope +Make variant file generation atomic with database commits to prevent orphaned files. #### Implementation -1. Create `get_media_for_notes(note_ids: List[int])` batch function -2. Create `get_tags_for_notes(note_ids: List[int])` batch function -3. Use single query with `WHERE note_id IN (...)` -4. Update `_get_cached_notes()` to use batch loading +1. Modify `generate_all_variants()` in `starpunk/media.py`: + - Write variants to temporary directory first + - Perform database inserts in transaction + - Move files to final location after successful commit + - Clean up temp files on any failure -#### Example -```python -def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]: - """Batch load media for multiple notes.""" - # Single query: SELECT * FROM note_media WHERE note_id IN (?) - # Returns: {note_id: [media_list]} -``` - -#### Acceptance Criteria -- Feed generation uses batch queries -- Query count reduced from O(n) to O(1) for media/tags -- Performance improvement measurable in tests -- No change to API behavior - ---- - -### 6. Atomic Variant Generation - -**Source**: Backlog - Medium Priority (Architect Review 2.2.6) - -Make variant file generation atomic with database commits. - -#### Current State -- Files written to disk before database commit -- Database commit failure leaves orphaned files - -#### Implementation -1. Write variant files to temporary location first -2. Database insert in transaction -3. Move files to final location after successful commit -4. Cleanup temp files on any failure +2. Add startup recovery: + - Detect orphaned variant files on startup + - Log warnings for orphans found + - Optionally clean up orphans #### Flow ``` -1. Generate variants to temp directory +1. Generate variants to temp directory (data/media/.tmp/) 2. BEGIN TRANSACTION 3. INSERT media record 4. INSERT variant records 5. COMMIT 6. Move files from temp to final location -7. On failure: rollback DB, delete temp files +7. On failure: ROLLBACK, delete temp files ``` #### Acceptance Criteria -- No orphaned files on database failures -- No orphaned database records on file failures -- Atomic operation for all media saves -- Tests simulate failure scenarios +- [ ] No orphaned files on database failures +- [ ] No orphaned DB records on file failures +- [ ] Atomic operation for all media saves +- [ ] Startup recovery detects orphans +- [ ] Tests simulate failure scenarios + +#### Dependencies +- Phase 0 complete --- -### 7. Default Slug Format Change +### Phase 5: Test Coverage Expansion +**Priority**: Medium - Quality assurance -**Source**: Backlog - Medium Priority +#### Scope +Increase overall test coverage to 90% minimum. -Change default slug format from content-based to timestamp-based. +#### Approach +1. Run coverage report: `uv run pytest --cov=starpunk --cov-report=html` +2. Identify modules below 90% coverage +3. Prioritize based on risk and complexity +4. Write tests for uncovered paths -#### Current State -- Slugs generated from note content -- Can produce unwanted slugs from private content +#### Known Coverage Gaps (to verify) +- MPO format handling (untested) +- Edge cases in error paths +- Configuration validation paths +- Startup/shutdown hooks -#### New Format -- Default: `YYYYMMDDHHMMSS` (e.g., `20251216143052`) -- Duplicate handling: append `-1`, `-2`, etc. -- Custom slugs still supported via `mp-slug` +#### Specific Test Additions +1. **MPO Format Tests** (`tests/test_media_upload.py`): + - `test_mpo_detection_and_conversion()` + - `test_mpo_corrupted_file()` + - `test_mpo_single_frame()` -#### Implementation -1. Update `generate_slug()` function in `starpunk/slug_utils.py` -2. Generate timestamp-based slug by default -3. Check for duplicates and append suffix if needed -4. Preserve custom slug functionality +2. **Debug File Tests** (new test file or extend `test_media_upload.py`): + - `test_debug_file_disabled_by_default()` + - `test_debug_file_cleanup_old_files()` + - `test_debug_file_size_limit()` + - `test_debug_filename_sanitization()` -#### Example -```python -def generate_slug(content: str = None, custom_slug: str = None) -> str: - if custom_slug: - return sanitize_slug(custom_slug) - # Default: timestamp-based - timestamp = datetime.now().strftime("%Y%m%d%H%M%S") - return ensure_unique_slug(timestamp) -``` +3. **Batch Loading Tests**: + - `test_get_media_for_notes_batch()` + - `test_get_tags_for_notes_batch()` + - `test_batch_with_empty_list()` #### Acceptance Criteria -- New notes use timestamp slugs by default -- Duplicate timestamps handled with suffix -- Custom `mp-slug` still works -- Existing notes unchanged -- Tests cover all slug generation scenarios +- [ ] Overall coverage >= 90% +- [ ] No module below 85% coverage +- [ ] All new code in v1.5.0 has 100% coverage +- [ ] MPO handling fully tested + +#### Dependencies +- Phases 1-4 complete (tests cover new functionality) --- ## Out of Scope -- New user-facing features -- UI changes -- New feed formats -- Micropub extensions -- Database schema changes (except for bug fixes) +Items explicitly excluded from v1.5.0: -## Dependencies +| Item | Reason | +|------|--------| +| Rate limiting | Handled by reverse proxy (Caddy/Nginx) | +| Schema changes | Not needed for v1.5.0 fixes | +| New user features | Quality-focused release | +| N+1 fixes in admin/search | Low traffic, deferred to BACKLOG | +| UI changes | No frontend work planned | -- v1.4.x complete -- No new external dependencies expected +--- + +## Recommendation: Single Release vs. Multiple Patches + +**Recommendation: Single v1.5.0 Release** + +### Rationale + +1. **Phase Dependencies**: Most phases depend on Phase 0 (test fixes). Splitting would create artificial release boundaries. + +2. **Cognitive Overhead**: Multiple patch releases (1.5.1, 1.5.2, etc.) require: + - Multiple changelog entries + - Multiple version bumps + - Multiple release notes + - More git tags/branches + +3. **Test Coverage Integration**: Test coverage work (Phase 5) tests functionality from Phases 1-4. Separating them creates incomplete test coverage. + +4. **User Experience**: Users prefer fewer, more significant updates over many small patches. + +5. **Scope Alignment**: All v1.5.0 work is internally focused (no external API changes). Users see one "quality improvement" release. + +### Exception + +If Phase 0 (test fixes) reveals critical bugs affecting production, those fixes should be: +- Backported to a v1.4.3 patch release on the current branch +- Then merged forward to v1.5.0 + +### Alternative Considered + +Splitting into: +- v1.5.0: Phase 0 (test fixes) + Phase 1 (slugs) +- v1.5.1: Phase 2-4 (technical debt) +- v1.5.2: Phase 5 (test coverage) + +**Rejected** because test coverage work must test the new functionality, making separation counterproductive. + +--- ## Success Criteria -1. ✅ Test coverage ≥90% overall -2. ✅ MPO format has test coverage -3. ✅ Debug file cleanup implemented and tested -4. ✅ Filename sanitization implemented and tested -5. ✅ N+1 query pattern eliminated -6. ✅ Variant generation is atomic -7. ✅ Default slugs are timestamp-based -8. ✅ All existing tests continue to pass -9. ✅ No regressions in functionality +| # | Criterion | Verification | +|---|-----------|--------------| +| 1 | All tests pass | `uv run pytest` shows 0 failures | +| 2 | Coverage >= 90% | `uv run pytest --cov=starpunk` | +| 3 | MPO tested | MPO tests in test suite | +| 4 | Debug cleanup works | Manual verification + tests | +| 5 | N+1 fixed in feed | Performance tests show improvement | +| 6 | Variants atomic | Failure simulation tests pass | +| 7 | Slugs timestamp-based | New notes use `YYYYMMDDHHMMSS` format | +| 8 | No regressions | Full test suite passes | +| 9 | ADRs documented | ADR-062 in `/docs/decisions/` | -## Related Backlog Items +--- -After completion, the following backlog items should be marked complete or moved to "Recently Completed": +## Related Documentation -- MPO Format Test Coverage (High) -- Debug File Storage Without Cleanup Mechanism (Medium) -- Filename Not Sanitized in Debug Path (Medium) -- N+1 Query Pattern in Feed Generation (Medium) -- Transaction Not Atomic in Variant Generation (Medium) -- Default Slug Change (Medium) +- ADR-062: Timestamp-Based Slug Format (supersedes ADR-007) +- ADR-007: Slug Generation Algorithm (superseded) +- BACKLOG.md: Deferred N+1 query locations documented +- v1.4.2 Architect Review: Source of many v1.5.0 items + +--- + +## Implementation Timeline + +| Phase | Estimated Effort | Dependencies | +|-------|------------------|--------------| +| Phase 0: Test Fixes | 2-4 hours | None | +| Phase 1: Timestamp Slugs | 2-3 hours | Phase 0 | +| Phase 2: Debug Files | 3-4 hours | Phase 0 | +| Phase 3: N+1 Fix | 3-4 hours | Phase 0 | +| Phase 4: Atomic Variants | 4-6 hours | Phase 0 | +| Phase 5: Coverage | 4-8 hours | Phases 1-4 | + +**Total Estimated**: 18-29 hours + +--- + +## Post-Release + +After v1.5.0 ships, update BACKLOG.md to move completed items to "Recently Completed" section: +- MPO Format Test Coverage +- Debug File Storage Cleanup +- Filename Sanitization +- N+1 Query Fix (Feed Generation - partial) +- Atomic Variant Generation +- Default Slug Change