docs: v1.5.0 planning - ADR-062, release plan, and design docs
- ADR-062: Timestamp-based slug format (supersedes ADR-007) - Updated v1.5.0 RELEASE.md with 6-phase plan - Updated BACKLOG.md with deferred N+1 query locations - Developer questions and architect responses for Phase 1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
197
docs/decisions/ADR-062-timestamp-based-slug-format.md
Normal file
197
docs/decisions/ADR-062-timestamp-based-slug-format.md
Normal file
@@ -0,0 +1,197 @@
|
||||
# ADR-062: Timestamp-Based Slug Format
|
||||
|
||||
## Status
|
||||
Accepted (Supersedes ADR-007)
|
||||
|
||||
## Context
|
||||
|
||||
ADR-007 established a content-based slug generation algorithm that extracts the first 5 words from note content to create URL slugs. While this approach provides readable, SEO-friendly URLs, it has drawbacks for a personal note-taking system:
|
||||
|
||||
1. **Privacy Concerns**: The slug reveals note content in the URL. Private thoughts or draft content become visible in URLs that may be shared or logged.
|
||||
|
||||
2. **SEO Irrelevance**: StarPunk is designed for personal IndieWeb notes, not public blogs. Notes are typically short-form content (similar to tweets or status updates) where SEO optimization provides no meaningful benefit.
|
||||
|
||||
3. **Unpredictable Slugs**: Users cannot predict what slug will be generated without examining their content carefully.
|
||||
|
||||
4. **Edge Case Handling**: Content-based slugs require complex fallback logic for short content, unicode-only content, or special characters.
|
||||
|
||||
5. **Collision Complexity**: Similar notes require random suffix generation (e.g., `hello-world-a7c9`), which undermines the readability goal.
|
||||
|
||||
The user has explicitly stated: "Notes don't need SEO."
|
||||
|
||||
## Decision
|
||||
|
||||
Change the default slug format from content-based to timestamp-based:
|
||||
|
||||
**New Default Format**: `YYYYMMDDHHMMSS`
|
||||
- Example: `20251216143052`
|
||||
- Compact, sortable, predictable
|
||||
- 14 characters total
|
||||
|
||||
**Collision Handling**: Sequential numeric suffix
|
||||
- First collision: `20251216143052-1`
|
||||
- Second collision: `20251216143052-2`
|
||||
- Simple, predictable, no randomness
|
||||
|
||||
**Custom Slugs**: Continue to support user-specified slugs via `mp-slug` Micropub property and the web UI custom slug field. When provided, custom slugs take precedence.
|
||||
|
||||
### Algorithm Specification
|
||||
|
||||
```python
|
||||
def generate_slug(custom_slug: str = None, created_at: datetime = None) -> str:
|
||||
"""Generate a URL-safe slug for a note.
|
||||
|
||||
Args:
|
||||
custom_slug: User-provided custom slug (takes precedence if provided)
|
||||
created_at: Note creation timestamp (defaults to now)
|
||||
|
||||
Returns:
|
||||
URL-safe slug string
|
||||
"""
|
||||
if custom_slug:
|
||||
return sanitize_slug(custom_slug)
|
||||
|
||||
# Default: timestamp-based
|
||||
timestamp = (created_at or datetime.now()).strftime("%Y%m%d%H%M%S")
|
||||
return ensure_unique_slug(timestamp)
|
||||
|
||||
def ensure_unique_slug(base_slug: str) -> str:
|
||||
"""Ensure slug is unique, adding numeric suffix if needed."""
|
||||
if not slug_exists(base_slug):
|
||||
return base_slug
|
||||
|
||||
suffix = 1
|
||||
while slug_exists(f"{base_slug}-{suffix}"):
|
||||
suffix += 1
|
||||
return f"{base_slug}-{suffix}"
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
| Scenario | Generated Slug |
|
||||
|----------|----------------|
|
||||
| Normal note at 2:30:52 PM on Dec 16, 2025 | `20251216143052` |
|
||||
| Second note in same second | `20251216143052-1` |
|
||||
| Third note in same second | `20251216143052-2` |
|
||||
| Note with custom slug "my-custom-slug" | `my-custom-slug` |
|
||||
|
||||
## Rationale
|
||||
|
||||
### Timestamp Format (Score: 9/10)
|
||||
|
||||
**Pros**:
|
||||
- **Privacy**: URL reveals nothing about content
|
||||
- **Predictability**: User knows exactly what slug format to expect
|
||||
- **Sortability**: Chronological sorting by URL is possible
|
||||
- **Simplicity**: No complex word extraction or normalization
|
||||
- **Collision Rarity**: Same-second creation is rare; handled cleanly when it occurs
|
||||
- **Compact**: 14 characters vs potentially 50+ for content-based
|
||||
|
||||
**Cons**:
|
||||
- **Not Memorable**: `20251216143052` is harder to remember than `hello-world`
|
||||
- **No SEO Value**: Search engines prefer descriptive URLs
|
||||
|
||||
**Why Cons Don't Matter**:
|
||||
- StarPunk is for personal notes, not public SEO-optimized content
|
||||
- Notes are accessed via UI, feeds, or bookmarks, not memorized URLs
|
||||
- Users wanting memorable URLs can use custom slugs
|
||||
|
||||
### Sequential Suffix (Score: 10/10)
|
||||
|
||||
**Pros**:
|
||||
- **Deterministic**: No randomness; same collision always gets same suffix
|
||||
- **Simple**: No cryptographic random generation needed
|
||||
- **Readable**: `-1`, `-2` are clear and obvious
|
||||
- **Debuggable**: Easy to understand collision resolution
|
||||
|
||||
**Cons**:
|
||||
- **Enumerable**: Sequential numbers could be probed
|
||||
- Not a real security concern for note slugs
|
||||
|
||||
### Comparison with ADR-007 Approach
|
||||
|
||||
| Aspect | ADR-007 (Content-Based) | ADR-062 (Timestamp) |
|
||||
|--------|------------------------|---------------------|
|
||||
| Privacy | Reveals content | Reveals only time |
|
||||
| Complexity | High (word extraction, normalization, unicode handling) | Low (strftime) |
|
||||
| SEO | Good | None |
|
||||
| Predictability | Low | High |
|
||||
| Collision handling | Random suffix | Sequential suffix |
|
||||
| Fallback cases | Many (short content, unicode, etc.) | None |
|
||||
| Code lines | ~50 | ~15 |
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Simplified Code**: Remove complex word extraction, unicode normalization, and multiple fallback paths
|
||||
2. **Better Privacy**: Note content never appears in URLs
|
||||
3. **Predictable Output**: Users always know what slug format to expect
|
||||
4. **Fewer Edge Cases**: No special handling for short content, unicode, or special characters
|
||||
5. **Cleaner Collisions**: Sequential suffixes are more intuitive than random strings
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Migration**: Existing notes keep their content-based slugs (no migration needed)
|
||||
2. **Not Human-Readable**: URLs don't describe content
|
||||
3. **No SEO**: Search engines won't benefit from descriptive URLs
|
||||
|
||||
### Mitigations
|
||||
|
||||
**Human-Readable URLs**:
|
||||
- Users wanting descriptive URLs can use custom slugs via `mp-slug`
|
||||
- The web UI custom slug field remains available
|
||||
- This is opt-in rather than default
|
||||
|
||||
**SEO**:
|
||||
- IndieWeb notes are typically not SEO targets
|
||||
- Content is in the page body where search engines can index it
|
||||
- Microformats2 markup provides semantic meaning
|
||||
|
||||
## Backward Compatibility
|
||||
|
||||
- Existing notes retain their slugs (no data migration)
|
||||
- New notes use timestamp format by default
|
||||
- Custom slug functionality unchanged
|
||||
- All existing URLs remain valid
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
- Test default slug generation produces timestamp format
|
||||
- Test collision handling with sequential suffixes
|
||||
- Test custom slugs still take precedence
|
||||
- Test edge case: multiple notes in same second
|
||||
- Test reserved slug rejection still works
|
||||
- Verify existing tests for custom slugs pass
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
Changes required in:
|
||||
- `starpunk/slug_utils.py`: Update `generate_slug()` function
|
||||
- `starpunk/notes.py`: Remove content parameter from slug generation call
|
||||
- Tests: Update expected slug formats
|
||||
|
||||
Estimated effort: Small (1-2 hours implementation, 1 hour testing)
|
||||
|
||||
## References
|
||||
|
||||
- ADR-007: Slug Generation Algorithm (Superseded by this ADR)
|
||||
- ADR-035: Custom Slugs (Unchanged; complements this decision)
|
||||
- IndieWeb Permalink Best Practices: https://indieweb.org/permalink
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Default slugs use `YYYYMMDDHHMMSS` format
|
||||
- [ ] Collision handling uses sequential suffix (`-1`, `-2`, etc.)
|
||||
- [ ] Custom slugs via `mp-slug` continue to work
|
||||
- [ ] Custom slugs via web UI continue to work
|
||||
- [ ] Reserved slug validation unchanged
|
||||
- [ ] Existing notes unaffected
|
||||
- [ ] All tests pass
|
||||
- [ ] Code complexity reduced
|
||||
|
||||
---
|
||||
|
||||
**Approved**: 2025-12-16
|
||||
**Architect**: StarPunk Architect Agent
|
||||
**Supersedes**: ADR-007 (Slug Generation Algorithm)
|
||||
412
docs/design/v1.5.0/2025-12-16-architect-responses.md
Normal file
412
docs/design/v1.5.0/2025-12-16-architect-responses.md
Normal file
@@ -0,0 +1,412 @@
|
||||
# Architect Responses - v1.5.0 Developer Questions
|
||||
|
||||
**Date**: 2025-12-16
|
||||
**Architect**: StarPunk Architect Agent
|
||||
**In Response To**: `docs/design/v1.5.0/2025-12-16-developer-questions.md`
|
||||
|
||||
---
|
||||
|
||||
## Response Summary
|
||||
|
||||
All 8 questions have been addressed. The implementation can proceed immediately after reading this document. Key decisions:
|
||||
|
||||
1. **Q1**: Keep existing pattern (pass `existing_slugs: Set[str]`). ADR-062 pseudocode is conceptual.
|
||||
2. **Q2**: First note gets base slug, first collision gets `-1`. The table in ADR-062 is correct.
|
||||
3. **Q3**: Create new function in `slug_utils.py`. Migration path provided.
|
||||
4. **Q4**: Keep reserved slug validation for custom slugs only. Safe to skip for timestamp slugs.
|
||||
5. **Q5**: Tests should use pattern matching for timestamps. Test strategy provided.
|
||||
6. **Q6**: Phase 0 is a hard dependency. Cannot proceed with Phase 1 until all tests pass.
|
||||
7. **Q7**: Slug format is opaque. No format-aware code needed.
|
||||
8. **Q8**: Phases 2-4 can be parallelized. Commit each phase separately.
|
||||
|
||||
---
|
||||
|
||||
## Detailed Responses
|
||||
|
||||
### Q1: Collision handling function location and signature
|
||||
|
||||
**Decision**: Keep the existing pattern of accepting `existing_slugs: Set[str]` as a parameter.
|
||||
|
||||
**Rationale**:
|
||||
The ADR-062 pseudocode (`slug_exists(base_slug)`) is conceptual illustration only. The actual implementation should:
|
||||
|
||||
1. Follow the existing pattern in the codebase (passing pre-fetched slug sets)
|
||||
2. Avoid introducing database coupling into utility functions
|
||||
3. Prevent N+1 queries when checking multiple candidates
|
||||
|
||||
**Implementation Specification**:
|
||||
|
||||
Create a new function in `/home/phil/Projects/starpunk/starpunk/slug_utils.py`:
|
||||
|
||||
```python
|
||||
def generate_timestamp_slug(
|
||||
created_at: datetime = None,
|
||||
existing_slugs: Set[str] = None
|
||||
) -> str:
|
||||
"""Generate a timestamp-based slug with collision handling.
|
||||
|
||||
Per ADR-062: Default format is YYYYMMDDHHMMSS with sequential
|
||||
suffix (-1, -2, etc.) for collisions.
|
||||
|
||||
Args:
|
||||
created_at: Note creation timestamp (defaults to now)
|
||||
existing_slugs: Set of existing slugs to check for collisions
|
||||
|
||||
Returns:
|
||||
Unique timestamp-based slug
|
||||
"""
|
||||
if created_at is None:
|
||||
created_at = datetime.utcnow()
|
||||
|
||||
if existing_slugs is None:
|
||||
existing_slugs = set()
|
||||
|
||||
base_slug = created_at.strftime("%Y%m%d%H%M%S")
|
||||
|
||||
# If no collision, return base slug
|
||||
if base_slug not in existing_slugs:
|
||||
return base_slug
|
||||
|
||||
# Sequential suffix for collisions
|
||||
suffix = 1
|
||||
while f"{base_slug}-{suffix}" in existing_slugs:
|
||||
suffix += 1
|
||||
|
||||
return f"{base_slug}-{suffix}"
|
||||
```
|
||||
|
||||
**Location**: `/home/phil/Projects/starpunk/starpunk/slug_utils.py`
|
||||
|
||||
**Do NOT modify**: The `ensure_unique_slug()` function signature in ADR-062 is pseudocode. Do not create a function with database coupling.
|
||||
|
||||
---
|
||||
|
||||
### Q2: Sequential suffix starting number
|
||||
|
||||
**Decision**: First note gets base slug (no suffix). First collision gets `-1`.
|
||||
|
||||
**Clarification**: The examples table in ADR-062 is correct. The text "first collision: `20251216143052-1`" means the first **duplicate** (collision) gets `-1`, not the first note overall.
|
||||
|
||||
**Example Scenario**:
|
||||
```
|
||||
Note A created at 14:30:52 -> slug = "20251216143052" (base, no suffix)
|
||||
Note B created at 14:30:52 -> slug = "20251216143052-1" (first collision)
|
||||
Note C created at 14:30:52 -> slug = "20251216143052-2" (second collision)
|
||||
```
|
||||
|
||||
**Implementation**: Sequential suffix loop should start at `1`, not `2`:
|
||||
|
||||
```python
|
||||
suffix = 1
|
||||
while f"{base_slug}-{suffix}" in existing_slugs:
|
||||
suffix += 1
|
||||
return f"{base_slug}-{suffix}"
|
||||
```
|
||||
|
||||
**Note**: This differs from the existing `make_slug_unique_with_suffix()` function in `slug_utils.py` which starts at `-2`. The new timestamp slug function should start at `-1` per ADR-062.
|
||||
|
||||
---
|
||||
|
||||
### Q3: Two `generate_slug()` functions with different signatures
|
||||
|
||||
**Decision**: Create a new function in `slug_utils.py`. Do not modify or remove the existing function in `utils.py`.
|
||||
|
||||
**Migration Path**:
|
||||
|
||||
1. **Create new function** in `/home/phil/Projects/starpunk/starpunk/slug_utils.py`:
|
||||
- Name: `generate_timestamp_slug()` (as specified in Q1 response)
|
||||
- Do not name it `generate_slug()` to avoid confusion during migration
|
||||
|
||||
2. **Update `/home/phil/Projects/starpunk/starpunk/notes.py`** (lines 228-234):
|
||||
|
||||
**Current code** (line 228-234):
|
||||
```python
|
||||
else:
|
||||
# Generate base slug from content
|
||||
base_slug = generate_slug(content, created_at)
|
||||
|
||||
# Make unique if collision
|
||||
slug = make_slug_unique(base_slug, existing_slugs)
|
||||
```
|
||||
|
||||
**Replace with**:
|
||||
```python
|
||||
else:
|
||||
# Generate timestamp-based slug (ADR-062)
|
||||
from starpunk.slug_utils import generate_timestamp_slug
|
||||
slug = generate_timestamp_slug(created_at, existing_slugs)
|
||||
```
|
||||
|
||||
3. **Keep the existing function** in `utils.py` unchanged. It may be used elsewhere or for future content-based slug options.
|
||||
|
||||
4. **Update imports** in `notes.py`:
|
||||
- Remove `generate_slug` from the `utils` import line (line 34) if no longer needed
|
||||
- Verify no other callers use `generate_slug()` for default note creation
|
||||
|
||||
**Rationale**:
|
||||
- Creating a distinctly-named function makes the transition explicit
|
||||
- Preserves backward compatibility if content-based slugs are ever needed
|
||||
- Allows gradual migration with clear audit trail
|
||||
|
||||
---
|
||||
|
||||
### Q4: Reserved slug handling in timestamp context
|
||||
|
||||
**Decision**:
|
||||
- **Default timestamp slugs**: Safe to skip reserved slug validation (timestamps cannot collide with reserved words)
|
||||
- **Custom slugs**: Must retain reserved slug validation
|
||||
|
||||
**Implementation**:
|
||||
|
||||
The new `generate_timestamp_slug()` function does NOT need to check reserved slugs because:
|
||||
- Reserved slugs are alphabetic (`api`, `admin`, `feed`, etc.)
|
||||
- Timestamp slugs are purely numeric (`20251216143052`)
|
||||
- These sets are mutually exclusive by construction
|
||||
|
||||
**However**, the existing custom slug validation in `validate_and_sanitize_custom_slug()` must retain reserved slug checking because user-provided slugs could be anything.
|
||||
|
||||
**Code Impact**:
|
||||
- `generate_timestamp_slug()`: No reserved slug check needed
|
||||
- `validate_and_sanitize_custom_slug()`: Unchanged (already handles reserved slugs)
|
||||
- `validate_slug()` in `utils.py`: Unchanged (still validates reserved slugs for custom slugs)
|
||||
|
||||
**Simplification**: You may remove the defensive `validate_slug()` call after timestamp generation in `notes.py` (line 236-237) since timestamp slugs are guaranteed valid by construction:
|
||||
```python
|
||||
# This check is no longer needed for timestamp slugs:
|
||||
# if not validate_slug(slug):
|
||||
# raise InvalidNoteDataError(...)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Q5: Test update strategy for slug format changes
|
||||
|
||||
**Decision**: Use pattern matching for timestamp assertions. Preserve content-based tests in a separate test file or clearly marked section.
|
||||
|
||||
**Test Strategy**:
|
||||
|
||||
1. **For new timestamp slug tests**, use regex pattern matching:
|
||||
```python
|
||||
import re
|
||||
|
||||
TIMESTAMP_SLUG_PATTERN = re.compile(r'^\d{14}(-\d+)?$')
|
||||
|
||||
def test_default_slug_is_timestamp():
|
||||
note = create_note("Some content here")
|
||||
assert TIMESTAMP_SLUG_PATTERN.match(note.slug)
|
||||
|
||||
def test_slug_collision_adds_suffix():
|
||||
# Create two notes at same timestamp
|
||||
fixed_time = datetime(2025, 12, 16, 14, 30, 52)
|
||||
note1 = create_note("First note", created_at=fixed_time)
|
||||
note2 = create_note("Second note", created_at=fixed_time)
|
||||
|
||||
assert note1.slug == "20251216143052"
|
||||
assert note2.slug == "20251216143052-1"
|
||||
```
|
||||
|
||||
2. **For timestamp-specific assertions**, use fixed timestamps in tests:
|
||||
```python
|
||||
def test_timestamp_slug_format():
|
||||
"""Test that timestamp slug matches exact format."""
|
||||
fixed_time = datetime(2025, 12, 16, 14, 30, 52)
|
||||
slug = generate_timestamp_slug(fixed_time, set())
|
||||
assert slug == "20251216143052"
|
||||
```
|
||||
|
||||
3. **Content-based tests**: Move to a separate section or file named `test_legacy_slug_generation.py` with a comment:
|
||||
```python
|
||||
"""
|
||||
Legacy tests for content-based slug generation.
|
||||
|
||||
These test the generate_slug() function in utils.py which is
|
||||
preserved for backward compatibility. New notes use timestamp-based
|
||||
slugs per ADR-062.
|
||||
"""
|
||||
```
|
||||
|
||||
4. **Update existing tests that assert specific slug formats**:
|
||||
- Replace exact string assertions (`assert slug == "hello-world"`) with pattern matching
|
||||
- OR use fixed timestamps and assert exact expected values
|
||||
|
||||
**Affected Test Files** (estimate from your question):
|
||||
- `/home/phil/Projects/starpunk/tests/test_utils.py`: ~20 tests
|
||||
- `/home/phil/Projects/starpunk/tests/test_notes.py`: ~10 tests
|
||||
- `/home/phil/Projects/starpunk/tests/test_micropub.py`: ~10 tests
|
||||
|
||||
**Helper Function** (add to test utilities or conftest.py):
|
||||
```python
|
||||
def assert_valid_timestamp_slug(slug: str) -> None:
|
||||
"""Assert slug matches timestamp format per ADR-062."""
|
||||
pattern = re.compile(r'^\d{14}(-\d+)?$')
|
||||
assert pattern.match(slug), f"Slug '{slug}' does not match timestamp format"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Q6: Phase 0 priority - are tests blocking?
|
||||
|
||||
**Decision**: Phase 0 is a **hard dependency**. Phase 1 cannot proceed until all tests pass.
|
||||
|
||||
**Rationale**:
|
||||
|
||||
1. **Failing tests mask regressions**: If you implement Phase 1 (slug changes) while 19 tests are failing, you cannot distinguish between:
|
||||
- Pre-existing failures
|
||||
- New failures caused by your changes
|
||||
|
||||
2. **CI/CD integrity**: The project should maintain a green build. Committing Phase 1 work with known failing tests violates this principle.
|
||||
|
||||
3. **Debugging complexity**: If slug changes interact unexpectedly with the failing test areas (e.g., feed generation, content negotiation), debugging becomes significantly harder.
|
||||
|
||||
**Implementation Order**:
|
||||
|
||||
```
|
||||
Phase 0 (Test Fixes) <-- MUST COMPLETE FIRST
|
||||
|
|
||||
v
|
||||
+-- Phase 1 (Timestamp Slugs)
|
||||
|
|
||||
+-- Phase 2 (Debug Files) <-- Can run in parallel
|
||||
|
|
||||
+-- Phase 3 (N+1 Query Fix) <-- after Phase 0
|
||||
|
|
||||
+-- Phase 4 (Atomic Variants)
|
||||
|
|
||||
v
|
||||
Phase 5 (Test Coverage) <-- LAST
|
||||
```
|
||||
|
||||
**Exception**: If fixing a Phase 0 test reveals it requires slug-related changes, document the circular dependency and bring to architect attention.
|
||||
|
||||
**Recommendation**: Start Phase 0 immediately. Many of those 19 failures may be simple test maintenance issues (timeouts, assertions that need updating, etc.).
|
||||
|
||||
---
|
||||
|
||||
### Q7: Backwards compatibility for existing notes
|
||||
|
||||
**Decision**: Slug format is **opaque**. No code needs to be format-aware.
|
||||
|
||||
**Clarification**:
|
||||
|
||||
1. **No format detection needed**: Do not create an `is_timestamp_slug()` helper function. The system should treat all slugs identically regardless of format.
|
||||
|
||||
2. **Code paths are format-agnostic**:
|
||||
- Database queries use slug as opaque string key
|
||||
- URL routing uses slug as path parameter
|
||||
- Feed generation includes slug verbatim
|
||||
- Templates display slug without interpretation
|
||||
|
||||
3. **Coexistence is automatic**: Content-based slugs (`my-first-post`) and timestamp slugs (`20251216143052`) are both valid slug patterns. They coexist naturally.
|
||||
|
||||
4. **Sorting behavior**:
|
||||
- Database ORDER BY on `created_at` is unaffected
|
||||
- Alphabetical slug sorting would interleave formats (numbers sort before letters)
|
||||
- This is acceptable; no UI sorts by slug alphabetically
|
||||
|
||||
5. **Display considerations**:
|
||||
- Timestamps are less readable in URLs, but URLs are rarely human-read
|
||||
- UI displays note content, not slugs
|
||||
- RSS feeds include full URLs; format doesn't affect feed readers
|
||||
|
||||
**No audit required**: You do not need to audit slug usages. The design ensures format-agnosticism.
|
||||
|
||||
---
|
||||
|
||||
### Q8: Phase ordering rationale
|
||||
|
||||
**Decision**:
|
||||
- Phases 2, 3, and 4 **can be parallelized** after Phase 0 completes
|
||||
- Phase 1 (Timestamp Slugs) should be done **before** Phases 2-4 for cleaner testing
|
||||
- Each phase should be **committed separately**
|
||||
|
||||
**Recommended Order**:
|
||||
|
||||
```
|
||||
Day 1: Phase 0 (Test Fixes) - Get to green build
|
||||
Day 2: Phase 1 (Timestamp Slugs) - Quick win, user-visible improvement
|
||||
Day 3: Phase 2 (Debug Files) or Phase 3 (N+1) or Phase 4 (Atomic) - Any order
|
||||
Day 4: Remaining phases from above
|
||||
Day 5: Phase 5 (Test Coverage) - Must be last
|
||||
```
|
||||
|
||||
**Commit Strategy**:
|
||||
|
||||
Each phase gets its own commit or small series of commits:
|
||||
```
|
||||
feat(slugs): Implement timestamp-based slugs per ADR-062
|
||||
fix(tests): Resolve 19 failing tests in Phase 0
|
||||
feat(media): Add debug file cleanup and configuration
|
||||
perf(feed): Batch load media and tags to fix N+1 query
|
||||
feat(media): Make variant generation atomic with database
|
||||
test: Expand coverage to 90% for v1.5.0
|
||||
```
|
||||
|
||||
**Why Not Batch**: Batching commits makes bisecting regressions harder and obscures the changelog.
|
||||
|
||||
**Parallelization Note**: If you have capacity to work on multiple phases simultaneously (e.g., waiting for test runs), Phases 2-4 have no code overlap and can be developed in parallel branches:
|
||||
- `feature/v1.5.0-debug-files`
|
||||
- `feature/v1.5.0-n+1-fix`
|
||||
- `feature/v1.5.0-atomic-variants`
|
||||
|
||||
Then merge all into main before Phase 5.
|
||||
|
||||
---
|
||||
|
||||
## Observations Responses
|
||||
|
||||
### O1: Code duplication between `utils.py` and `slug_utils.py`
|
||||
|
||||
**Acknowledged**. The duplicate `RESERVED_SLUGS` definitions should be consolidated in a future cleanup. For v1.5.0:
|
||||
|
||||
- Use the more comprehensive list in `slug_utils.py` (12 items) as the canonical source
|
||||
- Do not consolidate in this release (scope creep)
|
||||
- Add to BACKLOG.md for v1.5.1 or v1.6.0
|
||||
|
||||
### O2: Timestamp format consistency
|
||||
|
||||
**Confirmed**: The hyphen removal in ADR-062 (`YYYYMMDDHHMMSS`) vs existing fallback (`YYYYMMDD-HHMMSS`) is intentional.
|
||||
|
||||
- ADR-062 format: `20251216143052` (14 characters, no separator)
|
||||
- Old fallback format: `20251216-143052` (15 characters, with hyphen)
|
||||
|
||||
**Rationale**:
|
||||
- One fewer character
|
||||
- Consistent with ISO 8601 compact format
|
||||
- No functional difference; purely aesthetic
|
||||
|
||||
**Action**: Use the unhyphenated format per ADR-062.
|
||||
|
||||
### O3: Test coverage gap mentioned in Phase 5
|
||||
|
||||
**Clarification**: This is intentional duplication for visibility.
|
||||
|
||||
- BACKLOG.md lists the gap for tracking
|
||||
- Phase 5 in RELEASE.md lists it for implementation planning
|
||||
- Both point to the same work item
|
||||
|
||||
No action needed; this is not an error.
|
||||
|
||||
---
|
||||
|
||||
## ADR-062 Updates Required
|
||||
|
||||
No updates to ADR-062 are required based on these questions. The ADR's pseudocode is understood to be conceptual, and all implementation details have been clarified in this response document.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
Based on these responses, the developer can now proceed with:
|
||||
|
||||
- [ ] Phase 0: Fix 19 failing tests
|
||||
- [ ] Create `generate_timestamp_slug()` in `slug_utils.py`
|
||||
- [ ] Update `notes.py` to use new timestamp slug function
|
||||
- [ ] Update tests with pattern matching strategy
|
||||
- [ ] Proceed with Phases 2-4 (order flexible)
|
||||
- [ ] Complete Phase 5 test coverage last
|
||||
|
||||
---
|
||||
|
||||
**Architect**: StarPunk Architect Agent
|
||||
**Status**: Questions Answered - Ready for Implementation
|
||||
**Next Action**: Developer to begin Phase 0 (Test Fixes)
|
||||
272
docs/design/v1.5.0/2025-12-16-developer-questions.md
Normal file
272
docs/design/v1.5.0/2025-12-16-developer-questions.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# Developer Review Questions - v1.5.0 Release Plan
|
||||
|
||||
**Date**: 2025-12-16
|
||||
**Reviewer**: StarPunk Developer Agent
|
||||
**Documents Reviewed**:
|
||||
- `docs/decisions/ADR-062-timestamp-based-slug-format.md`
|
||||
- `docs/projectplan/v1.5.0/RELEASE.md`
|
||||
- `docs/projectplan/BACKLOG.md`
|
||||
|
||||
## Executive Summary
|
||||
|
||||
After thorough review of the v1.5.0 release plan and ADR-062, I have identified **8 questions** that need clarification from the architect before implementation can begin. These questions fall into three categories:
|
||||
|
||||
1. **Slug implementation details** (Questions 1-4)
|
||||
2. **Test structure and expectations** (Questions 5-6)
|
||||
3. **Phase dependencies and ordering** (Questions 7-8)
|
||||
|
||||
The majority of questions concern Phase 1 (Timestamp-Based Slugs), which has some ambiguities around collision handling and interaction with existing code paths.
|
||||
|
||||
---
|
||||
|
||||
## Questions
|
||||
|
||||
### Phase 1: Timestamp-Based Slugs
|
||||
|
||||
#### Q1: Collision handling function location and signature
|
||||
|
||||
**Issue**: ADR-062 specifies an `ensure_unique_slug(base_slug: str) -> str` function that checks `slug_exists(base_slug)`, but the implementation location and database interaction pattern are unclear.
|
||||
|
||||
**Current State**:
|
||||
- `starpunk/slug_utils.py` has `make_slug_unique_with_suffix(base_slug, existing_slugs: Set[str])` that takes a pre-fetched set
|
||||
- `starpunk/utils.py` has `make_slug_unique(base_slug, existing_slugs: set[str])` with random suffix logic
|
||||
- `starpunk/notes.py` line 220 calls `_get_existing_slugs(db)` to fetch all slugs, then passes to uniqueness check
|
||||
|
||||
**Questions**:
|
||||
1. Should `ensure_unique_slug()` perform its own database queries (violates current pattern)?
|
||||
2. Or should it follow the existing pattern of accepting `existing_slugs: Set[str]` parameter?
|
||||
3. Should this function live in `slug_utils.py` or `utils.py`?
|
||||
4. The ADR shows `slug_exists()` as a separate function - should this be implemented, or is this pseudocode?
|
||||
|
||||
**Implementation Impact**: This affects the function signature and which module needs modification. The current codebase consistently passes pre-fetched slug sets to avoid N+1 queries.
|
||||
|
||||
**Suggested Clarification**: Confirm whether to:
|
||||
- Keep existing pattern: `ensure_unique_slug(base_slug: str, existing_slugs: Set[str]) -> str`
|
||||
- Or introduce database coupling: `ensure_unique_slug(base_slug: str, db: Connection) -> str`
|
||||
|
||||
---
|
||||
|
||||
#### Q2: Sequential suffix starting number
|
||||
|
||||
**Issue**: ADR-062 says "first collision: `20251216143052-1`" but the examples table shows the second note as `-1`, implying the first note has no suffix.
|
||||
|
||||
**Current State**:
|
||||
- Existing `make_slug_unique_with_suffix()` starts at `-2` for first collision (base slug is attempt #1)
|
||||
- This matches table in ADR-062 examples
|
||||
|
||||
**Questions**:
|
||||
1. Does "first collision" mean the first duplicate attempt (which should get `-1`)?
|
||||
2. Or does it mean the first note gets the base slug, and second note gets `-1`?
|
||||
|
||||
**Example Scenario**:
|
||||
```python
|
||||
# Two notes created at exactly 2025-12-16 14:30:52
|
||||
Note A created first: slug = ?
|
||||
Note B created second: slug = ?
|
||||
|
||||
# Is it:
|
||||
Option 1: A = 20251216143052, B = 20251216143052-1 (matches table)
|
||||
Option 2: A = 20251216143052-1, B = 20251216143052-2 (matches "first collision" text)
|
||||
```
|
||||
|
||||
**Implementation Impact**: Determines whether sequential suffix loop starts at 1 or 2.
|
||||
|
||||
**Suggested Clarification**: Confirm that first note gets base slug (no suffix), first collision gets `-1`.
|
||||
|
||||
---
|
||||
|
||||
#### Q3: Two `generate_slug()` functions with different signatures
|
||||
|
||||
**Issue**: The codebase has two `generate_slug()` functions:
|
||||
|
||||
**Current State**:
|
||||
- `starpunk/utils.py` line 173: `generate_slug(content: str, created_at: Optional[datetime]) -> str`
|
||||
- Content-based slug generation (extracts words)
|
||||
- Used by `notes.py` line 230
|
||||
- ADR-062 specifies: `generate_slug(custom_slug: str = None, created_at: datetime = None) -> str`
|
||||
- Timestamp-based, no content parameter
|
||||
|
||||
**Questions**:
|
||||
1. Should I replace the function in `utils.py`?
|
||||
2. Should I create a new function in `slug_utils.py` and leave `utils.py` as deprecated?
|
||||
3. Should both functions coexist with different names?
|
||||
4. The removal of the `content` parameter is a breaking change for `notes.py` - should `notes.py` be updated to not pass content at all?
|
||||
|
||||
**Implementation Impact**:
|
||||
- If replacing `utils.py` function: ~15 test files will need updates
|
||||
- If creating new function: need to update all callers to use new function
|
||||
- `notes.py` line 230 currently calls `generate_slug(content, created_at)` - this will break
|
||||
|
||||
**Suggested Clarification**: Specify exact function location and migration path for callers.
|
||||
|
||||
---
|
||||
|
||||
#### Q4: Reserved slug handling in timestamp context
|
||||
|
||||
**Issue**: ADR-062 doesn't address what happens if a timestamp-based slug conflicts with a reserved slug.
|
||||
|
||||
**Current State**:
|
||||
- Reserved slugs: `['api', 'admin', 'auth', 'feed', ...]` (all alphabetic)
|
||||
- Timestamp format: `20251216143052` (all numeric)
|
||||
- **These can never collide**
|
||||
|
||||
**Questions**:
|
||||
1. Should reserved slug validation be removed from the default slug path since timestamps can't conflict?
|
||||
2. Should it remain for safety/future-proofing?
|
||||
3. Custom slugs still need reserved validation - should this logic split?
|
||||
|
||||
**Implementation Impact**: May simplify code by removing unnecessary validation from timestamp path.
|
||||
|
||||
**Suggested Clarification**: Confirm whether reserved slug check is still needed for default slugs.
|
||||
|
||||
---
|
||||
|
||||
### Phase 0: Test Fixes
|
||||
|
||||
#### Q5: Test update strategy for slug format changes
|
||||
|
||||
**Issue**: Phase 1 will change default slug format from content-based to timestamp-based. Many tests currently assert content-based slugs.
|
||||
|
||||
**Current State**:
|
||||
- `tests/test_utils.py` has 20+ tests asserting content-based slug behavior (e.g., `assert slug == "hello-world-this-is-my"`)
|
||||
- Phase 1 acceptance criteria: "Update expected slug formats in test assertions"
|
||||
|
||||
**Questions**:
|
||||
1. Should these tests be updated to assert timestamp format (e.g., `assert slug.startswith("2025")`)?
|
||||
2. Should the old content-based tests be preserved but marked as testing legacy behavior?
|
||||
3. Should tests be split: "default slug generation" vs "content-based slug generation" (if we keep the old function)?
|
||||
4. Do you want all slug assertions to use pattern matching (timestamp format) or fixed timestamps in tests?
|
||||
|
||||
**Implementation Impact**:
|
||||
- Affects ~30 test assertions in `test_utils.py`
|
||||
- Affects ~10 integration tests in `test_notes.py`, `test_micropub.py`
|
||||
- May require test helper functions for deterministic timestamps
|
||||
|
||||
**Suggested Clarification**: Provide guidance on test update strategy.
|
||||
|
||||
---
|
||||
|
||||
#### Q6: Phase 0 priority - are tests blocking?
|
||||
|
||||
**Issue**: Phase 0 is marked "Must complete first - unblocks all other phases" but the 19 failing tests appear unrelated to slug changes.
|
||||
|
||||
**Current State**:
|
||||
- Failing tests: Migration performance, feed streaming, content negotiation, search security
|
||||
- None directly related to slug generation
|
||||
- Phase 1 (slugs) could theoretically proceed with these tests failing
|
||||
|
||||
**Questions**:
|
||||
1. Is Phase 0 truly blocking for Phase 1, or can Phase 1 proceed if slug-related tests pass?
|
||||
2. Should Phase 0 be completed before ANY other phase, or just before phases that depend on those specific test areas?
|
||||
3. If I discover that fixing some Phase 0 tests requires changes that conflict with Phase 1 work, what's the priority?
|
||||
|
||||
**Implementation Impact**: Affects work sequencing and potential merge conflicts.
|
||||
|
||||
**Suggested Clarification**: Confirm hard dependency of Phase 0 on all other phases, or allow parallel work.
|
||||
|
||||
---
|
||||
|
||||
### General Implementation Questions
|
||||
|
||||
#### Q7: Backwards compatibility for existing notes
|
||||
|
||||
**Issue**: ADR-062 states "Existing notes retain their slugs (no data migration)" but doesn't specify how the dual slug formats coexist.
|
||||
|
||||
**Current State**:
|
||||
- Database has notes with content-based slugs (e.g., `my-first-post`)
|
||||
- New notes will have timestamp slugs (e.g., `20251216143052`)
|
||||
- Both slug formats will exist simultaneously
|
||||
|
||||
**Questions**:
|
||||
1. Do any code paths need to detect/handle the two formats differently?
|
||||
2. Should there be a helper function `is_timestamp_slug(slug: str) -> bool` for future use?
|
||||
3. Are there any edge cases where code assumes slug format (e.g., sorting, display, URLs)?
|
||||
4. Does feed generation or UI need updates to handle mixed slug formats?
|
||||
|
||||
**Implementation Impact**: May require audit of all slug usages to ensure format-agnostic handling.
|
||||
|
||||
**Suggested Clarification**: Confirm if any code needs to be format-aware, or if slug format is truly opaque.
|
||||
|
||||
---
|
||||
|
||||
#### Q8: Phase ordering rationale
|
||||
|
||||
**Issue**: Phases 2-4 are all marked "Phase 0 complete" as dependency, but no interdependencies are noted between them.
|
||||
|
||||
**Current State**:
|
||||
- Phase 2: Debug file management
|
||||
- Phase 3: N+1 query fix (feed generation)
|
||||
- Phase 4: Atomic variant generation
|
||||
- All independent of each other
|
||||
|
||||
**Questions**:
|
||||
1. Can Phases 2-4 be implemented in parallel after Phase 0?
|
||||
2. Is there a preferred order (2→3→4) for any reason?
|
||||
3. Should each phase be committed separately or can they be batched?
|
||||
|
||||
**Implementation Impact**: Affects work planning and commit structure.
|
||||
|
||||
**Suggested Clarification**: Confirm whether Phases 2-4 can be parallelized or should be sequential.
|
||||
|
||||
---
|
||||
|
||||
## Additional Observations (Not Blocking)
|
||||
|
||||
These observations don't block implementation but may inform architectural decisions:
|
||||
|
||||
### O1: Code duplication between `utils.py` and `slug_utils.py`
|
||||
|
||||
**Observation**: Reserved slugs are defined in both:
|
||||
- `utils.py` line 24: `RESERVED_SLUGS = {"admin", "api", ...}` (5 items)
|
||||
- `slug_utils.py` line 24: `RESERVED_SLUGS = frozenset([...])` (12 items, more comprehensive)
|
||||
|
||||
**Impact**: Low - but may cause confusion. Consider consolidating in v1.5.1.
|
||||
|
||||
---
|
||||
|
||||
### O2: Timestamp format consistency
|
||||
|
||||
**Observation**: ADR-062 uses `YYYYMMDDHHMMSS` (no separator) but existing fallback in `utils.py` line 220 uses `YYYYMMDD-HHMMSS` (with hyphen).
|
||||
|
||||
**Questions**:
|
||||
- Is the hyphen removal intentional (shorter, more compact)?
|
||||
- Current: `20241118-143045` (15 chars)
|
||||
- ADR-062: `20241118143045` (14 chars)
|
||||
|
||||
**Impact**: Minor - affects slug length by 1 character.
|
||||
|
||||
---
|
||||
|
||||
### O3: Test coverage gap mentioned in Phase 5
|
||||
|
||||
**Observation**: Phase 5 mentions "MPO format handling (untested)" but this is already in BACKLOG as "High priority - Scheduled for v1.5.0".
|
||||
|
||||
**Question**: Is this intentionally listed in both BACKLOG and Phase 5, or is there duplication?
|
||||
|
||||
**Impact**: None - just a clarity question.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**Implementation Readiness**: 6/10
|
||||
|
||||
The release plan is well-structured with clear phases and acceptance criteria. However, **Phase 1 (Timestamp-Based Slugs)** has enough ambiguity in implementation details (Questions 1-4) that I cannot confidently proceed without architect clarification.
|
||||
|
||||
**Recommended Next Steps**:
|
||||
1. Architect addresses Questions 1-4 (Phase 1 blocking issues)
|
||||
2. Architect clarifies Questions 5-8 (helps inform implementation approach)
|
||||
3. Developer proceeds with Phase 0 (test fixes) while awaiting Q1-Q4 answers
|
||||
4. Developer implements Phases 1-4 after clarifications received
|
||||
|
||||
**Estimated Time to Implementation-Ready**: 1-2 hours for architect to address questions.
|
||||
|
||||
---
|
||||
|
||||
## Approval
|
||||
|
||||
Once these questions are addressed, I am confident the implementation can proceed smoothly with the 18-29 hour estimate provided in the release plan.
|
||||
|
||||
**Developer**: StarPunk Developer Agent
|
||||
**Status**: Awaiting Architect Response
|
||||
**Next Action**: Architect to review and answer questions 1-8
|
||||
@@ -1,6 +1,6 @@
|
||||
# StarPunk Backlog
|
||||
|
||||
**Last Updated**: 2025-12-16
|
||||
**Last Updated**: 2025-12-17
|
||||
|
||||
## Recently Completed
|
||||
|
||||
@@ -75,11 +75,24 @@
|
||||
- **Source**: Architect Review (Issue 1.2.3)
|
||||
- **Approach**: Sanitize filename before use: `safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]`
|
||||
|
||||
### N+1 Query Pattern in Feed Generation *(Scheduled: v1.5.0)*
|
||||
### N+1 Query Pattern in Feed Generation *(Scheduled: v1.5.0 - Partial)*
|
||||
- **Description**: In `_get_cached_notes()`, media and tags are loaded per-note in separate queries. For 50 notes, this is 100 additional database queries, degrading performance.
|
||||
- **Location**: `starpunk/routes/public.py` lines 68-74
|
||||
- **Source**: Architect Review (Issue 2.2.9)
|
||||
- **Approach**: Implement batch loading function `get_media_for_notes(note_ids: List[int])` using a single query with `WHERE note_id IN (...)`.
|
||||
- **v1.5.0 Scope**: Only `_get_cached_notes()` will be fixed in v1.5.0. Other N+1 patterns deferred (see Low Priority section).
|
||||
|
||||
### N+1 Query Patterns - Deferred Locations
|
||||
- **Description**: N+1 query patterns exist in multiple locations beyond `_get_cached_notes()`. These are lower priority due to lower traffic or single-note contexts.
|
||||
- **Deferred Locations**:
|
||||
1. **Admin Edit Page** (`starpunk/routes/admin.py` line 186): Single note context, low impact
|
||||
2. **Tag Archive Pages** (`starpunk/routes/public.py` lines 243-248, 332): Lower traffic than main feed
|
||||
3. **Individual Note View** (`starpunk/routes/public.py` lines 282-289): Single note, negligible
|
||||
4. **Note Model Property** (`starpunk/models.py` lines 380-381): On-demand loading, acceptable
|
||||
5. **Tag Module** (`starpunk/tags.py` line 197): Internal helper, low volume
|
||||
- **Source**: Architect Review, v1.5.0 scoping decision
|
||||
- **Approach**: Future optimization if performance issues arise. Consider batch loading patterns established in v1.5.0.
|
||||
- **Priority**: Deferred to post-v1.5.0
|
||||
|
||||
### Transaction Not Atomic in Variant Generation *(Scheduled: v1.5.0)*
|
||||
- **Description**: Files are written to disk before database commit. If the database commit fails, orphaned files remain on disk.
|
||||
|
||||
@@ -1,257 +1,346 @@
|
||||
# StarPunk v1.5.0 Release
|
||||
# StarPunk v1.5.0 Release Plan
|
||||
|
||||
**Status**: Planning
|
||||
**Status**: Approved
|
||||
**Codename**: "Trigger"
|
||||
**Focus**: Cleanup, Test Coverage, and Quality of Life Improvements
|
||||
**Focus**: Stability, Test Coverage, and Technical Debt Reduction
|
||||
**Last Updated**: 2025-12-17
|
||||
|
||||
## Overview
|
||||
|
||||
This minor release focuses on technical debt reduction, improving test coverage to 90%, and addressing quality-of-life issues identified in previous release reviews. No new user-facing features are planned.
|
||||
v1.5.0 is a quality-focused release that addresses failing tests, increases test coverage to 90%, implements critical fixes from the v1.4.x review cycle, and resolves technical debt. No new user-facing features are planned.
|
||||
|
||||
## Goals
|
||||
|
||||
1. **90% Test Coverage** - Increase overall test coverage from current baseline to 90%
|
||||
2. **Address Backlog Technical Debt** - Resolve 6 specific backlog items
|
||||
3. **Improve Code Quality** - Better error handling, atomicity, and performance
|
||||
1. **Fix Failing Tests** - Resolve all 19 currently failing tests
|
||||
2. **90% Test Coverage** - Increase overall test coverage to 90% minimum
|
||||
3. **Technical Debt Reduction** - Address 6 specific backlog items
|
||||
4. **Code Quality** - Better error handling, atomicity, and performance
|
||||
|
||||
## Features
|
||||
## Phased Implementation Plan
|
||||
|
||||
### 1. Test Coverage Target: 90%
|
||||
|
||||
Increase overall test coverage to 90% minimum across all modules.
|
||||
### Phase 0: Test Fixes (Critical Path)
|
||||
**Priority**: Must complete first - unblocks all other phases
|
||||
|
||||
#### Scope
|
||||
- Identify modules with coverage below 90%
|
||||
- Write missing unit tests for uncovered code paths
|
||||
- Add integration tests for critical workflows
|
||||
- Ensure edge cases and error paths are tested
|
||||
Fix the 19 failing tests identified in the current test suite:
|
||||
|
||||
| Category | Count | Tests |
|
||||
|----------|-------|-------|
|
||||
| Migration Performance | 2 | `test_single_worker_performance`, `test_concurrent_workers_performance` |
|
||||
| Feed Route (Streaming) | 1 | `test_feed_route_streaming` |
|
||||
| Feed Endpoints | 3 | `test_feed_rss_endpoint`, `test_feed_json_endpoint`, `test_feed_xml_legacy_endpoint` |
|
||||
| Content Negotiation | 6 | `test_accept_rss`, `test_accept_json_feed`, `test_accept_json_generic`, `test_accept_wildcard`, `test_no_accept_header`, `test_quality_factor_json_wins` |
|
||||
| Backward Compatibility | 1 | `test_feed_xml_contains_rss` |
|
||||
| Search Security | 1 | `test_search_escapes_html_in_note_content` |
|
||||
|
||||
#### Approach
|
||||
1. Investigate each failing test category
|
||||
2. Determine if failure is test issue or code issue
|
||||
3. Fix appropriately (prefer fixing tests over changing working code)
|
||||
4. Document any behavioral changes
|
||||
|
||||
#### Acceptance Criteria
|
||||
- `uv run pytest --cov=starpunk` reports ≥90% overall coverage
|
||||
- No module below 85% coverage
|
||||
- All new code in this release has 100% coverage
|
||||
- [ ] All 879 tests pass
|
||||
- [ ] No test skips added (unless justified)
|
||||
- [ ] No test timeouts
|
||||
|
||||
#### Dependencies
|
||||
None - this is the first phase
|
||||
|
||||
---
|
||||
|
||||
### 2. MPO Format Test Coverage
|
||||
### Phase 1: Timestamp-Based Slugs
|
||||
**Priority**: High - Addresses user-reported issue
|
||||
|
||||
**Source**: Backlog - High Priority (Developer Review M1)
|
||||
|
||||
Add test coverage for MPO (Multi-Picture Object) format handling.
|
||||
|
||||
#### Current State
|
||||
- MPO conversion code exists at `starpunk/media.py` lines 163-173
|
||||
- Advertised in CHANGELOG but untested
|
||||
#### Scope
|
||||
Implement ADR-062 to change default slug format from content-based to timestamp-based.
|
||||
|
||||
#### Implementation
|
||||
- Add `test_mpo_detection_and_conversion()` to `TestHEICSupport` class
|
||||
- Create MPO test image using Pillow's MPO support
|
||||
- Test primary image extraction
|
||||
- Test conversion to JPEG
|
||||
1. Update `starpunk/slug_utils.py`:
|
||||
- Change `generate_slug()` to use timestamp format `YYYYMMDDHHMMSS`
|
||||
- Update collision handling to use sequential suffix (`-1`, `-2`, etc.)
|
||||
- Preserve custom slug functionality
|
||||
|
||||
2. Update `starpunk/notes.py`:
|
||||
- Remove content parameter from slug generation calls
|
||||
|
||||
3. Update tests:
|
||||
- Modify expected slug formats in test assertions
|
||||
- Add tests for new timestamp format
|
||||
- Add tests for sequential collision handling
|
||||
|
||||
#### Acceptance Criteria
|
||||
- MPO detection tested
|
||||
- MPO to JPEG conversion tested
|
||||
- Edge cases (corrupted MPO, single-frame MPO) tested
|
||||
- [ ] Default slugs use `YYYYMMDDHHMMSS` format
|
||||
- [ ] Collision handling uses `-1`, `-2` suffix
|
||||
- [ ] Custom slugs via `mp-slug` work unchanged
|
||||
- [ ] Custom slugs via web UI work unchanged
|
||||
- [ ] Existing notes unaffected
|
||||
- [ ] ADR-062 referenced in code comments
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete (all tests passing)
|
||||
|
||||
---
|
||||
|
||||
### 3. Debug File Storage Cleanup
|
||||
### Phase 2: Debug File Management
|
||||
**Priority**: Medium - Security and operations concern
|
||||
|
||||
**Source**: Backlog - Medium Priority (Developer Review M2, Architect Review 1.2.2)
|
||||
|
||||
Implement cleanup mechanism for failed upload debug files.
|
||||
|
||||
#### Current State
|
||||
- Failed uploads saved to `data/debug/` directory
|
||||
- No cleanup mechanism exists
|
||||
- Potential disk space exhaustion
|
||||
#### Scope
|
||||
Implement cleanup mechanism for failed upload debug files and add configuration controls.
|
||||
|
||||
#### Implementation
|
||||
1. Add `DEBUG_SAVE_FAILED_UPLOADS` config option (default: `false` in production)
|
||||
2. Implement automatic cleanup for files older than 7 days
|
||||
3. Add disk space check or size limit (100MB max for debug folder)
|
||||
4. Cleanup runs on application startup and periodically
|
||||
1. Add configuration options:
|
||||
```python
|
||||
DEBUG_SAVE_FAILED_UPLOADS = False # Default: disabled in production
|
||||
DEBUG_FILE_MAX_AGE_DAYS = 7 # Auto-delete threshold
|
||||
DEBUG_FILE_MAX_SIZE_MB = 100 # Maximum debug folder size
|
||||
```
|
||||
|
||||
#### Configuration
|
||||
```python
|
||||
DEBUG_SAVE_FAILED_UPLOADS = False # Enable only for debugging
|
||||
DEBUG_FILE_MAX_AGE_DAYS = 7 # Auto-delete after 7 days
|
||||
DEBUG_FILE_MAX_SIZE_MB = 100 # Maximum debug folder size
|
||||
```
|
||||
2. Implement cleanup logic in `starpunk/media.py`:
|
||||
- Check config before saving debug files
|
||||
- Implement `cleanup_old_debug_files()` function
|
||||
- Add size limit check before saving
|
||||
|
||||
3. Add startup cleanup:
|
||||
- Run cleanup on application startup
|
||||
- Log cleanup actions
|
||||
|
||||
4. Sanitize filenames:
|
||||
- Sanitize filename before debug path construction
|
||||
- Pattern: `"".join(c for c in filename if c.isalnum() or c in "._-")[:50]`
|
||||
|
||||
#### Acceptance Criteria
|
||||
- Debug files disabled by default in production
|
||||
- Old files automatically cleaned up when enabled
|
||||
- Disk space protected with size limit
|
||||
- Tests cover all cleanup scenarios
|
||||
- [ ] Debug files disabled by default
|
||||
- [ ] Files older than 7 days auto-deleted when enabled
|
||||
- [ ] Folder size limited to 100MB
|
||||
- [ ] Filenames sanitized (no path traversal)
|
||||
- [ ] Cleanup runs on startup
|
||||
- [ ] Tests cover all scenarios
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete
|
||||
|
||||
---
|
||||
|
||||
### 4. Filename Sanitization in Debug Path
|
||||
### Phase 3: N+1 Query Fix (Feed Generation)
|
||||
**Priority**: Medium - Performance improvement
|
||||
|
||||
**Source**: Backlog - Medium Priority (Architect Review 1.2.3)
|
||||
|
||||
Sanitize filenames before use in debug file paths.
|
||||
|
||||
#### Current State
|
||||
- Original filename used directly at `starpunk/media.py` line 135
|
||||
- Path traversal or special character issues possible
|
||||
#### Scope
|
||||
Fix N+1 query pattern in `_get_cached_notes()` only. Other N+1 patterns are deferred (documented in BACKLOG.md).
|
||||
|
||||
#### Implementation
|
||||
```python
|
||||
safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]
|
||||
```
|
||||
1. Create batch loading functions in `starpunk/media.py`:
|
||||
```python
|
||||
def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
|
||||
"""Batch load media for multiple notes in single query."""
|
||||
```
|
||||
|
||||
2. Create batch loading functions in `starpunk/tags.py`:
|
||||
```python
|
||||
def get_tags_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
|
||||
"""Batch load tags for multiple notes in single query."""
|
||||
```
|
||||
|
||||
3. Update `_get_cached_notes()` in `starpunk/routes/public.py`:
|
||||
- Use batch loading instead of per-note queries
|
||||
- Maintain same output format
|
||||
|
||||
#### Acceptance Criteria
|
||||
- Filenames sanitized before debug path construction
|
||||
- Path traversal attempts neutralized
|
||||
- Special characters removed
|
||||
- Filename length limited
|
||||
- Tests cover malicious filename patterns
|
||||
- [ ] Feed generation uses batch queries
|
||||
- [ ] Query count reduced from O(n) to O(1) for media/tags
|
||||
- [ ] No change to API behavior
|
||||
- [ ] Performance improvement verified in tests
|
||||
- [ ] Other N+1 locations documented in BACKLOG.md (not fixed)
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete
|
||||
|
||||
---
|
||||
|
||||
### 5. N+1 Query Pattern Fix
|
||||
### Phase 4: Atomic Variant Generation
|
||||
**Priority**: Medium - Data integrity
|
||||
|
||||
**Source**: Backlog - Medium Priority (Architect Review 2.2.9)
|
||||
|
||||
Eliminate N+1 query pattern in feed generation.
|
||||
|
||||
#### Current State
|
||||
- `_get_cached_notes()` loads media and tags per-note
|
||||
- For 50 notes: 100 additional database queries
|
||||
- Performance degrades with note count
|
||||
#### Scope
|
||||
Make variant file generation atomic with database commits to prevent orphaned files.
|
||||
|
||||
#### Implementation
|
||||
1. Create `get_media_for_notes(note_ids: List[int])` batch function
|
||||
2. Create `get_tags_for_notes(note_ids: List[int])` batch function
|
||||
3. Use single query with `WHERE note_id IN (...)`
|
||||
4. Update `_get_cached_notes()` to use batch loading
|
||||
1. Modify `generate_all_variants()` in `starpunk/media.py`:
|
||||
- Write variants to temporary directory first
|
||||
- Perform database inserts in transaction
|
||||
- Move files to final location after successful commit
|
||||
- Clean up temp files on any failure
|
||||
|
||||
#### Example
|
||||
```python
|
||||
def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[dict]]:
|
||||
"""Batch load media for multiple notes."""
|
||||
# Single query: SELECT * FROM note_media WHERE note_id IN (?)
|
||||
# Returns: {note_id: [media_list]}
|
||||
```
|
||||
|
||||
#### Acceptance Criteria
|
||||
- Feed generation uses batch queries
|
||||
- Query count reduced from O(n) to O(1) for media/tags
|
||||
- Performance improvement measurable in tests
|
||||
- No change to API behavior
|
||||
|
||||
---
|
||||
|
||||
### 6. Atomic Variant Generation
|
||||
|
||||
**Source**: Backlog - Medium Priority (Architect Review 2.2.6)
|
||||
|
||||
Make variant file generation atomic with database commits.
|
||||
|
||||
#### Current State
|
||||
- Files written to disk before database commit
|
||||
- Database commit failure leaves orphaned files
|
||||
|
||||
#### Implementation
|
||||
1. Write variant files to temporary location first
|
||||
2. Database insert in transaction
|
||||
3. Move files to final location after successful commit
|
||||
4. Cleanup temp files on any failure
|
||||
2. Add startup recovery:
|
||||
- Detect orphaned variant files on startup
|
||||
- Log warnings for orphans found
|
||||
- Optionally clean up orphans
|
||||
|
||||
#### Flow
|
||||
```
|
||||
1. Generate variants to temp directory
|
||||
1. Generate variants to temp directory (data/media/.tmp/)
|
||||
2. BEGIN TRANSACTION
|
||||
3. INSERT media record
|
||||
4. INSERT variant records
|
||||
5. COMMIT
|
||||
6. Move files from temp to final location
|
||||
7. On failure: rollback DB, delete temp files
|
||||
7. On failure: ROLLBACK, delete temp files
|
||||
```
|
||||
|
||||
#### Acceptance Criteria
|
||||
- No orphaned files on database failures
|
||||
- No orphaned database records on file failures
|
||||
- Atomic operation for all media saves
|
||||
- Tests simulate failure scenarios
|
||||
- [ ] No orphaned files on database failures
|
||||
- [ ] No orphaned DB records on file failures
|
||||
- [ ] Atomic operation for all media saves
|
||||
- [ ] Startup recovery detects orphans
|
||||
- [ ] Tests simulate failure scenarios
|
||||
|
||||
#### Dependencies
|
||||
- Phase 0 complete
|
||||
|
||||
---
|
||||
|
||||
### 7. Default Slug Format Change
|
||||
### Phase 5: Test Coverage Expansion
|
||||
**Priority**: Medium - Quality assurance
|
||||
|
||||
**Source**: Backlog - Medium Priority
|
||||
#### Scope
|
||||
Increase overall test coverage to 90% minimum.
|
||||
|
||||
Change default slug format from content-based to timestamp-based.
|
||||
#### Approach
|
||||
1. Run coverage report: `uv run pytest --cov=starpunk --cov-report=html`
|
||||
2. Identify modules below 90% coverage
|
||||
3. Prioritize based on risk and complexity
|
||||
4. Write tests for uncovered paths
|
||||
|
||||
#### Current State
|
||||
- Slugs generated from note content
|
||||
- Can produce unwanted slugs from private content
|
||||
#### Known Coverage Gaps (to verify)
|
||||
- MPO format handling (untested)
|
||||
- Edge cases in error paths
|
||||
- Configuration validation paths
|
||||
- Startup/shutdown hooks
|
||||
|
||||
#### New Format
|
||||
- Default: `YYYYMMDDHHMMSS` (e.g., `20251216143052`)
|
||||
- Duplicate handling: append `-1`, `-2`, etc.
|
||||
- Custom slugs still supported via `mp-slug`
|
||||
#### Specific Test Additions
|
||||
1. **MPO Format Tests** (`tests/test_media_upload.py`):
|
||||
- `test_mpo_detection_and_conversion()`
|
||||
- `test_mpo_corrupted_file()`
|
||||
- `test_mpo_single_frame()`
|
||||
|
||||
#### Implementation
|
||||
1. Update `generate_slug()` function in `starpunk/slug_utils.py`
|
||||
2. Generate timestamp-based slug by default
|
||||
3. Check for duplicates and append suffix if needed
|
||||
4. Preserve custom slug functionality
|
||||
2. **Debug File Tests** (new test file or extend `test_media_upload.py`):
|
||||
- `test_debug_file_disabled_by_default()`
|
||||
- `test_debug_file_cleanup_old_files()`
|
||||
- `test_debug_file_size_limit()`
|
||||
- `test_debug_filename_sanitization()`
|
||||
|
||||
#### Example
|
||||
```python
|
||||
def generate_slug(content: str = None, custom_slug: str = None) -> str:
|
||||
if custom_slug:
|
||||
return sanitize_slug(custom_slug)
|
||||
# Default: timestamp-based
|
||||
timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
|
||||
return ensure_unique_slug(timestamp)
|
||||
```
|
||||
3. **Batch Loading Tests**:
|
||||
- `test_get_media_for_notes_batch()`
|
||||
- `test_get_tags_for_notes_batch()`
|
||||
- `test_batch_with_empty_list()`
|
||||
|
||||
#### Acceptance Criteria
|
||||
- New notes use timestamp slugs by default
|
||||
- Duplicate timestamps handled with suffix
|
||||
- Custom `mp-slug` still works
|
||||
- Existing notes unchanged
|
||||
- Tests cover all slug generation scenarios
|
||||
- [ ] Overall coverage >= 90%
|
||||
- [ ] No module below 85% coverage
|
||||
- [ ] All new code in v1.5.0 has 100% coverage
|
||||
- [ ] MPO handling fully tested
|
||||
|
||||
#### Dependencies
|
||||
- Phases 1-4 complete (tests cover new functionality)
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- New user-facing features
|
||||
- UI changes
|
||||
- New feed formats
|
||||
- Micropub extensions
|
||||
- Database schema changes (except for bug fixes)
|
||||
Items explicitly excluded from v1.5.0:
|
||||
|
||||
## Dependencies
|
||||
| Item | Reason |
|
||||
|------|--------|
|
||||
| Rate limiting | Handled by reverse proxy (Caddy/Nginx) |
|
||||
| Schema changes | Not needed for v1.5.0 fixes |
|
||||
| New user features | Quality-focused release |
|
||||
| N+1 fixes in admin/search | Low traffic, deferred to BACKLOG |
|
||||
| UI changes | No frontend work planned |
|
||||
|
||||
- v1.4.x complete
|
||||
- No new external dependencies expected
|
||||
---
|
||||
|
||||
## Recommendation: Single Release vs. Multiple Patches
|
||||
|
||||
**Recommendation: Single v1.5.0 Release**
|
||||
|
||||
### Rationale
|
||||
|
||||
1. **Phase Dependencies**: Most phases depend on Phase 0 (test fixes). Splitting would create artificial release boundaries.
|
||||
|
||||
2. **Cognitive Overhead**: Multiple patch releases (1.5.1, 1.5.2, etc.) require:
|
||||
- Multiple changelog entries
|
||||
- Multiple version bumps
|
||||
- Multiple release notes
|
||||
- More git tags/branches
|
||||
|
||||
3. **Test Coverage Integration**: Test coverage work (Phase 5) tests functionality from Phases 1-4. Separating them creates incomplete test coverage.
|
||||
|
||||
4. **User Experience**: Users prefer fewer, more significant updates over many small patches.
|
||||
|
||||
5. **Scope Alignment**: All v1.5.0 work is internally focused (no external API changes). Users see one "quality improvement" release.
|
||||
|
||||
### Exception
|
||||
|
||||
If Phase 0 (test fixes) reveals critical bugs affecting production, those fixes should be:
|
||||
- Backported to a v1.4.3 patch release on the current branch
|
||||
- Then merged forward to v1.5.0
|
||||
|
||||
### Alternative Considered
|
||||
|
||||
Splitting into:
|
||||
- v1.5.0: Phase 0 (test fixes) + Phase 1 (slugs)
|
||||
- v1.5.1: Phase 2-4 (technical debt)
|
||||
- v1.5.2: Phase 5 (test coverage)
|
||||
|
||||
**Rejected** because test coverage work must test the new functionality, making separation counterproductive.
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. ✅ Test coverage ≥90% overall
|
||||
2. ✅ MPO format has test coverage
|
||||
3. ✅ Debug file cleanup implemented and tested
|
||||
4. ✅ Filename sanitization implemented and tested
|
||||
5. ✅ N+1 query pattern eliminated
|
||||
6. ✅ Variant generation is atomic
|
||||
7. ✅ Default slugs are timestamp-based
|
||||
8. ✅ All existing tests continue to pass
|
||||
9. ✅ No regressions in functionality
|
||||
| # | Criterion | Verification |
|
||||
|---|-----------|--------------|
|
||||
| 1 | All tests pass | `uv run pytest` shows 0 failures |
|
||||
| 2 | Coverage >= 90% | `uv run pytest --cov=starpunk` |
|
||||
| 3 | MPO tested | MPO tests in test suite |
|
||||
| 4 | Debug cleanup works | Manual verification + tests |
|
||||
| 5 | N+1 fixed in feed | Performance tests show improvement |
|
||||
| 6 | Variants atomic | Failure simulation tests pass |
|
||||
| 7 | Slugs timestamp-based | New notes use `YYYYMMDDHHMMSS` format |
|
||||
| 8 | No regressions | Full test suite passes |
|
||||
| 9 | ADRs documented | ADR-062 in `/docs/decisions/` |
|
||||
|
||||
## Related Backlog Items
|
||||
---
|
||||
|
||||
After completion, the following backlog items should be marked complete or moved to "Recently Completed":
|
||||
## Related Documentation
|
||||
|
||||
- MPO Format Test Coverage (High)
|
||||
- Debug File Storage Without Cleanup Mechanism (Medium)
|
||||
- Filename Not Sanitized in Debug Path (Medium)
|
||||
- N+1 Query Pattern in Feed Generation (Medium)
|
||||
- Transaction Not Atomic in Variant Generation (Medium)
|
||||
- Default Slug Change (Medium)
|
||||
- ADR-062: Timestamp-Based Slug Format (supersedes ADR-007)
|
||||
- ADR-007: Slug Generation Algorithm (superseded)
|
||||
- BACKLOG.md: Deferred N+1 query locations documented
|
||||
- v1.4.2 Architect Review: Source of many v1.5.0 items
|
||||
|
||||
---
|
||||
|
||||
## Implementation Timeline
|
||||
|
||||
| Phase | Estimated Effort | Dependencies |
|
||||
|-------|------------------|--------------|
|
||||
| Phase 0: Test Fixes | 2-4 hours | None |
|
||||
| Phase 1: Timestamp Slugs | 2-3 hours | Phase 0 |
|
||||
| Phase 2: Debug Files | 3-4 hours | Phase 0 |
|
||||
| Phase 3: N+1 Fix | 3-4 hours | Phase 0 |
|
||||
| Phase 4: Atomic Variants | 4-6 hours | Phase 0 |
|
||||
| Phase 5: Coverage | 4-8 hours | Phases 1-4 |
|
||||
|
||||
**Total Estimated**: 18-29 hours
|
||||
|
||||
---
|
||||
|
||||
## Post-Release
|
||||
|
||||
After v1.5.0 ships, update BACKLOG.md to move completed items to "Recently Completed" section:
|
||||
- MPO Format Test Coverage
|
||||
- Debug File Storage Cleanup
|
||||
- Filename Sanitization
|
||||
- N+1 Query Fix (Feed Generation - partial)
|
||||
- Atomic Variant Generation
|
||||
- Default Slug Change
|
||||
|
||||
Reference in New Issue
Block a user