310 lines
8.3 KiB
Markdown
310 lines
8.3 KiB
Markdown
# Phase 1.1 Quick Reference: Core Utilities
|
|
|
|
## Quick Start
|
|
|
|
**File**: `starpunk/utils.py`
|
|
**Tests**: `tests/test_utils.py`
|
|
**Estimated Time**: 2-3 hours
|
|
|
|
## Implementation Order
|
|
|
|
1. Constants and imports
|
|
2. Helper functions (extract_first_words, normalize_slug_text, generate_random_suffix)
|
|
3. Slug functions (generate_slug, make_slug_unique, validate_slug)
|
|
4. Content hashing (calculate_content_hash)
|
|
5. Path functions (generate_note_path, ensure_note_directory, validate_note_path)
|
|
6. File operations (write_note_file, read_note_file, delete_note_file)
|
|
7. Date/time functions (format_rfc822, format_iso8601, parse_iso8601)
|
|
|
|
## Function Checklist
|
|
|
|
### Slug Generation (3 functions)
|
|
- [ ] `generate_slug(content: str, created_at: Optional[datetime] = None) -> str`
|
|
- [ ] `make_slug_unique(base_slug: str, existing_slugs: Set[str]) -> str`
|
|
- [ ] `validate_slug(slug: str) -> bool`
|
|
|
|
### Content Hashing (1 function)
|
|
- [ ] `calculate_content_hash(content: str) -> str`
|
|
|
|
### Path Operations (3 functions)
|
|
- [ ] `generate_note_path(slug: str, created_at: datetime, data_dir: Path) -> Path`
|
|
- [ ] `ensure_note_directory(note_path: Path) -> Path`
|
|
- [ ] `validate_note_path(file_path: Path, data_dir: Path) -> bool`
|
|
|
|
### File Operations (3 functions)
|
|
- [ ] `write_note_file(file_path: Path, content: str) -> None`
|
|
- [ ] `read_note_file(file_path: Path) -> str`
|
|
- [ ] `delete_note_file(file_path: Path, soft: bool = False, data_dir: Optional[Path] = None) -> None`
|
|
|
|
### Date/Time (3 functions)
|
|
- [ ] `format_rfc822(dt: datetime) -> str`
|
|
- [ ] `format_iso8601(dt: datetime) -> str`
|
|
- [ ] `parse_iso8601(date_string: str) -> datetime`
|
|
|
|
### Helper Functions (3 functions)
|
|
- [ ] `extract_first_words(text: str, max_words: int = 5) -> str`
|
|
- [ ] `normalize_slug_text(text: str) -> str`
|
|
- [ ] `generate_random_suffix(length: int = 4) -> str`
|
|
|
|
**Total**: 16 functions
|
|
|
|
## Constants Required
|
|
|
|
```python
|
|
# Slug configuration
|
|
MAX_SLUG_LENGTH = 100
|
|
MIN_SLUG_LENGTH = 1
|
|
SLUG_WORDS_COUNT = 5
|
|
RANDOM_SUFFIX_LENGTH = 4
|
|
|
|
# File operations
|
|
TEMP_FILE_SUFFIX = '.tmp'
|
|
TRASH_DIR_NAME = '.trash'
|
|
|
|
# Hashing
|
|
CONTENT_HASH_ALGORITHM = 'sha256'
|
|
|
|
# Regex patterns
|
|
SLUG_PATTERN = re.compile(r'^[a-z0-9]+(?:-[a-z0-9]+)*$')
|
|
SAFE_SLUG_PATTERN = re.compile(r'[^a-z0-9-]')
|
|
MULTIPLE_HYPHENS_PATTERN = re.compile(r'-+')
|
|
|
|
# Character set
|
|
RANDOM_CHARS = 'abcdefghijklmnopqrstuvwxyz0123456789'
|
|
```
|
|
|
|
## Key Algorithms
|
|
|
|
### Slug Generation Algorithm
|
|
|
|
```
|
|
1. Extract first 5 words from content
|
|
2. Convert to lowercase
|
|
3. Replace spaces with hyphens
|
|
4. Remove all characters except a-z, 0-9, hyphens
|
|
5. Collapse multiple hyphens to single hyphen
|
|
6. Strip leading/trailing hyphens
|
|
7. Truncate to 100 characters
|
|
8. If empty or too short → timestamp fallback (YYYYMMDD-HHMMSS)
|
|
9. Return slug
|
|
```
|
|
|
|
### Atomic File Write Algorithm
|
|
|
|
```
|
|
1. Create temp file path: file_path.with_suffix('.tmp')
|
|
2. Write content to temp file
|
|
3. Atomically rename temp to final path
|
|
4. On error: delete temp file, re-raise exception
|
|
```
|
|
|
|
### Path Validation Algorithm
|
|
|
|
```
|
|
1. Resolve both paths to absolute
|
|
2. Check if file_path.is_relative_to(data_dir)
|
|
3. Return boolean
|
|
```
|
|
|
|
## Test Coverage Requirements
|
|
|
|
- Minimum 90% code coverage
|
|
- Test all functions
|
|
- Test edge cases (empty, whitespace, unicode, special chars)
|
|
- Test error cases (invalid input, file errors)
|
|
- Test security (path traversal)
|
|
|
|
## Example Test Structure
|
|
|
|
```python
|
|
class TestSlugGeneration:
|
|
def test_generate_slug_from_content(self): pass
|
|
def test_generate_slug_empty_content(self): pass
|
|
def test_generate_slug_special_characters(self): pass
|
|
def test_make_slug_unique_no_collision(self): pass
|
|
def test_make_slug_unique_with_collision(self): pass
|
|
def test_validate_slug_valid(self): pass
|
|
def test_validate_slug_invalid(self): pass
|
|
|
|
class TestContentHashing:
|
|
def test_calculate_content_hash_consistency(self): pass
|
|
def test_calculate_content_hash_different(self): pass
|
|
def test_calculate_content_hash_empty(self): pass
|
|
|
|
class TestFilePathOperations:
|
|
def test_generate_note_path(self): pass
|
|
def test_validate_note_path_safe(self): pass
|
|
def test_validate_note_path_traversal(self): pass
|
|
|
|
class TestAtomicFileOperations:
|
|
def test_write_and_read_note_file(self): pass
|
|
def test_write_note_file_atomic(self): pass
|
|
def test_delete_note_file_hard(self): pass
|
|
def test_delete_note_file_soft(self): pass
|
|
|
|
class TestDateTimeFormatting:
|
|
def test_format_rfc822(self): pass
|
|
def test_format_iso8601(self): pass
|
|
def test_parse_iso8601(self): pass
|
|
```
|
|
|
|
## Common Pitfalls to Avoid
|
|
|
|
1. **Don't use `random` module** → Use `secrets` for security
|
|
2. **Don't forget path validation** → Always validate before file operations
|
|
3. **Don't use magic numbers** → Define as constants
|
|
4. **Don't skip temp file cleanup** → Use try/finally
|
|
5. **Don't use bare `except:`** → Catch specific exceptions
|
|
6. **Don't forget type hints** → All functions need type hints
|
|
7. **Don't skip docstrings** → All functions need docstrings with examples
|
|
8. **Don't forget edge cases** → Test empty, whitespace, unicode, special chars
|
|
|
|
## Security Checklist
|
|
|
|
- [ ] Path validation prevents directory traversal
|
|
- [ ] Use `secrets` module for random generation
|
|
- [ ] Validate all external input
|
|
- [ ] Use atomic file writes
|
|
- [ ] Handle symlinks correctly (resolve paths)
|
|
- [ ] No hardcoded credentials or paths
|
|
- [ ] Error messages don't leak sensitive info
|
|
|
|
## Performance Targets
|
|
|
|
- Slug generation: < 1ms
|
|
- File write: < 10ms
|
|
- File read: < 5ms
|
|
- Path validation: < 1ms
|
|
- Hash calculation: < 5ms for 10KB content
|
|
|
|
## Module Structure Template
|
|
|
|
```python
|
|
"""
|
|
Core utility functions for StarPunk
|
|
|
|
This module provides essential utilities for slug generation, file operations,
|
|
hashing, and date/time handling.
|
|
"""
|
|
|
|
# Standard library
|
|
import hashlib
|
|
import re
|
|
import secrets
|
|
from datetime import datetime
|
|
from pathlib import Path
|
|
from typing import Optional
|
|
|
|
# Third-party
|
|
# (none for utils.py)
|
|
|
|
# Constants
|
|
MAX_SLUG_LENGTH = 100
|
|
# ... more constants
|
|
|
|
# Helper functions
|
|
def extract_first_words(text: str, max_words: int = 5) -> str:
|
|
"""Extract first N words from text."""
|
|
pass
|
|
|
|
# ... more helpers
|
|
|
|
# Slug functions
|
|
def generate_slug(content: str, created_at: Optional[datetime] = None) -> str:
|
|
"""Generate URL-safe slug from content."""
|
|
pass
|
|
|
|
# ... more slug functions
|
|
|
|
# Content hashing
|
|
def calculate_content_hash(content: str) -> str:
|
|
"""Calculate SHA-256 hash of content."""
|
|
pass
|
|
|
|
# Path operations
|
|
def generate_note_path(slug: str, created_at: datetime, data_dir: Path) -> Path:
|
|
"""Generate file path for note."""
|
|
pass
|
|
|
|
# ... more path functions
|
|
|
|
# File operations
|
|
def write_note_file(file_path: Path, content: str) -> None:
|
|
"""Write note content to file atomically."""
|
|
pass
|
|
|
|
# ... more file functions
|
|
|
|
# Date/time functions
|
|
def format_rfc822(dt: datetime) -> str:
|
|
"""Format datetime as RFC-822 string."""
|
|
pass
|
|
|
|
# ... more date/time functions
|
|
```
|
|
|
|
## Verification Checklist
|
|
|
|
Before marking Phase 1.1 complete:
|
|
|
|
- [ ] All 16 functions implemented
|
|
- [ ] All functions have type hints
|
|
- [ ] All functions have docstrings with examples
|
|
- [ ] All constants defined
|
|
- [ ] Test file created with >90% coverage
|
|
- [ ] All tests pass
|
|
- [ ] Code formatted with Black
|
|
- [ ] Code passes flake8
|
|
- [ ] No security issues
|
|
- [ ] No hardcoded values
|
|
- [ ] Error messages are clear
|
|
- [ ] Performance targets met
|
|
|
|
## Next Steps After Implementation
|
|
|
|
Once `starpunk/utils.py` is complete:
|
|
|
|
1. Move to Phase 1.2: Data Models (`starpunk/models.py`)
|
|
2. Models will import and use these utilities
|
|
3. Integration tests will verify utilities work with models
|
|
|
|
## References
|
|
|
|
- Full design: `/home/phil/Projects/starpunk/docs/design/phase-1.1-core-utilities.md`
|
|
- ADR-007: Slug generation algorithm
|
|
- Python coding standards
|
|
- Utility function patterns
|
|
|
|
## Quick Command Reference
|
|
|
|
```bash
|
|
# Run tests
|
|
pytest tests/test_utils.py -v
|
|
|
|
# Run tests with coverage
|
|
pytest tests/test_utils.py --cov=starpunk.utils --cov-report=term-missing
|
|
|
|
# Format code
|
|
black starpunk/utils.py tests/test_utils.py
|
|
|
|
# Lint code
|
|
flake8 starpunk/utils.py tests/test_utils.py
|
|
|
|
# Type check (optional)
|
|
mypy starpunk/utils.py
|
|
```
|
|
|
|
## Estimated Time Breakdown
|
|
|
|
- Constants and imports: 10 minutes
|
|
- Helper functions: 20 minutes
|
|
- Slug functions: 30 minutes
|
|
- Content hashing: 10 minutes
|
|
- Path functions: 25 minutes
|
|
- File operations: 35 minutes
|
|
- Date/time functions: 15 minutes
|
|
- Tests: 60-90 minutes
|
|
- Documentation review: 15 minutes
|
|
|
|
**Total**: 2-3 hours
|