that initial commit

This commit is contained in:
2025-11-18 19:21:31 -07:00
commit a68fd570c7
69 changed files with 31070 additions and 0 deletions

1017
docs/design/initial-files.md Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,309 @@
# Phase 1.1 Quick Reference: Core Utilities
## Quick Start
**File**: `starpunk/utils.py`
**Tests**: `tests/test_utils.py`
**Estimated Time**: 2-3 hours
## Implementation Order
1. Constants and imports
2. Helper functions (extract_first_words, normalize_slug_text, generate_random_suffix)
3. Slug functions (generate_slug, make_slug_unique, validate_slug)
4. Content hashing (calculate_content_hash)
5. Path functions (generate_note_path, ensure_note_directory, validate_note_path)
6. File operations (write_note_file, read_note_file, delete_note_file)
7. Date/time functions (format_rfc822, format_iso8601, parse_iso8601)
## Function Checklist
### Slug Generation (3 functions)
- [ ] `generate_slug(content: str, created_at: Optional[datetime] = None) -> str`
- [ ] `make_slug_unique(base_slug: str, existing_slugs: Set[str]) -> str`
- [ ] `validate_slug(slug: str) -> bool`
### Content Hashing (1 function)
- [ ] `calculate_content_hash(content: str) -> str`
### Path Operations (3 functions)
- [ ] `generate_note_path(slug: str, created_at: datetime, data_dir: Path) -> Path`
- [ ] `ensure_note_directory(note_path: Path) -> Path`
- [ ] `validate_note_path(file_path: Path, data_dir: Path) -> bool`
### File Operations (3 functions)
- [ ] `write_note_file(file_path: Path, content: str) -> None`
- [ ] `read_note_file(file_path: Path) -> str`
- [ ] `delete_note_file(file_path: Path, soft: bool = False, data_dir: Optional[Path] = None) -> None`
### Date/Time (3 functions)
- [ ] `format_rfc822(dt: datetime) -> str`
- [ ] `format_iso8601(dt: datetime) -> str`
- [ ] `parse_iso8601(date_string: str) -> datetime`
### Helper Functions (3 functions)
- [ ] `extract_first_words(text: str, max_words: int = 5) -> str`
- [ ] `normalize_slug_text(text: str) -> str`
- [ ] `generate_random_suffix(length: int = 4) -> str`
**Total**: 16 functions
## Constants Required
```python
# Slug configuration
MAX_SLUG_LENGTH = 100
MIN_SLUG_LENGTH = 1
SLUG_WORDS_COUNT = 5
RANDOM_SUFFIX_LENGTH = 4
# File operations
TEMP_FILE_SUFFIX = '.tmp'
TRASH_DIR_NAME = '.trash'
# Hashing
CONTENT_HASH_ALGORITHM = 'sha256'
# Regex patterns
SLUG_PATTERN = re.compile(r'^[a-z0-9]+(?:-[a-z0-9]+)*$')
SAFE_SLUG_PATTERN = re.compile(r'[^a-z0-9-]')
MULTIPLE_HYPHENS_PATTERN = re.compile(r'-+')
# Character set
RANDOM_CHARS = 'abcdefghijklmnopqrstuvwxyz0123456789'
```
## Key Algorithms
### Slug Generation Algorithm
```
1. Extract first 5 words from content
2. Convert to lowercase
3. Replace spaces with hyphens
4. Remove all characters except a-z, 0-9, hyphens
5. Collapse multiple hyphens to single hyphen
6. Strip leading/trailing hyphens
7. Truncate to 100 characters
8. If empty or too short → timestamp fallback (YYYYMMDD-HHMMSS)
9. Return slug
```
### Atomic File Write Algorithm
```
1. Create temp file path: file_path.with_suffix('.tmp')
2. Write content to temp file
3. Atomically rename temp to final path
4. On error: delete temp file, re-raise exception
```
### Path Validation Algorithm
```
1. Resolve both paths to absolute
2. Check if file_path.is_relative_to(data_dir)
3. Return boolean
```
## Test Coverage Requirements
- Minimum 90% code coverage
- Test all functions
- Test edge cases (empty, whitespace, unicode, special chars)
- Test error cases (invalid input, file errors)
- Test security (path traversal)
## Example Test Structure
```python
class TestSlugGeneration:
def test_generate_slug_from_content(self): pass
def test_generate_slug_empty_content(self): pass
def test_generate_slug_special_characters(self): pass
def test_make_slug_unique_no_collision(self): pass
def test_make_slug_unique_with_collision(self): pass
def test_validate_slug_valid(self): pass
def test_validate_slug_invalid(self): pass
class TestContentHashing:
def test_calculate_content_hash_consistency(self): pass
def test_calculate_content_hash_different(self): pass
def test_calculate_content_hash_empty(self): pass
class TestFilePathOperations:
def test_generate_note_path(self): pass
def test_validate_note_path_safe(self): pass
def test_validate_note_path_traversal(self): pass
class TestAtomicFileOperations:
def test_write_and_read_note_file(self): pass
def test_write_note_file_atomic(self): pass
def test_delete_note_file_hard(self): pass
def test_delete_note_file_soft(self): pass
class TestDateTimeFormatting:
def test_format_rfc822(self): pass
def test_format_iso8601(self): pass
def test_parse_iso8601(self): pass
```
## Common Pitfalls to Avoid
1. **Don't use `random` module** → Use `secrets` for security
2. **Don't forget path validation** → Always validate before file operations
3. **Don't use magic numbers** → Define as constants
4. **Don't skip temp file cleanup** → Use try/finally
5. **Don't use bare `except:`** → Catch specific exceptions
6. **Don't forget type hints** → All functions need type hints
7. **Don't skip docstrings** → All functions need docstrings with examples
8. **Don't forget edge cases** → Test empty, whitespace, unicode, special chars
## Security Checklist
- [ ] Path validation prevents directory traversal
- [ ] Use `secrets` module for random generation
- [ ] Validate all external input
- [ ] Use atomic file writes
- [ ] Handle symlinks correctly (resolve paths)
- [ ] No hardcoded credentials or paths
- [ ] Error messages don't leak sensitive info
## Performance Targets
- Slug generation: < 1ms
- File write: < 10ms
- File read: < 5ms
- Path validation: < 1ms
- Hash calculation: < 5ms for 10KB content
## Module Structure Template
```python
"""
Core utility functions for StarPunk
This module provides essential utilities for slug generation, file operations,
hashing, and date/time handling.
"""
# Standard library
import hashlib
import re
import secrets
from datetime import datetime
from pathlib import Path
from typing import Optional
# Third-party
# (none for utils.py)
# Constants
MAX_SLUG_LENGTH = 100
# ... more constants
# Helper functions
def extract_first_words(text: str, max_words: int = 5) -> str:
"""Extract first N words from text."""
pass
# ... more helpers
# Slug functions
def generate_slug(content: str, created_at: Optional[datetime] = None) -> str:
"""Generate URL-safe slug from content."""
pass
# ... more slug functions
# Content hashing
def calculate_content_hash(content: str) -> str:
"""Calculate SHA-256 hash of content."""
pass
# Path operations
def generate_note_path(slug: str, created_at: datetime, data_dir: Path) -> Path:
"""Generate file path for note."""
pass
# ... more path functions
# File operations
def write_note_file(file_path: Path, content: str) -> None:
"""Write note content to file atomically."""
pass
# ... more file functions
# Date/time functions
def format_rfc822(dt: datetime) -> str:
"""Format datetime as RFC-822 string."""
pass
# ... more date/time functions
```
## Verification Checklist
Before marking Phase 1.1 complete:
- [ ] All 16 functions implemented
- [ ] All functions have type hints
- [ ] All functions have docstrings with examples
- [ ] All constants defined
- [ ] Test file created with >90% coverage
- [ ] All tests pass
- [ ] Code formatted with Black
- [ ] Code passes flake8
- [ ] No security issues
- [ ] No hardcoded values
- [ ] Error messages are clear
- [ ] Performance targets met
## Next Steps After Implementation
Once `starpunk/utils.py` is complete:
1. Move to Phase 1.2: Data Models (`starpunk/models.py`)
2. Models will import and use these utilities
3. Integration tests will verify utilities work with models
## References
- Full design: `/home/phil/Projects/starpunk/docs/design/phase-1.1-core-utilities.md`
- ADR-007: Slug generation algorithm
- Python coding standards
- Utility function patterns
## Quick Command Reference
```bash
# Run tests
pytest tests/test_utils.py -v
# Run tests with coverage
pytest tests/test_utils.py --cov=starpunk.utils --cov-report=term-missing
# Format code
black starpunk/utils.py tests/test_utils.py
# Lint code
flake8 starpunk/utils.py tests/test_utils.py
# Type check (optional)
mypy starpunk/utils.py
```
## Estimated Time Breakdown
- Constants and imports: 10 minutes
- Helper functions: 20 minutes
- Slug functions: 30 minutes
- Content hashing: 10 minutes
- Path functions: 25 minutes
- File operations: 35 minutes
- Date/time functions: 15 minutes
- Tests: 60-90 minutes
- Documentation review: 15 minutes
**Total**: 2-3 hours

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,599 @@
# Phase 1.2 Quick Reference: Data Models
## Quick Start
**File**: `starpunk/models.py`
**Tests**: `tests/test_models.py`
**Estimated Time**: 3-4 hours
**Dependencies**: `starpunk/utils.py`, `starpunk/database.py`
## Implementation Order
1. Module docstring, imports, and constants
2. Note model (most complex)
3. Session model
4. Token model
5. AuthState model (simplest)
6. Tests for all models
## Model Checklist
### Note Model (10 items)
- [ ] Dataclass structure with all fields
- [ ] `from_row(row: dict, data_dir: Path) -> Note` class method
- [ ] `content` property (lazy loading from file)
- [ ] `html` property (markdown rendering + caching)
- [ ] `title` property (extract from content)
- [ ] `excerpt` property (first 200 chars)
- [ ] `permalink` property (URL path)
- [ ] `is_published` property (alias)
- [ ] `to_dict(include_content, include_html) -> dict` method
- [ ] `verify_integrity() -> bool` method
### Session Model (6 items)
- [ ] Dataclass structure with all fields
- [ ] `from_row(row: dict) -> Session` class method
- [ ] `is_expired` property
- [ ] `is_valid() -> bool` method
- [ ] `with_updated_last_used() -> Session` method
- [ ] `to_dict() -> dict` method
### Token Model (6 items)
- [ ] Dataclass structure with all fields
- [ ] `from_row(row: dict) -> Token` class method
- [ ] `scopes` property (list of scope strings)
- [ ] `has_scope(required_scope: str) -> bool` method
- [ ] `is_valid(required_scope: Optional[str]) -> bool` method
- [ ] `to_dict() -> dict` method
### AuthState Model (4 items)
- [ ] Dataclass structure with all fields
- [ ] `from_row(row: dict) -> AuthState` class method
- [ ] `is_expired` property
- [ ] `is_valid() -> bool` method
- [ ] `to_dict() -> dict` method
**Total**: 4 models, 26 methods/properties
## Constants Required
```python
# Session configuration
DEFAULT_SESSION_EXPIRY_DAYS = 30
SESSION_EXTENSION_ON_USE = True
# Auth state configuration
DEFAULT_AUTH_STATE_EXPIRY_MINUTES = 5
# Token configuration
DEFAULT_TOKEN_EXPIRY_DAYS = 90
# Markdown rendering
MARKDOWN_EXTENSIONS = ['extra', 'codehilite', 'nl2br']
# Content limits
MAX_TITLE_LENGTH = 200
EXCERPT_LENGTH = 200
```
## Key Design Patterns
### Frozen Dataclasses
```python
@dataclass(frozen=True)
class Note:
# Core fields
id: int
slug: str
# ... more fields
# Internal fields (not from database)
_data_dir: Path = field(repr=False, compare=False)
_cached_content: Optional[str] = field(
default=None,
repr=False,
compare=False,
init=False
)
```
### Lazy Loading Pattern
```python
@property
def content(self) -> str:
"""Lazy-load content from file"""
if self._cached_content is None:
# Read from file
file_path = self._data_dir / self.file_path
content = read_note_file(file_path)
# Cache it (use object.__setattr__ for frozen dataclass)
object.__setattr__(self, '_cached_content', content)
return self._cached_content
```
### from_row Pattern
```python
@classmethod
def from_row(cls, row: dict, data_dir: Path = None) -> 'Note':
"""Create instance from database row"""
# Handle sqlite3.Row or dict
if hasattr(row, 'keys'):
data = {key: row[key] for key in row.keys()}
else:
data = row
# Convert timestamps if needed
if isinstance(data['created_at'], str):
data['created_at'] = datetime.fromisoformat(data['created_at'])
return cls(
id=data['id'],
slug=data['slug'],
# ... more fields
_data_dir=data_dir
)
```
### Immutable Update Pattern
```python
def with_updated_last_used(self) -> 'Session':
"""Create new session with updated timestamp"""
from dataclasses import replace
return replace(self, last_used_at=datetime.utcnow())
```
## Test Coverage Requirements
- Minimum 90% code coverage
- Test all model creation (from_row)
- Test all properties and methods
- Test lazy loading behavior
- Test caching behavior
- Test edge cases (empty content, expired sessions, etc.)
- Test error cases (file not found, invalid data)
## Example Test Structure
```python
class TestNoteModel:
def test_from_row(self): pass
def test_content_lazy_loading(self, tmp_path): pass
def test_content_caching(self, tmp_path): pass
def test_html_rendering(self, tmp_path): pass
def test_html_caching(self, tmp_path): pass
def test_title_extraction(self, tmp_path): pass
def test_title_fallback_to_slug(self): pass
def test_excerpt_generation(self, tmp_path): pass
def test_permalink(self): pass
def test_to_dict_basic(self): pass
def test_to_dict_with_content(self, tmp_path): pass
def test_verify_integrity_success(self, tmp_path): pass
def test_verify_integrity_failure(self, tmp_path): pass
class TestSessionModel:
def test_from_row(self): pass
def test_is_expired_false(self): pass
def test_is_expired_true(self): pass
def test_is_valid_active(self): pass
def test_is_valid_expired(self): pass
def test_with_updated_last_used(self): pass
def test_to_dict(self): pass
class TestTokenModel:
def test_from_row(self): pass
def test_scopes_property(self): pass
def test_scopes_empty(self): pass
def test_has_scope_true(self): pass
def test_has_scope_false(self): pass
def test_is_expired_never(self): pass
def test_is_expired_yes(self): pass
def test_is_valid(self): pass
def test_is_valid_with_scope(self): pass
def test_to_dict(self): pass
class TestAuthStateModel:
def test_from_row(self): pass
def test_is_expired_false(self): pass
def test_is_expired_true(self): pass
def test_is_valid(self): pass
def test_to_dict(self): pass
```
## Common Pitfalls to Avoid
1. **Don't modify frozen dataclasses directly** → Use `object.__setattr__()` for caching
2. **Don't load content in __init__** → Use lazy loading properties
3. **Don't forget to cache expensive operations** → HTML rendering should be cached
4. **Don't validate paths in models** → Models trust caller has validated
5. **Don't put business logic in models** → Models are data only
6. **Don't forget datetime conversion** → Database may return strings
7. **Don't expose sensitive data in to_dict()** → Exclude tokens, passwords
8. **Don't forget to test with tmp_path** → Use pytest tmp_path fixture
## Security Checklist
- [ ] Session token excluded from to_dict()
- [ ] Token value excluded from to_dict()
- [ ] File paths not directly exposed (use properties)
- [ ] Expiry checked before using sessions/tokens/states
- [ ] No SQL injection (models don't query, but be aware)
- [ ] File reading errors propagate (don't hide)
- [ ] Datetime comparisons use UTC
## Performance Targets
- Model creation (from_row): < 1ms
- Content loading (first access): < 5ms
- HTML rendering (first access): < 10ms
- Cached property access: < 0.1ms
- to_dict() serialization: < 1ms
## Module Structure Template
```python
"""
Data models for StarPunk
This module provides data model classes that wrap database rows and provide
clean interfaces for working with notes, sessions, tokens, and authentication
state. All models are immutable and use dataclasses.
"""
# Standard library
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from pathlib import Path
from typing import Optional
# Third-party
import markdown
# Local
from starpunk.utils import read_note_file, calculate_content_hash
# Constants
DEFAULT_SESSION_EXPIRY_DAYS = 30
# ... more constants
# Models
@dataclass(frozen=True)
class Note:
"""Represents a note/post"""
# Fields here
pass
@dataclass(frozen=True)
class Session:
"""Represents an authenticated session"""
# Fields here
pass
@dataclass(frozen=True)
class Token:
"""Represents a Micropub access token"""
# Fields here
pass
@dataclass(frozen=True)
class AuthState:
"""Represents an OAuth state token"""
# Fields here
pass
```
## Quick Algorithm Reference
### Title Extraction Algorithm
```
1. Get content (lazy load if needed)
2. Split on newlines
3. Take first non-empty line
4. Strip markdown heading syntax (# , ## , etc.)
5. Limit to MAX_TITLE_LENGTH
6. If empty, use slug as fallback
```
### Excerpt Generation Algorithm
```
1. Get content (lazy load if needed)
2. Remove markdown syntax (simple regex)
3. Take first EXCERPT_LENGTH characters
4. Truncate to word boundary
5. Add ellipsis if truncated
```
### Lazy Loading Algorithm
```
1. Check if cached value is None
2. If None:
a. Perform expensive operation (file read, HTML render)
b. Store result in cache using object.__setattr__()
3. Return cached value
```
### Session Validation Algorithm
```
1. Check if session is expired
2. Check if session_token is not empty
3. Check if 'me' URL is valid (basic check)
4. Return True only if all checks pass
```
### Token Scope Checking Algorithm
```
1. Parse scope string into list (split on whitespace)
2. Check if required_scope in list
3. Return boolean
```
## Database Row Format Reference
### Note Row
```python
{
'id': 1,
'slug': 'my-note',
'file_path': 'notes/2024/11/my-note.md',
'published': 1, # or True
'created_at': '2024-11-18T14:30:00' or datetime(...),
'updated_at': '2024-11-18T14:30:00' or datetime(...),
'content_hash': 'abc123...'
}
```
### Session Row
```python
{
'id': 1,
'session_token': 'xyz789...',
'me': 'https://alice.example.com',
'created_at': datetime(...),
'expires_at': datetime(...),
'last_used_at': datetime(...) or None
}
```
### Token Row
```python
{
'token': 'abc123...',
'me': 'https://alice.example.com',
'client_id': 'https://quill.p3k.io',
'scope': 'create update',
'created_at': datetime(...),
'expires_at': datetime(...) or None
}
```
### AuthState Row
```python
{
'state': 'random123...',
'created_at': datetime(...),
'expires_at': datetime(...)
}
```
## Verification Checklist
Before marking Phase 1.2 complete:
- [ ] All 4 models implemented
- [ ] All models are frozen dataclasses
- [ ] All models have from_row() class method
- [ ] All models have to_dict() method
- [ ] Note model lazy-loads content
- [ ] Note model renders HTML with caching
- [ ] Session model validates expiry
- [ ] Token model validates scopes
- [ ] All properties have type hints
- [ ] All methods have docstrings with examples
- [ ] Test file created with >90% coverage
- [ ] All tests pass
- [ ] Code formatted with Black
- [ ] Code passes flake8
- [ ] No security issues
- [ ] Integration with utils.py works
## Usage Quick Examples
### Note Model
```python
# Create from database
row = db.execute("SELECT * FROM notes WHERE slug = ?", (slug,)).fetchone()
note = Note.from_row(row, data_dir=Path("data"))
# Access metadata (fast)
print(note.slug, note.published)
# Lazy load content (slow first time, cached after)
content = note.content
# Render HTML (slow first time, cached after)
html = note.html
# Extract metadata
title = note.title
permalink = note.permalink
# Serialize for JSON/templates
data = note.to_dict(include_content=True)
```
### Session Model
```python
# Create from database
row = db.execute("SELECT * FROM sessions WHERE session_token = ?", (token,)).fetchone()
session = Session.from_row(row)
# Validate
if session.is_valid():
# Update last used
updated = session.with_updated_last_used()
# Save to database
db.execute("UPDATE sessions SET last_used_at = ? WHERE id = ?",
(updated.last_used_at, updated.id))
```
### Token Model
```python
# Create from database
row = db.execute("SELECT * FROM tokens WHERE token = ?", (token,)).fetchone()
token_obj = Token.from_row(row)
# Validate with required scope
if token_obj.is_valid(required_scope='create'):
# Allow request
pass
```
### AuthState Model
```python
# Create from database
row = db.execute("SELECT * FROM auth_state WHERE state = ?", (state,)).fetchone()
auth_state = AuthState.from_row(row)
# Validate
if auth_state.is_valid():
# Delete (single-use)
db.execute("DELETE FROM auth_state WHERE state = ?", (state,))
```
## Next Steps After Implementation
Once `starpunk/models.py` is complete:
1. Move to Phase 2.1: Notes Management (`starpunk/notes.py`)
2. Notes module will use Note model extensively
3. Integration tests will verify models work with database
## References
- Full design: `/home/phil/Projects/starpunk/docs/design/phase-1.2-data-models.md`
- Database schema: `/home/phil/Projects/starpunk/starpunk/database.py`
- Utilities: `/home/phil/Projects/starpunk/starpunk/utils.py`
- Python dataclasses: https://docs.python.org/3/library/dataclasses.html
## Quick Command Reference
```bash
# Run tests
pytest tests/test_models.py -v
# Run tests with coverage
pytest tests/test_models.py --cov=starpunk.models --cov-report=term-missing
# Format code
black starpunk/models.py tests/test_models.py
# Lint code
flake8 starpunk/models.py tests/test_models.py
# Type check (optional)
mypy starpunk/models.py
```
## Estimated Time Breakdown
- Module docstring and imports: 10 minutes
- Constants: 5 minutes
- Note model: 80 minutes
- Basic structure: 15 minutes
- from_row: 10 minutes
- Lazy loading properties: 20 minutes
- HTML rendering: 15 minutes
- Metadata extraction: 15 minutes
- Other methods: 5 minutes
- Session model: 30 minutes
- Token model: 30 minutes
- AuthState model: 20 minutes
- Tests (all models): 90-120 minutes
- Documentation review: 15 minutes
**Total**: 3-4 hours
## Implementation Tips
### Frozen Dataclass Caching
Since dataclasses with `frozen=True` are immutable, use this pattern for caching:
```python
@property
def content(self) -> str:
if self._cached_content is None:
content = read_note_file(self._data_dir / self.file_path)
# Use object.__setattr__ to bypass frozen restriction
object.__setattr__(self, '_cached_content', content)
return self._cached_content
```
### Datetime Handling
Database may return strings or datetime objects:
```python
@classmethod
def from_row(cls, row: dict) -> 'Note':
created_at = row['created_at']
if isinstance(created_at, str):
created_at = datetime.fromisoformat(created_at.replace('Z', '+00:00'))
# ... rest of method
```
### Testing with tmp_path
Use pytest's tmp_path fixture for file operations:
```python
def test_content_loading(tmp_path):
# Create test file
note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md'
note_file.parent.mkdir(parents=True)
note_file.write_text('# Test')
# Create note with tmp_path as data_dir
note = Note(
id=1,
slug='test',
file_path='notes/2024/11/test.md',
published=True,
created_at=datetime.utcnow(),
updated_at=datetime.utcnow(),
_data_dir=tmp_path
)
assert '# Test' in note.content
```
### Markdown Extension Configuration
```python
import markdown
html = markdown.markdown(
content,
extensions=['extra', 'codehilite', 'nl2br'],
extension_configs={
'codehilite': {'css_class': 'highlight'}
}
)
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,616 @@
# Phase 2.1: Notes Management - Quick Reference
## Overview
Quick reference guide for implementing Phase 2.1: Notes Management (CRUD operations) in StarPunk.
**File**: `starpunk/notes.py`
**Estimated Time**: 6-8 hours
**Dependencies**: `utils.py`, `models.py`, `database.py`
---
## Function Checklist
### Required Functions
- [ ] **create_note(content, published=False, created_at=None) -> Note**
- Generate unique slug
- Write file atomically
- Insert database record
- Return Note object
- [ ] **get_note(slug=None, id=None, load_content=True) -> Optional[Note]**
- Query database by slug or id
- Load content from file if requested
- Return Note or None
- [ ] **list_notes(published_only=False, limit=50, offset=0, order_by='created_at', order_dir='DESC') -> list[Note]**
- Query database with filters
- Support pagination
- No file I/O (metadata only)
- Return list of Notes
- [ ] **update_note(slug=None, id=None, content=None, published=None) -> Note**
- Update file if content changed
- Update database record
- Return updated Note
- [ ] **delete_note(slug=None, id=None, soft=True) -> None**
- Soft delete: mark deleted_at in database
- Hard delete: remove file and database record
- Return None
### Custom Exceptions
- [ ] **NoteNotFoundError(Exception)**
- Raised when note doesn't exist
- [ ] **InvalidNoteDataError(Exception)**
- Raised for invalid content/parameters
- [ ] **NoteSyncError(Exception)**
- Raised when file/database sync fails
---
## Implementation Order
### Step 1: Module Setup (15 minutes)
```python
# starpunk/notes.py
"""Notes management for StarPunk"""
# Imports
from datetime import datetime
from pathlib import Path
from typing import Optional
from flask import current_app
from starpunk.database import get_db
from starpunk.models import Note
from starpunk.utils import (
generate_slug, make_slug_unique, generate_note_path,
ensure_note_directory, write_note_file, read_note_file,
delete_note_file, calculate_content_hash,
validate_note_path, validate_slug
)
# Exception classes (define all 3)
```
**Time**: 15 minutes
### Step 2: create_note() (90 minutes)
**Algorithm**:
1. Validate content not empty
2. Set created_at to now if not provided
3. Query existing slugs from database
4. Generate unique slug
5. Generate file path
6. Validate path (security)
7. Calculate content hash
8. Write file (ensure_note_directory + write_note_file)
9. Insert database record
10. If DB fails: delete file, raise NoteSyncError
11. If success: commit, return Note object
**Testing**:
- Create basic note
- Create with empty content (should fail)
- Create with duplicate slug (should add suffix)
- Create with specific timestamp
- File write fails (should not create DB record)
**Time**: 90 minutes (45 min implementation + 45 min testing)
### Step 3: get_note() (45 minutes)
**Algorithm**:
1. Validate parameters (exactly one of slug or id)
2. Query database
3. Return None if not found
4. Create Note.from_row()
5. Optionally verify integrity (log warning if mismatch)
6. Return Note
**Testing**:
- Get by slug
- Get by id
- Get nonexistent (returns None)
- Get with both parameters (should fail)
- Get without loading content
**Time**: 45 minutes (25 min implementation + 20 min testing)
### Step 4: list_notes() (60 minutes)
**Algorithm**:
1. Validate order_by (whitelist check)
2. Validate order_dir (ASC/DESC)
3. Validate limit (max 1000)
4. Build SQL query with filters
5. Add ORDER BY and LIMIT/OFFSET
6. Execute query
7. Create Note objects (don't load content)
8. Return list
**Testing**:
- List all notes
- List published only
- List with pagination
- List with different ordering
- Invalid order_by (should fail - SQL injection test)
**Time**: 60 minutes (35 min implementation + 25 min testing)
### Step 5: update_note() (90 minutes)
**Algorithm**:
1. Validate parameters
2. Get existing note (raises NoteNotFoundError if missing)
3. Validate content if provided
4. Setup paths and timestamps
5. If content changed: write new file, calculate new hash
6. Build UPDATE query for changed fields
7. Execute database update
8. If DB fails: log error, raise NoteSyncError
9. If success: commit, return updated Note
**Testing**:
- Update content only
- Update published only
- Update both
- Update nonexistent (should fail)
- Update with empty content (should fail)
- Update with no changes (should fail)
**Time**: 90 minutes (50 min implementation + 40 min testing)
### Step 6: delete_note() (60 minutes)
**Algorithm**:
1. Validate parameters
2. Get existing note (return if None - idempotent)
3. Validate path
4. If soft delete:
- UPDATE notes SET deleted_at = now WHERE id = ?
- Optionally move file to trash (best effort)
5. If hard delete:
- DELETE FROM notes WHERE id = ?
- Delete file (best effort)
6. Return None
**Testing**:
- Soft delete
- Hard delete
- Delete nonexistent (should succeed)
- Delete already deleted (should succeed)
**Time**: 60 minutes (35 min implementation + 25 min testing)
### Step 7: Integration Tests (60 minutes)
**Full CRUD cycle test**:
1. Create note
2. Retrieve note
3. Update content
4. Update published status
5. List notes (verify appears)
6. Delete note
7. Verify gone
**Sync tests**:
- Verify file exists after create
- Verify DB record exists after create
- Verify file updated after update
- Verify file deleted after hard delete
- Verify DB record deleted after hard delete
**Time**: 60 minutes
### Step 8: Documentation and Cleanup (30 minutes)
- Review all docstrings
- Format with Black
- Run flake8
- Check type hints
- Review error messages
**Time**: 30 minutes
---
## Common Pitfalls
### 1. Forgetting to Commit Transactions
```python
# BAD - no commit
db.execute("INSERT INTO notes ...")
# GOOD - explicit commit
db.execute("INSERT INTO notes ...")
db.commit()
```
### 2. Not Cleaning Up on Failure
```python
# BAD - orphaned file if DB fails
write_note_file(path, content)
db.execute("INSERT ...") # What if this fails?
# GOOD - cleanup on failure
write_note_file(path, content)
try:
db.execute("INSERT ...")
db.commit()
except Exception as e:
path.unlink() # Delete file we created
raise NoteSyncError(...)
```
### 3. SQL Injection in ORDER BY
```python
# BAD - SQL injection risk
order_by = request.args.get('order')
query = f"SELECT * FROM notes ORDER BY {order_by}"
# GOOD - whitelist validation
ALLOWED = ['id', 'slug', 'created_at', 'updated_at']
if order_by not in ALLOWED:
raise ValueError(f"Invalid order_by: {order_by}")
query = f"SELECT * FROM notes ORDER BY {order_by}"
```
### 4. Not Using Parameterized Queries
```python
# BAD - SQL injection risk
slug = request.args.get('slug')
query = f"SELECT * FROM notes WHERE slug = '{slug}'"
# GOOD - parameterized query
query = "SELECT * FROM notes WHERE slug = ?"
db.execute(query, (slug,))
```
### 5. Forgetting Path Validation
```python
# BAD - directory traversal risk
note_path = data_dir / file_path
write_note_file(note_path, content)
# GOOD - validate path
note_path = data_dir / file_path
if not validate_note_path(note_path, data_dir):
raise NoteSyncError(...)
write_note_file(note_path, content)
```
### 6. Not Handling None in Optional Parameters
```python
# BAD - will crash on None
if slug and id:
raise ValueError(...)
# GOOD - explicit None checks
if slug is None and id is None:
raise ValueError("Must provide slug or id")
if slug is not None and id is not None:
raise ValueError("Cannot provide both")
```
---
## Testing Checklist
### Unit Tests
- [ ] create_note with valid content
- [ ] create_note with empty content (fail)
- [ ] create_note with duplicate slug (unique suffix)
- [ ] create_note with specific timestamp
- [ ] create_note with unicode content
- [ ] create_note file write failure (no DB record)
- [ ] get_note by slug
- [ ] get_note by id
- [ ] get_note nonexistent (returns None)
- [ ] get_note with invalid parameters
- [ ] get_note without loading content
- [ ] list_notes all
- [ ] list_notes published only
- [ ] list_notes with pagination
- [ ] list_notes with ordering
- [ ] list_notes with invalid order_by (fail)
- [ ] update_note content
- [ ] update_note published
- [ ] update_note both
- [ ] update_note nonexistent (fail)
- [ ] update_note empty content (fail)
- [ ] delete_note soft
- [ ] delete_note hard
- [ ] delete_note nonexistent (succeed)
### Integration Tests
- [ ] Full CRUD cycle (create → read → update → delete)
- [ ] File-database sync maintained throughout lifecycle
- [ ] Orphaned files cleaned up on DB failure
- [ ] Soft-deleted notes excluded from queries
- [ ] Hard-deleted notes removed from DB and filesystem
### Performance Tests
- [ ] list_notes with 1000 notes (< 10ms)
- [ ] get_note (< 10ms)
- [ ] create_note (< 20ms)
---
## Time Estimates
| Task | Time |
|------|------|
| Module setup | 15 min |
| create_note() | 90 min |
| get_note() | 45 min |
| list_notes() | 60 min |
| update_note() | 90 min |
| delete_note() | 60 min |
| Integration tests | 60 min |
| Documentation/cleanup | 30 min |
| **Total** | **7.5 hours** |
Add 30-60 minutes for unexpected issues and debugging.
---
## Key Design Decisions
### 1. File Operations Before Database
**Rationale**: Fail fast on disk issues before database changes.
**Pattern**:
```python
# Write file first
write_note_file(path, content)
# Then update database
db.execute("INSERT ...")
db.commit()
# If DB fails, cleanup file
```
### 2. Best-Effort File Cleanup
**Rationale**: Database is source of truth. Missing files can be recreated or cleaned up later.
**Pattern**:
```python
try:
path.unlink()
except OSError:
logger.warning("Cleanup failed")
# Don't fail - log and continue
```
### 3. Idempotent Delete
**Rationale**: DELETE operations should succeed even if already deleted.
**Pattern**:
```python
note = get_note(slug=slug)
if note is None:
return # Already deleted, that's fine
# ... proceed with delete
```
### 4. Lazy Content Loading
**Rationale**: list_notes() should not trigger file I/O for every note.
**Pattern**:
```python
# list_notes creates Notes without loading content
notes = [Note.from_row(row, data_dir) for row in rows]
# Content loaded on access
for note in notes:
print(note.slug) # Fast (metadata)
print(note.content) # Triggers file I/O
```
### 5. Parameterized Queries Only
**Rationale**: Prevent SQL injection.
**Pattern**:
```python
# Always use parameter binding
db.execute("SELECT * FROM notes WHERE slug = ?", (slug,))
# Never use string interpolation
db.execute(f"SELECT * FROM notes WHERE slug = '{slug}'") # NO!
```
---
## Dependencies Reference
### From utils.py
```python
generate_slug(content, created_at) -> str
make_slug_unique(base_slug, existing_slugs) -> str
validate_slug(slug) -> bool
generate_note_path(slug, created_at, data_dir) -> Path
ensure_note_directory(note_path) -> Path
write_note_file(file_path, content) -> None
read_note_file(file_path) -> str
delete_note_file(file_path, soft=False, data_dir=None) -> None
calculate_content_hash(content) -> str
validate_note_path(file_path, data_dir) -> bool
```
### From models.py
```python
Note.from_row(row, data_dir) -> Note
Note.content -> str (property, lazy-loaded)
Note.to_dict(include_content=False) -> dict
Note.verify_integrity() -> bool
```
### From database.py
```python
get_db() -> sqlite3.Connection
db.execute(query, params) -> Cursor
db.commit() -> None
db.rollback() -> None
```
---
## Error Handling Quick Reference
### When to Raise vs Return None
| Scenario | Action |
|----------|--------|
| Note not found in get_note() | Return None |
| Note not found in update_note() | Raise NoteNotFoundError |
| Note not found in delete_note() | Return None (idempotent) |
| Empty content | Raise InvalidNoteDataError |
| File write fails | Raise NoteSyncError |
| Database fails | Raise NoteSyncError |
| Invalid parameters | Raise ValueError |
### Error Message Examples
```python
# NoteNotFoundError
raise NoteNotFoundError(
slug,
f"Note '{slug}' does not exist or has been deleted"
)
# InvalidNoteDataError
raise InvalidNoteDataError(
'content',
content,
"Note content cannot be empty. Please provide markdown content."
)
# NoteSyncError
raise NoteSyncError(
'create',
f"Database insert failed: {str(e)}",
f"Failed to create note. File written but database update failed."
)
```
---
## Security Checklist
- [ ] All SQL queries use parameterized binding (no string interpolation)
- [ ] order_by field validated against whitelist
- [ ] All file paths validated with validate_note_path()
- [ ] No symlinks followed in path operations
- [ ] Content validated (not empty)
- [ ] Slug validated before use in file paths
- [ ] No code execution from user content
---
## Final Checks
Before submitting Phase 2.1 as complete:
- [ ] All 5 functions implemented
- [ ] All 3 exceptions implemented
- [ ] Full type hints on all functions
- [ ] Comprehensive docstrings with examples
- [ ] Test coverage > 90%
- [ ] All tests passing
- [ ] Black formatting applied
- [ ] flake8 linting passes (no errors)
- [ ] Integration test passes (full CRUD cycle)
- [ ] No orphaned files in test runs
- [ ] No orphaned database records in test runs
- [ ] Error messages are clear and actionable
- [ ] Performance targets met:
- [ ] create_note < 20ms
- [ ] get_note < 10ms
- [ ] list_notes < 10ms
- [ ] update_note < 20ms
- [ ] delete_note < 10ms
---
## Quick Command Reference
```bash
# Run tests
pytest tests/test_notes.py -v
# Check coverage
pytest tests/test_notes.py --cov=starpunk.notes --cov-report=term-missing
# Format code
black starpunk/notes.py tests/test_notes.py
# Lint code
flake8 starpunk/notes.py --max-line-length=100
# Type check (optional)
mypy starpunk/notes.py
```
---
## What's Next?
After completing Phase 2.1:
**Phase 3: Authentication**
- IndieLogin OAuth flow
- Session management
- Admin access control
**Phase 4: Web Routes**
- Admin interface (create/edit/delete notes)
- Public note views
- Template rendering
**Phase 5: Micropub**
- Micropub endpoint
- Token validation
- IndieWeb API compliance
**Phase 6: RSS Feed**
- Feed generation
- RFC-822 date formatting
- Published notes only
---
## Help & Resources
- **Full Design Doc**: [phase-2.1-notes-management.md](/home/phil/Projects/starpunk/docs/design/phase-2.1-notes-management.md)
- **Utilities Design**: [phase-1.1-core-utilities.md](/home/phil/Projects/starpunk/docs/design/phase-1.1-core-utilities.md)
- **Models Design**: [phase-1.2-data-models.md](/home/phil/Projects/starpunk/docs/design/phase-1.2-data-models.md)
- **Coding Standards**: [python-coding-standards.md](/home/phil/Projects/starpunk/docs/standards/python-coding-standards.md)
- **Implementation Plan**: [implementation-plan.md](/home/phil/Projects/starpunk/docs/projectplan/v1/implementation-plan.md)
Remember: "Every line of code must justify its existence. When in doubt, leave it out."

View File

@@ -0,0 +1,795 @@
# Project Structure Design
## Purpose
This document defines the complete directory and file structure for the StarPunk project. It provides the exact layout that developer agents should create, including all directories, their purposes, file organization, and naming conventions.
## Philosophy
The project structure follows these principles:
- **Flat is better than nested**: Avoid deep directory hierarchies
- **Conventional is better than clever**: Use standard Python project layout
- **Obvious is better than hidden**: Clear directory names over abbreviations
- **Data is sacred**: User data isolated in dedicated directory
## Complete Directory Structure
```
starpunk/
├── .venv/ # Python virtual environment (gitignored)
├── .env # Environment configuration (gitignored)
├── .env.example # Configuration template (committed)
├── .gitignore # Git ignore rules
├── README.md # Project documentation
├── CLAUDE.MD # AI agent requirements document
├── LICENSE # Project license
├── requirements.txt # Python dependencies
├── requirements-dev.txt # Development dependencies
├── app.py # Main application entry point
├── starpunk/ # Application package
│ ├── __init__.py # Package initialization
│ ├── config.py # Configuration management
│ ├── database.py # Database operations
│ ├── models.py # Data models
│ ├── auth.py # Authentication logic
│ ├── micropub.py # Micropub endpoint
│ ├── feed.py # RSS feed generation
│ ├── notes.py # Note management
│ └── utils.py # Helper functions
├── static/ # Static assets
│ ├── css/
│ │ └── style.css # Main stylesheet (~200 lines)
│ └── js/
│ └── preview.js # Optional markdown preview
├── templates/ # Jinja2 templates
│ ├── base.html # Base layout template
│ ├── index.html # Homepage (note list)
│ ├── note.html # Single note view
│ ├── feed.xml # RSS feed template
│ └── admin/
│ ├── base.html # Admin base layout
│ ├── login.html # Login form
│ ├── dashboard.html # Admin dashboard
│ ├── new.html # Create note form
│ └── edit.html # Edit note form
├── data/ # User data directory (gitignored)
│ ├── notes/ # Markdown note files
│ │ ├── 2024/
│ │ │ ├── 11/
│ │ │ │ ├── first-note.md
│ │ │ │ └── second-note.md
│ │ │ └── 12/
│ │ │ └── december-note.md
│ │ └── 2025/
│ │ └── 01/
│ │ └── new-year.md
│ └── starpunk.db # SQLite database
├── tests/ # Test suite
│ ├── __init__.py
│ ├── conftest.py # Pytest configuration and fixtures
│ ├── test_auth.py # Authentication tests
│ ├── test_database.py # Database tests
│ ├── test_micropub.py # Micropub endpoint tests
│ ├── test_feed.py # RSS feed tests
│ ├── test_notes.py # Note management tests
│ └── test_utils.py # Utility function tests
└── docs/ # Architecture documentation
├── architecture/
│ ├── overview.md
│ ├── components.md
│ ├── data-flow.md
│ ├── security.md
│ ├── deployment.md
│ └── technology-stack.md
├── decisions/
│ ├── ADR-001-python-web-framework.md
│ ├── ADR-002-flask-extensions.md
│ ├── ADR-003-frontend-technology.md
│ ├── ADR-004-file-based-note-storage.md
│ ├── ADR-005-indielogin-authentication.md
│ └── ADR-006-python-virtual-environment-uv.md
├── design/
│ ├── project-structure.md (this file)
│ ├── database-schema.md
│ ├── api-contracts.md
│ ├── initial-files.md
│ └── component-interfaces.md
└── standards/
├── documentation-organization.md
├── python-coding-standards.md
├── development-setup.md
└── testing-strategy.md
```
## Directory Purposes
### Root Directory (`/`)
**Purpose**: Project root containing configuration and entry point.
**Key Files**:
- `app.py` - Main Flask application (import app from starpunk package)
- `.env` - Environment variables (NEVER commit)
- `.env.example` - Template for configuration
- `requirements.txt` - Production dependencies
- `requirements-dev.txt` - Development tools
- `README.md` - User-facing documentation
- `CLAUDE.MD` - AI agent instructions
- `LICENSE` - Open source license
**Rationale**: Flat root with minimal files makes project easy to understand at a glance.
---
### Application Package (`starpunk/`)
**Purpose**: Core application code organized as Python package.
**Structure**: Flat module layout (no sub-packages in V1)
**Modules**:
| Module | Purpose | Approximate LOC |
|--------|---------|-----------------|
| `__init__.py` | Package init, create Flask app | 50 |
| `config.py` | Load configuration from .env | 75 |
| `database.py` | SQLite operations, schema management | 200 |
| `models.py` | Data models (Note, Session, Token) | 150 |
| `auth.py` | IndieAuth flow, session management | 200 |
| `micropub.py` | Micropub endpoint implementation | 250 |
| `feed.py` | RSS feed generation | 100 |
| `notes.py` | Note CRUD, file operations | 300 |
| `utils.py` | Slug generation, hashing, helpers | 100 |
**Total Estimated**: ~1,425 LOC for core application
**Naming Convention**:
- Lowercase with underscores
- Singular nouns for single-purpose modules (`config.py`, not `configs.py`)
- Plural for collections (`notes.py` manages many notes)
- Descriptive names over abbreviations (`database.py`, not `db.py`)
**Import Strategy**:
```python
# In app.py
from starpunk import create_app
app = create_app()
```
**Rationale**: Flat package structure is simpler than nested sub-packages. All modules are peers. No circular dependency issues.
---
### Static Assets (`static/`)
**Purpose**: CSS, JavaScript, and other static files served directly.
**Organization**:
```
static/
├── css/
│ └── style.css # Single stylesheet
└── js/
└── preview.js # Optional markdown preview
```
**CSS Structure** (`style.css`):
- ~200 lines total
- CSS custom properties for theming
- Mobile-first responsive design
- Semantic HTML with minimal classes
**JavaScript Structure** (`preview.js`):
- Optional enhancement only
- Vanilla JavaScript (no frameworks)
- Markdown preview using marked.js from CDN
- Degrades gracefully if disabled
**URL Pattern**: `/static/{type}/{file}`
- Example: `/static/css/style.css`
- Example: `/static/js/preview.js`
**Future Assets** (V2+):
- Favicon: `static/favicon.ico`
- Icons: `static/icons/`
- Images: `static/images/` (if needed)
**Rationale**: Standard Flask static file convention. Simple, flat structure since we have minimal assets.
---
### Templates (`templates/`)
**Purpose**: Jinja2 HTML templates for server-side rendering.
**Organization**:
```
templates/
├── base.html # Public site base layout
├── index.html # Homepage (extends base.html)
├── note.html # Single note (extends base.html)
├── feed.xml # RSS feed (no layout)
└── admin/
├── base.html # Admin base layout
├── login.html # Login form
├── dashboard.html # Admin dashboard
├── new.html # Create note
└── edit.html # Edit note
```
**Template Hierarchy**:
```
base.html (public)
├── index.html (note list)
└── note.html (single note)
admin/base.html
├── admin/dashboard.html
├── admin/new.html
└── admin/edit.html
admin/login.html (no base, standalone)
feed.xml (no base, XML output)
```
**Naming Convention**:
- Lowercase with hyphens for multi-word names
- Descriptive names (`dashboard.html`, not `dash.html`)
- Base templates clearly named (`base.html`)
**Template Features**:
- Microformats2 markup (h-entry, h-card)
- Semantic HTML5
- Accessible markup (ARIA labels)
- Mobile-responsive
- Progressive enhancement
**Rationale**: Standard Flask template convention. Admin templates in subdirectory keeps them organized.
---
### Data Directory (`data/`)
**Purpose**: User-created content and database. This is the sacred directory.
**Structure**:
```
data/
├── notes/
│ └── {YYYY}/
│ └── {MM}/
│ └── {slug}.md
└── starpunk.db
```
**Notes Directory** (`data/notes/`):
- **Pattern**: Year/Month hierarchy (`YYYY/MM/`)
- **Files**: Markdown files with slug names
- **Example**: `data/notes/2024/11/my-first-note.md`
**Database File** (`data/starpunk.db`):
- SQLite database
- Contains metadata, sessions, tokens
- NOT content (content is in .md files)
**Permissions**:
- Directory: 755 (rwxr-xr-x)
- Files: 644 (rw-r--r--)
- Database: 644 (rw-r--r--)
**Backup Strategy**:
```bash
# Simple backup
tar -czf starpunk-backup-$(date +%Y%m%d).tar.gz data/
# Or rsync
rsync -av data/ /backup/starpunk/
```
**Gitignore**: ENTIRE `data/` directory must be gitignored.
**Rationale**: User data is completely isolated. Easy to backup. Portable across systems.
---
### Tests (`tests/`)
**Purpose**: Complete test suite using pytest.
**Organization**:
```
tests/
├── __init__.py # Empty, marks as package
├── conftest.py # Pytest fixtures and configuration
├── test_auth.py # Authentication tests
├── test_database.py # Database operations tests
├── test_micropub.py # Micropub endpoint tests
├── test_feed.py # RSS feed generation tests
├── test_notes.py # Note management tests
└── test_utils.py # Utility function tests
```
**Naming Convention**:
- All test files: `test_{module}.py`
- All test functions: `def test_{function_name}():`
- Fixtures in `conftest.py`
**Test Organization**:
- One test file per application module
- Integration tests in same file as unit tests
- Use fixtures for common setup (database, app context)
**Coverage Target**: >80% for V1
**Rationale**: Standard pytest convention. Easy to run all tests or specific modules.
---
### Documentation (`docs/`)
**Purpose**: Architecture, decisions, designs, and standards.
**Organization**: See [Documentation Organization Standard](/home/phil/Projects/starpunk/docs/standards/documentation-organization.md)
```
docs/
├── architecture/ # System-level architecture
├── decisions/ # ADRs (Architecture Decision Records)
├── design/ # Detailed technical designs
└── standards/ # Coding and process standards
```
**Key Documents**:
- `architecture/overview.md` - System architecture
- `architecture/technology-stack.md` - Complete tech stack
- `decisions/ADR-*.md` - All architectural decisions
- `design/project-structure.md` - This document
- `standards/python-coding-standards.md` - Code style
**Rationale**: Clear separation of document types. Easy to find relevant documentation.
---
## File Naming Conventions
### Python Files
- **Modules**: `lowercase_with_underscores.py`
- **Packages**: `lowercase` (no underscores if possible)
- **Classes**: `PascalCase` (in code, not filenames)
- **Functions**: `lowercase_with_underscores` (in code)
**Examples**:
```
starpunk/database.py # Good
starpunk/Database.py # Bad
starpunk/db.py # Bad (use full word)
starpunk/note_manager.py # Good if needed
```
### Markdown Files (Notes)
- **Pattern**: `{slug}.md`
- **Slug format**: `lowercase-with-hyphens`
- **Valid characters**: `a-z`, `0-9`, `-` (hyphen)
- **No spaces, no underscores, no special characters**
**Examples**:
```
first-note.md # Good
my-thoughts-on-python.md # Good
2024-11-18-daily-note.md # Good (date prefix okay)
First Note.md # Bad (spaces)
first_note.md # Bad (underscores)
first-note.markdown # Bad (use .md)
```
### Template Files
- **Pattern**: `lowercase.html` or `lowercase-name.html`
- **XML**: `.xml` extension for RSS feed
**Examples**:
```
base.html # Good
note.html # Good
admin/dashboard.html # Good
admin/new-note.html # Good (if multi-word)
admin/NewNote.html # Bad
```
### Documentation Files
- **Pattern**: `lowercase-with-hyphens.md`
- **ADRs**: `ADR-{NNN}-{short-title}.md`
**Examples**:
```
project-structure.md # Good
database-schema.md # Good
ADR-001-python-web-framework.md # Good
ProjectStructure.md # Bad
project_structure.md # Bad (use hyphens)
```
### Configuration Files
- **Pattern**: Standard conventions (`.env`, `requirements.txt`, etc.)
**Examples**:
```
.env # Good
.env.example # Good
requirements.txt # Good
requirements-dev.txt # Good
.gitignore # Good
```
---
## Gitignore Requirements
The `.gitignore` file MUST include the following:
```gitignore
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
# Virtual Environment
.venv/
venv/
env/
ENV/
.venv.*
# Environment Configuration
.env
*.env
!.env.example
# User Data (CRITICAL - NEVER COMMIT)
data/
*.db
*.sqlite
*.sqlite3
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store
# Testing
.pytest_cache/
.coverage
htmlcov/
*.cover
.hypothesis/
# Distribution
dist/
build/
*.egg-info/
*.egg
# Logs
*.log
logs/
```
**Critical Rules**:
1. **NEVER commit `data/` directory** - contains user data
2. **NEVER commit `.env` file** - contains secrets
3. **NEVER commit `.venv/`** - virtual environment is local
4. **ALWAYS commit `.env.example`** - template is safe
---
## Directory Creation Order
When initializing project, create in this order:
1. **Root files**:
```bash
touch .gitignore
touch .env.example
touch README.md
touch CLAUDE.MD
touch LICENSE
touch requirements.txt
touch requirements-dev.txt
```
2. **Application package**:
```bash
mkdir -p starpunk
touch starpunk/__init__.py
```
3. **Static assets**:
```bash
mkdir -p static/css static/js
touch static/css/style.css
touch static/js/preview.js
```
4. **Templates**:
```bash
mkdir -p templates/admin
touch templates/base.html
touch templates/index.html
touch templates/note.html
touch templates/feed.xml
touch templates/admin/base.html
touch templates/admin/login.html
touch templates/admin/dashboard.html
touch templates/admin/new.html
touch templates/admin/edit.html
```
5. **Data directory** (created on first run):
```bash
mkdir -p data/notes
```
6. **Tests**:
```bash
mkdir -p tests
touch tests/__init__.py
touch tests/conftest.py
```
7. **Documentation** (mostly exists):
```bash
mkdir -p docs/architecture docs/decisions docs/design docs/standards
```
---
## Path Standards
### Absolute vs Relative Paths
**In Code**:
- Use relative paths from project root
- Let Flask handle path resolution
- Use `pathlib.Path` for file operations
**In Configuration**:
- Support both absolute and relative paths
- Relative paths are relative to project root
- Document clearly in `.env.example`
**In Documentation**:
- Use absolute paths for clarity
- Example: `/home/phil/Projects/starpunk/data/notes`
**In Agent Operations**:
- ALWAYS use absolute paths
- Never rely on current working directory
- See ADR-006 for details
### Path Construction Examples
```python
from pathlib import Path
# Project root
PROJECT_ROOT = Path(__file__).parent.parent
# Data directory
DATA_DIR = PROJECT_ROOT / "data"
NOTES_DIR = DATA_DIR / "notes"
DB_PATH = DATA_DIR / "starpunk.db"
# Static files
STATIC_DIR = PROJECT_ROOT / "static"
CSS_DIR = STATIC_DIR / "css"
# Templates (handled by Flask)
TEMPLATE_DIR = PROJECT_ROOT / "templates"
```
---
## File Size Guidelines
### Target Sizes
| File Type | Target Size | Maximum Size | Rationale |
|-----------|-------------|--------------|-----------|
| Python module | 100-300 LOC | 500 LOC | Keep modules focused |
| Template | 50-100 lines | 200 lines | Use template inheritance |
| CSS | 200 LOC total | 300 LOC | Minimal styling only |
| JavaScript | 100 LOC | 200 LOC | Optional feature only |
| Markdown note | 50-500 words | No limit | User content |
| Test file | 100-200 LOC | 400 LOC | One module per file |
**If file exceeds maximum**: Consider splitting into multiple modules or refactoring.
---
## Module Organization
### starpunk/__init__.py
**Purpose**: Package initialization and Flask app factory.
**Contents**:
- `create_app()` function
- Blueprint registration
- Configuration loading
**Example**:
```python
from flask import Flask
def create_app():
app = Flask(__name__)
# Load configuration
from starpunk.config import load_config
load_config(app)
# Initialize database
from starpunk.database import init_db
init_db(app)
# Register blueprints
from starpunk.routes import public, admin, api
app.register_blueprint(public.bp)
app.register_blueprint(admin.bp)
app.register_blueprint(api.bp)
return app
```
### app.py (Root)
**Purpose**: Application entry point for Flask.
**Contents**:
```python
from starpunk import create_app
app = create_app()
if __name__ == '__main__':
app.run(debug=True)
```
**Rationale**: Minimal entry point. All logic in package.
---
## Import Organization
Follow PEP 8 import ordering:
1. Standard library imports
2. Third-party imports
3. Local application imports
**Example**:
```python
# Standard library
import os
import sqlite3
from pathlib import Path
from datetime import datetime
# Third-party
from flask import Flask, render_template
import markdown
import httpx
# Local
from starpunk.config import load_config
from starpunk.database import get_db
from starpunk.models import Note
```
---
## Future Structure Extensions (V2+)
### Potential Additions
**Media Uploads**:
```
data/
├── notes/
├── media/
│ └── {YYYY}/{MM}/
│ └── {filename}
└── starpunk.db
```
**Migrations**:
```
migrations/
├── 001_initial_schema.sql
├── 002_add_media_table.sql
└── 003_add_tags.sql
```
**Deployment**:
```
deploy/
├── systemd/
│ └── starpunk.service
├── nginx/
│ └── starpunk.conf
└── docker/
└── Dockerfile
```
**Do NOT create these in V1** - only when needed.
---
## Verification Checklist
After creating project structure, verify:
- [ ] All directories exist
- [ ] `.gitignore` is configured correctly
- [ ] `.env.example` exists (`.env` does not)
- [ ] `data/` directory is gitignored
- [ ] All `__init__.py` files exist in Python packages
- [ ] Template hierarchy is correct
- [ ] Static files are in correct locations
- [ ] Tests directory mirrors application structure
- [ ] Documentation is organized correctly
- [ ] No placeholder files committed (except templates)
---
## Anti-Patterns to Avoid
**Don't**:
- Create deeply nested directories (max 3 levels)
- Use abbreviations in directory names (`tpl/` instead of `templates/`)
- Mix code and data (keep `data/` separate)
- Put configuration in code (use `.env`)
- Create empty placeholder directories
- Use both `src/` and package name (pick one - we chose package)
- Create `scripts/`, `bin/`, `tools/` directories in V1 (YAGNI)
- Put tests inside application package (keep separate)
**Do**:
- Keep structure flat where possible
- Use standard conventions (Flask, Python, pytest)
- Separate concerns (code, data, tests, docs, static)
- Make structure obvious to newcomers
- Follow principle: "Every directory must justify its existence"
---
## Summary
**Total Directories**: 12 top-level directories/files
**Total Python Modules**: ~9 in starpunk package
**Total Templates**: 9 HTML/XML files
**Total LOC Estimate**: ~1,500 LOC application + 500 LOC tests = 2,000 total
**Philosophy**: Simple, flat, conventional structure. Every file and directory has a clear purpose. User data is isolated and portable. Documentation is comprehensive and organized.
## References
- [Python Packaging Guide](https://packaging.python.org/)
- [Flask Project Layout](https://flask.palletsprojects.com/en/3.0.x/tutorial/layout/)
- [Pytest Good Practices](https://docs.pytest.org/en/stable/goodpractices.html)
- [PEP 8 Package Layout](https://peps.python.org/pep-0008/)
- [ADR-004: File-Based Note Storage](/home/phil/Projects/starpunk/docs/decisions/ADR-004-file-based-note-storage.md)
- [ADR-006: Python Virtual Environment](/home/phil/Projects/starpunk/docs/decisions/ADR-006-python-virtual-environment-uv.md)
- [Documentation Organization Standard](/home/phil/Projects/starpunk/docs/standards/documentation-organization.md)