# Phase 1.2: Data Models Design ## Overview This document provides a complete, implementation-ready design for Phase 1.2 of the StarPunk V1 implementation plan: Data Models. The models module (`starpunk/models.py`) provides data model classes that wrap database rows and provide clean interfaces for working with notes, sessions, tokens, and authentication state. **Priority**: CRITICAL - Used by all feature modules **Estimated Effort**: 3-4 hours **Dependencies**: `starpunk/utils.py`, `starpunk/database.py` **File**: `starpunk/models.py` ## Design Principles 1. **Immutability** - Model instances are immutable after creation 2. **Type safety** - Full type hints on all properties and methods 3. **Lazy loading** - Expensive operations (file I/O, HTML rendering) only happen when needed 4. **Clean interfaces** - Properties for data access, methods for operations 5. **No business logic** - Models represent data, not behavior (behavior goes in `notes.py`, `auth.py`) 6. **Testable** - Easy to construct for testing, no hidden dependencies ## Architecture Decision: Dataclasses vs Regular Classes After evaluating options, we'll use **Python dataclasses with `frozen=True`** for immutability: **Advantages**: - Automatic `__init__`, `__repr__`, `__eq__` - Type hints built-in - Immutability via `frozen=True` - Clean, minimal boilerplate - Standard library (no dependencies) **Alternatives Considered**: - Named tuples: Too limited, no methods - Regular classes: Too much boilerplate - Pydantic: Overkill, adds dependency ## Module Structure ```python """ Data models for StarPunk This module provides data model classes that wrap database rows and provide clean interfaces for working with notes, sessions, tokens, and authentication state. All models are immutable and use dataclasses for clean structure. """ # Standard library imports from dataclasses import dataclass, field from datetime import datetime, timedelta from pathlib import Path from typing import Optional # Third-party imports import markdown # Local imports from starpunk.utils import ( read_note_file, calculate_content_hash, validate_note_path ) # Constants DEFAULT_SESSION_EXPIRY_DAYS = 30 DEFAULT_AUTH_STATE_EXPIRY_MINUTES = 5 DEFAULT_TOKEN_EXPIRY_DAYS = 90 MARKDOWN_EXTENSIONS = ['extra', 'codehilite', 'nl2br'] # Model classes (defined below) ``` ## Model Specifications ### 1. Note Model #### Purpose Represents a note/post with metadata and lazy-loaded content. The Note model: - Wraps a database row from the `notes` table - Provides access to all note metadata - Lazy-loads markdown content from files - Lazy-renders HTML with caching - Generates permalinks and extracts metadata #### Database Schema Reference ```sql CREATE TABLE notes ( id INTEGER PRIMARY KEY AUTOINCREMENT, slug TEXT UNIQUE NOT NULL, file_path TEXT UNIQUE NOT NULL, published BOOLEAN DEFAULT 0, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, content_hash TEXT ); ``` #### Type Signature ```python @dataclass(frozen=True) class Note: """ Represents a note/post This is an immutable data model that wraps a database row and provides access to note metadata and lazy-loaded content. Content is read from files on-demand, and HTML rendering is cached. Attributes: id: Database ID (primary key) slug: URL-safe slug (unique) file_path: Path to markdown file (relative to data directory) published: Whether note is published (visible publicly) created_at: Creation timestamp (UTC) updated_at: Last update timestamp (UTC) content_hash: SHA-256 hash of content (for integrity checking) _data_dir: Base data directory path (used for file loading) _cached_content: Cached markdown content (lazy-loaded) _cached_html: Cached rendered HTML (lazy-loaded) Properties: content: Markdown content (loaded from file, cached) html: Rendered HTML content (cached) title: Extracted title (first line or slug) excerpt: Short excerpt for previews permalink: Public URL path is_published: Alias for published (more readable) Methods: from_row: Create Note from database row to_dict: Serialize to dictionary (for JSON) verify_integrity: Check if file content matches hash Examples: >>> # Create from database row >>> row = db.execute("SELECT * FROM notes WHERE slug = ?", (slug,)).fetchone() >>> note = Note.from_row(row, data_dir=Path("data")) >>> # Access metadata >>> print(note.slug) 'my-first-note' >>> print(note.published) True >>> # Lazy-load content >>> content = note.content # Reads file on first access >>> content = note.content # Returns cached value on subsequent access >>> # Render HTML >>> html = note.html # Renders markdown on first access >>> html = note.html # Returns cached value >>> # Extract metadata >>> title = note.title >>> permalink = note.permalink """ # Core fields from database id: int slug: str file_path: str published: bool created_at: datetime updated_at: datetime content_hash: Optional[str] = None # Internal fields (not from database) _data_dir: Path = field(repr=False, compare=False) _cached_content: Optional[str] = field(default=None, repr=False, compare=False, init=False) _cached_html: Optional[str] = field(default=None, repr=False, compare=False, init=False) @classmethod def from_row(cls, row: dict, data_dir: Path) -> 'Note': """ Create Note instance from database row Args: row: Database row (sqlite3.Row or dict with column names) data_dir: Base data directory path Returns: Note instance Examples: >>> row = db.execute("SELECT * FROM notes WHERE id = ?", (1,)).fetchone() >>> note = Note.from_row(row, Path("data")) """ pass @property def content(self) -> str: """ Get note content (lazy-loaded from file) Reads markdown content from file on first access, then caches. Subsequent accesses return cached value. Returns: Markdown content as string Raises: FileNotFoundError: If note file doesn't exist OSError: If file cannot be read Examples: >>> content = note.content >>> print(content) This is my note content... """ pass @property def html(self) -> str: """ Get rendered HTML content (lazy-rendered and cached) Renders markdown to HTML on first access, then caches. Uses Python-Markdown with extensions for code highlighting, tables, and other features. Returns: Rendered HTML as string Examples: >>> html = note.html >>> print(html)

This is my note content...

""" pass @property def title(self) -> str: """ Extract title from content Returns first line of content, or uses slug as fallback. Strips markdown heading syntax (# ) if present. Returns: Title string Examples: >>> # Content: "# My First Note\n\nContent here..." >>> note.title 'My First Note' >>> # Content: "Just a note without heading" >>> note.title 'Just a note without heading' """ pass @property def excerpt(self) -> str: """ Generate short excerpt for previews Returns first 200 characters of content (plain text, no markdown). Strips markdown formatting and adds ellipsis if truncated. Returns: Excerpt string Examples: >>> note.excerpt 'This is my note content. It has some interesting points...' """ pass @property def permalink(self) -> str: """ Generate permalink (public URL path) Returns: URL path string (e.g., '/note/my-first-note') Examples: >>> note.permalink '/note/my-first-note' """ pass @property def is_published(self) -> bool: """ Alias for published (more readable) Returns: True if note is published, False otherwise """ pass def to_dict(self, include_content: bool = False, include_html: bool = False) -> dict: """ Serialize note to dictionary Converts note to dictionary for JSON serialization or template rendering. Can optionally include content and rendered HTML. Args: include_content: Include markdown content in output include_html: Include rendered HTML in output Returns: Dictionary with note data Examples: >>> note.to_dict() { 'id': 1, 'slug': 'my-first-note', 'title': 'My First Note', 'published': True, 'created_at': '2024-11-18T14:30:00Z', 'updated_at': '2024-11-18T14:30:00Z', 'permalink': '/note/my-first-note' } >>> note.to_dict(include_content=True, include_html=True) { # ... same as above, plus: 'content': 'Markdown content...', 'html': '

Rendered HTML...

' } """ pass def verify_integrity(self) -> bool: """ Verify content matches stored hash Reads content from file, calculates hash, and compares with stored content_hash. Used to detect external file modifications. Returns: True if hash matches, False otherwise Examples: >>> note.verify_integrity() True # File has not been modified >>> # Someone edits file externally >>> note.verify_integrity() False # Hash mismatch detected """ pass ``` #### Implementation Details **Lazy Loading Strategy**: - `_cached_content` and `_cached_html` are private fields - Use `object.__setattr__()` to set cached values (frozen dataclass workaround) - Check if cached value is None before loading **HTML Rendering**: - Use `markdown.markdown()` with extensions - Extensions: `extra` (tables, code blocks), `codehilite` (syntax highlighting), `nl2br` (newlines to
) - No sanitization needed (user controls all content) **Title Extraction**: - Split content on newlines - Take first non-empty line - Strip markdown heading syntax: `# `, `## `, etc. - Fallback to slug if content is empty **Excerpt Generation**: - Remove markdown syntax (simple regex) - Take first 200 characters - Add ellipsis if truncated - Strip to word boundary **Edge Cases**: - File doesn't exist → raise FileNotFoundError - File is empty → return empty string for content - No heading in content → use first line as title - Content all whitespace → use slug as title --- ### 2. Session Model #### Purpose Represents an authenticated user session for admin access via IndieLogin. #### Database Schema Reference ```sql CREATE TABLE sessions ( id INTEGER PRIMARY KEY AUTOINCREMENT, session_token TEXT UNIQUE NOT NULL, me TEXT NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, expires_at TIMESTAMP NOT NULL, last_used_at TIMESTAMP ); ``` #### Type Signature ```python @dataclass(frozen=True) class Session: """ Represents an authenticated session Sessions are created after successful IndieLogin authentication and stored in the database. They have a configurable expiry time and can be extended on use. Attributes: id: Database ID (primary key) session_token: Unique session token (stored in cookie) me: Authenticated user's URL (from IndieLogin) created_at: Session creation timestamp (UTC) expires_at: Session expiration timestamp (UTC) last_used_at: Last activity timestamp (UTC, nullable) Properties: is_expired: Check if session has expired is_active: Check if session is not expired age: Age of session (timedelta) time_until_expiry: Time remaining until expiry (timedelta) Methods: from_row: Create Session from database row to_dict: Serialize to dictionary is_valid: Comprehensive validation (checks expiry, token format) with_updated_last_used: Create new session with updated last_used_at Examples: >>> # Create from database row >>> row = db.execute("SELECT * FROM sessions WHERE session_token = ?", (token,)).fetchone() >>> session = Session.from_row(row) >>> # Check if expired >>> if session.is_expired: ... print("Session expired") >>> # Check validity >>> if session.is_valid(): ... print("Session is valid") >>> # Update last used >>> updated_session = session.with_updated_last_used() """ # Core fields from database id: int session_token: str me: str created_at: datetime expires_at: datetime last_used_at: Optional[datetime] = None @classmethod def from_row(cls, row: dict) -> 'Session': """ Create Session instance from database row Args: row: Database row (sqlite3.Row or dict) Returns: Session instance Examples: >>> row = db.execute("SELECT * FROM sessions WHERE id = ?", (1,)).fetchone() >>> session = Session.from_row(row) """ pass @property def is_expired(self) -> bool: """ Check if session has expired Compares expires_at with current UTC time. Returns: True if expired, False otherwise Examples: >>> session.is_expired False """ pass @property def is_active(self) -> bool: """ Check if session is active (not expired) Returns: True if not expired, False otherwise Examples: >>> session.is_active True """ pass @property def age(self) -> timedelta: """ Get age of session Returns: Timedelta since session creation Examples: >>> session.age datetime.timedelta(days=2, seconds=3600) """ pass @property def time_until_expiry(self) -> timedelta: """ Get time remaining until expiry Returns: Timedelta until expiry (negative if already expired) Examples: >>> session.time_until_expiry datetime.timedelta(days=28) """ pass def is_valid(self) -> bool: """ Comprehensive session validation Checks: - Session is not expired - Session token is not empty - User 'me' URL is valid Returns: True if session is valid, False otherwise Examples: >>> session.is_valid() True """ pass def with_updated_last_used(self) -> 'Session': """ Create new session with updated last_used_at Since sessions are immutable, this returns a new Session instance with last_used_at set to current UTC time. Returns: New Session instance with updated timestamp Examples: >>> updated = session.with_updated_last_used() >>> updated.last_used_at datetime.datetime(2024, 11, 18, 15, 30, 0) """ pass def to_dict(self) -> dict: """ Serialize session to dictionary Returns: Dictionary with session data (excludes sensitive token) Examples: >>> session.to_dict() { 'id': 1, 'me': 'https://alice.example.com', 'created_at': '2024-11-18T14:30:00Z', 'expires_at': '2024-12-18T14:30:00Z', 'is_active': True } """ pass ``` #### Implementation Details **Expiry Checking**: - Use `datetime.utcnow()` for current time - Compare with `expires_at` field - Return boolean (no exceptions) **Validation**: - Check expiry first - Validate token is not empty - Validate 'me' URL format (basic check) - Return False if any check fails **Immutability**: - Use dataclass `frozen=True` - `with_updated_last_used()` returns new instance - Use `dataclasses.replace()` for creating modified copy **Edge Cases**: - Expired session → is_valid() returns False - Empty token → is_valid() returns False - Negative time_until_expiry → session expired - None last_used_at → valid (just created) --- ### 3. Token Model #### Purpose Represents a Micropub access token with scope permissions. #### Database Schema Reference ```sql CREATE TABLE tokens ( token TEXT PRIMARY KEY, me TEXT NOT NULL, client_id TEXT, scope TEXT, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, expires_at TIMESTAMP ); ``` #### Type Signature ```python @dataclass(frozen=True) class Token: """ Represents a Micropub access token Tokens are used to authenticate Micropub API requests. They have associated scopes that determine what actions the token can perform. Attributes: token: Token string (primary key, used in Authorization header) me: User's URL client_id: Client application URL (optional) scope: Space-separated scope string (e.g., "create update delete") created_at: Token creation timestamp (UTC) expires_at: Token expiration timestamp (UTC, nullable for no expiry) Properties: is_expired: Check if token has expired is_active: Check if token is not expired scopes: List of individual scopes Methods: from_row: Create Token from database row to_dict: Serialize to dictionary has_scope: Check if token has specific scope is_valid: Comprehensive validation Examples: >>> # Create from database row >>> row = db.execute("SELECT * FROM tokens WHERE token = ?", (token,)).fetchone() >>> token_obj = Token.from_row(row) >>> # Check scope >>> if token_obj.has_scope('create'): ... print("Can create posts") >>> # Check if expired >>> if token_obj.is_active: ... print("Token is active") """ # Core fields from database token: str me: str client_id: Optional[str] = None scope: Optional[str] = None created_at: datetime = field(default_factory=lambda: datetime.utcnow()) expires_at: Optional[datetime] = None @classmethod def from_row(cls, row: dict) -> 'Token': """ Create Token instance from database row Args: row: Database row (sqlite3.Row or dict) Returns: Token instance Examples: >>> row = db.execute("SELECT * FROM tokens WHERE token = ?", (token,)).fetchone() >>> token_obj = Token.from_row(row) """ pass @property def is_expired(self) -> bool: """ Check if token has expired If expires_at is None, token never expires. Otherwise, compares with current UTC time. Returns: True if expired, False otherwise Examples: >>> token_obj.is_expired False """ pass @property def is_active(self) -> bool: """ Check if token is active (not expired) Returns: True if not expired, False otherwise """ pass @property def scopes(self) -> list[str]: """ Get list of individual scopes Splits scope string on whitespace. Returns: List of scope strings Examples: >>> # scope = "create update delete" >>> token_obj.scopes ['create', 'update', 'delete'] >>> # scope = None >>> token_obj.scopes [] """ pass def has_scope(self, required_scope: str) -> bool: """ Check if token has required scope Args: required_scope: Scope to check for (e.g., 'create', 'update') Returns: True if token has scope, False otherwise Examples: >>> token_obj.has_scope('create') True >>> token_obj.has_scope('delete') False """ pass def is_valid(self, required_scope: Optional[str] = None) -> bool: """ Comprehensive token validation Checks: - Token is not expired - Token string is not empty - If required_scope provided, token has that scope Args: required_scope: Optional scope to check Returns: True if token is valid, False otherwise Examples: >>> token_obj.is_valid() True >>> token_obj.is_valid(required_scope='create') True """ pass def to_dict(self) -> dict: """ Serialize token to dictionary Returns: Dictionary with token data (excludes sensitive token value) Examples: >>> token_obj.to_dict() { 'me': 'https://alice.example.com', 'client_id': 'https://quill.p3k.io', 'scope': 'create update', 'created_at': '2024-11-18T14:30:00Z', 'is_active': True } """ pass ``` #### Implementation Details **Scope Handling**: - Scopes stored as space-separated string - Split on whitespace to get list - Case-sensitive comparison - Empty/None scope → empty list **Expiry Handling**: - None expires_at → never expires - Compare with current UTC time - Return boolean **Validation**: - Check expiry first - Validate token not empty - If required_scope provided, check has_scope() - Return False if any check fails **Edge Cases**: - None expires_at → is_active returns True - Empty scope → has_scope() always returns False - None scope → treated as empty - Whitespace in scope string → handled by split() --- ### 4. AuthState Model #### Purpose Represents a CSRF state token for OAuth authentication flows. #### Database Schema Reference ```sql CREATE TABLE auth_state ( state TEXT PRIMARY KEY, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, expires_at TIMESTAMP NOT NULL ); ``` #### Type Signature ```python @dataclass(frozen=True) class AuthState: """ Represents an OAuth state token (CSRF protection) State tokens are short-lived (5 minutes) and single-use. They prevent CSRF attacks during the IndieLogin authentication flow. Attributes: state: Random state token string (primary key) created_at: Token creation timestamp (UTC) expires_at: Token expiration timestamp (UTC) Properties: is_expired: Check if state token has expired is_active: Check if state token is not expired age: Age of state token Methods: from_row: Create AuthState from database row to_dict: Serialize to dictionary is_valid: Comprehensive validation Examples: >>> # Create from database row >>> row = db.execute("SELECT * FROM auth_state WHERE state = ?", (state,)).fetchone() >>> auth_state = AuthState.from_row(row) >>> # Check if expired >>> if auth_state.is_active: ... print("State token is still valid") >>> # Check validity >>> if auth_state.is_valid(): ... print("Valid state token") """ # Core fields from database state: str created_at: datetime expires_at: datetime @classmethod def from_row(cls, row: dict) -> 'AuthState': """ Create AuthState instance from database row Args: row: Database row (sqlite3.Row or dict) Returns: AuthState instance Examples: >>> row = db.execute("SELECT * FROM auth_state WHERE state = ?", (state,)).fetchone() >>> auth_state = AuthState.from_row(row) """ pass @property def is_expired(self) -> bool: """ Check if state token has expired State tokens have short expiry (5 minutes default). Returns: True if expired, False otherwise Examples: >>> auth_state.is_expired False """ pass @property def is_active(self) -> bool: """ Check if state token is active (not expired) Returns: True if not expired, False otherwise """ pass @property def age(self) -> timedelta: """ Get age of state token Returns: Timedelta since token creation Examples: >>> auth_state.age datetime.timedelta(seconds=120) """ pass def is_valid(self) -> bool: """ Comprehensive state token validation Checks: - Token is not expired - Token string is not empty Returns: True if valid, False otherwise Examples: >>> auth_state.is_valid() True """ pass def to_dict(self) -> dict: """ Serialize state token to dictionary Returns: Dictionary with state data Examples: >>> auth_state.to_dict() { 'created_at': '2024-11-18T14:30:00Z', 'expires_at': '2024-11-18T14:35:00Z', 'is_active': True } """ pass ``` #### Implementation Details **Expiry Handling**: - Short expiry: 5 minutes from creation - Use `datetime.utcnow()` for comparison - Return boolean **Validation**: - Check expiry - Validate state not empty - Return False if any check fails **Security Notes**: - State tokens are single-use (caller must delete after verification) - Short expiry prevents replay attacks - Random generation (handled by caller, not model) **Edge Cases**: - Expired state → is_valid() returns False - Empty state → is_valid() returns False - Age > expiry → is_expired returns True --- ## Constants ```python # Session configuration DEFAULT_SESSION_EXPIRY_DAYS = 30 SESSION_EXTENSION_ON_USE = True # Auth state configuration DEFAULT_AUTH_STATE_EXPIRY_MINUTES = 5 # Token configuration DEFAULT_TOKEN_EXPIRY_DAYS = 90 # Markdown rendering MARKDOWN_EXTENSIONS = ['extra', 'codehilite', 'nl2br'] # Content limits MAX_TITLE_LENGTH = 200 EXCERPT_LENGTH = 200 ``` ## Error Handling ### Exception Types Models use standard Python exceptions: - **FileNotFoundError**: Note file doesn't exist (Note.content property) - **OSError**: File read error (Note.content property) - **ValueError**: Invalid data (from_row with malformed data) - **TypeError**: Wrong type passed to method ### Error Messages ```python # Good raise FileNotFoundError( f"Note file not found: {file_path}. " f"Database and filesystem may be out of sync." ) raise ValueError( f"Invalid datetime format for created_at: {row['created_at']}" ) # Bad raise FileNotFoundError("File not found") raise ValueError("Bad data") ``` ## Testing Strategy ### Test Coverage Requirements - Minimum 90% code coverage for models.py - Test all model creation methods - Test all properties and methods - Test edge cases and error conditions - Test lazy loading behavior - Test caching behavior ### Test Organization File: `tests/test_models.py` ```python """ Tests for data models Organized by model: - Note model tests - Session model tests - Token model tests - AuthState model tests """ import pytest from datetime import datetime, timedelta from pathlib import Path from starpunk.models import Note, Session, Token, AuthState class TestNoteModel: """Test Note model""" def test_from_row(self): """Test creating Note from database row""" row = { 'id': 1, 'slug': 'test-note', 'file_path': 'notes/2024/11/test-note.md', 'published': True, 'created_at': datetime(2024, 11, 18, 14, 30), 'updated_at': datetime(2024, 11, 18, 14, 30), 'content_hash': 'abc123...' } note = Note.from_row(row, Path('data')) assert note.slug == 'test-note' assert note.published is True def test_content_lazy_loading(self, tmp_path): """Test content is lazy-loaded from file""" # Create test note file note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md' note_file.parent.mkdir(parents=True) note_file.write_text('# Test Note\n\nContent here.') # Create note instance note = Note( id=1, slug='test', file_path='notes/2024/11/test.md', published=True, created_at=datetime.utcnow(), updated_at=datetime.utcnow(), _data_dir=tmp_path ) # Content should be loaded on first access content = note.content assert '# Test Note' in content # Second access should return cached value content2 = note.content assert content2 == content def test_html_rendering(self, tmp_path): """Test HTML rendering with caching""" # Create test note note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md' note_file.parent.mkdir(parents=True) note_file.write_text('# Heading\n\nParagraph here.') note = Note( id=1, slug='test', file_path='notes/2024/11/test.md', published=True, created_at=datetime.utcnow(), updated_at=datetime.utcnow(), _data_dir=tmp_path ) html = note.html assert '

' in html or '

' in html # Depends on markdown extensions assert '

Paragraph here.

' in html def test_title_extraction(self, tmp_path): """Test title extraction from content""" note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md' note_file.parent.mkdir(parents=True) note_file.write_text('# My Note Title\n\nContent.') note = Note( id=1, slug='test', file_path='notes/2024/11/test.md', published=True, created_at=datetime.utcnow(), updated_at=datetime.utcnow(), _data_dir=tmp_path ) assert note.title == 'My Note Title' def test_title_fallback_to_slug(self): """Test title falls back to slug if no heading""" # Note with no file (will fail to load content) note = Note( id=1, slug='my-test-note', file_path='notes/2024/11/test.md', published=True, created_at=datetime.utcnow(), updated_at=datetime.utcnow(), _data_dir=Path('/nonexistent') ) # Should fall back to slug # (actual implementation may vary) # This tests the fallback logic def test_permalink(self): """Test permalink generation""" note = Note( id=1, slug='my-note', file_path='notes/2024/11/my-note.md', published=True, created_at=datetime.utcnow(), updated_at=datetime.utcnow(), _data_dir=Path('data') ) assert note.permalink == '/note/my-note' def test_to_dict(self): """Test serialization to dictionary""" note = Note( id=1, slug='test', file_path='notes/2024/11/test.md', published=True, created_at=datetime(2024, 11, 18, 14, 30), updated_at=datetime(2024, 11, 18, 14, 30), _data_dir=Path('data') ) data = note.to_dict() assert data['slug'] == 'test' assert data['published'] is True assert 'content' not in data # Not included by default def test_to_dict_with_content(self, tmp_path): """Test serialization includes content when requested""" note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md' note_file.parent.mkdir(parents=True) note_file.write_text('Test content') note = Note( id=1, slug='test', file_path='notes/2024/11/test.md', published=True, created_at=datetime.utcnow(), updated_at=datetime.utcnow(), _data_dir=tmp_path ) data = note.to_dict(include_content=True) assert 'content' in data assert data['content'] == 'Test content' def test_verify_integrity(self, tmp_path): """Test content integrity verification""" note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md' note_file.parent.mkdir(parents=True) content = 'Test content' note_file.write_text(content) # Calculate correct hash from starpunk.utils import calculate_content_hash content_hash = calculate_content_hash(content) note = Note( id=1, slug='test', file_path='notes/2024/11/test.md', published=True, created_at=datetime.utcnow(), updated_at=datetime.utcnow(), content_hash=content_hash, _data_dir=tmp_path ) # Should verify successfully assert note.verify_integrity() is True # Modify file note_file.write_text('Modified content') # Should fail verification assert note.verify_integrity() is False class TestSessionModel: """Test Session model""" def test_from_row(self): """Test creating Session from database row""" row = { 'id': 1, 'session_token': 'abc123', 'me': 'https://alice.example.com', 'created_at': datetime(2024, 11, 18, 14, 30), 'expires_at': datetime(2024, 12, 18, 14, 30), 'last_used_at': None } session = Session.from_row(row) assert session.session_token == 'abc123' assert session.me == 'https://alice.example.com' def test_is_expired_false(self): """Test is_expired returns False for active session""" session = Session( id=1, session_token='abc123', me='https://alice.example.com', created_at=datetime.utcnow(), expires_at=datetime.utcnow() + timedelta(days=30) ) assert session.is_expired is False assert session.is_active is True def test_is_expired_true(self): """Test is_expired returns True for expired session""" session = Session( id=1, session_token='abc123', me='https://alice.example.com', created_at=datetime.utcnow() - timedelta(days=31), expires_at=datetime.utcnow() - timedelta(days=1) ) assert session.is_expired is True assert session.is_active is False def test_is_valid(self): """Test comprehensive validation""" session = Session( id=1, session_token='abc123', me='https://alice.example.com', created_at=datetime.utcnow(), expires_at=datetime.utcnow() + timedelta(days=30) ) assert session.is_valid() is True def test_is_valid_expired(self): """Test validation fails for expired session""" session = Session( id=1, session_token='abc123', me='https://alice.example.com', created_at=datetime.utcnow() - timedelta(days=31), expires_at=datetime.utcnow() - timedelta(days=1) ) assert session.is_valid() is False def test_with_updated_last_used(self): """Test creating session with updated timestamp""" original = Session( id=1, session_token='abc123', me='https://alice.example.com', created_at=datetime.utcnow(), expires_at=datetime.utcnow() + timedelta(days=30), last_used_at=None ) updated = original.with_updated_last_used() assert updated.last_used_at is not None assert updated.session_token == original.session_token def test_age(self): """Test age calculation""" session = Session( id=1, session_token='abc123', me='https://alice.example.com', created_at=datetime.utcnow() - timedelta(hours=2), expires_at=datetime.utcnow() + timedelta(days=30) ) age = session.age assert age.total_seconds() >= 7200 # At least 2 hours def test_to_dict(self): """Test serialization to dictionary""" session = Session( id=1, session_token='abc123', me='https://alice.example.com', created_at=datetime(2024, 11, 18, 14, 30), expires_at=datetime(2024, 12, 18, 14, 30) ) data = session.to_dict() assert 'me' in data assert 'session_token' not in data # Excluded for security class TestTokenModel: """Test Token model""" def test_from_row(self): """Test creating Token from database row""" row = { 'token': 'xyz789', 'me': 'https://alice.example.com', 'client_id': 'https://quill.p3k.io', 'scope': 'create update', 'created_at': datetime(2024, 11, 18, 14, 30), 'expires_at': None } token = Token.from_row(row) assert token.token == 'xyz789' assert token.scope == 'create update' def test_scopes_property(self): """Test scope parsing""" token = Token( token='xyz789', me='https://alice.example.com', scope='create update delete' ) assert token.scopes == ['create', 'update', 'delete'] def test_scopes_empty(self): """Test empty scope""" token = Token( token='xyz789', me='https://alice.example.com', scope=None ) assert token.scopes == [] def test_has_scope(self): """Test scope checking""" token = Token( token='xyz789', me='https://alice.example.com', scope='create update' ) assert token.has_scope('create') is True assert token.has_scope('update') is True assert token.has_scope('delete') is False def test_is_expired_never_expires(self): """Test token with no expiry""" token = Token( token='xyz789', me='https://alice.example.com', expires_at=None ) assert token.is_expired is False assert token.is_active is True def test_is_expired_with_expiry(self): """Test token expiry""" token = Token( token='xyz789', me='https://alice.example.com', expires_at=datetime.utcnow() - timedelta(days=1) ) assert token.is_expired is True assert token.is_active is False def test_is_valid(self): """Test validation""" token = Token( token='xyz789', me='https://alice.example.com', scope='create' ) assert token.is_valid() is True def test_is_valid_with_required_scope(self): """Test validation with scope requirement""" token = Token( token='xyz789', me='https://alice.example.com', scope='create update' ) assert token.is_valid(required_scope='create') is True assert token.is_valid(required_scope='delete') is False class TestAuthStateModel: """Test AuthState model""" def test_from_row(self): """Test creating AuthState from database row""" row = { 'state': 'random123', 'created_at': datetime(2024, 11, 18, 14, 30), 'expires_at': datetime(2024, 11, 18, 14, 35) } auth_state = AuthState.from_row(row) assert auth_state.state == 'random123' def test_is_expired(self): """Test expiry checking""" # Active state auth_state = AuthState( state='random123', created_at=datetime.utcnow(), expires_at=datetime.utcnow() + timedelta(minutes=5) ) assert auth_state.is_expired is False assert auth_state.is_active is True # Expired state expired = AuthState( state='random123', created_at=datetime.utcnow() - timedelta(minutes=10), expires_at=datetime.utcnow() - timedelta(minutes=5) ) assert expired.is_expired is True assert expired.is_active is False def test_is_valid(self): """Test validation""" auth_state = AuthState( state='random123', created_at=datetime.utcnow(), expires_at=datetime.utcnow() + timedelta(minutes=5) ) assert auth_state.is_valid() is True def test_age(self): """Test age calculation""" auth_state = AuthState( state='random123', created_at=datetime.utcnow() - timedelta(minutes=2), expires_at=datetime.utcnow() + timedelta(minutes=3) ) age = auth_state.age assert age.total_seconds() >= 120 # At least 2 minutes ``` ## Usage Examples ### Creating and Using a Note ```python from pathlib import Path from starpunk.models import Note from starpunk.database import get_db # Get note from database db = get_db(app) row = db.execute("SELECT * FROM notes WHERE slug = ?", ("my-note",)).fetchone() # Create Note instance note = Note.from_row(row, data_dir=Path("data")) # Access metadata (no file I/O) print(note.slug) # "my-note" print(note.published) # True print(note.created_at) # datetime object # Lazy-load content (reads file on first access) content = note.content # File I/O happens here print(content) # "# My Note\n\nContent here..." # Render HTML (uses cached content, renders on first access) html = note.html # Markdown rendering happens here print(html) # "

My Note

Content here...

" # Extract metadata title = note.title # "My Note" permalink = note.permalink # "/note/my-note" excerpt = note.excerpt # "Content here..." # Serialize for templates data = note.to_dict(include_content=True, include_html=True) # Use in template: render_template('note.html', note=data) ``` ### Validating a Session ```python from starpunk.models import Session from starpunk.database import get_db # Get session from database db = get_db(app) row = db.execute( "SELECT * FROM sessions WHERE session_token = ?", (session_token,) ).fetchone() if row is None: # Session not found return False # Create Session instance session = Session.from_row(row) # Validate session if not session.is_valid(): # Session expired or invalid return False # Update last used timestamp updated_session = session.with_updated_last_used() # Save to database db.execute( "UPDATE sessions SET last_used_at = ? WHERE id = ?", (updated_session.last_used_at, updated_session.id) ) db.commit() # Session is valid, store user info g.user_me = session.me ``` ### Checking Token Scope ```python from starpunk.models import Token from starpunk.database import get_db # Get token from Authorization header auth_header = request.headers.get('Authorization', '') if not auth_header.startswith('Bearer '): return {'error': 'unauthorized'}, 401 token_value = auth_header[7:] # Remove "Bearer " # Get from database db = get_db(app) row = db.execute("SELECT * FROM tokens WHERE token = ?", (token_value,)).fetchone() if row is None: return {'error': 'invalid_token'}, 401 # Create Token instance token = Token.from_row(row) # Validate with required scope if not token.is_valid(required_scope='create'): return {'error': 'insufficient_scope'}, 403 # Token is valid with required scope # Proceed with request ``` ### Verifying Auth State ```python from starpunk.models import AuthState from starpunk.database import get_db # Get state from callback parameter state_param = request.args.get('state') if not state_param: return {'error': 'missing_state'}, 400 # Get from database db = get_db(app) row = db.execute("SELECT * FROM auth_state WHERE state = ?", (state_param,)).fetchone() if row is None: return {'error': 'invalid_state'}, 400 # Create AuthState instance auth_state = AuthState.from_row(row) # Validate (checks expiry) if not auth_state.is_valid(): return {'error': 'expired_state'}, 400 # Delete state (single-use) db.execute("DELETE FROM auth_state WHERE state = ?", (state_param,)) db.commit() # State is valid, continue OAuth flow ``` ## Integration with Other Modules ### Integration with notes.py ```python # In starpunk/notes.py from starpunk.models import Note from starpunk.database import get_db from starpunk.utils import generate_slug, make_slug_unique def get_note(slug: str, data_dir: Path) -> Optional[Note]: """Get note by slug""" db = get_db(app) row = db.execute( "SELECT * FROM notes WHERE slug = ?", (slug,) ).fetchone() if row is None: return None return Note.from_row(row, data_dir=data_dir) def list_notes(published_only: bool = True, data_dir: Path) -> list[Note]: """List notes""" db = get_db(app) query = "SELECT * FROM notes" if published_only: query += " WHERE published = 1" query += " ORDER BY created_at DESC" rows = db.execute(query).fetchall() return [Note.from_row(row, data_dir=data_dir) for row in rows] ``` ### Integration with auth.py ```python # In starpunk/auth.py from starpunk.models import Session, Token, AuthState from starpunk.database import get_db def validate_session(session_token: str) -> Optional[Session]: """Validate session token""" db = get_db(app) row = db.execute( "SELECT * FROM sessions WHERE session_token = ?", (session_token,) ).fetchone() if row is None: return None session = Session.from_row(row) if not session.is_valid(): return None # Update last used updated = session.with_updated_last_used() db.execute( "UPDATE sessions SET last_used_at = ? WHERE id = ?", (updated.last_used_at, updated.id) ) db.commit() return updated def validate_micropub_token(token_value: str, required_scope: str) -> Optional[Token]: """Validate Micropub token""" db = get_db(app) row = db.execute( "SELECT * FROM tokens WHERE token = ?", (token_value,) ).fetchone() if row is None: return None token = Token.from_row(row) if not token.is_valid(required_scope=required_scope): return None return token ``` ## Performance Considerations ### Lazy Loading Benefits - **Note content**: Only loaded when accessed, not on model creation - **HTML rendering**: Only rendered when accessed, cached afterward - **Memory efficiency**: Can create many Note instances without loading all content - **Database-only operations**: Can list notes without reading files ### Caching Strategy - **Content caching**: First `note.content` access reads file, caches in `_cached_content` - **HTML caching**: First `note.html` access renders markdown, caches in `_cached_html` - **Cache invalidation**: Not needed (models are immutable, represent point-in-time) ### Performance Targets - **Model creation**: < 1ms (just data assignment) - **from_row()**: < 1ms (datetime parsing is fast) - **Content loading**: < 5ms for typical note - **HTML rendering**: < 10ms for typical note - **Property access**: < 0.1ms (cached) ## Security Considerations ### Session Security - **Token exposure**: Never include session_token in to_dict() output - **Expiry enforcement**: Always check is_expired before using session - **Last used tracking**: Update last_used_at on every use - **Secure comparison**: Use constant-time comparison for tokens (in auth.py, not model) ### Token Security - **Scope validation**: Always validate required scopes - **Expiry checking**: Check expiry before accepting token - **Token exposure**: Exclude token value from to_dict() output - **Scope parsing**: Handle malformed scope strings gracefully ### File Path Security - **Path validation**: Note model doesn't validate paths (caller must use validate_note_path) - **File reading**: Let exceptions propagate (FileNotFoundError, OSError) - **No writes**: Models are read-only, never write files ### State Token Security - **Single-use**: Caller must delete state after verification - **Short expiry**: 5 minutes prevents replay attacks - **Expiry enforcement**: Always check is_expired before using ## Dependencies ### Standard Library - `dataclasses` - Dataclass decorator and utilities - `datetime` - Datetime and timedelta - `pathlib` - Path operations - `typing` - Type hints ### Third-Party - `markdown` - Markdown to HTML rendering ### Internal - `starpunk.utils` - File operations, content hashing - `starpunk.database` - Not imported (models are independent of DB) ## Future Enhancements (V2+) ### Note Tags ```python @dataclass(frozen=True) class Note: # ... existing fields tags: list[str] = field(default_factory=list) @property def tag_string(self) -> str: """Get comma-separated tag string""" return ', '.join(self.tags) ``` ### Note Replies/Comments ```python @dataclass(frozen=True) class Note: # ... existing fields in_reply_to: Optional[str] = None # URL being replied to @property def is_reply(self) -> bool: return self.in_reply_to is not None ``` ### Media Attachments ```python @dataclass(frozen=True) class Media: """Represents a media attachment""" id: int filename: str file_path: str mime_type: str created_at: datetime note_id: Optional[int] = None ``` ### Webmentions ```python @dataclass(frozen=True) class Webmention: """Represents a received webmention""" id: int source: str target: str verified: bool created_at: datetime note_id: Optional[int] = None ``` ## Acceptance Criteria - [ ] All four models implemented (Note, Session, Token, AuthState) - [ ] All models use frozen dataclasses - [ ] All models have from_row() class method - [ ] All models have to_dict() method - [ ] Note model implements lazy loading for content - [ ] Note model implements HTML rendering with caching - [ ] Note model extracts title and excerpt - [ ] Session model validates expiry - [ ] Token model validates scopes - [ ] AuthState model validates expiry - [ ] All properties have type hints - [ ] All methods have comprehensive docstrings - [ ] Test coverage >90% - [ ] All tests pass - [ ] Code formatted with Black - [ ] Code passes flake8 linting - [ ] No security issues - [ ] Integration examples work ## References - [ADR-004: File-Based Note Storage](/home/phil/Projects/starpunk/docs/decisions/ADR-004-file-based-note-storage.md) - [Database Schema](/home/phil/Projects/starpunk/starpunk/database.py) - [Phase 1.1: Core Utilities](/home/phil/Projects/starpunk/docs/design/phase-1.1-core-utilities.md) - [Python Coding Standards](/home/phil/Projects/starpunk/docs/standards/python-coding-standards.md) - [Python Dataclasses Documentation](https://docs.python.org/3/library/dataclasses.html) - [Python-Markdown Documentation](https://python-markdown.github.io/) ## Implementation Checklist When implementing `starpunk/models.py`, complete in this order: 1. [ ] Create file with module docstring 2. [ ] Add imports and constants 3. [ ] Implement Note model - [ ] Basic dataclass structure - [ ] from_row() class method - [ ] content property (lazy loading) - [ ] html property (rendering + caching) - [ ] title property - [ ] excerpt property - [ ] permalink property - [ ] to_dict() method - [ ] verify_integrity() method 4. [ ] Implement Session model - [ ] Basic dataclass structure - [ ] from_row() class method - [ ] is_expired property - [ ] is_valid() method - [ ] with_updated_last_used() method - [ ] to_dict() method 5. [ ] Implement Token model - [ ] Basic dataclass structure - [ ] from_row() class method - [ ] scopes property - [ ] has_scope() method - [ ] is_valid() method - [ ] to_dict() method 6. [ ] Implement AuthState model - [ ] Basic dataclass structure - [ ] from_row() class method - [ ] is_expired property - [ ] is_valid() method - [ ] to_dict() method 7. [ ] Create tests/test_models.py 8. [ ] Write tests for all models 9. [ ] Run tests and achieve >90% coverage 10. [ ] Format with Black 11. [ ] Lint with flake8 12. [ ] Review all docstrings 13. [ ] Test integration with utils.py **Estimated Time**: 3-4 hours for implementation + tests