Files
StarPunk/docs/design/phase-1.2-data-models.md
2025-11-18 19:21:31 -07:00

53 KiB

Phase 1.2: Data Models Design

Overview

This document provides a complete, implementation-ready design for Phase 1.2 of the StarPunk V1 implementation plan: Data Models. The models module (starpunk/models.py) provides data model classes that wrap database rows and provide clean interfaces for working with notes, sessions, tokens, and authentication state.

Priority: CRITICAL - Used by all feature modules Estimated Effort: 3-4 hours Dependencies: starpunk/utils.py, starpunk/database.py File: starpunk/models.py

Design Principles

  1. Immutability - Model instances are immutable after creation
  2. Type safety - Full type hints on all properties and methods
  3. Lazy loading - Expensive operations (file I/O, HTML rendering) only happen when needed
  4. Clean interfaces - Properties for data access, methods for operations
  5. No business logic - Models represent data, not behavior (behavior goes in notes.py, auth.py)
  6. Testable - Easy to construct for testing, no hidden dependencies

Architecture Decision: Dataclasses vs Regular Classes

After evaluating options, we'll use Python dataclasses with frozen=True for immutability:

Advantages:

  • Automatic __init__, __repr__, __eq__
  • Type hints built-in
  • Immutability via frozen=True
  • Clean, minimal boilerplate
  • Standard library (no dependencies)

Alternatives Considered:

  • Named tuples: Too limited, no methods
  • Regular classes: Too much boilerplate
  • Pydantic: Overkill, adds dependency

Module Structure

"""
Data models for StarPunk

This module provides data model classes that wrap database rows and provide
clean interfaces for working with notes, sessions, tokens, and authentication
state. All models are immutable and use dataclasses for clean structure.
"""

# Standard library imports
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from pathlib import Path
from typing import Optional

# Third-party imports
import markdown

# Local imports
from starpunk.utils import (
    read_note_file,
    calculate_content_hash,
    validate_note_path
)

# Constants
DEFAULT_SESSION_EXPIRY_DAYS = 30
DEFAULT_AUTH_STATE_EXPIRY_MINUTES = 5
DEFAULT_TOKEN_EXPIRY_DAYS = 90
MARKDOWN_EXTENSIONS = ['extra', 'codehilite', 'nl2br']

# Model classes (defined below)

Model Specifications

1. Note Model

Purpose

Represents a note/post with metadata and lazy-loaded content. The Note model:

  • Wraps a database row from the notes table
  • Provides access to all note metadata
  • Lazy-loads markdown content from files
  • Lazy-renders HTML with caching
  • Generates permalinks and extracts metadata

Database Schema Reference

CREATE TABLE notes (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    slug TEXT UNIQUE NOT NULL,
    file_path TEXT UNIQUE NOT NULL,
    published BOOLEAN DEFAULT 0,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    content_hash TEXT
);

Type Signature

@dataclass(frozen=True)
class Note:
    """
    Represents a note/post

    This is an immutable data model that wraps a database row and provides
    access to note metadata and lazy-loaded content. Content is read from
    files on-demand, and HTML rendering is cached.

    Attributes:
        id: Database ID (primary key)
        slug: URL-safe slug (unique)
        file_path: Path to markdown file (relative to data directory)
        published: Whether note is published (visible publicly)
        created_at: Creation timestamp (UTC)
        updated_at: Last update timestamp (UTC)
        content_hash: SHA-256 hash of content (for integrity checking)
        _data_dir: Base data directory path (used for file loading)
        _cached_content: Cached markdown content (lazy-loaded)
        _cached_html: Cached rendered HTML (lazy-loaded)

    Properties:
        content: Markdown content (loaded from file, cached)
        html: Rendered HTML content (cached)
        title: Extracted title (first line or slug)
        excerpt: Short excerpt for previews
        permalink: Public URL path
        is_published: Alias for published (more readable)

    Methods:
        from_row: Create Note from database row
        to_dict: Serialize to dictionary (for JSON)
        verify_integrity: Check if file content matches hash

    Examples:
        >>> # Create from database row
        >>> row = db.execute("SELECT * FROM notes WHERE slug = ?", (slug,)).fetchone()
        >>> note = Note.from_row(row, data_dir=Path("data"))

        >>> # Access metadata
        >>> print(note.slug)
        'my-first-note'
        >>> print(note.published)
        True

        >>> # Lazy-load content
        >>> content = note.content  # Reads file on first access
        >>> content = note.content  # Returns cached value on subsequent access

        >>> # Render HTML
        >>> html = note.html  # Renders markdown on first access
        >>> html = note.html  # Returns cached value

        >>> # Extract metadata
        >>> title = note.title
        >>> permalink = note.permalink
    """
    # Core fields from database
    id: int
    slug: str
    file_path: str
    published: bool
    created_at: datetime
    updated_at: datetime
    content_hash: Optional[str] = None

    # Internal fields (not from database)
    _data_dir: Path = field(repr=False, compare=False)
    _cached_content: Optional[str] = field(default=None, repr=False, compare=False, init=False)
    _cached_html: Optional[str] = field(default=None, repr=False, compare=False, init=False)

    @classmethod
    def from_row(cls, row: dict, data_dir: Path) -> 'Note':
        """
        Create Note instance from database row

        Args:
            row: Database row (sqlite3.Row or dict with column names)
            data_dir: Base data directory path

        Returns:
            Note instance

        Examples:
            >>> row = db.execute("SELECT * FROM notes WHERE id = ?", (1,)).fetchone()
            >>> note = Note.from_row(row, Path("data"))
        """
        pass

    @property
    def content(self) -> str:
        """
        Get note content (lazy-loaded from file)

        Reads markdown content from file on first access, then caches.
        Subsequent accesses return cached value.

        Returns:
            Markdown content as string

        Raises:
            FileNotFoundError: If note file doesn't exist
            OSError: If file cannot be read

        Examples:
            >>> content = note.content
            >>> print(content)
            This is my note content...
        """
        pass

    @property
    def html(self) -> str:
        """
        Get rendered HTML content (lazy-rendered and cached)

        Renders markdown to HTML on first access, then caches.
        Uses Python-Markdown with extensions for code highlighting,
        tables, and other features.

        Returns:
            Rendered HTML as string

        Examples:
            >>> html = note.html
            >>> print(html)
            <p>This is my note content...</p>
        """
        pass

    @property
    def title(self) -> str:
        """
        Extract title from content

        Returns first line of content, or uses slug as fallback.
        Strips markdown heading syntax (# ) if present.

        Returns:
            Title string

        Examples:
            >>> # Content: "# My First Note\n\nContent here..."
            >>> note.title
            'My First Note'

            >>> # Content: "Just a note without heading"
            >>> note.title
            'Just a note without heading'
        """
        pass

    @property
    def excerpt(self) -> str:
        """
        Generate short excerpt for previews

        Returns first 200 characters of content (plain text, no markdown).
        Strips markdown formatting and adds ellipsis if truncated.

        Returns:
            Excerpt string

        Examples:
            >>> note.excerpt
            'This is my note content. It has some interesting points...'
        """
        pass

    @property
    def permalink(self) -> str:
        """
        Generate permalink (public URL path)

        Returns:
            URL path string (e.g., '/note/my-first-note')

        Examples:
            >>> note.permalink
            '/note/my-first-note'
        """
        pass

    @property
    def is_published(self) -> bool:
        """
        Alias for published (more readable)

        Returns:
            True if note is published, False otherwise
        """
        pass

    def to_dict(self, include_content: bool = False, include_html: bool = False) -> dict:
        """
        Serialize note to dictionary

        Converts note to dictionary for JSON serialization or template rendering.
        Can optionally include content and rendered HTML.

        Args:
            include_content: Include markdown content in output
            include_html: Include rendered HTML in output

        Returns:
            Dictionary with note data

        Examples:
            >>> note.to_dict()
            {
                'id': 1,
                'slug': 'my-first-note',
                'title': 'My First Note',
                'published': True,
                'created_at': '2024-11-18T14:30:00Z',
                'updated_at': '2024-11-18T14:30:00Z',
                'permalink': '/note/my-first-note'
            }

            >>> note.to_dict(include_content=True, include_html=True)
            {
                # ... same as above, plus:
                'content': 'Markdown content...',
                'html': '<p>Rendered HTML...</p>'
            }
        """
        pass

    def verify_integrity(self) -> bool:
        """
        Verify content matches stored hash

        Reads content from file, calculates hash, and compares with
        stored content_hash. Used to detect external file modifications.

        Returns:
            True if hash matches, False otherwise

        Examples:
            >>> note.verify_integrity()
            True  # File has not been modified

            >>> # Someone edits file externally
            >>> note.verify_integrity()
            False  # Hash mismatch detected
        """
        pass

Implementation Details

Lazy Loading Strategy:

  • _cached_content and _cached_html are private fields
  • Use object.__setattr__() to set cached values (frozen dataclass workaround)
  • Check if cached value is None before loading

HTML Rendering:

  • Use markdown.markdown() with extensions
  • Extensions: extra (tables, code blocks), codehilite (syntax highlighting), nl2br (newlines to
    )
  • No sanitization needed (user controls all content)

Title Extraction:

  • Split content on newlines
  • Take first non-empty line
  • Strip markdown heading syntax: # , ## , etc.
  • Fallback to slug if content is empty

Excerpt Generation:

  • Remove markdown syntax (simple regex)
  • Take first 200 characters
  • Add ellipsis if truncated
  • Strip to word boundary

Edge Cases:

  • File doesn't exist → raise FileNotFoundError
  • File is empty → return empty string for content
  • No heading in content → use first line as title
  • Content all whitespace → use slug as title

2. Session Model

Purpose

Represents an authenticated user session for admin access via IndieLogin.

Database Schema Reference

CREATE TABLE sessions (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    session_token TEXT UNIQUE NOT NULL,
    me TEXT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP NOT NULL,
    last_used_at TIMESTAMP
);

Type Signature

@dataclass(frozen=True)
class Session:
    """
    Represents an authenticated session

    Sessions are created after successful IndieLogin authentication and
    stored in the database. They have a configurable expiry time and can
    be extended on use.

    Attributes:
        id: Database ID (primary key)
        session_token: Unique session token (stored in cookie)
        me: Authenticated user's URL (from IndieLogin)
        created_at: Session creation timestamp (UTC)
        expires_at: Session expiration timestamp (UTC)
        last_used_at: Last activity timestamp (UTC, nullable)

    Properties:
        is_expired: Check if session has expired
        is_active: Check if session is not expired
        age: Age of session (timedelta)
        time_until_expiry: Time remaining until expiry (timedelta)

    Methods:
        from_row: Create Session from database row
        to_dict: Serialize to dictionary
        is_valid: Comprehensive validation (checks expiry, token format)
        with_updated_last_used: Create new session with updated last_used_at

    Examples:
        >>> # Create from database row
        >>> row = db.execute("SELECT * FROM sessions WHERE session_token = ?", (token,)).fetchone()
        >>> session = Session.from_row(row)

        >>> # Check if expired
        >>> if session.is_expired:
        ...     print("Session expired")

        >>> # Check validity
        >>> if session.is_valid():
        ...     print("Session is valid")

        >>> # Update last used
        >>> updated_session = session.with_updated_last_used()
    """
    # Core fields from database
    id: int
    session_token: str
    me: str
    created_at: datetime
    expires_at: datetime
    last_used_at: Optional[datetime] = None

    @classmethod
    def from_row(cls, row: dict) -> 'Session':
        """
        Create Session instance from database row

        Args:
            row: Database row (sqlite3.Row or dict)

        Returns:
            Session instance

        Examples:
            >>> row = db.execute("SELECT * FROM sessions WHERE id = ?", (1,)).fetchone()
            >>> session = Session.from_row(row)
        """
        pass

    @property
    def is_expired(self) -> bool:
        """
        Check if session has expired

        Compares expires_at with current UTC time.

        Returns:
            True if expired, False otherwise

        Examples:
            >>> session.is_expired
            False
        """
        pass

    @property
    def is_active(self) -> bool:
        """
        Check if session is active (not expired)

        Returns:
            True if not expired, False otherwise

        Examples:
            >>> session.is_active
            True
        """
        pass

    @property
    def age(self) -> timedelta:
        """
        Get age of session

        Returns:
            Timedelta since session creation

        Examples:
            >>> session.age
            datetime.timedelta(days=2, seconds=3600)
        """
        pass

    @property
    def time_until_expiry(self) -> timedelta:
        """
        Get time remaining until expiry

        Returns:
            Timedelta until expiry (negative if already expired)

        Examples:
            >>> session.time_until_expiry
            datetime.timedelta(days=28)
        """
        pass

    def is_valid(self) -> bool:
        """
        Comprehensive session validation

        Checks:
        - Session is not expired
        - Session token is not empty
        - User 'me' URL is valid

        Returns:
            True if session is valid, False otherwise

        Examples:
            >>> session.is_valid()
            True
        """
        pass

    def with_updated_last_used(self) -> 'Session':
        """
        Create new session with updated last_used_at

        Since sessions are immutable, this returns a new Session instance
        with last_used_at set to current UTC time.

        Returns:
            New Session instance with updated timestamp

        Examples:
            >>> updated = session.with_updated_last_used()
            >>> updated.last_used_at
            datetime.datetime(2024, 11, 18, 15, 30, 0)
        """
        pass

    def to_dict(self) -> dict:
        """
        Serialize session to dictionary

        Returns:
            Dictionary with session data (excludes sensitive token)

        Examples:
            >>> session.to_dict()
            {
                'id': 1,
                'me': 'https://alice.example.com',
                'created_at': '2024-11-18T14:30:00Z',
                'expires_at': '2024-12-18T14:30:00Z',
                'is_active': True
            }
        """
        pass

Implementation Details

Expiry Checking:

  • Use datetime.utcnow() for current time
  • Compare with expires_at field
  • Return boolean (no exceptions)

Validation:

  • Check expiry first
  • Validate token is not empty
  • Validate 'me' URL format (basic check)
  • Return False if any check fails

Immutability:

  • Use dataclass frozen=True
  • with_updated_last_used() returns new instance
  • Use dataclasses.replace() for creating modified copy

Edge Cases:

  • Expired session → is_valid() returns False
  • Empty token → is_valid() returns False
  • Negative time_until_expiry → session expired
  • None last_used_at → valid (just created)

3. Token Model

Purpose

Represents a Micropub access token with scope permissions.

Database Schema Reference

CREATE TABLE tokens (
    token TEXT PRIMARY KEY,
    me TEXT NOT NULL,
    client_id TEXT,
    scope TEXT,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP
);

Type Signature

@dataclass(frozen=True)
class Token:
    """
    Represents a Micropub access token

    Tokens are used to authenticate Micropub API requests. They have
    associated scopes that determine what actions the token can perform.

    Attributes:
        token: Token string (primary key, used in Authorization header)
        me: User's URL
        client_id: Client application URL (optional)
        scope: Space-separated scope string (e.g., "create update delete")
        created_at: Token creation timestamp (UTC)
        expires_at: Token expiration timestamp (UTC, nullable for no expiry)

    Properties:
        is_expired: Check if token has expired
        is_active: Check if token is not expired
        scopes: List of individual scopes

    Methods:
        from_row: Create Token from database row
        to_dict: Serialize to dictionary
        has_scope: Check if token has specific scope
        is_valid: Comprehensive validation

    Examples:
        >>> # Create from database row
        >>> row = db.execute("SELECT * FROM tokens WHERE token = ?", (token,)).fetchone()
        >>> token_obj = Token.from_row(row)

        >>> # Check scope
        >>> if token_obj.has_scope('create'):
        ...     print("Can create posts")

        >>> # Check if expired
        >>> if token_obj.is_active:
        ...     print("Token is active")
    """
    # Core fields from database
    token: str
    me: str
    client_id: Optional[str] = None
    scope: Optional[str] = None
    created_at: datetime = field(default_factory=lambda: datetime.utcnow())
    expires_at: Optional[datetime] = None

    @classmethod
    def from_row(cls, row: dict) -> 'Token':
        """
        Create Token instance from database row

        Args:
            row: Database row (sqlite3.Row or dict)

        Returns:
            Token instance

        Examples:
            >>> row = db.execute("SELECT * FROM tokens WHERE token = ?", (token,)).fetchone()
            >>> token_obj = Token.from_row(row)
        """
        pass

    @property
    def is_expired(self) -> bool:
        """
        Check if token has expired

        If expires_at is None, token never expires.
        Otherwise, compares with current UTC time.

        Returns:
            True if expired, False otherwise

        Examples:
            >>> token_obj.is_expired
            False
        """
        pass

    @property
    def is_active(self) -> bool:
        """
        Check if token is active (not expired)

        Returns:
            True if not expired, False otherwise
        """
        pass

    @property
    def scopes(self) -> list[str]:
        """
        Get list of individual scopes

        Splits scope string on whitespace.

        Returns:
            List of scope strings

        Examples:
            >>> # scope = "create update delete"
            >>> token_obj.scopes
            ['create', 'update', 'delete']

            >>> # scope = None
            >>> token_obj.scopes
            []
        """
        pass

    def has_scope(self, required_scope: str) -> bool:
        """
        Check if token has required scope

        Args:
            required_scope: Scope to check for (e.g., 'create', 'update')

        Returns:
            True if token has scope, False otherwise

        Examples:
            >>> token_obj.has_scope('create')
            True

            >>> token_obj.has_scope('delete')
            False
        """
        pass

    def is_valid(self, required_scope: Optional[str] = None) -> bool:
        """
        Comprehensive token validation

        Checks:
        - Token is not expired
        - Token string is not empty
        - If required_scope provided, token has that scope

        Args:
            required_scope: Optional scope to check

        Returns:
            True if token is valid, False otherwise

        Examples:
            >>> token_obj.is_valid()
            True

            >>> token_obj.is_valid(required_scope='create')
            True
        """
        pass

    def to_dict(self) -> dict:
        """
        Serialize token to dictionary

        Returns:
            Dictionary with token data (excludes sensitive token value)

        Examples:
            >>> token_obj.to_dict()
            {
                'me': 'https://alice.example.com',
                'client_id': 'https://quill.p3k.io',
                'scope': 'create update',
                'created_at': '2024-11-18T14:30:00Z',
                'is_active': True
            }
        """
        pass

Implementation Details

Scope Handling:

  • Scopes stored as space-separated string
  • Split on whitespace to get list
  • Case-sensitive comparison
  • Empty/None scope → empty list

Expiry Handling:

  • None expires_at → never expires
  • Compare with current UTC time
  • Return boolean

Validation:

  • Check expiry first
  • Validate token not empty
  • If required_scope provided, check has_scope()
  • Return False if any check fails

Edge Cases:

  • None expires_at → is_active returns True
  • Empty scope → has_scope() always returns False
  • None scope → treated as empty
  • Whitespace in scope string → handled by split()

4. AuthState Model

Purpose

Represents a CSRF state token for OAuth authentication flows.

Database Schema Reference

CREATE TABLE auth_state (
    state TEXT PRIMARY KEY,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP NOT NULL
);

Type Signature

@dataclass(frozen=True)
class AuthState:
    """
    Represents an OAuth state token (CSRF protection)

    State tokens are short-lived (5 minutes) and single-use. They prevent
    CSRF attacks during the IndieLogin authentication flow.

    Attributes:
        state: Random state token string (primary key)
        created_at: Token creation timestamp (UTC)
        expires_at: Token expiration timestamp (UTC)

    Properties:
        is_expired: Check if state token has expired
        is_active: Check if state token is not expired
        age: Age of state token

    Methods:
        from_row: Create AuthState from database row
        to_dict: Serialize to dictionary
        is_valid: Comprehensive validation

    Examples:
        >>> # Create from database row
        >>> row = db.execute("SELECT * FROM auth_state WHERE state = ?", (state,)).fetchone()
        >>> auth_state = AuthState.from_row(row)

        >>> # Check if expired
        >>> if auth_state.is_active:
        ...     print("State token is still valid")

        >>> # Check validity
        >>> if auth_state.is_valid():
        ...     print("Valid state token")
    """
    # Core fields from database
    state: str
    created_at: datetime
    expires_at: datetime

    @classmethod
    def from_row(cls, row: dict) -> 'AuthState':
        """
        Create AuthState instance from database row

        Args:
            row: Database row (sqlite3.Row or dict)

        Returns:
            AuthState instance

        Examples:
            >>> row = db.execute("SELECT * FROM auth_state WHERE state = ?", (state,)).fetchone()
            >>> auth_state = AuthState.from_row(row)
        """
        pass

    @property
    def is_expired(self) -> bool:
        """
        Check if state token has expired

        State tokens have short expiry (5 minutes default).

        Returns:
            True if expired, False otherwise

        Examples:
            >>> auth_state.is_expired
            False
        """
        pass

    @property
    def is_active(self) -> bool:
        """
        Check if state token is active (not expired)

        Returns:
            True if not expired, False otherwise
        """
        pass

    @property
    def age(self) -> timedelta:
        """
        Get age of state token

        Returns:
            Timedelta since token creation

        Examples:
            >>> auth_state.age
            datetime.timedelta(seconds=120)
        """
        pass

    def is_valid(self) -> bool:
        """
        Comprehensive state token validation

        Checks:
        - Token is not expired
        - Token string is not empty

        Returns:
            True if valid, False otherwise

        Examples:
            >>> auth_state.is_valid()
            True
        """
        pass

    def to_dict(self) -> dict:
        """
        Serialize state token to dictionary

        Returns:
            Dictionary with state data

        Examples:
            >>> auth_state.to_dict()
            {
                'created_at': '2024-11-18T14:30:00Z',
                'expires_at': '2024-11-18T14:35:00Z',
                'is_active': True
            }
        """
        pass

Implementation Details

Expiry Handling:

  • Short expiry: 5 minutes from creation
  • Use datetime.utcnow() for comparison
  • Return boolean

Validation:

  • Check expiry
  • Validate state not empty
  • Return False if any check fails

Security Notes:

  • State tokens are single-use (caller must delete after verification)
  • Short expiry prevents replay attacks
  • Random generation (handled by caller, not model)

Edge Cases:

  • Expired state → is_valid() returns False
  • Empty state → is_valid() returns False
  • Age > expiry → is_expired returns True

Constants

# Session configuration
DEFAULT_SESSION_EXPIRY_DAYS = 30
SESSION_EXTENSION_ON_USE = True

# Auth state configuration
DEFAULT_AUTH_STATE_EXPIRY_MINUTES = 5

# Token configuration
DEFAULT_TOKEN_EXPIRY_DAYS = 90

# Markdown rendering
MARKDOWN_EXTENSIONS = ['extra', 'codehilite', 'nl2br']

# Content limits
MAX_TITLE_LENGTH = 200
EXCERPT_LENGTH = 200

Error Handling

Exception Types

Models use standard Python exceptions:

  • FileNotFoundError: Note file doesn't exist (Note.content property)
  • OSError: File read error (Note.content property)
  • ValueError: Invalid data (from_row with malformed data)
  • TypeError: Wrong type passed to method

Error Messages

# Good
raise FileNotFoundError(
    f"Note file not found: {file_path}. "
    f"Database and filesystem may be out of sync."
)

raise ValueError(
    f"Invalid datetime format for created_at: {row['created_at']}"
)

# Bad
raise FileNotFoundError("File not found")
raise ValueError("Bad data")

Testing Strategy

Test Coverage Requirements

  • Minimum 90% code coverage for models.py
  • Test all model creation methods
  • Test all properties and methods
  • Test edge cases and error conditions
  • Test lazy loading behavior
  • Test caching behavior

Test Organization

File: tests/test_models.py

"""
Tests for data models

Organized by model:
- Note model tests
- Session model tests
- Token model tests
- AuthState model tests
"""

import pytest
from datetime import datetime, timedelta
from pathlib import Path
from starpunk.models import Note, Session, Token, AuthState


class TestNoteModel:
    """Test Note model"""

    def test_from_row(self):
        """Test creating Note from database row"""
        row = {
            'id': 1,
            'slug': 'test-note',
            'file_path': 'notes/2024/11/test-note.md',
            'published': True,
            'created_at': datetime(2024, 11, 18, 14, 30),
            'updated_at': datetime(2024, 11, 18, 14, 30),
            'content_hash': 'abc123...'
        }
        note = Note.from_row(row, Path('data'))
        assert note.slug == 'test-note'
        assert note.published is True

    def test_content_lazy_loading(self, tmp_path):
        """Test content is lazy-loaded from file"""
        # Create test note file
        note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md'
        note_file.parent.mkdir(parents=True)
        note_file.write_text('# Test Note\n\nContent here.')

        # Create note instance
        note = Note(
            id=1,
            slug='test',
            file_path='notes/2024/11/test.md',
            published=True,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
            _data_dir=tmp_path
        )

        # Content should be loaded on first access
        content = note.content
        assert '# Test Note' in content

        # Second access should return cached value
        content2 = note.content
        assert content2 == content

    def test_html_rendering(self, tmp_path):
        """Test HTML rendering with caching"""
        # Create test note
        note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md'
        note_file.parent.mkdir(parents=True)
        note_file.write_text('# Heading\n\nParagraph here.')

        note = Note(
            id=1,
            slug='test',
            file_path='notes/2024/11/test.md',
            published=True,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
            _data_dir=tmp_path
        )

        html = note.html
        assert '<h1>' in html or '<h1 id="heading">' in html  # Depends on markdown extensions
        assert '<p>Paragraph here.</p>' in html

    def test_title_extraction(self, tmp_path):
        """Test title extraction from content"""
        note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md'
        note_file.parent.mkdir(parents=True)
        note_file.write_text('# My Note Title\n\nContent.')

        note = Note(
            id=1,
            slug='test',
            file_path='notes/2024/11/test.md',
            published=True,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
            _data_dir=tmp_path
        )

        assert note.title == 'My Note Title'

    def test_title_fallback_to_slug(self):
        """Test title falls back to slug if no heading"""
        # Note with no file (will fail to load content)
        note = Note(
            id=1,
            slug='my-test-note',
            file_path='notes/2024/11/test.md',
            published=True,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
            _data_dir=Path('/nonexistent')
        )

        # Should fall back to slug
        # (actual implementation may vary)
        # This tests the fallback logic

    def test_permalink(self):
        """Test permalink generation"""
        note = Note(
            id=1,
            slug='my-note',
            file_path='notes/2024/11/my-note.md',
            published=True,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
            _data_dir=Path('data')
        )

        assert note.permalink == '/note/my-note'

    def test_to_dict(self):
        """Test serialization to dictionary"""
        note = Note(
            id=1,
            slug='test',
            file_path='notes/2024/11/test.md',
            published=True,
            created_at=datetime(2024, 11, 18, 14, 30),
            updated_at=datetime(2024, 11, 18, 14, 30),
            _data_dir=Path('data')
        )

        data = note.to_dict()
        assert data['slug'] == 'test'
        assert data['published'] is True
        assert 'content' not in data  # Not included by default

    def test_to_dict_with_content(self, tmp_path):
        """Test serialization includes content when requested"""
        note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md'
        note_file.parent.mkdir(parents=True)
        note_file.write_text('Test content')

        note = Note(
            id=1,
            slug='test',
            file_path='notes/2024/11/test.md',
            published=True,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
            _data_dir=tmp_path
        )

        data = note.to_dict(include_content=True)
        assert 'content' in data
        assert data['content'] == 'Test content'

    def test_verify_integrity(self, tmp_path):
        """Test content integrity verification"""
        note_file = tmp_path / 'notes' / '2024' / '11' / 'test.md'
        note_file.parent.mkdir(parents=True)
        content = 'Test content'
        note_file.write_text(content)

        # Calculate correct hash
        from starpunk.utils import calculate_content_hash
        content_hash = calculate_content_hash(content)

        note = Note(
            id=1,
            slug='test',
            file_path='notes/2024/11/test.md',
            published=True,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
            content_hash=content_hash,
            _data_dir=tmp_path
        )

        # Should verify successfully
        assert note.verify_integrity() is True

        # Modify file
        note_file.write_text('Modified content')

        # Should fail verification
        assert note.verify_integrity() is False


class TestSessionModel:
    """Test Session model"""

    def test_from_row(self):
        """Test creating Session from database row"""
        row = {
            'id': 1,
            'session_token': 'abc123',
            'me': 'https://alice.example.com',
            'created_at': datetime(2024, 11, 18, 14, 30),
            'expires_at': datetime(2024, 12, 18, 14, 30),
            'last_used_at': None
        }
        session = Session.from_row(row)
        assert session.session_token == 'abc123'
        assert session.me == 'https://alice.example.com'

    def test_is_expired_false(self):
        """Test is_expired returns False for active session"""
        session = Session(
            id=1,
            session_token='abc123',
            me='https://alice.example.com',
            created_at=datetime.utcnow(),
            expires_at=datetime.utcnow() + timedelta(days=30)
        )
        assert session.is_expired is False
        assert session.is_active is True

    def test_is_expired_true(self):
        """Test is_expired returns True for expired session"""
        session = Session(
            id=1,
            session_token='abc123',
            me='https://alice.example.com',
            created_at=datetime.utcnow() - timedelta(days=31),
            expires_at=datetime.utcnow() - timedelta(days=1)
        )
        assert session.is_expired is True
        assert session.is_active is False

    def test_is_valid(self):
        """Test comprehensive validation"""
        session = Session(
            id=1,
            session_token='abc123',
            me='https://alice.example.com',
            created_at=datetime.utcnow(),
            expires_at=datetime.utcnow() + timedelta(days=30)
        )
        assert session.is_valid() is True

    def test_is_valid_expired(self):
        """Test validation fails for expired session"""
        session = Session(
            id=1,
            session_token='abc123',
            me='https://alice.example.com',
            created_at=datetime.utcnow() - timedelta(days=31),
            expires_at=datetime.utcnow() - timedelta(days=1)
        )
        assert session.is_valid() is False

    def test_with_updated_last_used(self):
        """Test creating session with updated timestamp"""
        original = Session(
            id=1,
            session_token='abc123',
            me='https://alice.example.com',
            created_at=datetime.utcnow(),
            expires_at=datetime.utcnow() + timedelta(days=30),
            last_used_at=None
        )

        updated = original.with_updated_last_used()
        assert updated.last_used_at is not None
        assert updated.session_token == original.session_token

    def test_age(self):
        """Test age calculation"""
        session = Session(
            id=1,
            session_token='abc123',
            me='https://alice.example.com',
            created_at=datetime.utcnow() - timedelta(hours=2),
            expires_at=datetime.utcnow() + timedelta(days=30)
        )
        age = session.age
        assert age.total_seconds() >= 7200  # At least 2 hours

    def test_to_dict(self):
        """Test serialization to dictionary"""
        session = Session(
            id=1,
            session_token='abc123',
            me='https://alice.example.com',
            created_at=datetime(2024, 11, 18, 14, 30),
            expires_at=datetime(2024, 12, 18, 14, 30)
        )
        data = session.to_dict()
        assert 'me' in data
        assert 'session_token' not in data  # Excluded for security


class TestTokenModel:
    """Test Token model"""

    def test_from_row(self):
        """Test creating Token from database row"""
        row = {
            'token': 'xyz789',
            'me': 'https://alice.example.com',
            'client_id': 'https://quill.p3k.io',
            'scope': 'create update',
            'created_at': datetime(2024, 11, 18, 14, 30),
            'expires_at': None
        }
        token = Token.from_row(row)
        assert token.token == 'xyz789'
        assert token.scope == 'create update'

    def test_scopes_property(self):
        """Test scope parsing"""
        token = Token(
            token='xyz789',
            me='https://alice.example.com',
            scope='create update delete'
        )
        assert token.scopes == ['create', 'update', 'delete']

    def test_scopes_empty(self):
        """Test empty scope"""
        token = Token(
            token='xyz789',
            me='https://alice.example.com',
            scope=None
        )
        assert token.scopes == []

    def test_has_scope(self):
        """Test scope checking"""
        token = Token(
            token='xyz789',
            me='https://alice.example.com',
            scope='create update'
        )
        assert token.has_scope('create') is True
        assert token.has_scope('update') is True
        assert token.has_scope('delete') is False

    def test_is_expired_never_expires(self):
        """Test token with no expiry"""
        token = Token(
            token='xyz789',
            me='https://alice.example.com',
            expires_at=None
        )
        assert token.is_expired is False
        assert token.is_active is True

    def test_is_expired_with_expiry(self):
        """Test token expiry"""
        token = Token(
            token='xyz789',
            me='https://alice.example.com',
            expires_at=datetime.utcnow() - timedelta(days=1)
        )
        assert token.is_expired is True
        assert token.is_active is False

    def test_is_valid(self):
        """Test validation"""
        token = Token(
            token='xyz789',
            me='https://alice.example.com',
            scope='create'
        )
        assert token.is_valid() is True

    def test_is_valid_with_required_scope(self):
        """Test validation with scope requirement"""
        token = Token(
            token='xyz789',
            me='https://alice.example.com',
            scope='create update'
        )
        assert token.is_valid(required_scope='create') is True
        assert token.is_valid(required_scope='delete') is False


class TestAuthStateModel:
    """Test AuthState model"""

    def test_from_row(self):
        """Test creating AuthState from database row"""
        row = {
            'state': 'random123',
            'created_at': datetime(2024, 11, 18, 14, 30),
            'expires_at': datetime(2024, 11, 18, 14, 35)
        }
        auth_state = AuthState.from_row(row)
        assert auth_state.state == 'random123'

    def test_is_expired(self):
        """Test expiry checking"""
        # Active state
        auth_state = AuthState(
            state='random123',
            created_at=datetime.utcnow(),
            expires_at=datetime.utcnow() + timedelta(minutes=5)
        )
        assert auth_state.is_expired is False
        assert auth_state.is_active is True

        # Expired state
        expired = AuthState(
            state='random123',
            created_at=datetime.utcnow() - timedelta(minutes=10),
            expires_at=datetime.utcnow() - timedelta(minutes=5)
        )
        assert expired.is_expired is True
        assert expired.is_active is False

    def test_is_valid(self):
        """Test validation"""
        auth_state = AuthState(
            state='random123',
            created_at=datetime.utcnow(),
            expires_at=datetime.utcnow() + timedelta(minutes=5)
        )
        assert auth_state.is_valid() is True

    def test_age(self):
        """Test age calculation"""
        auth_state = AuthState(
            state='random123',
            created_at=datetime.utcnow() - timedelta(minutes=2),
            expires_at=datetime.utcnow() + timedelta(minutes=3)
        )
        age = auth_state.age
        assert age.total_seconds() >= 120  # At least 2 minutes

Usage Examples

Creating and Using a Note

from pathlib import Path
from starpunk.models import Note
from starpunk.database import get_db

# Get note from database
db = get_db(app)
row = db.execute("SELECT * FROM notes WHERE slug = ?", ("my-note",)).fetchone()

# Create Note instance
note = Note.from_row(row, data_dir=Path("data"))

# Access metadata (no file I/O)
print(note.slug)  # "my-note"
print(note.published)  # True
print(note.created_at)  # datetime object

# Lazy-load content (reads file on first access)
content = note.content  # File I/O happens here
print(content)  # "# My Note\n\nContent here..."

# Render HTML (uses cached content, renders on first access)
html = note.html  # Markdown rendering happens here
print(html)  # "<h1>My Note</h1><p>Content here...</p>"

# Extract metadata
title = note.title  # "My Note"
permalink = note.permalink  # "/note/my-note"
excerpt = note.excerpt  # "Content here..."

# Serialize for templates
data = note.to_dict(include_content=True, include_html=True)
# Use in template: render_template('note.html', note=data)

Validating a Session

from starpunk.models import Session
from starpunk.database import get_db

# Get session from database
db = get_db(app)
row = db.execute(
    "SELECT * FROM sessions WHERE session_token = ?",
    (session_token,)
).fetchone()

if row is None:
    # Session not found
    return False

# Create Session instance
session = Session.from_row(row)

# Validate session
if not session.is_valid():
    # Session expired or invalid
    return False

# Update last used timestamp
updated_session = session.with_updated_last_used()

# Save to database
db.execute(
    "UPDATE sessions SET last_used_at = ? WHERE id = ?",
    (updated_session.last_used_at, updated_session.id)
)
db.commit()

# Session is valid, store user info
g.user_me = session.me

Checking Token Scope

from starpunk.models import Token
from starpunk.database import get_db

# Get token from Authorization header
auth_header = request.headers.get('Authorization', '')
if not auth_header.startswith('Bearer '):
    return {'error': 'unauthorized'}, 401

token_value = auth_header[7:]  # Remove "Bearer "

# Get from database
db = get_db(app)
row = db.execute("SELECT * FROM tokens WHERE token = ?", (token_value,)).fetchone()

if row is None:
    return {'error': 'invalid_token'}, 401

# Create Token instance
token = Token.from_row(row)

# Validate with required scope
if not token.is_valid(required_scope='create'):
    return {'error': 'insufficient_scope'}, 403

# Token is valid with required scope
# Proceed with request

Verifying Auth State

from starpunk.models import AuthState
from starpunk.database import get_db

# Get state from callback parameter
state_param = request.args.get('state')
if not state_param:
    return {'error': 'missing_state'}, 400

# Get from database
db = get_db(app)
row = db.execute("SELECT * FROM auth_state WHERE state = ?", (state_param,)).fetchone()

if row is None:
    return {'error': 'invalid_state'}, 400

# Create AuthState instance
auth_state = AuthState.from_row(row)

# Validate (checks expiry)
if not auth_state.is_valid():
    return {'error': 'expired_state'}, 400

# Delete state (single-use)
db.execute("DELETE FROM auth_state WHERE state = ?", (state_param,))
db.commit()

# State is valid, continue OAuth flow

Integration with Other Modules

Integration with notes.py

# In starpunk/notes.py
from starpunk.models import Note
from starpunk.database import get_db
from starpunk.utils import generate_slug, make_slug_unique

def get_note(slug: str, data_dir: Path) -> Optional[Note]:
    """Get note by slug"""
    db = get_db(app)
    row = db.execute(
        "SELECT * FROM notes WHERE slug = ?",
        (slug,)
    ).fetchone()

    if row is None:
        return None

    return Note.from_row(row, data_dir=data_dir)

def list_notes(published_only: bool = True, data_dir: Path) -> list[Note]:
    """List notes"""
    db = get_db(app)

    query = "SELECT * FROM notes"
    if published_only:
        query += " WHERE published = 1"
    query += " ORDER BY created_at DESC"

    rows = db.execute(query).fetchall()

    return [Note.from_row(row, data_dir=data_dir) for row in rows]

Integration with auth.py

# In starpunk/auth.py
from starpunk.models import Session, Token, AuthState
from starpunk.database import get_db

def validate_session(session_token: str) -> Optional[Session]:
    """Validate session token"""
    db = get_db(app)
    row = db.execute(
        "SELECT * FROM sessions WHERE session_token = ?",
        (session_token,)
    ).fetchone()

    if row is None:
        return None

    session = Session.from_row(row)

    if not session.is_valid():
        return None

    # Update last used
    updated = session.with_updated_last_used()
    db.execute(
        "UPDATE sessions SET last_used_at = ? WHERE id = ?",
        (updated.last_used_at, updated.id)
    )
    db.commit()

    return updated

def validate_micropub_token(token_value: str, required_scope: str) -> Optional[Token]:
    """Validate Micropub token"""
    db = get_db(app)
    row = db.execute(
        "SELECT * FROM tokens WHERE token = ?",
        (token_value,)
    ).fetchone()

    if row is None:
        return None

    token = Token.from_row(row)

    if not token.is_valid(required_scope=required_scope):
        return None

    return token

Performance Considerations

Lazy Loading Benefits

  • Note content: Only loaded when accessed, not on model creation
  • HTML rendering: Only rendered when accessed, cached afterward
  • Memory efficiency: Can create many Note instances without loading all content
  • Database-only operations: Can list notes without reading files

Caching Strategy

  • Content caching: First note.content access reads file, caches in _cached_content
  • HTML caching: First note.html access renders markdown, caches in _cached_html
  • Cache invalidation: Not needed (models are immutable, represent point-in-time)

Performance Targets

  • Model creation: < 1ms (just data assignment)
  • from_row(): < 1ms (datetime parsing is fast)
  • Content loading: < 5ms for typical note
  • HTML rendering: < 10ms for typical note
  • Property access: < 0.1ms (cached)

Security Considerations

Session Security

  • Token exposure: Never include session_token in to_dict() output
  • Expiry enforcement: Always check is_expired before using session
  • Last used tracking: Update last_used_at on every use
  • Secure comparison: Use constant-time comparison for tokens (in auth.py, not model)

Token Security

  • Scope validation: Always validate required scopes
  • Expiry checking: Check expiry before accepting token
  • Token exposure: Exclude token value from to_dict() output
  • Scope parsing: Handle malformed scope strings gracefully

File Path Security

  • Path validation: Note model doesn't validate paths (caller must use validate_note_path)
  • File reading: Let exceptions propagate (FileNotFoundError, OSError)
  • No writes: Models are read-only, never write files

State Token Security

  • Single-use: Caller must delete state after verification
  • Short expiry: 5 minutes prevents replay attacks
  • Expiry enforcement: Always check is_expired before using

Dependencies

Standard Library

  • dataclasses - Dataclass decorator and utilities
  • datetime - Datetime and timedelta
  • pathlib - Path operations
  • typing - Type hints

Third-Party

  • markdown - Markdown to HTML rendering

Internal

  • starpunk.utils - File operations, content hashing
  • starpunk.database - Not imported (models are independent of DB)

Future Enhancements (V2+)

Note Tags

@dataclass(frozen=True)
class Note:
    # ... existing fields
    tags: list[str] = field(default_factory=list)

    @property
    def tag_string(self) -> str:
        """Get comma-separated tag string"""
        return ', '.join(self.tags)

Note Replies/Comments

@dataclass(frozen=True)
class Note:
    # ... existing fields
    in_reply_to: Optional[str] = None  # URL being replied to

    @property
    def is_reply(self) -> bool:
        return self.in_reply_to is not None

Media Attachments

@dataclass(frozen=True)
class Media:
    """Represents a media attachment"""
    id: int
    filename: str
    file_path: str
    mime_type: str
    created_at: datetime
    note_id: Optional[int] = None

Webmentions

@dataclass(frozen=True)
class Webmention:
    """Represents a received webmention"""
    id: int
    source: str
    target: str
    verified: bool
    created_at: datetime
    note_id: Optional[int] = None

Acceptance Criteria

  • All four models implemented (Note, Session, Token, AuthState)
  • All models use frozen dataclasses
  • All models have from_row() class method
  • All models have to_dict() method
  • Note model implements lazy loading for content
  • Note model implements HTML rendering with caching
  • Note model extracts title and excerpt
  • Session model validates expiry
  • Token model validates scopes
  • AuthState model validates expiry
  • All properties have type hints
  • All methods have comprehensive docstrings
  • Test coverage >90%
  • All tests pass
  • Code formatted with Black
  • Code passes flake8 linting
  • No security issues
  • Integration examples work

References

Implementation Checklist

When implementing starpunk/models.py, complete in this order:

  1. Create file with module docstring
  2. Add imports and constants
  3. Implement Note model
    • Basic dataclass structure
    • from_row() class method
    • content property (lazy loading)
    • html property (rendering + caching)
    • title property
    • excerpt property
    • permalink property
    • to_dict() method
    • verify_integrity() method
  4. Implement Session model
    • Basic dataclass structure
    • from_row() class method
    • is_expired property
    • is_valid() method
    • with_updated_last_used() method
    • to_dict() method
  5. Implement Token model
    • Basic dataclass structure
    • from_row() class method
    • scopes property
    • has_scope() method
    • is_valid() method
    • to_dict() method
  6. Implement AuthState model
    • Basic dataclass structure
    • from_row() class method
    • is_expired property
    • is_valid() method
    • to_dict() method
  7. Create tests/test_models.py
  8. Write tests for all models
  9. Run tests and achieve >90% coverage
  10. Format with Black
  11. Lint with flake8
  12. Review all docstrings
  13. Test integration with utils.py

Estimated Time: 3-4 hours for implementation + tests