# Phase 2.1: Notes Management Design ## Overview This document provides a complete, implementation-ready design for Phase 2.1 of the StarPunk V1 implementation plan: Notes Management (CRUD Operations). The notes module (`starpunk/notes.py`) implements all create, read, update, and delete operations for notes, with critical emphasis on maintaining file and database synchronization. **Priority**: CRITICAL - Core application functionality **Estimated Effort**: 6-8 hours **Dependencies**: `starpunk/utils.py`, `starpunk/models.py`, `starpunk/database.py` **File**: `starpunk/notes.py` ## Design Principles 1. **Atomic Transactions** - File and database operations succeed or fail together 2. **File-Database Sync** - Maintain perfect consistency between filesystem and database 3. **Type Safety** - Full type hints on all functions and parameters 4. **Error Recovery** - Handle failures gracefully with proper rollback 5. **Security First** - Validate all paths, prevent SQL injection, handle malicious input 6. **Testability** - Functions designed for easy testing and mocking 7. **Transaction Patterns** - Consistent use of database transactions for all write operations ## Core Responsibility The notes module has ONE critical responsibility: **Maintain perfect synchronization between markdown files on disk and note metadata in the database.** Every write operation (create, update, delete) must ensure that: - If the database operation succeeds, the file operation must have succeeded - If the file operation fails, the database must be rolled back - If the database operation fails, the file must be cleaned up - No orphaned files (file exists, no database record) - No orphaned records (database record exists, no file) ## Module Structure ```python """ Notes management for StarPunk This module provides CRUD operations for notes with atomic file+database synchronization. All write operations use database transactions to ensure files and database records stay in sync. Functions: create_note: Create new note with file and database entry get_note: Retrieve note by slug or ID list_notes: List notes with filtering and pagination update_note: Update note content and/or metadata delete_note: Delete note (soft or hard delete) Exceptions: NoteNotFoundError: Note does not exist InvalidNoteDataError: Invalid content or parameters NoteSyncError: File/database synchronization failure """ # Standard library imports from datetime import datetime from pathlib import Path from typing import Optional # Third-party imports from flask import current_app # Local imports from starpunk.database import get_db from starpunk.models import Note from starpunk.utils import ( generate_slug, make_slug_unique, generate_note_path, ensure_note_directory, write_note_file, read_note_file, delete_note_file, calculate_content_hash, validate_note_path, validate_slug ) # Custom exceptions (defined below) class NoteNotFoundError(Exception): """Raised when a note cannot be found""" pass class InvalidNoteDataError(Exception): """Raised when note data is invalid""" pass class NoteSyncError(Exception): """Raised when file/database synchronization fails""" pass ``` --- ## Custom Exceptions ### Exception Hierarchy ```python class NoteNotFoundError(Exception): """ Raised when a note cannot be found This exception is raised when attempting to retrieve, update, or delete a note that doesn't exist in the database. Attributes: identifier: The slug or ID used to search for the note message: Human-readable error message """ def __init__(self, identifier: str | int, message: Optional[str] = None): self.identifier = identifier if message is None: message = f"Note not found: {identifier}" super().__init__(message) class InvalidNoteDataError(Exception): """ Raised when note data is invalid This exception is raised when attempting to create or update a note with invalid data (empty content, invalid slug, etc.) Attributes: field: The field that failed validation value: The invalid value message: Human-readable error message """ def __init__(self, field: str, value: any, message: Optional[str] = None): self.field = field self.value = value if message is None: message = f"Invalid {field}: {value}" super().__init__(message) class NoteSyncError(Exception): """ Raised when file/database synchronization fails This exception is raised when a file operation and database operation cannot be kept in sync (e.g., file written but database insert failed). Attributes: operation: The operation that failed ('create', 'update', 'delete') details: Additional details about the failure message: Human-readable error message """ def __init__(self, operation: str, details: str, message: Optional[str] = None): self.operation = operation self.details = details if message is None: message = f"Sync error during {operation}: {details}" super().__init__(message) ``` --- ## Function Specifications ### 1. create_note() #### Purpose Create a new note with markdown content, writing both file and database record atomically. #### Type Signature ```python def create_note( content: str, published: bool = False, created_at: Optional[datetime] = None ) -> Note: """ Create a new note Creates a new note by generating a unique slug, writing the markdown content to a file, and inserting a database record. File and database operations are atomic - if either fails, both are rolled back. Args: content: Markdown content for the note (must not be empty) published: Whether the note should be published (default: False) created_at: Creation timestamp (default: current UTC time) Returns: Note object with all metadata and content loaded Raises: InvalidNoteDataError: If content is empty or whitespace-only NoteSyncError: If file write succeeds but database insert fails OSError: If file cannot be written (permissions, disk full, etc.) ValueError: If configuration is missing or invalid Examples: >>> # Create unpublished draft >>> note = create_note("# My First Note\\n\\nContent here.", published=False) >>> print(note.slug) 'my-first-note' >>> # Create published note >>> note = create_note( ... "Just published this!", ... published=True ... ) >>> print(note.published) True >>> # Create with specific timestamp >>> from datetime import datetime >>> note = create_note( ... "Backdated note", ... created_at=datetime(2024, 1, 1, 12, 0, 0) ... ) Transaction Safety: 1. Validates content (before any changes) 2. Generates unique slug (database query) 3. Writes file to disk 4. Begins database transaction 5. Inserts database record 6. If database fails: deletes file, raises NoteSyncError 7. If successful: commits transaction, returns Note Notes: - Slug is generated from first 5 words of content - Random suffix added if slug already exists - File path follows pattern: data/notes/YYYY/MM/slug.md - Content hash calculated and stored for integrity checking - created_at and updated_at set to same value initially """ ``` #### Implementation Algorithm ```python def create_note( content: str, published: bool = False, created_at: Optional[datetime] = None ) -> Note: # 1. VALIDATION (before any changes) if not content or not content.strip(): raise InvalidNoteDataError('content', content, 'Content cannot be empty or whitespace-only') # 2. SETUP if created_at is None: created_at = datetime.utcnow() updated_at = created_at # Same as created_at for new notes data_dir = Path(current_app.config['DATA_PATH']) # 3. GENERATE UNIQUE SLUG # Query all existing slugs from database db = get_db() existing_slugs_rows = db.execute("SELECT slug FROM notes").fetchall() existing_slugs = {row['slug'] for row in existing_slugs_rows} # Generate base slug from content base_slug = generate_slug(content, created_at) # Make unique if collision slug = make_slug_unique(base_slug, existing_slugs) # Validate final slug (defensive check) if not validate_slug(slug): raise InvalidNoteDataError('slug', slug, f'Generated slug is invalid: {slug}') # 4. GENERATE FILE PATH note_path = generate_note_path(slug, created_at, data_dir) # Security: Validate path stays within data directory if not validate_note_path(note_path, data_dir): raise NoteSyncError( 'create', f'Generated path outside data directory: {note_path}', 'Path validation failed' ) # 5. CALCULATE CONTENT HASH content_hash = calculate_content_hash(content) # 6. WRITE FILE (before database to fail fast on disk issues) try: ensure_note_directory(note_path) write_note_file(note_path, content) except OSError as e: # File write failed, nothing to clean up raise NoteSyncError( 'create', f'Failed to write file: {e}', f'Could not write note file: {note_path}' ) # 7. INSERT DATABASE RECORD (transaction starts here) file_path_rel = str(note_path.relative_to(data_dir)) try: db.execute( """ INSERT INTO notes (slug, file_path, published, created_at, updated_at, content_hash) VALUES (?, ?, ?, ?, ?, ?) """, (slug, file_path_rel, published, created_at, updated_at, content_hash) ) db.commit() except Exception as e: # Database insert failed, delete the file we created try: note_path.unlink() except OSError: # Log warning but don't fail - file cleanup is best effort current_app.logger.warning(f'Failed to clean up file after DB error: {note_path}') # Raise sync error raise NoteSyncError( 'create', f'Database insert failed: {e}', f'Failed to create note: {slug}' ) # 8. RETRIEVE AND RETURN NOTE OBJECT # Get the auto-generated ID note_id = db.execute("SELECT last_insert_rowid()").fetchone()[0] # Fetch the complete record row = db.execute( "SELECT * FROM notes WHERE id = ?", (note_id,) ).fetchone() # Create Note object note = Note.from_row(row, data_dir) return note ``` #### Edge Cases | Case | Handling | |------|----------| | Empty content | Raise InvalidNoteDataError before any operations | | Whitespace-only content | Raise InvalidNoteDataError (strip and check) | | Very long content (>10MB) | Allowed (may hit filesystem limits, let OSError propagate) | | Unicode/emoji in content | Fully supported (UTF-8 encoding in write_note_file) | | Slug collision | Handled by make_slug_unique (adds random suffix) | | Disk full | OSError raised during file write, before database operation | | Permission denied | OSError raised during file write, before database operation | | Database locked | SQLite error during insert, file cleaned up | | Missing DATA_PATH config | ValueError raised when accessing config | #### Error Recovery ```python # Scenario 1: File write fails # - No database operation has occurred # - No cleanup needed # - Raise NoteSyncError with details # Scenario 2: Database insert fails after file write # - File exists on disk # - Delete file (best effort) # - Raise NoteSyncError with details # Scenario 3: Cleanup fails after database error # - Log warning (orphaned file) # - Still raise original NoteSyncError # - Orphaned file can be cleaned up later by maintenance task ``` --- ### 2. get_note() #### Purpose Retrieve a note by slug or ID, loading metadata from database and optionally loading content from file. #### Type Signature ```python def get_note( slug: Optional[str] = None, id: Optional[int] = None, load_content: bool = True ) -> Optional[Note]: """ Get a note by slug or ID Retrieves note metadata from database and optionally loads content from file. Exactly one of slug or id must be provided. Args: slug: Note slug (unique identifier in URLs) id: Note database ID (primary key) load_content: Whether to load file content (default: True) Returns: Note object with metadata and optionally content, or None if not found Raises: ValueError: If both slug and id provided, or neither provided OSError: If file cannot be read (when load_content=True) FileNotFoundError: If note file doesn't exist (when load_content=True) Examples: >>> # Get by slug >>> note = get_note(slug="my-first-note") >>> if note: ... print(note.content) # Content loaded ... else: ... print("Note not found") >>> # Get by ID >>> note = get_note(id=42) >>> # Get metadata only (no file I/O) >>> note = get_note(slug="my-note", load_content=False) >>> print(note.slug) # Works >>> print(note.content) # Will trigger file load on access >>> # Check if note exists >>> if get_note(slug="maybe-exists"): ... print("Note exists") Performance: - Metadata retrieval: Single database query, <1ms - Content loading: File I/O, typically <5ms for normal notes - Use load_content=False for list operations to avoid file I/O Notes: - Returns None if note not found (does not raise exception) - Content hash verification is optional (logs warning if mismatch) - Note.content property will lazy-load if load_content=False - Soft-deleted notes (deleted_at != NULL) are excluded """ ``` #### Implementation Algorithm ```python def get_note( slug: Optional[str] = None, id: Optional[int] = None, load_content: bool = True ) -> Optional[Note]: # 1. VALIDATE PARAMETERS if slug is None and id is None: raise ValueError("Must provide either slug or id") if slug is not None and id is not None: raise ValueError("Cannot provide both slug and id") # 2. QUERY DATABASE db = get_db() if slug is not None: # Query by slug row = db.execute( "SELECT * FROM notes WHERE slug = ? AND deleted_at IS NULL", (slug,) ).fetchone() else: # Query by ID row = db.execute( "SELECT * FROM notes WHERE id = ? AND deleted_at IS NULL", (id,) ).fetchone() # 3. CHECK IF FOUND if row is None: return None # 4. CREATE NOTE OBJECT data_dir = Path(current_app.config['DATA_PATH']) note = Note.from_row(row, data_dir) # 5. OPTIONALLY VERIFY INTEGRITY # This is a passive check - log warning but don't fail if load_content and note.content_hash: try: if not note.verify_integrity(): current_app.logger.warning( f'Content hash mismatch for note {note.slug}. ' f'File may have been modified externally.' ) except Exception as e: current_app.logger.warning( f'Failed to verify integrity for note {note.slug}: {e}' ) # 6. RETURN NOTE # If load_content=False, content will be lazy-loaded on first access # If load_content=True, content is already loaded by Note model return note ``` #### Edge Cases | Case | Handling | |------|----------| | Both slug and id provided | Raise ValueError | | Neither slug nor id provided | Raise ValueError | | Note not found | Return None (not an error) | | Note soft-deleted | Return None (excluded by query) | | File missing | FileNotFoundError when accessing note.content | | File modified externally | Log warning on hash mismatch, return note anyway | | load_content=False | Return note, content loaded on demand via property | --- ### 3. list_notes() #### Purpose List notes with filtering, sorting, and pagination. Returns metadata only (no file I/O) for performance. #### Type Signature ```python def list_notes( published_only: bool = False, limit: int = 50, offset: int = 0, order_by: str = 'created_at', order_dir: str = 'DESC' ) -> list[Note]: """ List notes with filtering and pagination Retrieves notes from database with optional filtering by published status, sorting, and pagination. Does not load file content for performance - use note.content to lazy-load when needed. Args: published_only: If True, only return published notes (default: False) limit: Maximum number of notes to return (default: 50, max: 1000) offset: Number of notes to skip for pagination (default: 0) order_by: Field to sort by (default: 'created_at') order_dir: Sort direction, 'ASC' or 'DESC' (default: 'DESC') Returns: List of Note objects with metadata only (content not loaded) Raises: ValueError: If order_by is not a valid column name (SQL injection prevention) ValueError: If order_dir is not 'ASC' or 'DESC' ValueError: If limit exceeds maximum allowed value Examples: >>> # List recent published notes >>> notes = list_notes(published_only=True, limit=10) >>> for note in notes: ... print(note.slug, note.created_at) >>> # List all notes, oldest first >>> notes = list_notes(order_dir='ASC') >>> # Pagination (page 2, 20 per page) >>> notes = list_notes(limit=20, offset=20) >>> # List by update time >>> notes = list_notes(order_by='updated_at') Performance: - Single database query - No file I/O (content not loaded) - Efficient for large result sets with pagination - Typical query time: <10ms for 1000s of notes Pagination Example: >>> page = 1 >>> per_page = 20 >>> notes = list_notes( ... published_only=True, ... limit=per_page, ... offset=(page - 1) * per_page ... ) Notes: - Excludes soft-deleted notes (deleted_at IS NULL) - Content is lazy-loaded when accessed via note.content - order_by values are validated to prevent SQL injection - Default sort is newest first (created_at DESC) """ ``` #### Implementation Algorithm ```python def list_notes( published_only: bool = False, limit: int = 50, offset: int = 0, order_by: str = 'created_at', order_dir: str = 'DESC' ) -> list[Note]: # 1. VALIDATE PARAMETERS # Prevent SQL injection - validate order_by column ALLOWED_ORDER_FIELDS = ['id', 'slug', 'created_at', 'updated_at', 'published'] if order_by not in ALLOWED_ORDER_FIELDS: raise ValueError( f"Invalid order_by field: {order_by}. " f"Allowed: {', '.join(ALLOWED_ORDER_FIELDS)}" ) # Validate order direction order_dir = order_dir.upper() if order_dir not in ['ASC', 'DESC']: raise ValueError(f"Invalid order_dir: {order_dir}. Must be 'ASC' or 'DESC'") # Validate limit (prevent excessive queries) MAX_LIMIT = 1000 if limit > MAX_LIMIT: raise ValueError(f"Limit {limit} exceeds maximum {MAX_LIMIT}") if limit < 1: raise ValueError(f"Limit must be >= 1") if offset < 0: raise ValueError(f"Offset must be >= 0") # 2. BUILD QUERY # Start with base query query = "SELECT * FROM notes WHERE deleted_at IS NULL" # Add filters params = [] if published_only: query += " AND published = 1" # Add ordering (safe because order_by validated above) query += f" ORDER BY {order_by} {order_dir}" # Add pagination query += " LIMIT ? OFFSET ?" params.extend([limit, offset]) # 3. EXECUTE QUERY db = get_db() rows = db.execute(query, params).fetchall() # 4. CREATE NOTE OBJECTS (without loading content) data_dir = Path(current_app.config['DATA_PATH']) notes = [Note.from_row(row, data_dir) for row in rows] return notes ``` #### Edge Cases | Case | Handling | |------|----------| | Invalid order_by field | Raise ValueError (SQL injection prevention) | | Invalid order_dir | Raise ValueError | | Limit too large (>1000) | Raise ValueError | | Limit zero or negative | Raise ValueError | | Offset negative | Raise ValueError | | No notes match filter | Return empty list | | Offset beyond results | Return empty list | #### Performance Considerations - **No file I/O**: Content not loaded, only database query - **Indexed queries**: Ensure created_at and updated_at have database indexes - **Pagination**: Use LIMIT/OFFSET for efficient large result sets - **Typical performance**: <10ms for queries on tables with thousands of notes --- ### 4. update_note() #### Purpose Update note content and/or published status, maintaining file-database synchronization. #### Type Signature ```python def update_note( slug: Optional[str] = None, id: Optional[int] = None, content: Optional[str] = None, published: Optional[bool] = None ) -> Note: """ Update a note's content and/or published status Updates note content and/or metadata, maintaining atomic synchronization between file and database. At least one of content or published must be provided. Args: slug: Note slug to update (mutually exclusive with id) id: Note ID to update (mutually exclusive with slug) content: New markdown content (None = no change) published: New published status (None = no change) Returns: Updated Note object with new content and metadata Raises: ValueError: If both slug and id provided, or neither provided ValueError: If neither content nor published provided (no changes) NoteNotFoundError: If note doesn't exist InvalidNoteDataError: If content is empty/whitespace (when provided) NoteSyncError: If file update succeeds but database update fails OSError: If file cannot be written Examples: >>> # Update content only >>> note = update_note( ... slug="my-note", ... content="# Updated content\\n\\nNew text here." ... ) >>> # Publish a draft >>> note = update_note(slug="draft-note", published=True) >>> # Update both content and status >>> note = update_note( ... id=42, ... content="New content", ... published=True ... ) >>> # Unpublish a note >>> note = update_note(slug="old-post", published=False) Transaction Safety: 1. Validates parameters 2. Retrieves existing note from database 3. If content changed: writes new file (old file preserved) 4. Begins database transaction 5. Updates database record 6. If database fails: no rollback needed (old file still exists) 7. If successful: commits transaction, returns updated Note Notes: - Slug cannot be changed (use delete + create for that) - updated_at is automatically set to current time - Content hash recalculated if content changes - File is overwritten atomically (temp file + rename) - Old file content is lost (no backup by default) """ ``` #### Implementation Algorithm ```python def update_note( slug: Optional[str] = None, id: Optional[int] = None, content: Optional[str] = None, published: Optional[bool] = None ) -> Note: # 1. VALIDATE PARAMETERS if slug is None and id is None: raise ValueError("Must provide either slug or id") if slug is not None and id is not None: raise ValueError("Cannot provide both slug and id") if content is None and published is None: raise ValueError("Must provide at least one of content or published to update") # Validate content if provided if content is not None: if not content or not content.strip(): raise InvalidNoteDataError( 'content', content, 'Content cannot be empty or whitespace-only' ) # 2. GET EXISTING NOTE existing_note = get_note(slug=slug, id=id, load_content=False) if existing_note is None: identifier = slug if slug is not None else id raise NoteNotFoundError(identifier) # 3. SETUP updated_at = datetime.utcnow() data_dir = Path(current_app.config['DATA_PATH']) note_path = data_dir / existing_note.file_path # Validate path (security check) if not validate_note_path(note_path, data_dir): raise NoteSyncError( 'update', f'Note file path outside data directory: {note_path}', 'Path validation failed' ) # 4. UPDATE FILE (if content changed) new_content_hash = existing_note.content_hash if content is not None: try: # Write new content atomically write_note_file(note_path, content) # Calculate new hash new_content_hash = calculate_content_hash(content) except OSError as e: raise NoteSyncError( 'update', f'Failed to write file: {e}', f'Could not update note file: {note_path}' ) # 5. UPDATE DATABASE db = get_db() # Build update query based on what changed update_fields = ['updated_at = ?'] params = [updated_at] if content is not None: update_fields.append('content_hash = ?') params.append(new_content_hash) if published is not None: update_fields.append('published = ?') params.append(published) # Add WHERE clause parameter if slug is not None: where_clause = "slug = ?" params.append(slug) else: where_clause = "id = ?" params.append(id) query = f"UPDATE notes SET {', '.join(update_fields)} WHERE {where_clause}" try: db.execute(query, params) db.commit() except Exception as e: # Database update failed # File has been updated, but we can't roll that back easily # Log error and raise current_app.logger.error( f'Database update failed for note {existing_note.slug}: {e}' ) raise NoteSyncError( 'update', f'Database update failed: {e}', f'Failed to update note: {existing_note.slug}' ) # 6. RETURN UPDATED NOTE updated_note = get_note(slug=existing_note.slug, load_content=True) return updated_note ``` #### Edge Cases | Case | Handling | |------|----------| | Note not found | Raise NoteNotFoundError | | Empty content provided | Raise InvalidNoteDataError | | No changes provided | Raise ValueError | | File write fails | Raise NoteSyncError before database update | | Database update fails | Raise NoteSyncError (file already updated) | | Note soft-deleted | get_note returns None, raises NoteNotFoundError | #### Transaction Safety Considerations **Problem**: Update is not fully atomic because file write happens before database update. **Risk**: If database update fails, file has already been modified. **Mitigation Options**: 1. **Accept the risk** (Recommended for V1) - File system is source of truth - Database can be rebuilt from files - Database update failures are rare - Risk is acceptable for single-user system 2. **Backup before update** (Future enhancement) - Copy old file to `.backup/` before writing - If database fails, restore from backup - Adds complexity and storage overhead 3. **Database-first approach** (Alternative design) - Update database first - Then update file - Rollback database if file fails - Problem: Database shows new hash before file is updated **V1 Decision**: Accept risk, log errors, rely on file system as source of truth. --- ### 5. delete_note() #### Purpose Delete a note, either soft delete (mark as deleted) or hard delete (remove completely). #### Type Signature ```python def delete_note( slug: Optional[str] = None, id: Optional[int] = None, soft: bool = True ) -> None: """ Delete a note (soft or hard delete) Deletes a note either by marking it as deleted (soft delete) or by permanently removing the file and database record (hard delete). Args: slug: Note slug to delete (mutually exclusive with id) id: Note ID to delete (mutually exclusive with slug) soft: If True, soft delete (mark deleted_at); if False, hard delete (default: True) Returns: None Raises: ValueError: If both slug and id provided, or neither provided NoteNotFoundError: If note doesn't exist NoteSyncError: If file deletion succeeds but database update fails OSError: If file cannot be deleted Examples: >>> # Soft delete (default) >>> delete_note(slug="old-note") >>> # Note marked as deleted, file remains >>> # Hard delete >>> delete_note(slug="spam-note", soft=False) >>> # Note and file permanently removed >>> # Delete by ID >>> delete_note(id=42, soft=False) Soft Delete: - Sets deleted_at timestamp in database - File remains on disk (optionally moved to .trash/) - Note excluded from normal queries (deleted_at IS NULL) - Can be undeleted by clearing deleted_at (future feature) Hard Delete: - Removes database record permanently - Deletes file from disk - Cannot be recovered - Use for spam, test data, or confirmed deletions Transaction Safety: Soft delete: 1. Updates database (sets deleted_at) 2. Optionally moves file to .trash/ 3. If move fails: log warning but succeed (database is source of truth) Hard delete: 1. Deletes database record 2. Deletes file from disk 3. If file delete fails: log warning but succeed (record already gone) Notes: - Soft delete is default and recommended - Hard delete is permanent and cannot be undone - Missing files during hard delete are not errors (idempotent) - Deleting already-deleted note returns successfully (idempotent) """ ``` #### Implementation Algorithm ```python def delete_note( slug: Optional[str] = None, id: Optional[int] = None, soft: bool = True ) -> None: # 1. VALIDATE PARAMETERS if slug is None and id is None: raise ValueError("Must provide either slug or id") if slug is not None and id is not None: raise ValueError("Cannot provide both slug and id") # 2. GET EXISTING NOTE # For soft delete, exclude already soft-deleted notes # For hard delete, get note even if soft-deleted if soft: existing_note = get_note(slug=slug, id=id, load_content=False) else: # Hard delete: query including soft-deleted notes db = get_db() if slug is not None: row = db.execute( "SELECT * FROM notes WHERE slug = ?", (slug,) ).fetchone() else: row = db.execute( "SELECT * FROM notes WHERE id = ?", (id,) ).fetchone() if row is None: existing_note = None else: data_dir = Path(current_app.config['DATA_PATH']) existing_note = Note.from_row(row, data_dir) # 3. CHECK IF NOTE EXISTS if existing_note is None: # Note not found - could already be deleted # For idempotency, don't raise error - just return return # 4. SETUP data_dir = Path(current_app.config['DATA_PATH']) note_path = data_dir / existing_note.file_path # Validate path (security check) if not validate_note_path(note_path, data_dir): raise NoteSyncError( 'delete', f'Note file path outside data directory: {note_path}', 'Path validation failed' ) # 5. PERFORM DELETION db = get_db() if soft: # SOFT DELETE: Mark as deleted in database deleted_at = datetime.utcnow() try: db.execute( "UPDATE notes SET deleted_at = ? WHERE id = ?", (deleted_at, existing_note.id) ) db.commit() except Exception as e: raise NoteSyncError( 'delete', f'Database update failed: {e}', f'Failed to soft delete note: {existing_note.slug}' ) # Optionally move file to trash (best effort) # This is optional and failure is not critical try: delete_note_file(note_path, soft=True, data_dir=data_dir) except Exception as e: current_app.logger.warning( f'Failed to move file to trash for note {existing_note.slug}: {e}' ) # Don't fail - database update succeeded else: # HARD DELETE: Remove from database and filesystem try: db.execute( "DELETE FROM notes WHERE id = ?", (existing_note.id,) ) db.commit() except Exception as e: raise NoteSyncError( 'delete', f'Database delete failed: {e}', f'Failed to delete note: {existing_note.slug}' ) # Delete file (best effort) try: delete_note_file(note_path, soft=False) except FileNotFoundError: # File already gone - that's fine current_app.logger.info( f'File already deleted for note {existing_note.slug}' ) except Exception as e: current_app.logger.warning( f'Failed to delete file for note {existing_note.slug}: {e}' ) # Don't fail - database record already deleted # 6. RETURN (no value) return None ``` #### Edge Cases | Case | Handling | |------|----------| | Note not found | Return successfully (idempotent) | | Note already soft-deleted | Soft delete returns successfully; hard delete proceeds | | File missing during delete | Log warning, continue (idempotent) | | File delete fails | Log warning, continue (database is source of truth) | | Database delete fails | Raise NoteSyncError | #### Soft vs Hard Delete Decision Matrix | Scenario | Recommended Action | |----------|-------------------| | User deletes note from UI | Soft delete (can undo) | | Cleanup of test data | Hard delete | | Removing spam | Hard delete | | Bulk operations | Soft delete (safer) | | Storage cleanup | Hard delete old soft-deleted notes | --- ## Database Transaction Patterns ### Transaction Principles 1. **File operations before database operations** - Fail fast on disk issues 2. **Commit explicitly** - Never rely on auto-commit 3. **Rollback on error** - Clean up partial changes 4. **Cleanup orphaned files** - Best effort, log failures 5. **Idempotent operations** - Safe to retry ### Pattern 1: Create Operation ```python # 1. Write file (before database) write_note_file(path, content) # 2. Begin transaction (implicit in SQLite) db.execute("INSERT INTO notes ...") # 3a. Success path db.commit() return note # 3b. Error path try: # Delete file we created path.unlink() except: # Log but don't fail logger.warning("Failed to cleanup") raise NoteSyncError(...) ``` ### Pattern 2: Update Operation ```python # 1. Write file (before database) write_note_file(path, new_content) # 2. Update database db.execute("UPDATE notes ...") # 3a. Success path db.commit() return note # 3b. Error path # File already updated, can't rollback easily # Accept inconsistency, log error logger.error("Database update failed") raise NoteSyncError(...) ``` ### Pattern 3: Delete Operation (Soft) ```python # 1. Update database first (soft delete) db.execute("UPDATE notes SET deleted_at = ...") db.commit() # 2. Move file to trash (best effort) try: delete_note_file(path, soft=True) except: logger.warning("Failed to move to trash") # Continue - database update succeeded ``` ### Pattern 4: Delete Operation (Hard) ```python # 1. Delete database record db.execute("DELETE FROM notes ...") db.commit() # 2. Delete file (best effort) try: delete_note_file(path, soft=False) except FileNotFoundError: # Already gone, that's fine pass except: logger.warning("Failed to delete file") # Continue - database delete succeeded ``` --- ## Error Handling Strategy ### Error Categories 1. **User Errors** (400-level) - Empty content - Invalid parameters - Note not found - Action: Return clear error message, don't log as error 2. **System Errors** (500-level) - Disk full - Permission denied - Database locked - Action: Log error, return generic message to user 3. **Sync Errors** (Consistency) - File write succeeded, database failed - Database succeeded, file failed - Action: Log detailed error, attempt cleanup, raise NoteSyncError ### Error Messages **Good Error Messages**: ```python raise NoteNotFoundError( slug, f"Note '{slug}' does not exist or has been deleted" ) raise InvalidNoteDataError( 'content', content, "Note content cannot be empty. Please provide markdown content." ) raise NoteSyncError( 'create', f"Database insert failed: {str(e)}", f"Failed to create note. The file was written successfully but " f"could not be registered in the database. Please check database " f"connectivity and try again." ) ``` **Bad Error Messages**: ```python raise Exception("Error") raise ValueError("Invalid") raise RuntimeError("Failed") ``` ### Logging Strategy ```python # User errors: INFO level (normal operation) logger.info(f"Note not found: {slug}") # Sync issues: WARNING level (cleanup failures) logger.warning(f"Failed to cleanup orphaned file: {path}") # System errors: ERROR level (unexpected failures) logger.error(f"Database update failed for note {slug}: {e}") # Integrity issues: WARNING level (hash mismatch) logger.warning(f"Content hash mismatch for note {slug}") ``` --- ## File/Database Sync Strategy ### Synchronization Guarantee **Goal**: Ensure every database record has a corresponding file, and vice versa. **Approach**: Transactional operations with cleanup on failure. ### Order of Operations #### Create ``` 1. Write file 2. Insert database 3. If database fails: delete file ``` #### Update ``` 1. Write new file content 2. Update database 3. If database fails: log error (file already updated) ``` #### Delete (Soft) ``` 1. Update database (set deleted_at) 2. Move file to trash (optional, best effort) ``` #### Delete (Hard) ``` 1. Delete database record 2. Delete file (best effort) ``` ### Consistency Checks **Orphaned Files** (file exists, no database record): - Can occur if database insert fails and cleanup fails - Detection: List all files, check if in database - Resolution: Delete file or import into database - Prevention: Reliable cleanup in create_note() **Orphaned Records** (database record exists, no file): - Can occur if file is deleted externally - Detection: Query database, check if file exists - Resolution: Recreate file from backup or delete record - Prevention: File permissions, path validation **Hash Mismatches** (file content doesn't match stored hash): - Can occur if file is edited externally - Detection: verify_integrity() check - Resolution: Recalculate hash, update database - Prevention: File permissions, user education ### Recovery Procedures **Manual Recovery** (future admin tool): ```python def check_sync_status(): """Check for orphaned files and records""" # Find orphaned files # Find orphaned records # Find hash mismatches # Return report def cleanup_orphans(): """Clean up orphaned files and records""" # Delete orphaned files # Delete orphaned records (or attempt to recreate files) # Recalculate hashes ``` --- ## Security Considerations ### SQL Injection Prevention **Risk**: User input in SQL queries **Mitigation**: ```python # GOOD: Parameterized query db.execute("SELECT * FROM notes WHERE slug = ?", (slug,)) # BAD: String interpolation db.execute(f"SELECT * FROM notes WHERE slug = '{slug}'") ``` **Validation**: - All `order_by` fields validated against whitelist - All user inputs passed as parameters, never interpolated - SQLite parameter binding prevents injection ### Path Traversal Prevention **Risk**: Malicious slugs that escape data directory **Mitigation**: ```python # Always validate paths if not validate_note_path(note_path, data_dir): raise NoteSyncError(...) # validate_note_path uses Path.resolve() and is_relative_to() # This prevents ../../../etc/passwd style attacks ``` **Validation**: - All file paths validated before use - Paths resolved to absolute before checking - Path must be within data_dir ### Content Validation **Risk**: Malicious content **Mitigation**: - Markdown is safe by design (no code execution) - HTML rendering happens on display (template escaping) - No execution of user content - Content hash ensures integrity **Note**: StarPunk is single-user, so content trust is high. User is trusted to not attack themselves. ### File System Security **Permissions**: - Data directory: 755 (rwxr-xr-x) - Note files: 644 (rw-r--r--) - Trash directory: 755 (rwxr-xr-x) **Isolation**: - All notes within data_dir - No symlinks followed - Path validation prevents escape --- ## Integration with Other Modules ### Using utils.py ```python from starpunk.utils import ( generate_slug, # Create slug from content make_slug_unique, # Add suffix if collision validate_slug, # Check slug format generate_note_path, # Build file path ensure_note_directory, # Create directories write_note_file, # Atomic file write read_note_file, # Read file content delete_note_file, # Delete or trash file calculate_content_hash, # SHA-256 hash validate_note_path # Security check ) ``` ### Using models.py ```python from starpunk.models import Note # Create Note from database row note = Note.from_row(row, data_dir) # Access properties note.slug # URL slug note.content # Lazy-loaded markdown content note.html # Rendered HTML note.title # Extracted title note.permalink # /note/slug note.published # Boolean status # Serialize note.to_dict(include_content=True) ``` ### Using database.py ```python from starpunk.database import get_db # Get database connection db = get_db() # Execute queries row = db.execute("SELECT * FROM notes WHERE slug = ?", (slug,)).fetchone() rows = db.execute("SELECT * FROM notes").fetchall() # Transactions db.execute("INSERT INTO notes ...") db.commit() # Success db.execute("UPDATE notes ...") db.rollback() # Error recovery ``` ### Used by routes.py (Phase 4) ```python from starpunk.notes import ( create_note, get_note, list_notes, update_note, delete_note ) # Example route handlers @app.route('/admin/notes', methods=['POST']) def admin_create_note(): content = request.form.get('content') published = request.form.get('published') == 'true' try: note = create_note(content, published) return redirect(f'/admin/notes/{note.slug}') except InvalidNoteDataError as e: flash(str(e)) return redirect('/admin/notes/new') @app.route('/note/') def view_note(slug): note = get_note(slug=slug) if note is None or not note.published: abort(404) return render_template('note.html', note=note) ``` --- ## Testing Strategy ### Test Organization File: `tests/test_notes.py` ```python """ Tests for notes management module Test categories: - Note creation tests - Note retrieval tests - Note listing tests - Note update tests - Note deletion tests - Edge case tests - Error handling tests - Integration tests """ import pytest from pathlib import Path from datetime import datetime from starpunk.notes import ( create_note, get_note, list_notes, update_note, delete_note, NoteNotFoundError, InvalidNoteDataError, NoteSyncError ) from starpunk.database import get_db ``` ### Test Categories #### 1. Create Note Tests ```python class TestCreateNote: """Test note creation""" def test_create_simple_note(self, app, client): """Test creating a basic note""" note = create_note("# Test Note\n\nContent here.", published=False) assert note.slug is not None assert note.published is False assert note.content == "# Test Note\n\nContent here." # Verify file exists data_dir = Path(app.config['DATA_PATH']) note_path = data_dir / note.file_path assert note_path.exists() # Verify database record db = get_db() row = db.execute("SELECT * FROM notes WHERE slug = ?", (note.slug,)).fetchone() assert row is not None def test_create_published_note(self, app, client): """Test creating a published note""" note = create_note("Published content", published=True) assert note.published is True def test_create_with_timestamp(self, app, client): """Test creating note with specific timestamp""" created_at = datetime(2024, 1, 1, 12, 0, 0) note = create_note("Backdated note", created_at=created_at) assert note.created_at == created_at def test_create_generates_unique_slug(self, app, client): """Test slug uniqueness enforcement""" note1 = create_note("# Same Title\n\nContent 1") note2 = create_note("# Same Title\n\nContent 2") assert note1.slug != note2.slug assert note2.slug.startswith(note1.slug.rsplit('-', 1)[0]) def test_create_empty_content_fails(self, app, client): """Test empty content raises error""" with pytest.raises(InvalidNoteDataError) as exc: create_note("") assert 'content' in str(exc.value).lower() def test_create_whitespace_content_fails(self, app, client): """Test whitespace-only content raises error""" with pytest.raises(InvalidNoteDataError): create_note(" \n\t ") def test_create_unicode_content(self, app, client): """Test unicode content is handled correctly""" note = create_note("# 你好世界\n\nTest unicode 🚀") assert "你好世界" in note.content assert "🚀" in note.content def test_create_very_long_content(self, app, client): """Test handling very long content""" long_content = "x" * 1_000_000 # 1MB note = create_note(long_content) assert len(note.content) == 1_000_000 def test_create_file_write_failure_rollback(self, app, client, monkeypatch): """Test database rollback if file write fails""" # Mock write_note_file to fail def mock_write_fail(*args): raise OSError("Disk full") monkeypatch.setattr('starpunk.notes.write_note_file', mock_write_fail) with pytest.raises(NoteSyncError): create_note("Test content") # Verify no database record created db = get_db() count = db.execute("SELECT COUNT(*) FROM notes").fetchone()[0] assert count == 0 ``` #### 2. Get Note Tests ```python class TestGetNote: """Test note retrieval""" def test_get_by_slug(self, app, client): """Test retrieving note by slug""" created = create_note("Test content") retrieved = get_note(slug=created.slug) assert retrieved is not None assert retrieved.slug == created.slug assert retrieved.content == "Test content" def test_get_by_id(self, app, client): """Test retrieving note by ID""" created = create_note("Test content") retrieved = get_note(id=created.id) assert retrieved is not None assert retrieved.id == created.id def test_get_nonexistent_returns_none(self, app, client): """Test getting nonexistent note returns None""" note = get_note(slug="does-not-exist") assert note is None def test_get_without_identifier_raises_error(self, app, client): """Test error when neither slug nor id provided""" with pytest.raises(ValueError): get_note() def test_get_with_both_identifiers_raises_error(self, app, client): """Test error when both slug and id provided""" with pytest.raises(ValueError): get_note(slug="test", id=42) def test_get_without_loading_content(self, app, client): """Test getting note without loading content""" created = create_note("Test content") retrieved = get_note(slug=created.slug, load_content=False) assert retrieved is not None # Content will be lazy-loaded on access assert retrieved.content == "Test content" ``` #### 3. List Notes Tests ```python class TestListNotes: """Test note listing""" def test_list_all_notes(self, app, client): """Test listing all notes""" create_note("Note 1", published=True) create_note("Note 2", published=False) notes = list_notes() assert len(notes) == 2 def test_list_published_only(self, app, client): """Test filtering published notes""" create_note("Published", published=True) create_note("Draft", published=False) notes = list_notes(published_only=True) assert len(notes) == 1 assert notes[0].published is True def test_list_with_pagination(self, app, client): """Test pagination""" for i in range(25): create_note(f"Note {i}") # First page page1 = list_notes(limit=10, offset=0) assert len(page1) == 10 # Second page page2 = list_notes(limit=10, offset=10) assert len(page2) == 10 # Third page page3 = list_notes(limit=10, offset=20) assert len(page3) == 5 def test_list_ordering(self, app, client): """Test ordering by different fields""" note1 = create_note("First", created_at=datetime(2024, 1, 1)) note2 = create_note("Second", created_at=datetime(2024, 1, 2)) # Newest first (default) notes = list_notes(order_by='created_at', order_dir='DESC') assert notes[0].slug == note2.slug # Oldest first notes = list_notes(order_by='created_at', order_dir='ASC') assert notes[0].slug == note1.slug def test_list_invalid_order_field(self, app, client): """Test invalid order_by field raises error""" with pytest.raises(ValueError) as exc: list_notes(order_by='malicious; DROP TABLE notes;') assert 'Invalid order_by' in str(exc.value) def test_list_invalid_order_direction(self, app, client): """Test invalid order direction raises error""" with pytest.raises(ValueError): list_notes(order_dir='INVALID') ``` #### 4. Update Note Tests ```python class TestUpdateNote: """Test note updates""" def test_update_content(self, app, client): """Test updating note content""" note = create_note("Original content") updated = update_note(slug=note.slug, content="Updated content") assert updated.content == "Updated content" assert updated.updated_at > note.updated_at def test_update_published_status(self, app, client): """Test updating published status""" note = create_note("Draft", published=False) updated = update_note(slug=note.slug, published=True) assert updated.published is True def test_update_both_content_and_status(self, app, client): """Test updating content and status together""" note = create_note("Draft", published=False) updated = update_note( slug=note.slug, content="Published content", published=True ) assert updated.content == "Published content" assert updated.published is True def test_update_nonexistent_raises_error(self, app, client): """Test updating nonexistent note raises error""" with pytest.raises(NoteNotFoundError): update_note(slug="does-not-exist", content="New content") def test_update_empty_content_fails(self, app, client): """Test updating with empty content raises error""" note = create_note("Original") with pytest.raises(InvalidNoteDataError): update_note(slug=note.slug, content="") def test_update_no_changes_fails(self, app, client): """Test updating with no changes raises error""" note = create_note("Content") with pytest.raises(ValueError): update_note(slug=note.slug) ``` #### 5. Delete Note Tests ```python class TestDeleteNote: """Test note deletion""" def test_soft_delete(self, app, client): """Test soft deletion""" note = create_note("To be deleted") delete_note(slug=note.slug, soft=True) # Note not found in normal queries retrieved = get_note(slug=note.slug) assert retrieved is None # But record still in database with deleted_at set db = get_db() row = db.execute( "SELECT * FROM notes WHERE slug = ?", (note.slug,) ).fetchone() assert row is not None assert row['deleted_at'] is not None def test_hard_delete(self, app, client): """Test hard deletion""" note = create_note("To be deleted") data_dir = Path(app.config['DATA_PATH']) note_path = data_dir / note.file_path delete_note(slug=note.slug, soft=False) # Note not in database db = get_db() row = db.execute( "SELECT * FROM notes WHERE slug = ?", (note.slug,) ).fetchone() assert row is None # File deleted assert not note_path.exists() def test_delete_nonexistent_succeeds(self, app, client): """Test deleting nonexistent note is idempotent""" # Should not raise error delete_note(slug="does-not-exist", soft=True) delete_note(slug="does-not-exist", soft=False) def test_delete_already_deleted_succeeds(self, app, client): """Test deleting already-deleted note is idempotent""" note = create_note("Test") delete_note(slug=note.slug, soft=True) # Delete again - should succeed delete_note(slug=note.slug, soft=True) ``` #### 6. Integration Tests ```python class TestNoteLifecycle: """Test complete note lifecycle""" def test_create_read_update_delete_cycle(self, app, client): """Test full CRUD cycle""" # Create note = create_note("Initial content", published=False) assert note.slug is not None # Read retrieved = get_note(slug=note.slug) assert retrieved.content == "Initial content" assert retrieved.published is False # Update content updated = update_note(slug=note.slug, content="Updated content") assert updated.content == "Updated content" # Publish published = update_note(slug=note.slug, published=True) assert published.published is True # List (should appear) notes = list_notes(published_only=True) assert any(n.slug == note.slug for n in notes) # Delete delete_note(slug=note.slug, soft=False) # Verify gone retrieved = get_note(slug=note.slug) assert retrieved is None def test_file_database_sync_maintained(self, app, client): """Test file and database stay in sync""" data_dir = Path(app.config['DATA_PATH']) # Create note note = create_note("Sync test") note_path = data_dir / note.file_path # File exists assert note_path.exists() # Database record exists db = get_db() row = db.execute("SELECT * FROM notes WHERE slug = ?", (note.slug,)).fetchone() assert row is not None # Update note update_note(slug=note.slug, content="Updated") # File updated assert note_path.read_text() == "Updated" # Database updated row = db.execute("SELECT * FROM notes WHERE slug = ?", (note.slug,)).fetchone() assert row['updated_at'] > row['created_at'] # Delete note delete_note(slug=note.slug, soft=False) # File deleted assert not note_path.exists() # Database deleted row = db.execute("SELECT * FROM notes WHERE slug = ?", (note.slug,)).fetchone() assert row is None ``` ### Test Coverage Requirements - Minimum 90% code coverage for notes.py - All functions tested with multiple scenarios - All error paths tested - All edge cases covered - Integration tests for CRUD cycles - Performance tests for list operations with large datasets ### Test Fixtures ```python @pytest.fixture def app(): """Create test Flask app""" # Setup test app with test database # Return app @pytest.fixture def client(app): """Create test client""" return app.test_client() @pytest.fixture def db(app): """Get test database""" return get_db() ``` --- ## Performance Considerations ### Lazy Loading Benefits - **list_notes()**: No file I/O, only database queries - **get_note(load_content=False)**: Metadata only, fast - **Pagination**: LIMIT/OFFSET efficient for large datasets ### Database Indexes Ensure these indexes exist: ```sql CREATE INDEX idx_notes_slug ON notes(slug); CREATE INDEX idx_notes_created_at ON notes(created_at); CREATE INDEX idx_notes_updated_at ON notes(updated_at); CREATE INDEX idx_notes_published ON notes(published); CREATE INDEX idx_notes_deleted_at ON notes(deleted_at); ``` ### Performance Targets | Operation | Target | Typical Dataset | |-----------|--------|-----------------| | create_note() | < 20ms | Any size | | get_note() | < 10ms | < 5ms DB + < 5ms file I/O | | list_notes() | < 10ms | 1000s of notes | | update_note() | < 20ms | Any size | | delete_note() | < 10ms | Any size | ### Optimization Opportunities (Future) 1. **Content Caching**: Cache rendered HTML in memory 2. **Database Connection Pooling**: Reuse connections 3. **Batch Operations**: Bulk create/update for imports 4. **Async File I/O**: Non-blocking file operations --- ## Configuration ### Required Config Values ```python # Flask app config DATA_PATH = '/path/to/data' # Base directory for notes # Optional config MAX_CONTENT_SIZE = 10 * 1024 * 1024 # 10MB default TRASH_ENABLED = True # Enable soft delete file trash BACKUP_ON_UPDATE = False # Backup old content before update ``` ### Usage ```python from flask import current_app data_dir = Path(current_app.config['DATA_PATH']) max_size = current_app.config.get('MAX_CONTENT_SIZE', 10 * 1024 * 1024) ``` --- ## Acceptance Criteria Phase 2.1 is complete when: - [ ] All 5 functions implemented (create, get, list, update, delete) - [ ] All functions have full type hints - [ ] All functions have comprehensive docstrings with examples - [ ] All custom exceptions defined - [ ] File-database synchronization works correctly - [ ] Transactions used for all write operations - [ ] Error handling comprehensive (all edge cases covered) - [ ] Path validation prevents directory traversal - [ ] SQL injection prevention (parameterized queries, validated order_by) - [ ] Test coverage >90% - [ ] All tests pass - [ ] Integration tests demonstrate full CRUD cycle - [ ] Code formatted with Black - [ ] Code passes flake8 linting - [ ] No orphaned files or database records in tests - [ ] Performance targets met for typical operations - [ ] Error messages are clear and actionable - [ ] Documentation complete and accurate --- ## References - [ADR-004: File-Based Note Storage](/home/phil/Projects/starpunk/docs/decisions/ADR-004-file-based-note-storage.md) - [ADR-007: Slug Generation Algorithm](/home/phil/Projects/starpunk/docs/decisions/ADR-007-slug-generation-algorithm.md) - [Phase 1.1: Core Utilities](/home/phil/Projects/starpunk/docs/design/phase-1.1-core-utilities.md) - [Phase 1.2: Data Models](/home/phil/Projects/starpunk/docs/design/phase-1.2-data-models.md) - [Database Schema](/home/phil/Projects/starpunk/starpunk/database.py) - [Python Coding Standards](/home/phil/Projects/starpunk/docs/standards/python-coding-standards.md) - [Architecture Overview](/home/phil/Projects/starpunk/docs/architecture/overview.md) - [Security Architecture](/home/phil/Projects/starpunk/docs/architecture/security.md) --- ## Next Steps After completing Phase 2.1: 1. **Phase 3: Authentication** (IndieLogin integration) 2. **Phase 4: Web Routes** (Admin UI + Public views) 3. **Phase 5: Micropub** (API endpoint) 4. **Phase 6: RSS Feed** (Syndication) Phase 2.1 provides the foundation for all note-related operations in the application.