Files
StarPunk/docs/design/v1.1.1/bug-fixes-spec.md
Phil Skentelbery e589f5bd6c docs: Fix ADR numbering conflicts and create comprehensive documentation indices
This commit resolves all documentation issues identified in the comprehensive review:

CRITICAL FIXES:
- Renumbered duplicate ADRs to eliminate conflicts:
  * ADR-022-migration-race-condition-fix → ADR-037
  * ADR-022-syndication-formats → ADR-038
  * ADR-023-microformats2-compliance → ADR-040
  * ADR-027-versioning-strategy-for-authorization-removal → ADR-042
  * ADR-030-CORRECTED-indieauth-endpoint-discovery → ADR-043
  * ADR-031-endpoint-discovery-implementation → ADR-044

- Updated all cross-references to renumbered ADRs in:
  * docs/projectplan/ROADMAP.md
  * docs/reports/v1.0.0-rc.5-migration-race-condition-implementation.md
  * docs/reports/2025-11-24-endpoint-discovery-analysis.md
  * docs/decisions/ADR-043-CORRECTED-indieauth-endpoint-discovery.md
  * docs/decisions/ADR-044-endpoint-discovery-implementation.md

- Updated README.md version from 1.0.0 to 1.1.0
- Tracked ADR-021-indieauth-provider-strategy.md in git

DOCUMENTATION IMPROVEMENTS:
- Created comprehensive INDEX.md files for all docs/ subdirectories:
  * docs/architecture/INDEX.md (28 documents indexed)
  * docs/decisions/INDEX.md (55 ADRs indexed with topical grouping)
  * docs/design/INDEX.md (phase plans and feature designs)
  * docs/standards/INDEX.md (9 standards with compliance checklist)
  * docs/reports/INDEX.md (57 implementation reports)
  * docs/deployment/INDEX.md (deployment guides)
  * docs/examples/INDEX.md (code samples and usage patterns)
  * docs/migration/INDEX.md (version migration guides)
  * docs/releases/INDEX.md (release documentation)
  * docs/reviews/INDEX.md (architectural reviews)
  * docs/security/INDEX.md (security documentation)

- Updated CLAUDE.md with complete folder descriptions including:
  * docs/migration/
  * docs/releases/
  * docs/security/

VERIFICATION:
- All ADR numbers now sequential and unique (50 total ADRs)
- No duplicate ADR numbers remain
- All cross-references updated and verified
- Documentation structure consistent and well-organized

These changes improve documentation discoverability, maintainability, and
ensure proper version tracking. All index files follow consistent format
with clear navigation guidance.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 13:28:56 -07:00

18 KiB

Bug Fixes and Edge Cases Specification

Overview

This specification details the bug fixes and edge case handling improvements planned for v1.1.1, focusing on test stability, Unicode handling, memory optimization, and session management.

Bug Fixes

1. Migration Race Condition in Tests

Problem

10 tests exhibit flaky behavior due to race conditions during database migration execution. Tests occasionally fail when migrations are executed concurrently or when the test database isn't properly initialized.

Root Cause

  • Concurrent test execution without proper isolation
  • Shared database state between tests
  • Migration lock not properly acquired
  • Test fixtures not waiting for migration completion

Solution

# starpunk/testing/fixtures.py
import threading
import tempfile
from contextlib import contextmanager

# Global lock for test database operations
_test_db_lock = threading.Lock()

@contextmanager
def isolated_test_database():
    """Create isolated database for testing"""
    with _test_db_lock:
        # Create unique temp database
        temp_db = tempfile.NamedTemporaryFile(
            suffix='.db',
            delete=False
        )
        db_path = temp_db.name
        temp_db.close()

        try:
            # Initialize database with migrations
            run_migrations_sync(db_path)

            # Yield database for test
            yield db_path
        finally:
            # Cleanup
            try:
                os.unlink(db_path)
            except:
                pass

def run_migrations_sync(db_path: str):
    """Run migrations synchronously with proper locking"""
    conn = sqlite3.connect(db_path)

    # Use exclusive lock during migrations
    conn.execute("BEGIN EXCLUSIVE")

    try:
        migrator = DatabaseMigrator(conn)
        migrator.run_all()
        conn.commit()
    except Exception:
        conn.rollback()
        raise
    finally:
        conn.close()

# Test base class
class StarPunkTestCase(unittest.TestCase):
    """Base test case with proper database isolation"""

    def setUp(self):
        """Set up test with isolated database"""
        self.db_context = isolated_test_database()
        self.db_path = self.db_context.__enter__()
        self.app = create_app(database=self.db_path)
        self.client = self.app.test_client()

    def tearDown(self):
        """Clean up test database"""
        self.db_context.__exit__(None, None, None)

# Example test with proper isolation
class TestMigrations(StarPunkTestCase):
    def test_migration_idempotency(self):
        """Test that migrations can be run multiple times"""
        # First run happens in setUp

        # Second run should be safe
        run_migrations_sync(self.db_path)

        # Verify database state
        with sqlite3.connect(self.db_path) as conn:
            tables = conn.execute(
                "SELECT name FROM sqlite_master WHERE type='table'"
            ).fetchall()
            self.assertIn(('notes',), tables)

Test Timing Improvements

# starpunk/testing/wait.py
import time
from typing import Callable

def wait_for_condition(
    condition: Callable[[], bool],
    timeout: float = 5.0,
    interval: float = 0.1
) -> bool:
    """Wait for condition to become true"""
    start = time.time()

    while time.time() - start < timeout:
        if condition():
            return True
        time.sleep(interval)

    return False

# Usage in tests
def test_async_operation(self):
    """Test with proper waiting"""
    self.client.post('/notes', data={'content': 'Test'})

    # Wait for indexing to complete
    success = wait_for_condition(
        lambda: search_index_updated(),
        timeout=2.0
    )
    self.assertTrue(success)

2. Unicode Edge Cases in Slug Generation

Problem

Slug generation fails or produces invalid slugs for certain Unicode inputs, including emoji, RTL text, and combining characters.

Current Issues

  • Emoji in titles break slug generation
  • RTL languages produce confusing slugs
  • Combining characters aren't normalized
  • Zero-width characters remain in slugs

Solution

# starpunk/utils/slugify.py
import unicodedata
import re

def generate_slug(text: str, max_length: int = 50) -> str:
    """Generate URL-safe slug from text with Unicode handling"""

    if not text:
        return generate_random_slug()

    # Normalize Unicode (NFKD = compatibility decomposition)
    text = unicodedata.normalize('NFKD', text)

    # Remove non-ASCII characters but keep numbers and letters
    text = text.encode('ascii', 'ignore').decode('ascii')

    # Convert to lowercase
    text = text.lower()

    # Replace spaces and punctuation with hyphens
    text = re.sub(r'[^a-z0-9]+', '-', text)

    # Remove leading/trailing hyphens
    text = text.strip('-')

    # Collapse multiple hyphens
    text = re.sub(r'-+', '-', text)

    # Truncate to max length (at word boundary if possible)
    if len(text) > max_length:
        text = text[:max_length].rsplit('-', 1)[0]

    # If we end up with empty string, generate random
    if not text:
        return generate_random_slug()

    return text

def generate_random_slug() -> str:
    """Generate random slug when text-based generation fails"""
    import random
    import string

    return 'note-' + ''.join(
        random.choices(string.ascii_lowercase + string.digits, k=8)
    )

# Extended test cases
TEST_CASES = [
    ("Hello World", "hello-world"),
    ("Hello 👋 World", "hello-world"),  # Emoji removed
    ("مرحبا بالعالم", "note-a1b2c3d4"),  # Arabic -> random
    ("Ĥëłłö Ŵöŕłđ", "hello-world"),  # Diacritics removed
    ("Hello\u200bWorld", "helloworld"),  # Zero-width space
    ("---Hello---", "hello"),  # Multiple hyphens
    ("123", "123"),  # Numbers only
    ("!@#$%", "note-x1y2z3a4"),  # Special chars -> random
    ("a" * 100, "a" * 50),  # Truncation
    ("", "note-r4nd0m12"),  # Empty -> random
]

def test_slug_generation():
    """Test slug generation with Unicode edge cases"""
    for input_text, expected in TEST_CASES:
        result = generate_slug(input_text)
        if expected.startswith("note-"):
            # Random slug - just check format
            assert result.startswith("note-")
            assert len(result) == 13
        else:
            assert result == expected

3. RSS Feed Memory Optimization

Problem

RSS feed generation for sites with thousands of notes causes high memory usage and slow response times.

Current Issues

  • Loading all notes into memory at once
  • No pagination or limits
  • Inefficient XML building
  • No caching of generated feeds

Solution

# starpunk/feeds/rss.py
from typing import Iterator
import sqlite3

class OptimizedRSSGenerator:
    """Memory-efficient RSS feed generator"""

    def __init__(self, base_url: str, limit: int = 50):
        self.base_url = base_url
        self.limit = limit

    def generate_feed(self) -> str:
        """Generate RSS feed with streaming"""
        # Use string builder for efficiency
        parts = []
        parts.append(self._generate_header())

        # Stream notes from database
        for note in self._stream_recent_notes():
            parts.append(self._generate_item(note))

        parts.append(self._generate_footer())

        return ''.join(parts)

    def _stream_recent_notes(self) -> Iterator[dict]:
        """Stream notes without loading all into memory"""
        with get_db() as conn:
            # Use server-side cursor equivalent
            conn.row_factory = sqlite3.Row

            cursor = conn.execute(
                """
                SELECT
                    id,
                    content,
                    slug,
                    created_at,
                    updated_at
                FROM notes
                WHERE published = 1
                ORDER BY created_at DESC
                LIMIT ?
                """,
                (self.limit,)
            )

            # Yield one at a time
            for row in cursor:
                yield dict(row)

    def _generate_item(self, note: dict) -> str:
        """Generate single RSS item efficiently"""
        # Pre-calculate values once
        title = extract_title(note['content'])
        url = f"{self.base_url}/notes/{note['id']}"

        # Use string formatting for efficiency
        return f"""
        <item>
            <title>{escape_xml(title)}</title>
            <link>{url}</link>
            <guid isPermaLink="true">{url}</guid>
            <description>{escape_xml(note['content'][:500])}</description>
            <pubDate>{format_rfc822(note['created_at'])}</pubDate>
        </item>
        """

# Caching layer
from functools import lru_cache
from datetime import datetime, timedelta

class CachedRSSFeed:
    """RSS feed with caching"""

    def __init__(self):
        self.cache = {}
        self.cache_duration = timedelta(minutes=5)

    def get_feed(self) -> str:
        """Get RSS feed with caching"""
        now = datetime.now()

        # Check cache
        if 'feed' in self.cache:
            cached_feed, cached_time = self.cache['feed']
            if now - cached_time < self.cache_duration:
                return cached_feed

        # Generate new feed
        generator = OptimizedRSSGenerator(
            base_url=config.BASE_URL,
            limit=config.RSS_ITEM_LIMIT
        )
        feed = generator.generate_feed()

        # Update cache
        self.cache['feed'] = (feed, now)

        return feed

    def invalidate(self):
        """Invalidate cache when notes change"""
        self.cache.clear()

# Memory-efficient XML escaping
def escape_xml(text: str) -> str:
    """Escape XML special characters efficiently"""
    if not text:
        return ""

    # Use replace instead of xml.sax.saxutils for efficiency
    return (
        text.replace("&", "&amp;")
            .replace("<", "&lt;")
            .replace(">", "&gt;")
            .replace('"', "&quot;")
            .replace("'", "&apos;")
    )

4. Session Timeout Handling

Problem

Sessions don't properly timeout, leading to security issues and stale session accumulation.

Current Issues

  • No automatic session expiration
  • No cleanup of old sessions
  • Session extension not working
  • No timeout configuration

Solution

# starpunk/auth/session_improved.py
from datetime import datetime, timedelta
import threading
import time

class ImprovedSessionManager:
    """Session manager with proper timeout handling"""

    def __init__(self):
        self.timeout = config.SESSION_TIMEOUT
        self.cleanup_interval = 3600  # 1 hour
        self._start_cleanup_thread()

    def _start_cleanup_thread(self):
        """Start background cleanup thread"""
        def cleanup_loop():
            while True:
                try:
                    self.cleanup_expired_sessions()
                except Exception as e:
                    logger.error(f"Session cleanup error: {e}")
                time.sleep(self.cleanup_interval)

        thread = threading.Thread(target=cleanup_loop)
        thread.daemon = True
        thread.start()

    def create_session(self, user_id: str, remember: bool = False) -> dict:
        """Create session with appropriate timeout"""
        session_id = generate_secure_token()

        # Longer timeout for "remember me"
        if remember:
            timeout = config.SESSION_TIMEOUT_REMEMBER
        else:
            timeout = self.timeout

        expires_at = datetime.now() + timedelta(seconds=timeout)

        with get_db() as conn:
            conn.execute(
                """
                INSERT INTO sessions (
                    id, user_id, expires_at, created_at, last_activity
                )
                VALUES (?, ?, ?, ?, ?)
                """,
                (
                    session_id,
                    user_id,
                    expires_at,
                    datetime.now(),
                    datetime.now()
                )
            )

        logger.info(f"Session created for user {user_id}")

        return {
            'session_id': session_id,
            'expires_at': expires_at.isoformat(),
            'timeout': timeout
        }

    def validate_and_extend(self, session_id: str) -> Optional[str]:
        """Validate session and extend timeout on activity"""
        now = datetime.now()

        with get_db() as conn:
            # Get session
            result = conn.execute(
                """
                SELECT user_id, expires_at, last_activity
                FROM sessions
                WHERE id = ? AND expires_at > ?
                """,
                (session_id, now)
            ).fetchone()

            if not result:
                return None

            user_id = result['user_id']
            last_activity = datetime.fromisoformat(result['last_activity'])

            # Extend session if active
            if now - last_activity > timedelta(minutes=5):
                # Only extend if there's been recent activity
                new_expires = now + timedelta(seconds=self.timeout)

                conn.execute(
                    """
                    UPDATE sessions
                    SET expires_at = ?, last_activity = ?
                    WHERE id = ?
                    """,
                    (new_expires, now, session_id)
                )

                logger.debug(f"Session extended for user {user_id}")

            return user_id

    def cleanup_expired_sessions(self):
        """Remove expired sessions from database"""
        with get_db() as conn:
            result = conn.execute(
                """
                DELETE FROM sessions
                WHERE expires_at < ?
                RETURNING id
                """,
                (datetime.now(),)
            )

            deleted_count = len(result.fetchall())

            if deleted_count > 0:
                logger.info(f"Cleaned up {deleted_count} expired sessions")

    def invalidate_session(self, session_id: str):
        """Explicitly invalidate a session"""
        with get_db() as conn:
            conn.execute(
                "DELETE FROM sessions WHERE id = ?",
                (session_id,)
            )

        logger.info(f"Session {session_id} invalidated")

    def get_active_sessions(self, user_id: str) -> list:
        """Get all active sessions for a user"""
        with get_db() as conn:
            result = conn.execute(
                """
                SELECT id, created_at, last_activity, expires_at
                FROM sessions
                WHERE user_id = ? AND expires_at > ?
                ORDER BY last_activity DESC
                """,
                (user_id, datetime.now())
            )

            return [dict(row) for row in result]

# Session middleware
@app.before_request
def check_session():
    """Check and extend session on each request"""
    session_id = request.cookies.get('session_id')

    if session_id:
        user_id = session_manager.validate_and_extend(session_id)

        if user_id:
            g.user_id = user_id
            g.authenticated = True
        else:
            # Clear invalid session cookie
            g.clear_session = True
            g.authenticated = False
    else:
        g.authenticated = False

@app.after_request
def update_session_cookie(response):
    """Update session cookie if needed"""
    if hasattr(g, 'clear_session') and g.clear_session:
        response.set_cookie(
            'session_id',
            '',
            expires=0,
            secure=config.SESSION_SECURE,
            httponly=True,
            samesite='Lax'
        )

    return response

Testing Strategy

Test Stability Improvements

# starpunk/testing/stability.py
import pytest
from unittest.mock import patch

@pytest.fixture
def stable_test_env():
    """Provide stable test environment"""
    with patch('time.time', return_value=1234567890):
        with patch('random.choice', side_effect=cycle('abcd')):
            with isolated_test_database() as db:
                yield db

def test_with_stability(stable_test_env):
    """Test with predictable environment"""
    # Time and randomness are now deterministic
    pass

Unicode Test Suite

# starpunk/testing/unicode.py
import pytest

UNICODE_TEST_STRINGS = [
    "Simple ASCII",
    "Émoji 😀🎉🚀",
    "العربية",
    "中文字符",
    "🏳️‍🌈 flags",
    "Math: ∑∏∫",
    "Ñoño",
    "Combining: é (e + ́)",
]

@pytest.mark.parametrize("text", UNICODE_TEST_STRINGS)
def test_unicode_handling(text):
    """Test Unicode handling throughout system"""
    # Test slug generation
    slug = generate_slug(text)
    assert slug  # Should always produce something

    # Test note creation
    note = create_note(content=text)
    assert note.content == text

    # Test search
    results = search_notes(text)
    # Should not crash

    # Test RSS
    feed = generate_rss_feed()
    # Should be valid XML

Performance Testing

Memory Usage Tests

def test_rss_memory_usage():
    """Test RSS generation memory usage"""
    import tracemalloc

    # Create many notes
    for i in range(10000):
        create_note(content=f"Note {i}")

    # Measure memory for RSS generation
    tracemalloc.start()
    initial = tracemalloc.get_traced_memory()

    feed = generate_rss_feed()

    peak = tracemalloc.get_traced_memory()
    tracemalloc.stop()

    memory_used = (peak[0] - initial[0]) / 1024 / 1024  # MB

    assert memory_used < 10  # Should use less than 10MB

Acceptance Criteria

Race Condition Fixes

  1. All 10 flaky tests pass consistently
  2. Test isolation properly implemented
  3. Migration locks prevent concurrent execution
  4. Test fixtures properly synchronized

Unicode Handling

  1. Slug generation handles all Unicode input
  2. Never produces invalid/empty slugs
  3. Emoji and special characters handled gracefully
  4. RTL languages don't break system

RSS Memory Optimization

  1. Memory usage stays under 10MB for 10,000 notes
  2. Response time under 500ms
  3. Streaming implementation works correctly
  4. Cache invalidation on note changes

Session Management

  1. Sessions expire after configured timeout
  2. Expired sessions automatically cleaned up
  3. Active sessions properly extended
  4. Session invalidation works correctly

Risk Mitigation

  1. Test Stability: Run test suite 100 times to verify
  2. Unicode Compatibility: Test with real-world data
  3. Memory Leaks: Monitor long-running instances
  4. Session Security: Security review of implementation