# Bug Fixes and Edge Cases Specification ## Overview This specification details the bug fixes and edge case handling improvements planned for v1.1.1, focusing on test stability, Unicode handling, memory optimization, and session management. ## Bug Fixes ### 1. Migration Race Condition in Tests #### Problem 10 tests exhibit flaky behavior due to race conditions during database migration execution. Tests occasionally fail when migrations are executed concurrently or when the test database isn't properly initialized. #### Root Cause - Concurrent test execution without proper isolation - Shared database state between tests - Migration lock not properly acquired - Test fixtures not waiting for migration completion #### Solution ```python # starpunk/testing/fixtures.py import threading import tempfile from contextlib import contextmanager # Global lock for test database operations _test_db_lock = threading.Lock() @contextmanager def isolated_test_database(): """Create isolated database for testing""" with _test_db_lock: # Create unique temp database temp_db = tempfile.NamedTemporaryFile( suffix='.db', delete=False ) db_path = temp_db.name temp_db.close() try: # Initialize database with migrations run_migrations_sync(db_path) # Yield database for test yield db_path finally: # Cleanup try: os.unlink(db_path) except: pass def run_migrations_sync(db_path: str): """Run migrations synchronously with proper locking""" conn = sqlite3.connect(db_path) # Use exclusive lock during migrations conn.execute("BEGIN EXCLUSIVE") try: migrator = DatabaseMigrator(conn) migrator.run_all() conn.commit() except Exception: conn.rollback() raise finally: conn.close() # Test base class class StarPunkTestCase(unittest.TestCase): """Base test case with proper database isolation""" def setUp(self): """Set up test with isolated database""" self.db_context = isolated_test_database() self.db_path = self.db_context.__enter__() self.app = create_app(database=self.db_path) self.client = self.app.test_client() def tearDown(self): """Clean up test database""" self.db_context.__exit__(None, None, None) # Example test with proper isolation class TestMigrations(StarPunkTestCase): def test_migration_idempotency(self): """Test that migrations can be run multiple times""" # First run happens in setUp # Second run should be safe run_migrations_sync(self.db_path) # Verify database state with sqlite3.connect(self.db_path) as conn: tables = conn.execute( "SELECT name FROM sqlite_master WHERE type='table'" ).fetchall() self.assertIn(('notes',), tables) ``` #### Test Timing Improvements ```python # starpunk/testing/wait.py import time from typing import Callable def wait_for_condition( condition: Callable[[], bool], timeout: float = 5.0, interval: float = 0.1 ) -> bool: """Wait for condition to become true""" start = time.time() while time.time() - start < timeout: if condition(): return True time.sleep(interval) return False # Usage in tests def test_async_operation(self): """Test with proper waiting""" self.client.post('/notes', data={'content': 'Test'}) # Wait for indexing to complete success = wait_for_condition( lambda: search_index_updated(), timeout=2.0 ) self.assertTrue(success) ``` ### 2. Unicode Edge Cases in Slug Generation #### Problem Slug generation fails or produces invalid slugs for certain Unicode inputs, including emoji, RTL text, and combining characters. #### Current Issues - Emoji in titles break slug generation - RTL languages produce confusing slugs - Combining characters aren't normalized - Zero-width characters remain in slugs #### Solution ```python # starpunk/utils/slugify.py import unicodedata import re def generate_slug(text: str, max_length: int = 50) -> str: """Generate URL-safe slug from text with Unicode handling""" if not text: return generate_random_slug() # Normalize Unicode (NFKD = compatibility decomposition) text = unicodedata.normalize('NFKD', text) # Remove non-ASCII characters but keep numbers and letters text = text.encode('ascii', 'ignore').decode('ascii') # Convert to lowercase text = text.lower() # Replace spaces and punctuation with hyphens text = re.sub(r'[^a-z0-9]+', '-', text) # Remove leading/trailing hyphens text = text.strip('-') # Collapse multiple hyphens text = re.sub(r'-+', '-', text) # Truncate to max length (at word boundary if possible) if len(text) > max_length: text = text[:max_length].rsplit('-', 1)[0] # If we end up with empty string, generate random if not text: return generate_random_slug() return text def generate_random_slug() -> str: """Generate random slug when text-based generation fails""" import random import string return 'note-' + ''.join( random.choices(string.ascii_lowercase + string.digits, k=8) ) # Extended test cases TEST_CASES = [ ("Hello World", "hello-world"), ("Hello ๐Ÿ‘‹ World", "hello-world"), # Emoji removed ("ู…ุฑุญุจุง ุจุงู„ุนุงู„ู…", "note-a1b2c3d4"), # Arabic -> random ("ฤครซล‚ล‚รถ ลดรถล•ล‚ฤ‘", "hello-world"), # Diacritics removed ("Hello\u200bWorld", "helloworld"), # Zero-width space ("---Hello---", "hello"), # Multiple hyphens ("123", "123"), # Numbers only ("!@#$%", "note-x1y2z3a4"), # Special chars -> random ("a" * 100, "a" * 50), # Truncation ("", "note-r4nd0m12"), # Empty -> random ] def test_slug_generation(): """Test slug generation with Unicode edge cases""" for input_text, expected in TEST_CASES: result = generate_slug(input_text) if expected.startswith("note-"): # Random slug - just check format assert result.startswith("note-") assert len(result) == 13 else: assert result == expected ``` ### 3. RSS Feed Memory Optimization #### Problem RSS feed generation for sites with thousands of notes causes high memory usage and slow response times. #### Current Issues - Loading all notes into memory at once - No pagination or limits - Inefficient XML building - No caching of generated feeds #### Solution ```python # starpunk/feeds/rss.py from typing import Iterator import sqlite3 class OptimizedRSSGenerator: """Memory-efficient RSS feed generator""" def __init__(self, base_url: str, limit: int = 50): self.base_url = base_url self.limit = limit def generate_feed(self) -> str: """Generate RSS feed with streaming""" # Use string builder for efficiency parts = [] parts.append(self._generate_header()) # Stream notes from database for note in self._stream_recent_notes(): parts.append(self._generate_item(note)) parts.append(self._generate_footer()) return ''.join(parts) def _stream_recent_notes(self) -> Iterator[dict]: """Stream notes without loading all into memory""" with get_db() as conn: # Use server-side cursor equivalent conn.row_factory = sqlite3.Row cursor = conn.execute( """ SELECT id, content, slug, created_at, updated_at FROM notes WHERE published = 1 ORDER BY created_at DESC LIMIT ? """, (self.limit,) ) # Yield one at a time for row in cursor: yield dict(row) def _generate_item(self, note: dict) -> str: """Generate single RSS item efficiently""" # Pre-calculate values once title = extract_title(note['content']) url = f"{self.base_url}/notes/{note['id']}" # Use string formatting for efficiency return f""" {escape_xml(title)} {url} {url} {escape_xml(note['content'][:500])} {format_rfc822(note['created_at'])} """ # Caching layer from functools import lru_cache from datetime import datetime, timedelta class CachedRSSFeed: """RSS feed with caching""" def __init__(self): self.cache = {} self.cache_duration = timedelta(minutes=5) def get_feed(self) -> str: """Get RSS feed with caching""" now = datetime.now() # Check cache if 'feed' in self.cache: cached_feed, cached_time = self.cache['feed'] if now - cached_time < self.cache_duration: return cached_feed # Generate new feed generator = OptimizedRSSGenerator( base_url=config.BASE_URL, limit=config.RSS_ITEM_LIMIT ) feed = generator.generate_feed() # Update cache self.cache['feed'] = (feed, now) return feed def invalidate(self): """Invalidate cache when notes change""" self.cache.clear() # Memory-efficient XML escaping def escape_xml(text: str) -> str: """Escape XML special characters efficiently""" if not text: return "" # Use replace instead of xml.sax.saxutils for efficiency return ( text.replace("&", "&") .replace("<", "<") .replace(">", ">") .replace('"', """) .replace("'", "'") ) ``` ### 4. Session Timeout Handling #### Problem Sessions don't properly timeout, leading to security issues and stale session accumulation. #### Current Issues - No automatic session expiration - No cleanup of old sessions - Session extension not working - No timeout configuration #### Solution ```python # starpunk/auth/session_improved.py from datetime import datetime, timedelta import threading import time class ImprovedSessionManager: """Session manager with proper timeout handling""" def __init__(self): self.timeout = config.SESSION_TIMEOUT self.cleanup_interval = 3600 # 1 hour self._start_cleanup_thread() def _start_cleanup_thread(self): """Start background cleanup thread""" def cleanup_loop(): while True: try: self.cleanup_expired_sessions() except Exception as e: logger.error(f"Session cleanup error: {e}") time.sleep(self.cleanup_interval) thread = threading.Thread(target=cleanup_loop) thread.daemon = True thread.start() def create_session(self, user_id: str, remember: bool = False) -> dict: """Create session with appropriate timeout""" session_id = generate_secure_token() # Longer timeout for "remember me" if remember: timeout = config.SESSION_TIMEOUT_REMEMBER else: timeout = self.timeout expires_at = datetime.now() + timedelta(seconds=timeout) with get_db() as conn: conn.execute( """ INSERT INTO sessions ( id, user_id, expires_at, created_at, last_activity ) VALUES (?, ?, ?, ?, ?) """, ( session_id, user_id, expires_at, datetime.now(), datetime.now() ) ) logger.info(f"Session created for user {user_id}") return { 'session_id': session_id, 'expires_at': expires_at.isoformat(), 'timeout': timeout } def validate_and_extend(self, session_id: str) -> Optional[str]: """Validate session and extend timeout on activity""" now = datetime.now() with get_db() as conn: # Get session result = conn.execute( """ SELECT user_id, expires_at, last_activity FROM sessions WHERE id = ? AND expires_at > ? """, (session_id, now) ).fetchone() if not result: return None user_id = result['user_id'] last_activity = datetime.fromisoformat(result['last_activity']) # Extend session if active if now - last_activity > timedelta(minutes=5): # Only extend if there's been recent activity new_expires = now + timedelta(seconds=self.timeout) conn.execute( """ UPDATE sessions SET expires_at = ?, last_activity = ? WHERE id = ? """, (new_expires, now, session_id) ) logger.debug(f"Session extended for user {user_id}") return user_id def cleanup_expired_sessions(self): """Remove expired sessions from database""" with get_db() as conn: result = conn.execute( """ DELETE FROM sessions WHERE expires_at < ? RETURNING id """, (datetime.now(),) ) deleted_count = len(result.fetchall()) if deleted_count > 0: logger.info(f"Cleaned up {deleted_count} expired sessions") def invalidate_session(self, session_id: str): """Explicitly invalidate a session""" with get_db() as conn: conn.execute( "DELETE FROM sessions WHERE id = ?", (session_id,) ) logger.info(f"Session {session_id} invalidated") def get_active_sessions(self, user_id: str) -> list: """Get all active sessions for a user""" with get_db() as conn: result = conn.execute( """ SELECT id, created_at, last_activity, expires_at FROM sessions WHERE user_id = ? AND expires_at > ? ORDER BY last_activity DESC """, (user_id, datetime.now()) ) return [dict(row) for row in result] # Session middleware @app.before_request def check_session(): """Check and extend session on each request""" session_id = request.cookies.get('session_id') if session_id: user_id = session_manager.validate_and_extend(session_id) if user_id: g.user_id = user_id g.authenticated = True else: # Clear invalid session cookie g.clear_session = True g.authenticated = False else: g.authenticated = False @app.after_request def update_session_cookie(response): """Update session cookie if needed""" if hasattr(g, 'clear_session') and g.clear_session: response.set_cookie( 'session_id', '', expires=0, secure=config.SESSION_SECURE, httponly=True, samesite='Lax' ) return response ``` ## Testing Strategy ### Test Stability Improvements ```python # starpunk/testing/stability.py import pytest from unittest.mock import patch @pytest.fixture def stable_test_env(): """Provide stable test environment""" with patch('time.time', return_value=1234567890): with patch('random.choice', side_effect=cycle('abcd')): with isolated_test_database() as db: yield db def test_with_stability(stable_test_env): """Test with predictable environment""" # Time and randomness are now deterministic pass ``` ### Unicode Test Suite ```python # starpunk/testing/unicode.py import pytest UNICODE_TEST_STRINGS = [ "Simple ASCII", "ร‰moji ๐Ÿ˜€๐ŸŽ‰๐Ÿš€", "ุงู„ุนุฑุจูŠุฉ", "ไธญๆ–‡ๅญ—็ฌฆ", "๐Ÿณ๏ธโ€๐ŸŒˆ flags", "Math: โˆ‘โˆโˆซ", "ร‘oรฑo", "Combining: รฉ (e + ฬ)", ] @pytest.mark.parametrize("text", UNICODE_TEST_STRINGS) def test_unicode_handling(text): """Test Unicode handling throughout system""" # Test slug generation slug = generate_slug(text) assert slug # Should always produce something # Test note creation note = create_note(content=text) assert note.content == text # Test search results = search_notes(text) # Should not crash # Test RSS feed = generate_rss_feed() # Should be valid XML ``` ## Performance Testing ### Memory Usage Tests ```python def test_rss_memory_usage(): """Test RSS generation memory usage""" import tracemalloc # Create many notes for i in range(10000): create_note(content=f"Note {i}") # Measure memory for RSS generation tracemalloc.start() initial = tracemalloc.get_traced_memory() feed = generate_rss_feed() peak = tracemalloc.get_traced_memory() tracemalloc.stop() memory_used = (peak[0] - initial[0]) / 1024 / 1024 # MB assert memory_used < 10 # Should use less than 10MB ``` ## Acceptance Criteria ### Race Condition Fixes 1. โœ… All 10 flaky tests pass consistently 2. โœ… Test isolation properly implemented 3. โœ… Migration locks prevent concurrent execution 4. โœ… Test fixtures properly synchronized ### Unicode Handling 1. โœ… Slug generation handles all Unicode input 2. โœ… Never produces invalid/empty slugs 3. โœ… Emoji and special characters handled gracefully 4. โœ… RTL languages don't break system ### RSS Memory Optimization 1. โœ… Memory usage stays under 10MB for 10,000 notes 2. โœ… Response time under 500ms 3. โœ… Streaming implementation works correctly 4. โœ… Cache invalidation on note changes ### Session Management 1. โœ… Sessions expire after configured timeout 2. โœ… Expired sessions automatically cleaned up 3. โœ… Active sessions properly extended 4. โœ… Session invalidation works correctly ## Risk Mitigation 1. **Test Stability**: Run test suite 100 times to verify 2. **Unicode Compatibility**: Test with real-world data 3. **Memory Leaks**: Monitor long-running instances 4. **Session Security**: Security review of implementation