# ADR-012: Flaky Test Removal and Test Quality Standards ## Status Proposed ## Context The test suite contains several categories of flaky tests that pass inconsistently. These tests consume developer time without providing proportional value. Per the project philosophy ("Every line of code must justify its existence"), we must evaluate whether these tests should be kept, fixed, or removed. ## Analysis by Test Category ### 1. Migration Race Condition Tests (`test_migration_race_condition.py`) **Failing Tests:** - `test_debug_level_for_early_retries` - Log message matching - `test_new_connection_per_retry` - Connection count assertions - `test_concurrent_workers_barrier_sync` - Multiprocessing pickle errors - `test_sequential_worker_startup` - Missing table errors - `test_worker_late_arrival` - Missing table errors - `test_single_worker_performance` - Missing table errors - `test_concurrent_workers_performance` - Pickle errors **Value Analysis:** - The migration retry logic with exponential backoff is *critical* for production deployments with multiple Gunicorn workers - However, the flaky tests are testing implementation details (log levels, exact connection counts) rather than behavior - The multiprocessing tests fundamentally cannot work reliably because: 1. `multiprocessing.Manager().Barrier()` objects cannot be pickled for `Pool.map()` 2. The worker functions require Flask app context that doesn't transfer across processes 3. SQLite database files in temp directories may not be accessible across process boundaries **Root Cause:** Test design is flawed. These are attempting integration/stress tests using unit test infrastructure. **Recommendation: REMOVE the multiprocessing tests entirely. KEEP and FIX the unit tests.** Specifically: - **REMOVE:** `TestConcurrentExecution` class (all 3 tests) - fundamentally broken by design - **REMOVE:** `TestPerformance` class (both tests) - same multiprocessing issues - **KEEP:** `TestRetryLogic` - valuable, just needs mock fixes - **KEEP:** `TestGraduatedLogging` - valuable, needs logger configuration fixes - **KEEP:** `TestConnectionManagement` - valuable, needs assertion fixes - **KEEP:** `TestErrorHandling` - valuable, tests critical rollback behavior - **KEEP:** `TestBeginImmediateTransaction` - valuable, tests locking mechanism **Rationale for removal:** If we need to test concurrent migration behavior, that requires: 1. A proper integration test framework (not pytest unit tests) 2. External process spawning (not multiprocessing.Pool) 3. Real filesystem isolation 4. This is out of scope for V1 - the code works; the tests are the problem --- ### 2. Feed Route Tests (`test_routes_feeds.py`) **Failing Assertions:** - Tests checking for exact `