Files
StarPunk/docs/design/v1.5.0/2025-12-16-phase-0-implementation-report.md
Phil Skentelbery 92e7bdd342 feat(tests): Phase 0 - Fix flaky and broken tests
Implements Phase 0 of v1.5.0 per ADR-012 and RELEASE.md.

Changes:
- Remove 5 broken multiprocessing tests (TestConcurrentExecution, TestPerformance)
- Fix brittle XML assertion tests (check semantics not quote style)
- Fix test_debug_level_for_early_retries logger configuration
- Rename test_feed_route_streaming to test_feed_route_caching (correct name)

Results:
- Test count: 879 → 874 (5 removed as planned)
- All tests pass consistently (verified across 3 runs)
- No flakiness detected

References:
- ADR-012: Flaky Test Removal and Test Quality Standards
- docs/projectplan/v1.5.0/RELEASE.md Phase 0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-17 09:24:12 -07:00

5.3 KiB

Phase 0 Implementation Report - Test Fixes

Date: 2025-12-16 Developer: Developer Agent Phase: v1.5.0 Phase 0 - Test Fixes Status: Complete

Overview

Successfully implemented Phase 0 of v1.5.0 as specified in ADR-012 (Flaky Test Removal) and the v1.5.0 RELEASE.md. All test fixes completed and verified across 3 test runs with no flakiness detected.

Changes Made

1. Removed 5 Broken Multiprocessing Tests

File: tests/test_migration_race_condition.py

Removed the following tests that fundamentally cannot work due to Python multiprocessing limitations:

  • test_concurrent_workers_barrier_sync - Cannot pickle Barrier objects for Pool.map()
  • test_sequential_worker_startup - Missing Flask app context across processes
  • test_worker_late_arrival - Missing Flask app context across processes
  • test_single_worker_performance - Cannot pickle local functions
  • test_concurrent_workers_performance - Cannot pickle local functions

Action Taken:

  • Removed entire TestConcurrentExecution class (3 tests)
  • Removed entire TestPerformance class (2 tests)
  • Removed unused module-level worker functions (_barrier_worker, _simple_worker)
  • Removed unused imports (time, multiprocessing, Barrier)
  • Added explanatory comments documenting why tests were removed

Justification: Per ADR-012, these tests have architectural issues that make them unreliable. The migration retry logic they attempt to test is proven to work in production with multi-worker Gunicorn deployments. The tests are the problem, not the code.

2. Fixed Brittle Feed XML Assertions

File: tests/test_routes_feeds.py

Fixed assertions that were checking implementation details (quote style) rather than semantics (valid XML):

Changes:

  • test_feed_atom_endpoint: Changed from checking <?xml version="1.0" to checking <?xml version= and encoding= separately
  • test_feed_json_endpoint: Changed Content-Type assertion from exact match to .startswith('application/feed+json') to not require charset specification
  • test_accept_json_feed: Same Content-Type fix as above
  • test_accept_json_generic: Same Content-Type fix as above
  • test_quality_factor_json_wins: Same Content-Type fix as above

Rationale: XML generators may use single or double quotes. Tests should verify semantics (valid XML with correct encoding), not formatting details. Similarly, Content-Type may or may not include charset parameter depending on framework version.

3. Fixed test_debug_level_for_early_retries

File: tests/test_migration_race_condition.py

Issue: Logger not configured to capture DEBUG level messages.

Fix: Simplified logger configuration by:

  • Removing unnecessary caplog.clear() calls
  • Ensuring caplog.at_level(logging.DEBUG, logger='starpunk.migrations') wraps the actual test execution
  • Removing redundant clearing inside the context manager

Result: Test now reliably captures DEBUG level log messages from the migrations module.

4. Verified test_new_connection_per_retry

File: tests/test_migration_race_condition.py

Finding: This test is actually working correctly. It expects 10 connection attempts (retry_count 0-9) which matches the implementation (while retry_count < max_retries where max_retries = 10).

Action: No changes needed. Test runs successfully and correctly verifies that a new connection is created for each retry attempt.

5. Renamed Misleading Test

File: tests/test_routes_feed.py

Change: Renamed test_feed_route_streaming to test_feed_route_caching

Rationale: The test name said "streaming" but the implementation actually uses caching (Phase 3 feed optimization). The test correctly verifies ETag presence, which is a caching feature. The name was misleading but the test logic was correct.

Test Results

Ran full test suite 3 times to verify no flakiness:

Run 1: 874 passed, 1 warning in 375.92s Run 2: 874 passed, 1 warning in 386.40s Run 3: 874 passed, 1 warning in 375.68s

Test Count: Reduced from 879 to 874 (5 tests removed as planned) Flakiness: None detected across 3 runs Warnings: 1 expected warning about DecompressionBombWarning (intentional test of large image handling)

Acceptance Criteria

Criterion Status Evidence
All remaining tests pass consistently 3 successful test runs
5 broken tests removed Test count: 879 → 874
No new test skips added No @pytest.mark.skip added
Test count reduced to 874 Verified in all 3 runs

Files Modified

  • /home/phil/Projects/starpunk/tests/test_migration_race_condition.py
  • /home/phil/Projects/starpunk/tests/test_routes_feeds.py
  • /home/phil/Projects/starpunk/tests/test_routes_feed.py

Next Steps

Phase 0 is complete and ready for architect review. Once approved:

  • Commit changes with reference to ADR-012
  • Proceed to Phase 1 (Timestamp-Based Slugs)

Notes

The test suite is now more reliable and maintainable:

  • Removed tests that cannot work reliably due to Python limitations
  • Fixed tests that checked implementation details instead of behavior
  • Improved test isolation and logger configuration
  • Clearer test names that reflect actual behavior being tested

All changes align with the project philosophy: "Every line of code must justify its existence." Tests that fail unreliably do not justify their existence.