This release candidate fixes two critical production issues discovered in v1.1.2-rc.1:
1. CRITICAL: Static files returning 500 errors
- HTTP monitoring middleware was accessing response.data on streaming responses
- Fixed by checking direct_passthrough flag before accessing response data
- Static files (CSS, JS, images) now load correctly
- File: starpunk/monitoring/http.py
2. HIGH: Database metrics showing zero
- Configuration key mismatch: config set METRICS_SAMPLING_RATE (singular),
buffer read METRICS_SAMPLING_RATES (plural)
- Fixed by standardizing on singular key name
- Modified MetricsBuffer to accept both float and dict for flexibility
- Changed default sampling from 10% to 100% for better visibility
- Files: starpunk/monitoring/metrics.py, starpunk/config.py
Version: 1.1.2-rc.2
Documentation:
- Investigation report: docs/reports/2025-11-28-v1.1.2-rc.1-production-issues.md
- Architect review: docs/reviews/2025-11-28-v1.1.2-rc.1-architect-review.md
- Implementation report: docs/reports/2025-11-28-v1.1.2-rc.2-fixes.md
Testing: All monitoring tests pass (28/28)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
9.1 KiB
v1.1.2-rc.2 Production Bug Fixes - Implementation Report
Date: 2025-11-28 Developer: Developer Agent Version: 1.1.2-rc.2 Status: Fixes Complete, Tests Passed
Executive Summary
Successfully implemented fixes for two production issues found in v1.1.2-rc.1:
- CRITICAL (Issue 1): Static files returning 500 errors - site completely unusable
- HIGH (Issue 2): Database metrics showing zero due to config mismatch
Both fixes implemented according to architect specifications. All 28 monitoring tests pass. Ready for production deployment.
Issue 1: Static Files Return 500 Error (CRITICAL)
Problem
HTTP middleware's after_request hook accessed response.data on streaming responses (used by Flask's send_from_directory for static files), causing:
RuntimeError: Attempted implicit sequence conversion but the response object is in direct passthrough mode.
Impact
- ALL static files (CSS, JS, images) returned HTTP 500
- Site completely unusable without stylesheets
- Affected every page load
Root Cause
The HTTP metrics middleware in starpunk/monitoring/http.py:74-78 was checking response.data to calculate response size for metrics. Streaming responses cannot have their .data accessed without triggering an error.
Solution Implemented
File: starpunk/monitoring/http.py:73-86
Added check for direct_passthrough mode before accessing response data:
# Get response size
response_size = 0
# Check if response is in direct passthrough mode (streaming)
if hasattr(response, 'direct_passthrough') and response.direct_passthrough:
# For streaming responses, use content_length if available
if hasattr(response, 'content_length') and response.content_length:
response_size = response.content_length
# Otherwise leave as 0 (unknown size for streaming)
elif response.data:
# For buffered responses, we can safely get the data
response_size = len(response.data)
elif hasattr(response, 'content_length') and response.content_length:
response_size = response.content_length
Verification
- Monitoring tests: 28/28 passed (including HTTP metrics tests)
- Static files now load without errors
- Metrics still recorded for static files (with size when available)
- Graceful fallback for unknown sizes (records as 0)
Issue 2: Database Metrics Showing Zero (HIGH)
Problem
Admin dashboard showed 0 for all database metrics despite metrics being enabled and database operations occurring.
Impact
- Database performance monitoring feature incomplete
- No visibility into database operation performance
- Database pool statistics worked, but operation metrics didn't
Root Cause
Configuration key mismatch:
starpunk/config.py:92: SetsMETRICS_SAMPLING_RATE(singular) = 1.0 (100%)starpunk/monitoring/metrics.py:337: ReadsMETRICS_SAMPLING_RATES(plural) expecting dict- Result: Always returned
None, fell back to hardcoded 10% sampling - Consequence: Low traffic + 10% sampling = no metrics recorded
Solution Implemented
Part 1: Updated MetricsBuffer to Accept Float or Dict
File: starpunk/monitoring/metrics.py:87-125
Modified MetricsBuffer.__init__ to handle both formats:
def __init__(
self,
max_size: int = 1000,
sampling_rates: Optional[Union[Dict[OperationType, float], float]] = None
):
"""
Initialize metrics buffer
Args:
max_size: Maximum number of metrics to store
sampling_rates: Either:
- float: Global sampling rate for all operation types (0.0-1.0)
- dict: Mapping operation type to sampling rate
Default: 1.0 (100% sampling)
"""
self.max_size = max_size
self._buffer: Deque[Metric] = deque(maxlen=max_size)
self._lock = Lock()
self._process_id = os.getpid()
# Handle different sampling_rates types
if sampling_rates is None:
# Default to 100% sampling for all types
self._sampling_rates = {
"database": 1.0,
"http": 1.0,
"render": 1.0,
}
elif isinstance(sampling_rates, (int, float)):
# Global rate for all types
rate = float(sampling_rates)
self._sampling_rates = {
"database": rate,
"http": rate,
"render": rate,
}
else:
# Dict with per-type rates
self._sampling_rates = sampling_rates
Part 2: Fixed Configuration Reading
File: starpunk/monitoring/metrics.py:349-361
Changed from plural to singular config key:
# Get configuration from Flask app if available
try:
from flask import current_app
max_size = current_app.config.get('METRICS_BUFFER_SIZE', 1000)
sampling_rate = current_app.config.get('METRICS_SAMPLING_RATE', 1.0) # Singular!
except (ImportError, RuntimeError):
# Flask not available or no app context
max_size = 1000
sampling_rate = 1.0 # Default to 100%
_metrics_buffer = MetricsBuffer(
max_size=max_size,
sampling_rates=sampling_rate # Pass float directly
)
Part 3: Updated Documentation
File: starpunk/monitoring/metrics.py:76-79
Updated class docstring to reflect 100% default:
Per developer Q&A Q12:
- Configurable sampling rates per operation type
- Default 100% sampling (suitable for low-traffic sites) # Changed from 10%
- Slow queries always logged regardless of sampling
Design Decision: 100% Default Sampling
Per architect review, changed default from 10% to 100% because:
- StarPunk targets single-user, low-traffic deployments
- 100% sampling has negligible overhead for typical usage
- Ensures metrics are always visible (better UX)
- Power users can reduce via
METRICS_SAMPLING_RATEenvironment variable
Verification
- Monitoring tests: 28/28 passed (including sampling rate tests)
- Database metrics now appear immediately
- Backwards compatible (still accepts dict for per-type rates)
- Config environment variable works correctly
Files Modified
Core Fixes
-
starpunk/monitoring/http.py(lines 73-86)- Added streaming response detection
- Graceful fallback for response size calculation
-
starpunk/monitoring/metrics.py(multiple locations)- Added
Unionto type imports (line 29) - Updated
MetricsBuffer.__init__signature (lines 87-125) - Updated class docstring (lines 76-79)
- Fixed config key in
get_buffer()(lines 349-361)
- Added
Version & Documentation
-
starpunk/__init__.py(line 301)- Updated version:
1.1.2-rc.1→1.1.2-rc.2
- Updated version:
-
CHANGELOG.md- Added v1.1.2-rc.2 section with fixes and changes
-
docs/reports/2025-11-28-v1.1.2-rc.2-fixes.md(this file)- Comprehensive implementation report
Test Results
Targeted Testing
uv run pytest tests/test_monitoring.py -v
Result: 28 passed in 18.13s
All monitoring-related tests passed, including:
- HTTP metrics recording
- Database metrics recording
- Sampling rate configuration
- Memory monitoring
- Business metrics tracking
Key Tests Verified
test_setup_http_metrics- HTTP middleware setuptest_execute_records_metric- Database metrics recordingtest_sampling_rate_configurable- Config key fixtest_slow_query_always_recorded- Force recording bypass- All HTTP, database, and memory monitor tests
Verification Checklist
- Issue 1 (Static Files) fixed - streaming response handling
- Issue 2 (Database Metrics) fixed - config key mismatch
- Version number updated to 1.1.2-rc.2
- CHANGELOG.md updated with fixes
- All monitoring tests pass (28/28)
- Backwards compatible (dict sampling rates still work)
- Default sampling changed from 10% to 100%
- Implementation report created
Production Deployment Notes
Expected Behavior After Deployment
- Static files will load immediately - no more 500 errors
- Database metrics will show non-zero values immediately - 100% sampling
- Existing config still works - backwards compatible
Configuration
Users can adjust sampling if needed:
# Reduce sampling for high-traffic sites
METRICS_SAMPLING_RATE=0.1 # 10% sampling
# Or disable metrics entirely
METRICS_ENABLED=false
Rollback Plan
If issues arise:
- Revert to v1.1.2-rc.1 (will restore static file error)
- Or revert to v1.1.1 (stable, no metrics features)
Architect Review Required
Per architect review protocol, this implementation follows exact specifications from:
- Investigation Report:
docs/reports/2025-11-28-v1.1.2-rc.1-production-issues.md - Architect Review:
docs/reviews/2025-11-28-v1.1.2-rc.1-architect-review.md
All fixes implemented as specified. No design decisions made independently.
Next Steps
- Deploy v1.1.2-rc.2 to production
- Monitor for 24 hours - verify both fixes work
- If stable, tag as v1.1.2 (remove -rc suffix)
- Update deployment documentation with new sampling rate defaults
References
- Investigation Report:
docs/reports/2025-11-28-v1.1.2-rc.1-production-issues.md - Architect Review:
docs/reviews/2025-11-28-v1.1.2-rc.1-architect-review.md - ADR-053: Performance Monitoring System
- v1.1.2 Implementation Plan:
docs/projectplan/v1.1.2-implementation-plan.md