Files
StarPunk/docs/decisions/ADR-054-structured-logging-architecture.md
Phil Skentelbery e589f5bd6c docs: Fix ADR numbering conflicts and create comprehensive documentation indices
This commit resolves all documentation issues identified in the comprehensive review:

CRITICAL FIXES:
- Renumbered duplicate ADRs to eliminate conflicts:
  * ADR-022-migration-race-condition-fix → ADR-037
  * ADR-022-syndication-formats → ADR-038
  * ADR-023-microformats2-compliance → ADR-040
  * ADR-027-versioning-strategy-for-authorization-removal → ADR-042
  * ADR-030-CORRECTED-indieauth-endpoint-discovery → ADR-043
  * ADR-031-endpoint-discovery-implementation → ADR-044

- Updated all cross-references to renumbered ADRs in:
  * docs/projectplan/ROADMAP.md
  * docs/reports/v1.0.0-rc.5-migration-race-condition-implementation.md
  * docs/reports/2025-11-24-endpoint-discovery-analysis.md
  * docs/decisions/ADR-043-CORRECTED-indieauth-endpoint-discovery.md
  * docs/decisions/ADR-044-endpoint-discovery-implementation.md

- Updated README.md version from 1.0.0 to 1.1.0
- Tracked ADR-021-indieauth-provider-strategy.md in git

DOCUMENTATION IMPROVEMENTS:
- Created comprehensive INDEX.md files for all docs/ subdirectories:
  * docs/architecture/INDEX.md (28 documents indexed)
  * docs/decisions/INDEX.md (55 ADRs indexed with topical grouping)
  * docs/design/INDEX.md (phase plans and feature designs)
  * docs/standards/INDEX.md (9 standards with compliance checklist)
  * docs/reports/INDEX.md (57 implementation reports)
  * docs/deployment/INDEX.md (deployment guides)
  * docs/examples/INDEX.md (code samples and usage patterns)
  * docs/migration/INDEX.md (version migration guides)
  * docs/releases/INDEX.md (release documentation)
  * docs/reviews/INDEX.md (architectural reviews)
  * docs/security/INDEX.md (security documentation)

- Updated CLAUDE.md with complete folder descriptions including:
  * docs/migration/
  * docs/releases/
  * docs/security/

VERIFICATION:
- All ADR numbers now sequential and unique (50 total ADRs)
- No duplicate ADR numbers remain
- All cross-references updated and verified
- Documentation structure consistent and well-organized

These changes improve documentation discoverability, maintainability, and
ensure proper version tracking. All index files follow consistent format
with clear navigation guidance.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 13:28:56 -07:00

9.1 KiB

ADR-054: Structured Logging Architecture

Status

Accepted

Context

StarPunk currently uses print statements and basic logging without structure. For production deployments, we need:

  • Consistent log formatting
  • Appropriate log levels
  • Structured data for parsing
  • Correlation IDs for request tracking
  • Performance-conscious logging

We need a logging architecture that is simple, follows Python best practices, and provides production-grade observability.

Decision

Implement structured logging using Python's built-in logging module with JSON formatting and contextual information.

Logging Architecture

Application Code
      ↓
Logger Interface → Filters → Formatters → Handlers → Output
      ↑                                              ↓
Context Injection                            (stdout/file)

Log Levels

Following standard Python/syslog levels:

Level Value Usage
CRITICAL 50 System failures requiring immediate attention
ERROR 40 Errors that need investigation
WARNING 30 Unexpected conditions that might cause issues
INFO 20 Normal operation events
DEBUG 10 Detailed diagnostic information

Log Structure

JSON format for production, human-readable for development:

{
  "timestamp": "2025-11-25T10:30:45.123Z",
  "level": "INFO",
  "logger": "starpunk.micropub",
  "message": "Note created",
  "request_id": "a1b2c3d4",
  "user": "alice@example.com",
  "context": {
    "note_id": 123,
    "slug": "my-note",
    "word_count": 42
  },
  "performance": {
    "duration_ms": 45
  }
}

Logger Hierarchy

starpunk (root logger)
├── starpunk.auth        # Authentication/authorization
├── starpunk.micropub    # Micropub endpoint
├── starpunk.database    # Database operations
├── starpunk.search      # Search functionality
├── starpunk.web        # Web interface
├── starpunk.rss        # RSS generation
├── starpunk.monitoring # Performance monitoring
└── starpunk.migration  # Database migrations

Implementation Pattern

# starpunk/logging.py
import logging
import json
import sys
from datetime import datetime
from contextvars import ContextVar

# Request context for correlation
request_id: ContextVar[str] = ContextVar('request_id', default='')

class StructuredFormatter(logging.Formatter):
    """JSON formatter for structured logging"""

    def format(self, record):
        log_obj = {
            'timestamp': datetime.utcnow().isoformat() + 'Z',
            'level': record.levelname,
            'logger': record.name,
            'message': record.getMessage(),
            'request_id': request_id.get()
        }

        # Add extra fields
        if hasattr(record, 'context'):
            log_obj['context'] = record.context

        if hasattr(record, 'performance'):
            log_obj['performance'] = record.performance

        # Add exception info if present
        if record.exc_info:
            log_obj['exception'] = self.formatException(record.exc_info)

        return json.dumps(log_obj)

def setup_logging(level='INFO', format_type='json'):
    """Configure logging for the application"""
    root_logger = logging.getLogger('starpunk')
    root_logger.setLevel(level)

    handler = logging.StreamHandler(sys.stdout)

    if format_type == 'json':
        formatter = StructuredFormatter()
    else:
        # Human-readable for development
        formatter = logging.Formatter(
            '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
        )

    handler.setFormatter(formatter)
    root_logger.addHandler(handler)

    return root_logger

# Usage pattern
logger = logging.getLogger('starpunk.micropub')

def create_note(content, user):
    logger.info(
        "Creating note",
        extra={
            'context': {
                'user': user,
                'content_length': len(content)
            }
        }
    )
    # ... implementation

What to Log

Always Log (INFO+)

  • Authentication attempts (success/failure)
  • Note CRUD operations
  • Configuration changes
  • Startup/shutdown
  • External API calls
  • Migration execution
  • Search queries

Error Conditions (ERROR)

  • Database connection failures
  • Invalid Micropub requests
  • Authentication failures
  • File system errors
  • Configuration errors

Warnings (WARNING)

  • Slow queries
  • High memory usage
  • Deprecated feature usage
  • Missing optional configuration
  • FTS5 unavailability

Debug Information (DEBUG)

  • SQL queries executed
  • Request/response bodies
  • Template rendering details
  • Cache operations
  • Detailed timing data

What NOT to Log

  • Passwords or tokens
  • Full note content (unless debug)
  • Personal information (PII)
  • Request headers with auth
  • Database connection strings

Performance Considerations

  1. Lazy Evaluation: Use lazy % formatting

    logger.debug("Processing note %s", note_id)  # Good
    logger.debug(f"Processing note {note_id}")   # Bad
    
  2. Level Checking: Check before expensive operations

    if logger.isEnabledFor(logging.DEBUG):
        logger.debug("Data: %s", expensive_serialization())
    
  3. Async Logging: For high-volume scenarios (future)

  4. Sampling: For very frequent operations

    if random.random() < 0.1:  # Log 10%
        logger.debug("High frequency operation")
    

Rationale

Why Standard Logging Module?

  1. No Dependencies: Built into Python
  2. Industry Standard: Well understood
  3. Flexible: Handlers, formatters, filters
  4. Battle-tested: Proven in production
  5. Integration: Works with existing tools

Why JSON Format?

  1. Parseable: Easy for log aggregators
  2. Structured: Consistent field access
  3. Flexible: Can add fields without breaking
  4. Standard: Widely supported

Why Not Alternatives?

structlog:

  • Additional dependency
  • More complex API
  • Overkill for our needs

loguru:

  • Third-party dependency
  • Non-standard API
  • Not necessary for our scale

Print statements:

  • No levels
  • No structure
  • No filtering
  • Not production-ready

Consequences

Positive

  1. Production Ready: Professional logging
  2. Debuggable: Rich context in logs
  3. Parseable: Integration with log tools
  4. Performant: Minimal overhead
  5. Configurable: Adjust without code changes
  6. Correlatable: Request tracking via IDs

Negative

  1. Verbosity: More code for logging
  2. Learning: Developers must understand levels
  3. Size: JSON logs are larger than plain text
  4. Complexity: More setup than prints

Mitigations

  1. Provide logging utilities/helpers
  2. Document logging guidelines
  3. Use log rotation for size management
  4. Create developer-friendly formatter option

Alternatives Considered

1. Continue with Print Statements

Pros: Simplest possible Cons: Not production-ready Decision: Inadequate for production

2. Custom Logging Solution

Pros: Exactly what we need Cons: Reinventing the wheel Decision: Standard library is sufficient

3. External Logging Service

Pros: No local storage needed Cons: Privacy, dependency, cost Decision: Conflicts with self-hosted philosophy

4. Syslog Integration

Pros: Standard Unix logging Cons: Platform-specific, complexity Decision: Can add as handler if needed

Implementation Notes

Bootstrap Logging

# Application startup
import logging
from starpunk.logging import setup_logging

# Configure based on environment
if os.environ.get('STARPUNK_ENV') == 'production':
    setup_logging(level='INFO', format_type='json')
else:
    setup_logging(level='DEBUG', format_type='human')

Request Correlation

# Middleware sets request ID
from uuid import uuid4
from contextvars import copy_context

def middleware(request):
    request_id.set(str(uuid4())[:8])
    # Process request in context
    return copy_context().run(handler, request)

Migration Strategy

  1. Phase 1: Add logging module, keep prints
  2. Phase 2: Convert prints to logger calls
  3. Phase 3: Remove print statements
  4. Phase 4: Add structured context

Testing Strategy

  1. Unit Tests: Mock logger, verify calls
  2. Integration Tests: Verify log output format
  3. Performance Tests: Measure logging overhead
  4. Configuration Tests: Test different levels/formats

Configuration

Environment variables:

  • STARPUNK_LOG_LEVEL: DEBUG|INFO|WARNING|ERROR|CRITICAL
  • STARPUNK_LOG_FORMAT: json|human
  • STARPUNK_LOG_FILE: Path to log file (optional)
  • STARPUNK_LOG_ROTATION: Enable rotation (optional)

Security Considerations

  1. Never log sensitive data
  2. Sanitize user input in logs
  3. Rate limit log output
  4. Monitor for log injection attacks
  5. Secure log file permissions

References

Document History

  • 2025-11-25: Initial draft for v1.1.1 release planning