# Python Coding Standard ## Overview This document defines coding standards for the IndieAuth server implementation in Python. The primary goal is maintainability and clarity over cleverness. ## Python Version - **Target**: Python 3.10+ (for modern type hints and async support) - Use only stable language features - Avoid deprecated patterns ## Code Style ### Formatting - Use **Black** for automatic code formatting (line length: 88) - Use **isort** for import sorting - No manual formatting - let tools handle it ### Linting - Use **flake8** with the following configuration: ```ini # .flake8 [flake8] max-line-length = 88 extend-ignore = E203, W503 exclude = .git,__pycache__,docs,build,dist ``` - Use **mypy** for static type checking: ```ini # mypy.ini [mypy] python_version = 3.10 warn_return_any = True warn_unused_configs = True disallow_untyped_defs = True ``` ## Project Structure ``` indieauth/ ├── __init__.py ├── main.py # Application entry point ├── config.py # Configuration management ├── models/ # Data models │ ├── __init__.py │ ├── client.py │ ├── token.py │ └── user.py ├── endpoints/ # HTTP endpoint handlers │ ├── __init__.py │ ├── authorization.py │ ├── token.py │ └── registration.py ├── services/ # Business logic │ ├── __init__.py │ ├── auth_service.py │ ├── token_service.py │ └── client_service.py ├── storage/ # Data persistence │ ├── __init__.py │ ├── base.py │ └── sqlite.py ├── utils/ # Utility functions │ ├── __init__.py │ ├── crypto.py │ └── validation.py └── exceptions.py # Custom exceptions ``` ## Naming Conventions ### General Rules - Use descriptive names - clarity over brevity - Avoid abbreviations except well-known ones (url, id, db) - Use American English spelling ### Specific Conventions - **Modules**: `lowercase_with_underscores.py` - **Classes**: `PascalCase` - **Functions/Methods**: `lowercase_with_underscores()` - **Constants**: `UPPERCASE_WITH_UNDERSCORES` - **Private**: Prefix with single underscore `_private_method()` - **Internal**: Prefix with double underscore `__internal_var` ### Examples ```python # Good class ClientRegistration: MAX_REDIRECT_URIS = 10 def validate_redirect_uri(self, uri: str) -> bool: pass # Bad class client_reg: # Wrong case maxURIs = 10 # Wrong case, abbreviation def checkURI(self, u): # Unclear naming, missing types pass ``` ## Type Hints Type hints are **mandatory** for all functions and methods: ```python from typing import Optional, List, Dict, Union from datetime import datetime def generate_token( client_id: str, scope: Optional[str] = None, expires_in: int = 3600 ) -> Dict[str, Union[str, int, datetime]]: """Generate an access token for the client.""" pass ``` ## Docstrings Use Google-style docstrings for all public modules, classes, and functions: ```python def exchange_code( code: str, client_id: str, code_verifier: Optional[str] = None ) -> Token: """ Exchange authorization code for access token. Args: code: The authorization code received from auth endpoint client_id: The client identifier code_verifier: PKCE code verifier if PKCE was used Returns: Access token with associated metadata Raises: InvalidCodeError: If code is invalid or expired InvalidClientError: If client_id doesn't match code PKCERequiredError: If PKCE is required but not provided """ ``` ## Error Handling ### Custom Exceptions Define specific exceptions in `exceptions.py`: ```python class IndieAuthError(Exception): """Base exception for IndieAuth errors.""" pass class InvalidClientError(IndieAuthError): """Raised when client authentication fails.""" pass class InvalidTokenError(IndieAuthError): """Raised when token validation fails.""" pass ``` ### Error Handling Pattern ```python # Good - Specific exception handling try: token = validate_token(bearer_token) except InvalidTokenError as e: logger.warning(f"Token validation failed: {e}") return error_response(401, "invalid_token") except Exception as e: logger.error(f"Unexpected error during validation: {e}") return error_response(500, "internal_error") # Bad - Catching all exceptions try: token = validate_token(bearer_token) except: # Never use bare except return error_response(400, "error") ``` ## Logging Use the standard `logging` module: ```python import logging logger = logging.getLogger(__name__) class TokenService: def create_token(self, client_id: str) -> str: logger.debug(f"Creating token for client: {client_id}") token = self._generate_token() logger.info(f"Token created for client: {client_id}") return token ``` ### Logging Levels - **DEBUG**: Detailed diagnostic information - **INFO**: General informational messages - **WARNING**: Warning messages for potentially harmful situations - **ERROR**: Error messages for failures - **CRITICAL**: Critical problems that require immediate attention ### Sensitive Data Never log sensitive data: ```python # Bad logger.info(f"User logged in with password: {password}") logger.debug(f"Generated token: {access_token}") # Good logger.info(f"User logged in: {user_id}") logger.debug(f"Token generated for client: {client_id}") ``` ## Configuration Management Use environment variables for configuration: ```python # config.py import os from typing import Optional class Config: """Application configuration.""" # Required settings SECRET_KEY: str = os.environ["INDIEAUTH_SECRET_KEY"] DATABASE_URL: str = os.environ["INDIEAUTH_DATABASE_URL"] # Optional settings with defaults TOKEN_EXPIRY: int = int(os.getenv("INDIEAUTH_TOKEN_EXPIRY", "3600")) RATE_LIMIT: int = int(os.getenv("INDIEAUTH_RATE_LIMIT", "100")) DEBUG: bool = os.getenv("INDIEAUTH_DEBUG", "false").lower() == "true" @classmethod def validate(cls) -> None: """Validate configuration on startup.""" if not cls.SECRET_KEY: raise ValueError("INDIEAUTH_SECRET_KEY must be set") if len(cls.SECRET_KEY) < 32: raise ValueError("INDIEAUTH_SECRET_KEY must be at least 32 characters") ``` ## Dependency Management ### Requirements Files ``` requirements.txt # Production dependencies only requirements-dev.txt # Development dependencies (includes requirements.txt) requirements-test.txt # Test dependencies (includes requirements.txt) ``` ### Dependency Principles - Pin exact versions in requirements.txt - Minimize dependencies - prefer standard library - Audit dependencies for security vulnerabilities - Document why each dependency is needed ## Security Practices ### Input Validation Always validate and sanitize input: ```python from urllib.parse import urlparse def validate_redirect_uri(uri: str) -> bool: """Validate that redirect URI is safe.""" parsed = urlparse(uri) # Must be absolute URI if not parsed.scheme or not parsed.netloc: return False # Must be HTTPS in production if not DEBUG and parsed.scheme != "https": return False # Prevent open redirects if parsed.netloc in BLACKLISTED_DOMAINS: return False return True ``` ### Secrets Management ```python import secrets def generate_token() -> str: """Generate cryptographically secure token.""" return secrets.token_urlsafe(32) def constant_time_compare(a: str, b: str) -> bool: """Compare strings in constant time to prevent timing attacks.""" return secrets.compare_digest(a, b) ``` ## Performance Considerations ### Async/Await Use async for I/O operations when beneficial: ```python async def verify_client(client_id: str) -> Optional[Client]: """Verify client exists and is valid.""" client = await db.get_client(client_id) if client and not client.is_revoked: return client return None ``` ### Caching Cache expensive operations appropriately: ```python from functools import lru_cache @lru_cache(maxsize=128) def get_client_metadata(client_id: str) -> dict: """Fetch and cache client metadata.""" # Expensive operation return fetch_client_metadata(client_id) ``` ## Module Documentation Each module should have a header docstring: ```python """ Authorization endpoint implementation. This module handles the OAuth 2.0 authorization endpoint as specified in the IndieAuth specification. It processes authorization requests, validates client information, and generates authorization codes. """ ``` ## Comments ### When to Comment - Complex algorithms or business logic - Workarounds or non-obvious solutions - TODO items with issue references - Security-critical code sections ### Comment Style ```python # Good comments explain WHY, not WHAT # Bad - Explains what the code does counter = counter + 1 # Increment counter # Good - Explains why counter = counter + 1 # Track attempts for rate limiting # Security-critical sections need extra attention # SECURITY: Validate redirect_uri to prevent open redirect attacks # See: https://owasp.org/www-project-web-security-testing-guide/ if not validate_redirect_uri(redirect_uri): raise SecurityError("Invalid redirect URI") ``` ## Code Organization Principles 1. **Single Responsibility**: Each module/class/function does one thing 2. **Dependency Injection**: Pass dependencies, don't hard-code them 3. **Composition over Inheritance**: Prefer composition for code reuse 4. **Fail Fast**: Validate input early and fail with clear errors 5. **Explicit over Implicit**: Clear interfaces over magic behavior ## Security Practices ### Secure Logging Guidelines #### Never Log Sensitive Data The following must NEVER appear in logs: - Full tokens (authorization codes, access tokens, refresh tokens) - Passwords or secrets - Full authorization codes - Private keys or certificates - Personally identifiable information (PII) beyond user identifiers (email addresses, IP addresses in most cases) #### Safe Logging Practices When logging security-relevant events, follow these practices: 1. **Token Prefixes**: When token identification is necessary, log only the first 8 characters with ellipsis: ```python logger.info("Token validated", extra={ "token_prefix": token[:8] + "..." if len(token) > 8 else "***", "client_id": client_id }) ``` 2. **Request Context**: Log security events with context: ```python logger.warning("Authorization failed", extra={ "client_id": client_id, "error": error_code # Use error codes, not full messages }) ``` 3. **Security Events to Log**: - Failed authentication attempts - Token validation failures - Rate limit violations - Input validation failures - HTTPS redirect actions - Client registration events 4. **Use Structured Logging**: Include metadata as structured fields: ```python logger.info("Client registered", extra={ "event": "client.registered", "client_id": client_id, "registration_method": "self_service", "timestamp": datetime.utcnow().isoformat() }) ``` 5. **Sanitize User Input**: Always sanitize user-provided data before logging: ```python def sanitize_for_logging(value: str, max_length: int = 100) -> str: """Sanitize user input for safe logging.""" # Remove control characters value = "".join(ch for ch in value if ch.isprintable()) # Truncate if too long if len(value) > max_length: value = value[:max_length] + "..." return value ``` #### Security Audit Logging For security-critical operations, use a dedicated audit logger: ```python audit_logger = logging.getLogger("security.audit") # Log security-critical events audit_logger.info("Token issued", extra={ "event": "token.issued", "client_id": client_id, "scope": scope, "expires_in": expires_in }) ``` #### Testing Logging Security Include tests that verify sensitive data doesn't leak into logs: ```python def test_no_token_in_logs(caplog): """Verify tokens are not logged in full.""" token = "sensitive_token_abc123xyz789" # Perform operation that logs token validate_token(token) # Check logs don't contain full token for record in caplog.records: assert token not in record.getMessage() # But prefix might be present assert token[:8] in record.getMessage() or "***" in record.getMessage() ```