# Python Coding Standard

## Overview

This document defines coding standards for the IndieAuth server implementation in Python. The primary goal is maintainability and clarity over cleverness.

## Python Version

- **Target**: Python 3.10+ (for modern type hints and async support)
- Use only stable language features
- Avoid deprecated patterns

## Code Style

### Formatting
- Use **Black** for automatic code formatting (line length: 88)
- Use **isort** for import sorting
- No manual formatting - let tools handle it

### Linting
- Use **flake8** with the following configuration:
```ini
# .flake8
[flake8]
max-line-length = 88
extend-ignore = E203, W503
exclude = .git,__pycache__,docs,build,dist
```

- Use **mypy** for static type checking:
```ini
# mypy.ini
[mypy]
python_version = 3.10
warn_return_any = True
warn_unused_configs = True
disallow_untyped_defs = True
```

## Project Structure

```
indieauth/
├── __init__.py
├── main.py                 # Application entry point
├── config.py               # Configuration management
├── models/                 # Data models
│   ├── __init__.py
│   ├── client.py
│   ├── token.py
│   └── user.py
├── endpoints/              # HTTP endpoint handlers
│   ├── __init__.py
│   ├── authorization.py
│   ├── token.py
│   └── registration.py
├── services/               # Business logic
│   ├── __init__.py
│   ├── auth_service.py
│   ├── token_service.py
│   └── client_service.py
├── storage/                # Data persistence
│   ├── __init__.py
│   ├── base.py
│   └── sqlite.py
├── utils/                  # Utility functions
│   ├── __init__.py
│   ├── crypto.py
│   └── validation.py
└── exceptions.py           # Custom exceptions
```

## Naming Conventions

### General Rules
- Use descriptive names - clarity over brevity
- Avoid abbreviations except well-known ones (url, id, db)
- Use American English spelling

### Specific Conventions
- **Modules**: `lowercase_with_underscores.py`
- **Classes**: `PascalCase`
- **Functions/Methods**: `lowercase_with_underscores()`
- **Constants**: `UPPERCASE_WITH_UNDERSCORES`
- **Private**: Prefix with single underscore `_private_method()`
- **Internal**: Prefix with double underscore `__internal_var`

### Examples
```python
# Good
class ClientRegistration:
    MAX_REDIRECT_URIS = 10

    def validate_redirect_uri(self, uri: str) -> bool:
        pass

# Bad
class client_reg:  # Wrong case
    maxURIs = 10    # Wrong case, abbreviation

    def checkURI(self, u):  # Unclear naming, missing types
        pass
```

## Type Hints

Type hints are **mandatory** for all functions and methods:

```python
from typing import Optional, List, Dict, Union
from datetime import datetime

def generate_token(
    client_id: str,
    scope: Optional[str] = None,
    expires_in: int = 3600
) -> Dict[str, Union[str, int, datetime]]:
    """Generate an access token for the client."""
    pass
```

## Docstrings

Use Google-style docstrings for all public modules, classes, and functions:

```python
def exchange_code(
    code: str,
    client_id: str,
    code_verifier: Optional[str] = None
) -> Token:
    """
    Exchange authorization code for access token.

    Args:
        code: The authorization code received from auth endpoint
        client_id: The client identifier
        code_verifier: PKCE code verifier if PKCE was used

    Returns:
        Access token with associated metadata

    Raises:
        InvalidCodeError: If code is invalid or expired
        InvalidClientError: If client_id doesn't match code
        PKCERequiredError: If PKCE is required but not provided
    """
```

## Error Handling

### Custom Exceptions
Define specific exceptions in `exceptions.py`:

```python
class IndieAuthError(Exception):
    """Base exception for IndieAuth errors."""
    pass

class InvalidClientError(IndieAuthError):
    """Raised when client authentication fails."""
    pass

class InvalidTokenError(IndieAuthError):
    """Raised when token validation fails."""
    pass
```

### Error Handling Pattern
```python
# Good - Specific exception handling
try:
    token = validate_token(bearer_token)
except InvalidTokenError as e:
    logger.warning(f"Token validation failed: {e}")
    return error_response(401, "invalid_token")
except Exception as e:
    logger.error(f"Unexpected error during validation: {e}")
    return error_response(500, "internal_error")

# Bad - Catching all exceptions
try:
    token = validate_token(bearer_token)
except:  # Never use bare except
    return error_response(400, "error")
```

## Logging

Use the standard `logging` module:

```python
import logging

logger = logging.getLogger(__name__)

class TokenService:
    def create_token(self, client_id: str) -> str:
        logger.debug(f"Creating token for client: {client_id}")
        token = self._generate_token()
        logger.info(f"Token created for client: {client_id}")
        return token
```

### Logging Levels
- **DEBUG**: Detailed diagnostic information
- **INFO**: General informational messages
- **WARNING**: Warning messages for potentially harmful situations
- **ERROR**: Error messages for failures
- **CRITICAL**: Critical problems that require immediate attention

### Sensitive Data
Never log sensitive data:
```python
# Bad
logger.info(f"User logged in with password: {password}")
logger.debug(f"Generated token: {access_token}")

# Good
logger.info(f"User logged in: {user_id}")
logger.debug(f"Token generated for client: {client_id}")
```

## Configuration Management

Use environment variables for configuration:

```python
# config.py
import os
from typing import Optional

class Config:
    """Application configuration."""

    # Required settings
    SECRET_KEY: str = os.environ["INDIEAUTH_SECRET_KEY"]
    DATABASE_URL: str = os.environ["INDIEAUTH_DATABASE_URL"]

    # Optional settings with defaults
    TOKEN_EXPIRY: int = int(os.getenv("INDIEAUTH_TOKEN_EXPIRY", "3600"))
    RATE_LIMIT: int = int(os.getenv("INDIEAUTH_RATE_LIMIT", "100"))
    DEBUG: bool = os.getenv("INDIEAUTH_DEBUG", "false").lower() == "true"

    @classmethod
    def validate(cls) -> None:
        """Validate configuration on startup."""
        if not cls.SECRET_KEY:
            raise ValueError("INDIEAUTH_SECRET_KEY must be set")
        if len(cls.SECRET_KEY) < 32:
            raise ValueError("INDIEAUTH_SECRET_KEY must be at least 32 characters")
```

## Dependency Management

### Requirements Files
```
requirements.txt         # Production dependencies only
requirements-dev.txt     # Development dependencies (includes requirements.txt)
requirements-test.txt    # Test dependencies (includes requirements.txt)
```

### Dependency Principles
- Pin exact versions in requirements.txt
- Minimize dependencies - prefer standard library
- Audit dependencies for security vulnerabilities
- Document why each dependency is needed

## Security Practices

### Input Validation
Always validate and sanitize input:
```python
from urllib.parse import urlparse

def validate_redirect_uri(uri: str) -> bool:
    """Validate that redirect URI is safe."""
    parsed = urlparse(uri)

    # Must be absolute URI
    if not parsed.scheme or not parsed.netloc:
        return False

    # Must be HTTPS in production
    if not DEBUG and parsed.scheme != "https":
        return False

    # Prevent open redirects
    if parsed.netloc in BLACKLISTED_DOMAINS:
        return False

    return True
```

### Secrets Management
```python
import secrets

def generate_token() -> str:
    """Generate cryptographically secure token."""
    return secrets.token_urlsafe(32)

def constant_time_compare(a: str, b: str) -> bool:
    """Compare strings in constant time to prevent timing attacks."""
    return secrets.compare_digest(a, b)
```

## Performance Considerations

### Async/Await
Use async for I/O operations when beneficial:
```python
async def verify_client(client_id: str) -> Optional[Client]:
    """Verify client exists and is valid."""
    client = await db.get_client(client_id)
    if client and not client.is_revoked:
        return client
    return None
```

### Caching
Cache expensive operations appropriately:
```python
from functools import lru_cache

@lru_cache(maxsize=128)
def get_client_metadata(client_id: str) -> dict:
    """Fetch and cache client metadata."""
    # Expensive operation
    return fetch_client_metadata(client_id)
```

## Module Documentation

Each module should have a header docstring:
```python
"""
Authorization endpoint implementation.

This module handles the OAuth 2.0 authorization endpoint as specified
in the IndieAuth specification. It processes authorization requests,
validates client information, and generates authorization codes.
"""
```

## Comments

### When to Comment
- Complex algorithms or business logic
- Workarounds or non-obvious solutions
- TODO items with issue references
- Security-critical code sections

### Comment Style
```python
# Good comments explain WHY, not WHAT

# Bad - Explains what the code does
counter = counter + 1  # Increment counter

# Good - Explains why
counter = counter + 1  # Track attempts for rate limiting

# Security-critical sections need extra attention
# SECURITY: Validate redirect_uri to prevent open redirect attacks
# See: https://owasp.org/www-project-web-security-testing-guide/
if not validate_redirect_uri(redirect_uri):
    raise SecurityError("Invalid redirect URI")
```

## Code Organization Principles

1. **Single Responsibility**: Each module/class/function does one thing
2. **Dependency Injection**: Pass dependencies, don't hard-code them
3. **Composition over Inheritance**: Prefer composition for code reuse
4. **Fail Fast**: Validate input early and fail with clear errors
5. **Explicit over Implicit**: Clear interfaces over magic behavior

## Security Practices

### Secure Logging Guidelines

#### Never Log Sensitive Data

The following must NEVER appear in logs:
- Full tokens (authorization codes, access tokens, refresh tokens)
- Passwords or secrets
- Full authorization codes
- Private keys or certificates
- Personally identifiable information (PII) beyond user identifiers (email addresses, IP addresses in most cases)

#### Safe Logging Practices

When logging security-relevant events, follow these practices:

1. **Token Prefixes**: When token identification is necessary, log only the first 8 characters with ellipsis:
   ```python
   logger.info("Token validated", extra={
       "token_prefix": token[:8] + "..." if len(token) > 8 else "***",
       "client_id": client_id
   })
   ```

2. **Request Context**: Log security events with context:
   ```python
   logger.warning("Authorization failed", extra={
       "client_id": client_id,
       "error": error_code  # Use error codes, not full messages
   })
   ```

3. **Security Events to Log**:
   - Failed authentication attempts
   - Token validation failures
   - Rate limit violations
   - Input validation failures
   - HTTPS redirect actions
   - Client registration events

4. **Use Structured Logging**: Include metadata as structured fields:
   ```python
   logger.info("Client registered", extra={
       "event": "client.registered",
       "client_id": client_id,
       "registration_method": "self_service",
       "timestamp": datetime.utcnow().isoformat()
   })
   ```

5. **Sanitize User Input**: Always sanitize user-provided data before logging:
   ```python
   def sanitize_for_logging(value: str, max_length: int = 100) -> str:
       """Sanitize user input for safe logging."""
       # Remove control characters
       value = "".join(ch for ch in value if ch.isprintable())
       # Truncate if too long
       if len(value) > max_length:
           value = value[:max_length] + "..."
       return value
   ```

#### Security Audit Logging

For security-critical operations, use a dedicated audit logger:

```python
audit_logger = logging.getLogger("security.audit")

# Log security-critical events
audit_logger.info("Token issued", extra={
    "event": "token.issued",
    "client_id": client_id,
    "scope": scope,
    "expires_in": expires_in
})
```

#### Testing Logging Security

Include tests that verify sensitive data doesn't leak into logs:

```python
def test_no_token_in_logs(caplog):
    """Verify tokens are not logged in full."""
    token = "sensitive_token_abc123xyz789"

    # Perform operation that logs token
    validate_token(token)

    # Check logs don't contain full token
    for record in caplog.records:
        assert token not in record.getMessage()
        # But prefix might be present
        assert token[:8] in record.getMessage() or "***" in record.getMessage()
```