Files
Gondulf/docs/architecture/phase1-clarifications.md
Phil Skentelbery bebd47955f feat(core): implement Phase 1 foundation infrastructure
Implements Phase 1 Foundation with all core services:

Core Components:
- Configuration management with GONDULF_ environment variables
- Database layer with SQLAlchemy and migration system
- In-memory code storage with TTL support
- Email service with SMTP and TLS support (STARTTLS + implicit TLS)
- DNS service with TXT record verification
- Structured logging with Python standard logging
- FastAPI application with health check endpoint

Database Schema:
- authorization_codes table for OAuth 2.0 authorization codes
- domains table for domain verification
- migrations table for tracking schema versions
- Simple sequential migration system (001_initial_schema.sql)

Configuration:
- Environment-based configuration with validation
- .env.example template with all GONDULF_ variables
- Fail-fast validation on startup
- Sensible defaults for optional settings

Testing:
- 96 comprehensive tests (77 unit, 5 integration)
- 94.16% code coverage (exceeds 80% requirement)
- All tests passing
- Test coverage includes:
  - Configuration loading and validation
  - Database migrations and health checks
  - In-memory storage with expiration
  - Email service (STARTTLS, implicit TLS, authentication)
  - DNS service (TXT records, domain verification)
  - Health check endpoint integration

Documentation:
- Implementation report with test results
- Phase 1 clarifications document
- ADRs for key decisions (config, database, email, logging)

Technical Details:
- Python 3.10+ with type hints
- SQLite with configurable database URL
- System DNS with public DNS fallback
- Port-based TLS detection (465=SSL, 587=STARTTLS)
- Lazy configuration loading for testability

Exit Criteria Met:
✓ All foundation services implemented
✓ Application starts without errors
✓ Health check endpoint operational
✓ Database migrations working
✓ Test coverage exceeds 80%
✓ All tests passing

Ready for Architect review and Phase 2 development.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 12:21:42 -07:00

9.2 KiB

Phase 1 Implementation Clarifications

Date: 2024-11-20

This document provides specific answers to Developer's clarification questions for Phase 1 implementation.

1. Configuration Management - Environment Variables

Decision: YES - Use the GONDULF_ prefix for all environment variables.

Complete environment variable specification:

# Required - no defaults
GONDULF_SECRET_KEY=<generate-with-secrets.token_urlsafe(32)>

# Database
GONDULF_DATABASE_URL=sqlite:///./data/gondulf.db

# SMTP Configuration
GONDULF_SMTP_HOST=localhost
GONDULF_SMTP_PORT=587
GONDULF_SMTP_USERNAME=
GONDULF_SMTP_PASSWORD=
GONDULF_SMTP_FROM=noreply@example.com
GONDULF_SMTP_USE_TLS=true

# Token and Code Expiry (seconds)
GONDULF_TOKEN_EXPIRY=3600
GONDULF_CODE_EXPIRY=600

# Logging
GONDULF_LOG_LEVEL=INFO
GONDULF_DEBUG=false

Implementation Requirements:

  • Create .env.example with all variables documented
  • Use python-dotenv for loading (already in requirements.txt)
  • Validate GONDULF_SECRET_KEY exists on startup (fail fast if missing)
  • All other variables should have sensible defaults as shown above

See Also: ADR 0004 - Configuration Management Strategy


2. Database Schema - Tables for Phase 1

Decision: Create exactly THREE tables in Phase 1.

Table 1: authorization_codes

CREATE TABLE authorization_codes (
    code TEXT PRIMARY KEY,
    client_id TEXT NOT NULL,
    redirect_uri TEXT NOT NULL,
    state TEXT,
    code_challenge TEXT,
    code_challenge_method TEXT,
    scope TEXT,
    me TEXT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

Table 2: domains

CREATE TABLE domains (
    domain TEXT PRIMARY KEY,
    email TEXT NOT NULL,
    verification_code TEXT NOT NULL,
    verified BOOLEAN NOT NULL DEFAULT FALSE,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    verified_at TIMESTAMP
);

Table 3: migrations

CREATE TABLE migrations (
    version INTEGER PRIMARY KEY,
    description TEXT NOT NULL,
    applied_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

Do NOT create:

  • Audit tables (use logging instead)
  • Token tables (Phase 2)
  • Client tables (Phase 3)

Implementation Requirements:

  • Create src/gondulf/database/migrations/ directory
  • Create 001_initial_schema.sql with above schema
  • Migration runner should track applied migrations in migrations table
  • Use simple sequential versioning: 001, 002, 003, etc.

See Also: ADR 0005 - Phase 1 Database Schema


3. In-Memory Storage - Implementation Details

Decision: Option B - Standard dict with manual expiration check on access.

Rationale:

  • Simplest implementation
  • No background threads or complexity
  • Codes are short-lived (10 minutes), so memory cleanup isn't critical
  • Lazy deletion on access is sufficient

Implementation Specification:

class CodeStore:
    """In-memory storage for domain verification codes with TTL."""

    def __init__(self, ttl_seconds: int = 600):
        self._store: dict[str, tuple[str, float]] = {}
        self._ttl = ttl_seconds

    def store(self, email: str, code: str) -> None:
        """Store verification code with expiry timestamp."""
        expiry = time.time() + self._ttl
        self._store[email] = (code, expiry)

    def verify(self, email: str, code: str) -> bool:
        """Verify code and remove from store."""
        if email not in self._store:
            return False

        stored_code, expiry = self._store[email]

        # Check expiration
        if time.time() > expiry:
            del self._store[email]
            return False

        # Check code match
        if code != stored_code:
            return False

        # Valid - remove from store
        del self._store[email]
        return True

Expiration cleanup: On read only. No background cleanup needed.

Configuration: Use GONDULF_CODE_EXPIRY=600 (10 minutes default)


4. Email Service - SMTP TLS/STARTTLS

Decision: Support both via port-based configuration (Option B variant).

Configuration:

GONDULF_SMTP_HOST=smtp.gmail.com
GONDULF_SMTP_PORT=587              # or 465 for implicit TLS
GONDULF_SMTP_USERNAME=user@gmail.com
GONDULF_SMTP_PASSWORD=app-password
GONDULF_SMTP_FROM=noreply@example.com
GONDULF_SMTP_USE_TLS=true

Implementation Logic:

if smtp_port == 465:
    # Implicit TLS
    server = smtplib.SMTP_SSL(smtp_host, smtp_port)
elif smtp_port == 587 and smtp_use_tls:
    # STARTTLS
    server = smtplib.SMTP(smtp_host, smtp_port)
    server.starttls()
else:
    # Unencrypted (testing only)
    server = smtplib.SMTP(smtp_host, smtp_port)

if smtp_username and smtp_password:
    server.login(smtp_username, smtp_password)

Defaults: Port 587 with STARTTLS (most common)

See Also: ADR 0006 - Email SMTP Configuration


5. DNS Service - Resolver Configuration

Decision: Option C - Use system DNS with fallback to public DNS.

Rationale:

  • Respects system configuration (good citizenship)
  • Fallback to reliable public DNS if system fails
  • No configuration needed for most users
  • Works in containerized environments

Implementation Specification:

import dns.resolver

def create_resolver() -> dns.resolver.Resolver:
    """Create DNS resolver with system DNS and public fallbacks."""
    resolver = dns.resolver.Resolver()

    # Try system DNS first (resolver.nameservers is already populated)
    # If you need to explicitly set fallbacks:
    if not resolver.nameservers:
        # Fallback to public DNS if system DNS not available
        resolver.nameservers = ['8.8.8.8', '1.1.1.1']

    return resolver

No environment variable needed - keep it simple and use system defaults.

Timeout configuration: Use dnspython defaults (2 seconds per nameserver)


6. Logging Configuration - Log Levels and Format

Decision: Option B - Standard Python logging with structured fields.

Format:

%(asctime)s [%(levelname)s] %(name)s: %(message)s

Example output:

2024-11-20 10:30:45,123 [INFO] gondulf.domain: Domain verification requested domain=example.com email=user@example.com
2024-11-20 10:30:46,456 [INFO] gondulf.auth: Authorization code generated client_id=https://app.example.com me=https://example.com

Log Levels:

  • Development (GONDULF_DEBUG=true): DEBUG
  • Production (GONDULF_DEBUG=false): INFO
  • Configurable via GONDULF_LOG_LEVEL=INFO|DEBUG|WARNING|ERROR

Implementation:

import logging

# Configure root logger
log_level = os.getenv('GONDULF_LOG_LEVEL', 'DEBUG' if debug else 'INFO')
logging.basicConfig(
    level=log_level,
    format='%(asctime)s [%(levelname)s] %(name)s: %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S'
)

# Get logger for module
logger = logging.getLogger('gondulf.domain')

# Log with structured information
logger.info(f"Domain verification requested domain={domain} email={email}")

Output: stdout/stderr (let deployment environment handle log files)

See Also: ADR 0007 - Logging Strategy for v1.0.0


7. Health Check Endpoint

Decision: Option B - Check database connectivity.

Rationale:

  • Must verify database is accessible (critical dependency)
  • Email and DNS are used on-demand, not required for health
  • Keep it simple - one critical check
  • Fast response time

Endpoint Specification:

GET /health

Response - Healthy:

{
  "status": "healthy",
  "database": "connected"
}

Status Code: 200

Response - Unhealthy:

{
  "status": "unhealthy",
  "database": "error",
  "error": "unable to connect to database"
}

Status Code: 503

Implementation:

  • Execute simple query: SELECT 1 against database
  • Timeout: 5 seconds
  • No authentication required for health check
  • Log failures at WARNING level

8. Database File Location

Decision: Option C - Configurable via GONDULF_DATABASE_URL with smart defaults.

Configuration:

GONDULF_DATABASE_URL=sqlite:///./data/gondulf.db

Path Resolution:

  • Relative paths resolved from current working directory
  • Absolute paths used as-is
  • Default: ./data/gondulf.db (relative to cwd)

Data Directory Creation:

from pathlib import Path
from urllib.parse import urlparse

def ensure_database_directory(database_url: str) -> None:
    """Create database directory if it doesn't exist."""
    if database_url.startswith('sqlite:///'):
        # Parse path from URL
        db_path = database_url.replace('sqlite:///', '', 1)
        db_file = Path(db_path)

        # Create parent directory if needed
        db_file.parent.mkdir(parents=True, exist_ok=True)

Call this on application startup before any database operations.

Deployment Examples:

Development:

GONDULF_DATABASE_URL=sqlite:///./data/gondulf.db

Production (Docker):

GONDULF_DATABASE_URL=sqlite:////data/gondulf.db

Production (systemd):

GONDULF_DATABASE_URL=sqlite:////var/lib/gondulf/gondulf.db

Summary

All 8 questions have been answered with specific implementation details. Key ADRs created:

  • ADR 0004: Configuration Management
  • ADR 0005: Phase 1 Database Schema
  • ADR 0006: Email SMTP Configuration
  • ADR 0007: Logging Strategy

The Developer now has complete, unambiguous specifications to proceed with Phase 1 implementation.