# System Architecture Overview - v0.1.0 **Version**: 0.1.0 **Date**: 2025-12-22 **Status**: Initial Design ## Introduction This document describes the high-level architecture for Sneaky Klaus, a self-hosted Secret Santa organization application. The architecture prioritizes simplicity, ease of deployment, and minimal external dependencies while maintaining security and reliability. ## System Architecture ### Deployment Model Sneaky Klaus is deployed as a **single Docker container** containing all application components: ```mermaid graph TB subgraph "Docker Container" direction TB Flask[Flask Application] Gunicorn[Gunicorn WSGI Server] APScheduler[APScheduler Background Jobs] SQLite[(SQLite Database)] Gunicorn --> Flask Flask --> SQLite APScheduler --> Flask APScheduler --> SQLite end subgraph "External Services" Resend[Resend Email API] end subgraph "Clients" AdminBrowser[Admin Browser] ParticipantBrowser[Participant Browser] end AdminBrowser -->|HTTPS| Gunicorn ParticipantBrowser -->|HTTPS| Gunicorn Flask -->|HTTPS| Resend subgraph "Persistent Storage" DBVolume[Database Volume] SQLite -.->|Mounted| DBVolume end ``` ### Component Responsibilities | Component | Responsibility | Technology | |-----------|----------------|------------| | **Gunicorn** | HTTP request handling, worker process management | Gunicorn 21.x | | **Flask Application** | Request routing, business logic, template rendering | Flask 3.x | | **SQLite Database** | Data persistence, transactional storage | SQLite 3.40+ | | **APScheduler** | Background job scheduling (reminders, data purging) | APScheduler 3.10+ | | **Jinja2** | Server-side HTML template rendering | Jinja2 3.1+ | | **Resend** | Transactional email delivery | Resend API | ### Application Architecture The Flask application follows a layered architecture: ```mermaid graph TB subgraph "Presentation Layer" Routes[Route Handlers] Templates[Jinja2 Templates] Forms[WTForms Validation] end subgraph "Business Logic Layer" Services[Service Layer] Auth[Authentication Service] Matching[Matching Algorithm] Notifications[Notification Service] end subgraph "Data Access Layer" Models[SQLAlchemy Models] Repositories[Repository Pattern] end subgraph "Infrastructure" Database[(SQLite)] EmailProvider[Resend Email] Scheduler[APScheduler] end Routes --> Services Routes --> Forms Routes --> Templates Services --> Models Services --> Auth Services --> Matching Services --> Notifications Models --> Repositories Repositories --> Database Notifications --> EmailProvider Scheduler --> Services ``` ## Core Components ### Flask Application Structure ``` src/ ├── app.py # Application factory ├── config.py # Configuration management ├── models/ # SQLAlchemy models │ ├── admin.py │ ├── exchange.py │ ├── participant.py │ ├── match.py │ ├── session.py │ └── auth_token.py ├── routes/ # Route handlers (blueprints) │ ├── admin.py │ ├── participant.py │ ├── exchange.py │ └── auth.py ├── services/ # Business logic │ ├── auth_service.py │ ├── exchange_service.py │ ├── matching_service.py │ ├── notification_service.py │ └── scheduler_service.py ├── templates/ # Jinja2 templates │ ├── admin/ │ ├── participant/ │ ├── auth/ │ └── layouts/ ├── static/ # Static assets (CSS, minimal JS) │ ├── css/ │ └── js/ └── utils/ # Utility functions ├── email.py ├── security.py └── validators.py ``` ### Database Layer **ORM**: SQLAlchemy for database abstraction and model definition **Migration**: Alembic for schema versioning and migrations **Configuration**: - WAL mode enabled for better concurrency - Foreign keys enabled - Connection pooling disabled (single file, single process benefit) - Appropriate timeouts for locked database scenarios ### Background Job Scheduler **APScheduler Configuration**: - JobStore: SQLAlchemyJobStore (persists jobs across restarts) - Executor: ThreadPoolExecutor (4 workers) - Timezone-aware scheduling **Scheduled Jobs**: 1. **Reminder Emails**: Cron jobs scheduled per exchange based on configured intervals 2. **Exchange Completion**: Daily check for exchanges past their exchange date 3. **Data Purging**: Daily check for completed exchanges past 30-day retention 4. **Session Cleanup**: Daily purge of expired sessions 5. **Token Cleanup**: Hourly purge of expired auth tokens ### Email Service **Provider**: Resend API via official Python SDK **Email Types**: - Participant registration confirmation - Magic link authentication - Match notification (post-matching) - Reminder emails (configurable schedule) - Admin notifications (opt-in) - Password reset **Template Strategy**: - HTML templates stored in `templates/emails/` - Rendered using Jinja2 before sending - Plain text alternatives for all emails - Unsubscribe links where appropriate ## Configuration Management ### Environment Variables | Variable | Purpose | Required | Default | |----------|---------|----------|---------| | `SECRET_KEY` | Flask session encryption | Yes | - | | `DATABASE_URL` | SQLite database file path | No | `sqlite:///data/sneaky-klaus.db` | | `RESEND_API_KEY` | Resend API authentication | Yes | - | | `APP_URL` | Base URL for links in emails | Yes | - | | `ADMIN_EMAIL` | Initial admin email (setup) | Setup only | - | | `ADMIN_PASSWORD` | Initial admin password (setup) | Setup only | - | | `LOG_LEVEL` | Logging verbosity | No | `INFO` | | `TZ` | Container timezone | No | `UTC` | ### Configuration Files **config.py**: Python-based configuration with environment variable overrides **Environment-based configs**: - `DevelopmentConfig`: Debug mode, verbose logging - `ProductionConfig`: Security headers, minimal logging - `TestConfig`: In-memory database, mocked email ## Deployment Architecture ### Docker Container **Base Image**: `python:3.11-slim` **Exposed Ports**: - `8000`: HTTP (Gunicorn) **Volumes**: - `/app/data`: Database and uploaded files (if any) **Health Check**: - Endpoint: `/health` - Interval: 30 seconds - Timeout: 5 seconds **Dockerfile Structure**: ```dockerfile FROM python:3.11-slim # Install uv RUN pip install uv # Copy application WORKDIR /app COPY . /app # Install dependencies RUN uv sync --frozen # Create data directory RUN mkdir -p /app/data # Expose port EXPOSE 8000 # Health check HEALTHCHECK --interval=30s --timeout=5s --start-period=5s \ CMD curl -f http://localhost:8000/health || exit 1 # Run application CMD ["uv", "run", "gunicorn", "-c", "gunicorn.conf.py", "src.app:create_app()"] ``` ### Reverse Proxy Configuration **Recommended**: Deploy behind reverse proxy (Nginx, Traefik, Caddy) for: - HTTPS termination - Rate limiting (additional layer beyond app-level) - Static file caching - Request buffering **Example Nginx Config**: ```nginx server { listen 443 ssl http2; server_name secretsanta.example.com; ssl_certificate /path/to/cert.pem; ssl_certificate_key /path/to/key.pem; location / { proxy_pass http://sneaky-klaus:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } location /static { proxy_pass http://sneaky-klaus:8000/static; expires 1y; add_header Cache-Control "public, immutable"; } } ``` ## Security Architecture ### Authentication & Authorization See [ADR-0002: Authentication Strategy](../../decisions/0002-authentication-strategy.md) for detailed authentication design. **Summary**: - Admin: Password-based authentication with bcrypt hashing - Participant: Magic link authentication with time-limited tokens - Server-side sessions with secure cookies - Rate limiting on all authentication endpoints ### Security Headers Flask-Talisman configured with: - `Content-Security-Policy`: Restrict script sources - `X-Frame-Options`: Prevent clickjacking - `X-Content-Type-Options`: Prevent MIME sniffing - `Strict-Transport-Security`: Enforce HTTPS - `Referrer-Policy`: Control referrer information ### CSRF Protection Flask-WTF provides CSRF tokens for all forms: - Tokens embedded in forms automatically - Tokens validated on POST/PUT/DELETE requests - SameSite cookie attribute provides additional protection ### Input Validation - WTForms for form validation - SQLAlchemy parameterized queries prevent SQL injection - Jinja2 auto-escaping prevents XSS - Email validation for all email inputs ### Secrets Management - All secrets stored in environment variables - Never committed to version control - Docker secrets or .env file for local development - Secret rotation supported through environment updates ## Data Flow Examples ### Participant Registration Flow ```mermaid sequenceDiagram participant P as Participant Browser participant F as Flask App participant DB as SQLite Database participant E as Resend Email P->>F: GET /exchange/{id}/register F->>DB: Query exchange details DB-->>F: Exchange data F-->>P: Registration form P->>F: POST /exchange/{id}/register (name, email, gift ideas) F->>F: Validate form F->>DB: Check email uniqueness in exchange F->>DB: Insert participant record DB-->>F: Participant created F->>F: Generate magic link token F->>DB: Store auth token F->>E: Send confirmation email with magic link E-->>F: Email accepted F-->>P: Registration success page ``` ### Matching Flow ```mermaid sequenceDiagram participant A as Admin Browser participant F as Flask App participant M as Matching Service participant DB as SQLite Database participant E as Resend Email A->>F: POST /admin/exchange/{id}/match F->>DB: Get all participants F->>DB: Get exclusion rules F->>M: Execute matching algorithm M->>M: Generate valid assignments M-->>F: Match assignments F->>DB: Store matches (transaction) DB-->>F: Matches saved F->>E: Send match notification to each participant E-->>F: Emails queued F->>DB: Update exchange state to "Matched" F-->>A: Matching complete ``` ### Reminder Email Flow ```mermaid sequenceDiagram participant S as APScheduler participant F as Flask App participant DB as SQLite Database participant E as Resend Email S->>F: Trigger reminder job F->>DB: Query exchanges needing reminders today DB-->>F: Exchange list loop For each exchange F->>DB: Get opted-in participants DB-->>F: Participant list loop For each participant F->>DB: Get participant's match F->>E: Send reminder email E-->>F: Email sent end end F->>DB: Log reminder job completion ``` ## Performance Considerations ### Expected Load - **Concurrent Users**: 10-50 typical, 100 maximum - **Exchanges**: 10-100 per installation - **Participants per Exchange**: 3-100 typical, 500 maximum - **Database Size**: <100MB typical, <1GB maximum ### Scaling Strategy **Vertical Scaling**: Increase container resources (CPU, memory) as needed **Horizontal Scaling**: Not supported due to SQLite limitation. If horizontal scaling becomes necessary: 1. Migrate database to PostgreSQL 2. Externalize session storage (Redis) 3. Deploy multiple application instances behind load balancer For the target use case (self-hosted Secret Santa), vertical scaling is sufficient. ### Caching Strategy **Initial Version**: No caching layer (premature optimization) **Future Optimization** (if needed): - Flask-Caching for expensive queries (participant lists, exchange details) - Redis for session storage (if horizontal scaling needed) - Reverse proxy caching for static assets ### Database Optimization - Indexes on frequently queried fields (email, exchange_id, token_hash) - WAL mode for improved read concurrency - VACUUM scheduled periodically (after data purges) - Query optimization through SQLAlchemy query analysis ## Monitoring & Observability ### Logging **Python logging module** with structured logging: - **Log Levels**: DEBUG, INFO, WARNING, ERROR, CRITICAL - **Log Format**: JSON for production, human-readable for development - **Log Outputs**: stdout (captured by Docker) **Logged Events**: - Authentication attempts (success and failure) - Exchange state transitions - Matching operations (start, success, failure) - Email send operations - Background job execution - Error exceptions with stack traces ### Metrics (Future) Potential metrics to track: - Request count and latency by endpoint - Authentication success/failure rates - Email delivery success rates - Background job execution duration - Database query performance **Implementation**: Prometheus metrics endpoint (optional enhancement) ### Health Checks **`/health` endpoint** returns: - HTTP 200: Application healthy - HTTP 503: Application unhealthy (database unreachable, critical failure) **Checks**: - Database connectivity - Email service reachability (optional, cached) - Scheduler running status ## Disaster Recovery ### Backup Strategy **Database Backup**: - SQLite file located at `/app/data/sneaky-klaus.db` - Backup via volume snapshots or file copy - Recommended frequency: Daily automatic backups - Retention: 30 days **Backup Methods**: 1. Volume snapshots (Docker volume backup) 2. `sqlite3 .backup` command (online backup) 3. File copy (requires application shutdown for consistency) ### Restore Procedure 1. Stop container 2. Replace database file with backup 3. Start container 4. Verify application health ### Data Export **Future Enhancement**: Admin export functionality - CSV export of participants and exchanges - JSON export for full data portability ## Development Workflow ### Local Development ```bash # Clone repository git clone https://github.com/user/sneaky-klaus.git cd sneaky-klaus # Install dependencies uv sync # Set up environment variables cp .env.example .env # Edit .env with local values # Run database migrations uv run alembic upgrade head # Run development server uv run flask run ``` ### Testing Strategy **Test Levels**: 1. **Unit Tests**: Business logic, utilities (pytest) 2. **Integration Tests**: Database operations, email sending (pytest + fixtures) 3. **End-to-End Tests**: Full user flows (Playwright or Selenium) **Test Coverage Target**: 80%+ for business logic ### CI/CD Pipeline **Continuous Integration**: - Run tests on every commit - Lint code (ruff, mypy) - Build Docker image - Security scanning (bandit, safety) **Continuous Deployment**: - Tag releases (semantic versioning) - Push Docker image to registry - Update deployment documentation ## Future Architectural Considerations ### Potential Enhancements 1. **Multi-tenancy**: Support multiple isolated admin accounts (requires significant schema changes) 2. **PostgreSQL Support**: Optional PostgreSQL backend for larger deployments 3. **Horizontal Scaling**: Redis session storage, multi-instance deployment 4. **API**: REST API for programmatic access or mobile apps 5. **Webhooks**: Notify external systems of events 6. **Internationalization**: Multi-language support ### Migration Paths If the application needs to scale beyond SQLite: 1. **Database Migration**: Alembic migrations can be adapted for PostgreSQL 2. **Session Storage**: Move to Redis for distributed sessions 3. **Job Queue**: Move to Celery + Redis for distributed background jobs 4. **File Storage**: Move to S3-compatible storage if file uploads are added These migrations would be disruptive and are not planned for initial versions. ## Conclusion This architecture prioritizes simplicity and ease of self-hosting while maintaining security, reliability, and maintainability. The single-container deployment model minimizes operational complexity, making Sneaky Klaus accessible to non-technical users who want to self-host their Secret Santa exchanges. The design is deliberately conservative, avoiding premature optimization and complex infrastructure. Future enhancements can be added incrementally without requiring fundamental architectural changes. ## References - [ADR-0001: Core Technology Stack](../../decisions/0001-core-technology-stack.md) - [ADR-0002: Authentication Strategy](../../decisions/0002-authentication-strategy.md) - [Flask Documentation](https://flask.palletsprojects.com/) - [SQLAlchemy Documentation](https://docs.sqlalchemy.org/) - [APScheduler Documentation](https://apscheduler.readthedocs.io/)