This is my personal website.
+ + +``` + +Or visible link: + +```html +Email me +``` + +## Rationale + +### Follows IndieWeb Standards + +**IndieWeb Alignment**: +- rel="me" is the standard way to assert identity in IndieWeb +- Users familiar with IndieAuth likely already have rel="me" configured +- Interoperability with other IndieWeb tools +- Well-documented pattern: https://indieweb.org/rel-me + +**Community Expectations**: +- IndieAuth ecosystem uses rel="me" extensively +- Users understand the pattern +- Existing tutorials and documentation available +- Aligns with decentralized identity principles + +### Security Benefits + +**Prevents Social Engineering**: +- User cannot claim arbitrary email addresses +- Email must be published on the user's own site +- Attacker cannot trick user into entering wrong email +- Self-attested identity (user declares on their domain) + +**Reduces Attack Surface**: +- No user input field for email (no typos, no XSS) +- No email enumeration via guessing +- Email discovery transparent and auditable +- User controls what email is published + +**Transparency**: +- User explicitly publishes email on their site +- Public declaration of email relationship +- User aware they're making email public +- No hidden or implicit email collection + +### Implementation Simplicity + +**Standard Libraries**: +- BeautifulSoup: Robust HTML parsing (handles malformed HTML) +- requests: HTTP client (widely used, well-tested) +- No custom protocols or complex parsing +- Python standard library for email validation + +**Error Handling**: +- Clear error messages with setup instructions +- Graceful degradation (site unavailable, etc.) +- Standard HTTP status codes +- No complex state management + +**Testing**: +- Easy to mock HTTP responses +- Straightforward unit tests +- BeautifulSoup handles edge cases (malformed HTML) +- No external service dependencies + +### User Experience + +**Self-Documenting**: +- User adds one HTML tag to their site +- Clear relationship between domain and email +- User understands what they're publishing +- No hidden configuration + +**Familiar Pattern**: +- Similar to verifying site ownership (Google Search Console, etc.) +- Adding meta tags is common web practice +- Many users already have rel="me" for other purposes +- Works with static sites (no backend required) + +**Setup Time**: +- ~1 minute to add link tag +- No waiting (unlike DNS propagation) +- Immediate verification possible +- Can be combined with other rel="me" links + +## Consequences + +### Positive Consequences + +1. **IndieWeb Standard Compliance**: + - Follows established rel="me" pattern + - Interoperability with IndieWeb tools + - Community-vetted approach + - Well-documented standard + +2. **Enhanced Security**: + - No user-provided email input (prevents social engineering) + - Email explicitly published by user + - Transparent and auditable + - Reduces phishing risk + +3. **Implementation Simplicity**: + - Standard libraries (BeautifulSoup, requests) + - No complex protocols + - Easy to test and maintain + - Handles malformed HTML gracefully + +4. **User Control**: + - User explicitly declares email on their site + - Can change email by updating HTML + - No hidden email collection + - User aware of public email + +5. **Flexibility**: + - Works with static sites (no backend needed) + - Can use any email provider + - Email can be at different domain (e.g., Gmail) + - Supports multiple rel="me" links + +### Negative Consequences + +1. **Public Email Requirement**: + - User must publish email publicly on their site + - Not suitable for users who want private email + - Email harvesters can discover address + - Spam risk (mitigated: users can use spam filters) + +2. **HTML Parsing Complexity**: + - Must handle various HTML formats + - Malformed HTML can cause issues (mitigated: BeautifulSoup) + - Case sensitivity considerations + - Multiple possible HTML structures + +3. **Website Dependency**: + - User's site must be available during authentication + - Site downtime blocks authentication + - No fallback if site unreachable + - Requires HTTPS (not all sites have valid certificates) + +4. **Discovery Failures**: + - User may not have rel="me" configured + - Link may be in wrong format + - Email may be invalid format + - Clear error messages required + +5. **Privacy Considerations**: + - Email addresses visible to anyone + - Cannot use email verification without public disclosure + - Users must accept public email + - May deter privacy-conscious users + +### Mitigation Strategies + +**For Public Email Concern**: +- Document clearly that email will be public +- Suggest using dedicated email for IndieAuth +- Recommend spam filtering +- Note: Email is user's choice (they publish it) + +**For HTML Parsing**: +```python +from bs4 import BeautifulSoup + +# BeautifulSoup handles malformed HTML gracefully +soup = BeautifulSoup(html_content, 'html.parser') + +# Case-insensitive attribute matching +me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me') + +# Multiple link formats supported +# +# Email +``` + +**For Website Dependency**: +- Clear error messages with retry option +- Suggest checking site availability +- Timeout limits (10 seconds) +- Log errors for debugging + +**For Discovery Failures**: +```markdown +Error: No rel="me" email link found + +Please add this to your homepage: + + +See: https://indieweb.org/rel-me for more information. +``` + +## Implementation + +### Email Discovery Service + +```python +from bs4 import BeautifulSoup +import requests +from typing import Optional +import re + +class RelMeEmailDiscovery: + """ + Discover email addresses from rel="me" links on user's homepage. + """ + + def discover_email(self, domain: str) -> Optional[str]: + """ + Fetch domain homepage and discover email from rel="me" link. + + Args: + domain: User's domain (e.g., "example.com") + + Returns: + Email address or None if not found + """ + url = f"https://{domain}" + + try: + # Fetch homepage with safety limits + response = requests.get( + url, + timeout=10, + allow_redirects=True, + max_redirects=5, + verify=True # Verify SSL certificate + ) + response.raise_for_status() + + # Parse HTML (handles malformed HTML) + soup = BeautifulSoup(response.content, 'html.parser') + + # Find all rel="me" links + # Both and tags supported + me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me') + + # Look for mailto: links + for link in me_links: + href = link.get('href', '') + if href.startswith('mailto:'): + email = href.replace('mailto:', '').strip() + + # Validate email format + if self._validate_email_format(email): + logger.info(f"Discovered email via rel='me' for {domain}") + return email + + logger.warning(f"No rel='me' mailto: link found on {domain}") + return None + + except requests.exceptions.SSLError as e: + logger.error(f"SSL verification failed for {domain}: {e}") + return None + except requests.exceptions.Timeout: + logger.error(f"Timeout fetching {domain}") + return None + except requests.exceptions.HTTPError as e: + logger.error(f"HTTP error fetching {domain}: {e}") + return None + except Exception as e: + logger.error(f"Failed to discover email for {domain}: {e}") + return None + + def _validate_email_format(self, email: str) -> bool: + """ + Validate email address format. + + Args: + email: Email address to validate + + Returns: + True if valid format, False otherwise + """ + # Basic RFC 5322 format check + email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' + if not re.match(email_regex, email): + return False + + # Length check (RFC 5321) + if len(email) > 254: + return False + + # Must have exactly one @ + if email.count('@') != 1: + return False + + return True +``` + +### Error Messages + +```python +# DNS TXT found, but no rel="me" link +error_message = """ +Domain verified via DNS, but no email found on your site. + +Please add this to your homepage: + + +This allows us to discover your email address automatically. + +Learn more: https://indieweb.org/rel-me +""" + +# Site unreachable +error_message = """ +Could not fetch your site at https://{domain} + +Please check: +- Site is accessible via HTTPS +- SSL certificate is valid +- No firewall blocking requests + +Try again once your site is accessible. +""" + +# Invalid email format in rel="me" +error_message = """ +Found rel="me" link, but email format is invalid: {email} + +Please check your rel="me" link uses valid email format: + +""" +``` + +## Alternatives Considered + +### Alternative 1: User-Provided Email Input + +**Pros**: +- Simpler implementation (no HTTP fetch, no parsing) +- Works even if site is down +- User can use private email (not public) +- Immediate (no HTTP round-trip) + +**Cons**: +- Social engineering risk (attacker tricks user into entering wrong email) +- Typo risk (user enters incorrect email) +- No self-attestation (email not on user's site) +- Not aligned with IndieWeb standards + +**Rejected**: Security risks outweigh simplicity benefits. rel="me" provides self-attestation and prevents social engineering. + +--- + +### Alternative 2: DNS TXT Record for Email + +**Pros**: +- Stronger proof of domain control (DNS) +- No website dependency +- Machine-readable format +- Fast lookups (DNS cache) + +**Cons**: +- Requires DNS configuration (more complex than HTML) +- DNS propagation delays (can be hours) +- Not user-friendly for non-technical users +- Not standard IndieWeb practice + +**Rejected**: DNS configuration is more complex than adding HTML tag. rel="me" is more aligned with IndieWeb standards. + +--- + +### Alternative 3: WebFinger Protocol + +**Pros**: +- Standard protocol (RFC 7033) +- Machine-readable format (JSON) +- Supports multiple identities +- Well-defined spec + +**Cons**: +- Requires server-side endpoint (not for static sites) +- More complex implementation +- Not common in IndieWeb ecosystem +- Overkill for email discovery + +**Rejected**: Too complex for v1.0.0 MVP. Doesn't work with static sites. rel="me" is simpler and more aligned with IndieWeb. + +--- + +### Alternative 4: Well-Known URI + +**Pros**: +- Standard approach (`/.well-known/email`) +- Simple file-based implementation +- No HTML parsing required +- Fast lookups + +**Cons**: +- Not an established standard for email +- Requires server configuration +- Not aligned with IndieWeb practices +- Duplicate effort (rel="me" already exists) + +**Rejected**: Not standard practice. rel="me" is already established in IndieWeb ecosystem. + +## References + +- IndieWeb rel="me": https://indieweb.org/rel-me +- Example Implementation: https://thesatelliteoflove.com (Phil Skents' identity page) +- HTML Link Relations (W3C): https://www.w3.org/TR/html5/links.html#linkTypes +- BeautifulSoup Documentation: https://www.crummy.com/software/BeautifulSoup/ +- RFC 5322 (Email Format): https://datatracker.ietf.org/doc/html/rfc5322 +- RFC 5321 (SMTP): https://datatracker.ietf.org/doc/html/rfc5321 +- WebFinger (RFC 7033): https://datatracker.ietf.org/doc/html/rfc7033 (alternative considered) + +## Decision History + +- 2025-11-20: Proposed (Architect) +- 2025-11-20: Accepted (Architect) +- Related to ADR-005 (Two-Factor Domain Verification) diff --git a/docs/designs/phase-2-domain-verification.md b/docs/designs/phase-2-domain-verification.md new file mode 100644 index 0000000..71cd7f3 --- /dev/null +++ b/docs/designs/phase-2-domain-verification.md @@ -0,0 +1,2559 @@ +# Phase 2 Design: Domain Verification & Authorization Endpoint + +**Date**: 2025-11-20 +**Architect**: Claude (Architect Agent) +**Status**: Ready for Implementation +**Design Version**: 1.0 + +## Overview + +### What Phase 2 Builds + +Phase 2 implements the complete two-factor domain verification flow and the IndieAuth authorization endpoint, building on Phase 1's foundational services. + +**Core Functionality**: +1. HTML fetching service to retrieve user's homepage +2. rel="me" email discovery service to parse HTML for email links +3. Domain verification service to orchestrate two-factor verification (DNS TXT + Email) +4. HTTP endpoints for verification flow +5. Authorization endpoint to start IndieAuth authentication flow + +**Connection to IndieAuth Protocol**: Phase 2 implements steps 1-7 of the IndieAuth authorization flow (see `/docs/architecture/indieauth-protocol.md` lines 165-174), completing the domain verification and authorization code generation. + +**Connection to Phase 1**: Phase 2 uses all Phase 1 services: +- Configuration (SMTP, DNS, database settings) +- Database (to store verified domains) +- In-memory storage (for authorization codes) +- Email service (to send verification codes) +- DNS service (to verify TXT records) +- Logging (structured logging throughout) + +### Authentication Security Model + +Per ADR-005 and ADR-008, Phase 2 implements two-factor domain verification: + +**Factor 1: DNS TXT Record** (proves DNS control) +- Required: `_gondulf.{domain}` TXT record = `verified` +- Verified via Phase 1 DNS service +- Consensus from multiple resolvers + +**Factor 2: Email Verification via rel="me"** (proves email control) +- Discover email from `` on user's site +- Send 6-digit code to discovered email +- User enters code to complete verification + +**Combined Security**: Attacker must compromise BOTH DNS and email to authenticate fraudulently. + +## Components + +### 1. HTML Fetching Service + +**File**: `src/gondulf/html_fetcher.py` + +**Purpose**: Fetch user's homepage over HTTPS to discover rel="me" links. + +**Public Interface**: + +```python +from typing import Optional +import requests + +class HTMLFetcherService: + """ + Fetch user's homepage over HTTPS with security safeguards. + """ + + def __init__( + self, + timeout: int = 10, + max_redirects: int = 5, + max_size: int = 5 * 1024 * 1024 # 5MB + ): + """ + Initialize HTML fetcher service. + + Args: + timeout: HTTP request timeout in seconds (default: 10) + max_redirects: Maximum redirects to follow (default: 5) + max_size: Maximum response size in bytes (default: 5MB) + """ + self.timeout = timeout + self.max_redirects = max_redirects + self.max_size = max_size + + def fetch_site(self, domain: str) -> Optional[str]: + """ + Fetch site HTML content over HTTPS. + + Args: + domain: Domain to fetch (e.g., "example.com") + + Returns: + HTML content as string, or None if fetch fails + + Raises: + No exceptions raised - all errors logged and None returned + """ +``` + +**Implementation Details**: + +```python +def fetch_site(self, domain: str) -> Optional[str]: + """Fetch site HTML content over HTTPS.""" + url = f"https://{domain}" + + try: + # Fetch with security limits + response = requests.get( + url, + timeout=self.timeout, + allow_redirects=True, + max_redirects=self.max_redirects, + verify=True, # SECURITY: Enforce SSL certificate verification + headers={ + 'User-Agent': 'Gondulf/1.0.0 IndieAuth (+https://github.com/yourusername/gondulf)' + } + ) + response.raise_for_status() + + # SECURITY: Check response size to prevent memory exhaustion + content_length = int(response.headers.get('Content-Length', 0)) + if content_length > self.max_size: + logger.warning(f"Response too large for {domain}: {content_length} bytes") + return None + + # Check actual content size (Content-Length may be absent) + if len(response.content) > self.max_size: + logger.warning(f"Response content too large for {domain}: {len(response.content)} bytes") + return None + + logger.info(f"Successfully fetched {domain}: {len(response.content)} bytes") + return response.text + + except requests.exceptions.SSLError as e: + logger.error(f"SSL verification failed for {domain}: {e}") + return None + except requests.exceptions.Timeout: + logger.error(f"Timeout fetching {domain} after {self.timeout}s") + return None + except requests.exceptions.TooManyRedirects: + logger.error(f"Too many redirects for {domain}") + return None + except requests.exceptions.HTTPError as e: + logger.error(f"HTTP error fetching {domain}: {e}") + return None + except Exception as e: + logger.error(f"Unexpected error fetching {domain}: {e}") + return None +``` + +**Dependencies**: +- `requests` library (already in pyproject.toml) +- Python standard library: typing +- Phase 1 logging configuration + +**Error Handling**: +- SSL verification failure: Log error, return None (security: reject invalid certificates) +- Timeout: Log error, return None (configurable timeout via __init__) +- HTTP errors (404, 500, etc.): Log error with status code, return None +- Size limit exceeded: Log warning, return None (prevent DoS) +- Too many redirects: Log error, return None (prevent redirect loops) +- Generic exceptions: Log error, return None (fail-safe) + +**Security Considerations**: +- HTTPS only (hardcoded in URL) +- SSL certificate verification enforced (verify=True, cannot be disabled) +- Response size limit (5MB default, configurable) +- Timeout to prevent hanging (10s default, configurable) +- Redirect limit (5 max, configurable) +- User-Agent header identifies Gondulf for server logs + +**Testing Requirements**: +- ✅ Successful HTTPS fetch returns HTML content +- ✅ SSL verification failure returns None +- ✅ Timeout returns None +- ✅ HTTP error codes (404, 500) return None +- ✅ Redirects followed (up to max_redirects) +- ✅ Too many redirects returns None +- ✅ Content-Length exceeds max_size returns None +- ✅ Actual content exceeds max_size returns None +- ✅ Custom User-Agent sent in request + +--- + +### 2. rel="me" Email Discovery Service + +**File**: `src/gondulf/relme.py` + +**Purpose**: Parse HTML to discover email addresses from rel="me" links following IndieWeb standards. + +**Public Interface**: + +```python +from typing import Optional +from bs4 import BeautifulSoup +import re + +class RelMeDiscoveryService: + """ + Discover email addresses from rel="me" links in HTML. + + Follows IndieWeb rel="me" standard: https://indieweb.org/rel-me + """ + + def discover_email(self, html_content: str) -> Optional[str]: + """ + Parse HTML and discover email from rel="me" link. + + Args: + html_content: HTML content as string + + Returns: + Email address or None if not found + + Raises: + No exceptions raised - all errors logged and None returned + """ + + def validate_email_format(self, email: str) -> bool: + """ + Validate email address format (RFC 5322 simplified). + + Args: + email: Email address to validate + + Returns: + True if valid format, False otherwise + """ +``` + +**Implementation Details**: + +```python +def discover_email(self, html_content: str) -> Optional[str]: + """Parse HTML and discover email from rel='me' link.""" + try: + # Parse HTML (BeautifulSoup handles malformed HTML gracefully) + soup = BeautifulSoup(html_content, 'html.parser') + + # Find all rel="me" links - both and tags + # Case-insensitive matching via BeautifulSoup + me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me') + + # Look for mailto: links + for link in me_links: + href = link.get('href', '') + if href.startswith('mailto:'): + # Extract email from mailto: URL + email = href.replace('mailto:', '').strip() + + # Remove query parameters if present (e.g., mailto:user@example.com?subject=Hello) + if '?' in email: + email = email.split('?')[0] + + # Validate email format + if self.validate_email_format(email): + logger.info(f"Discovered email via rel='me': {email[:3]}***@{email.split('@')[1]}") + return email + else: + logger.warning(f"Found rel='me' mailto link with invalid email format: {email}") + + logger.warning("No rel='me' mailto: link found in HTML") + return None + + except Exception as e: + logger.error(f"Failed to parse HTML for rel='me' links: {e}") + return None + +def validate_email_format(self, email: str) -> bool: + """Validate email address format (RFC 5322 simplified).""" + # Basic format validation + email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' + + if not re.match(email_regex, email): + return False + + # Length check (RFC 5321 maximum) + if len(email) > 254: + return False + + # Must have exactly one @ + if email.count('@') != 1: + return False + + # Domain must have at least one dot + local, domain = email.split('@') + if '.' not in domain: + return False + + return True +``` + +**Dependencies**: +- `beautifulsoup4>=4.12.0` (NEW - add to pyproject.toml) +- `html.parser` (Python standard library, used by BeautifulSoup) +- `re` (Python standard library) +- Phase 1 logging configuration + +**Error Handling**: +- Malformed HTML: BeautifulSoup handles gracefully, continues parsing +- Missing rel="me" links: Log warning, return None +- Invalid email format in link: Log warning, skip link, continue searching +- Multiple rel="me" mailto links: Return first valid one +- Empty href attribute: Skip link, continue searching +- Exception during parsing: Log error, return None + +**Security Considerations**: +- No script execution: BeautifulSoup only extracts attributes, never executes JavaScript +- Email validation: Strict format checking prevents injection +- Link extraction only: No rendering or evaluation of HTML +- Partial masking in logs: Only log first 3 chars of email (privacy) + +**Testing Requirements**: +- ✅ Discovery from `` tag +- ✅ Discovery from `` tag +- ✅ Multiple rel="me" links: select first mailto +- ✅ Malformed HTML handled gracefully +- ✅ Missing rel="me" links returns None +- ✅ Invalid email format in link returns None (but logs warning) +- ✅ Empty href returns None +- ✅ Non-mailto rel="me" links ignored (e.g., https:// links) +- ✅ mailto with query parameters (e.g., ?subject=Hi) strips params +- ✅ Email validation: valid formats accepted +- ✅ Email validation: invalid formats rejected (no @, no domain, too long, etc.) + +--- + +### 3. Domain Verification Service + +**File**: `src/gondulf/domain_verification.py` + +**Purpose**: Orchestrate two-factor domain verification (DNS TXT + Email via rel="me"). + +**Public Interface**: + +```python +from typing import Tuple, Optional +from .dns import DNSService +from .html_fetcher import HTMLFetcherService +from .relme import RelMeDiscoveryService +from .email import EmailService +from .storage import CodeStorage +from .database.connection import DatabaseConnection +import secrets + +class DomainVerificationService: + """ + Two-factor domain verification service. + + Verifies domain ownership through: + 1. DNS TXT record verification (_gondulf.{domain} = verified) + 2. Email verification via rel="me" discovery + """ + + def __init__( + self, + dns_service: DNSService, + html_fetcher: HTMLFetcherService, + relme_discovery: RelMeDiscoveryService, + email_service: EmailService, + code_storage: CodeStorage, + database: DatabaseConnection, + code_ttl: int = 900 # 15 minutes + ): + """ + Initialize domain verification service. + + Args: + dns_service: DNS service for TXT record verification + html_fetcher: HTML fetcher service + relme_discovery: rel="me" email discovery service + email_service: Email service for sending codes + code_storage: In-memory storage for verification codes + database: Database connection for storing verified domains + code_ttl: Verification code TTL in seconds (default: 900 = 15 min) + """ + + def start_verification(self, domain: str) -> Tuple[bool, Optional[str], Optional[str]]: + """ + Start domain verification process. + + Steps: + 1. Verify DNS TXT record exists + 2. Fetch user's homepage + 3. Discover email from rel="me" link + 4. Generate and send verification code + + Args: + domain: Domain to verify (e.g., "example.com") + + Returns: + Tuple of (success, discovered_email_masked, error_message) + - success: True if code sent, False if verification cannot start + - discovered_email_masked: Email with partial masking (e.g., "u***@example.com") + - error_message: Error description if success=False, None otherwise + """ + + def verify_code(self, email: str, submitted_code: str) -> Tuple[bool, Optional[str], Optional[str]]: + """ + Verify submitted code. + + Args: + email: Email address (discovered from rel="me") + submitted_code: 6-digit code entered by user + + Returns: + Tuple of (success, domain, error_message) + - success: True if code valid, False otherwise + - domain: User's verified domain if success=True + - error_message: Error description if success=False + """ + + def is_domain_verified(self, domain: str) -> bool: + """ + Check if domain is already verified (cached in database). + + Args: + domain: Domain to check + + Returns: + True if domain previously verified, False otherwise + """ +``` + +**Implementation Details**: + +```python +def start_verification(self, domain: str) -> Tuple[bool, Optional[str], Optional[str]]: + """Start domain verification process.""" + logger.info(f"Starting domain verification: {domain}") + + # Step 1: Verify DNS TXT record (first factor) + logger.debug(f"Verifying DNS TXT record for {domain}") + dns_verified = self.dns_service.verify_txt_record(domain, "verified") + + if not dns_verified: + error = ( + f"DNS verification failed. TXT record not found for _gondulf.{domain}. " + f"Please add: Type=TXT, Name=_gondulf.{domain}, Value=verified" + ) + logger.warning(f"DNS verification failed: {domain}") + return False, None, error + + logger.info(f"DNS TXT record verified: {domain}") + + # Step 2: Fetch site homepage + logger.debug(f"Fetching homepage for {domain}") + html = self.html_fetcher.fetch_site(domain) + + if html is None: + error = ( + f"Could not fetch site at https://{domain}. " + f"Please ensure site is accessible via HTTPS with valid SSL certificate." + ) + logger.warning(f"Site fetch failed: {domain}") + return False, None, error + + logger.info(f"Successfully fetched homepage: {domain}") + + # Step 3: Discover email from rel="me" (second factor discovery) + logger.debug(f"Discovering email via rel='me' for {domain}") + email = self.relme_discovery.discover_email(html) + + if email is None: + error = ( + 'No rel="me" mailto: link found on homepage. ' + f'Please add to https://{domain}: ' + '' + ) + logger.warning(f"rel='me' discovery failed: {domain}") + return False, None, error + + logger.info(f"Email discovered via rel='me' for {domain}: {email[:3]}***") + + # Step 4: Check rate limiting + if self._is_rate_limited(domain): + error = ( + f"Rate limit exceeded for {domain}. " + f"Please wait before requesting another verification code." + ) + logger.warning(f"Rate limit exceeded: {domain}") + return False, email, error + + # Step 5: Generate verification code + code = self._generate_code() + + # Step 6: Store code with metadata + self.code_storage.store(email, code, ttl=self.code_ttl) + + # Store metadata for rate limiting and domain association + self._store_code_metadata(email, domain) + + logger.debug(f"Verification code generated and stored for {email[:3]}***") + + # Step 7: Send verification email (second factor verification) + logger.debug(f"Sending verification email to {email[:3]}***") + email_sent = self.email_service.send_verification_email(email, code) + + if not email_sent: + # Clean up stored code if email fails + self.code_storage.delete(email) + error = ( + f"Failed to send verification code to {email}. " + f"Please check email address in rel='me' link and try again." + ) + logger.error(f"Email send failed: {email[:3]}***") + return False, email, error + + logger.info(f"Verification code sent successfully to {email[:3]}***") + + # Mask email for display: u***@example.com + email_masked = self._mask_email(email) + + return True, email_masked, None + +def verify_code(self, email: str, submitted_code: str) -> Tuple[bool, Optional[str], Optional[str]]: + """Verify submitted code.""" + logger.info(f"Verifying code for {email[:3]}***") + + # Retrieve stored code + stored_code = self.code_storage.get(email) + + if stored_code is None: + logger.warning(f"No verification code found for {email[:3]}***") + return False, None, "No verification code found. Please request a new code." + + # Get code metadata + metadata = self._get_code_metadata(email) + if metadata is None: + logger.error(f"Code found but metadata missing for {email[:3]}***") + return False, None, "Verification error. Please request a new code." + + domain = metadata['domain'] + attempts = metadata.get('attempts', 0) + + # Check attempt limit (prevent brute force) + if attempts >= 3: + logger.warning(f"Too many attempts for {email[:3]}***") + self.code_storage.delete(email) + self._delete_code_metadata(email) + return False, None, "Too many attempts. Please request a new code." + + # Increment attempt counter + self._increment_attempts(email) + + # Verify code using constant-time comparison (SECURITY: prevent timing attacks) + if not secrets.compare_digest(submitted_code, stored_code): + logger.warning(f"Invalid code submitted for {email[:3]}***") + return False, None, f"Invalid code. {3 - attempts - 1} attempts remaining." + + # Code is valid - clean up and mark domain as verified + logger.info(f"Code verified successfully for {domain}") + + self.code_storage.delete(email) + self._delete_code_metadata(email) + + # Store verified domain in database + self._store_verified_domain(domain) + + return True, domain, None + +def is_domain_verified(self, domain: str) -> bool: + """Check if domain already verified.""" + with self.database.get_connection() as conn: + result = conn.execute( + "SELECT verified FROM domains WHERE domain = ?", + (domain,) + ).fetchone() + + if result and result['verified']: + logger.debug(f"Domain already verified: {domain}") + return True + + return False + +def _generate_code(self) -> str: + """Generate 6-digit verification code.""" + return ''.join(secrets.choice('0123456789') for _ in range(6)) + +def _mask_email(self, email: str) -> str: + """Mask email for display: u***@example.com""" + local, domain = email.split('@') + if len(local) <= 1: + return f"{local[0]}***@{domain}" + return f"{local[0]}***@{domain}" + +def _is_rate_limited(self, domain: str) -> bool: + """ + Check if domain is rate limited. + + Rate limit: Max 3 codes per domain per hour. + """ + # TODO: Implement rate limiting using code metadata + # For Phase 2, we'll implement simple in-memory tracking + # Future: Use Redis for distributed rate limiting + return False # Placeholder - implement in actual code + +def _store_code_metadata(self, email: str, domain: str) -> None: + """Store code metadata for rate limiting and domain association.""" + # TODO: Implement metadata storage + # Store: email -> {domain, created_at, attempts} + pass + +def _get_code_metadata(self, email: str) -> Optional[dict]: + """Retrieve code metadata.""" + # TODO: Implement metadata retrieval + # Return: {domain, created_at, attempts} + return {'domain': 'example.com', 'attempts': 0} # Placeholder + +def _delete_code_metadata(self, email: str) -> None: + """Delete code metadata.""" + # TODO: Implement metadata deletion + pass + +def _increment_attempts(self, email: str) -> None: + """Increment attempt counter for email.""" + # TODO: Implement attempt increment + pass + +def _store_verified_domain(self, domain: str) -> None: + """Store verified domain in database.""" + from datetime import datetime + + with self.database.get_connection() as conn: + conn.execute( + """ + INSERT OR REPLACE INTO domains (domain, verification_method, verified, verified_at, last_dns_check) + VALUES (?, ?, ?, ?, ?) + """, + (domain, 'two_factor', True, datetime.utcnow(), datetime.utcnow()) + ) + conn.commit() + + logger.info(f"Domain verification stored in database: {domain}") +``` + +**Dependencies**: +- All Phase 1 services (DNS, Email, Storage, Database) +- HTML fetcher service (Phase 2) +- rel="me" discovery service (Phase 2) +- Python standard library: secrets, datetime + +**Error Handling**: +- DNS verification failure: Return error with setup instructions +- Site fetch failure: Return error with troubleshooting steps +- rel="me" discovery failure: Return error with HTML example +- Email send failure: Return error, clean up stored code +- Code not found: Return error, suggest requesting new code +- Code expired: Handled by CodeStorage TTL +- Too many attempts: Return error, invalidate code +- Invalid code: Return error with remaining attempts +- Rate limit exceeded: Return error, suggest waiting + +**Security Considerations**: +- Two-factor verification: Both DNS and email required +- Constant-time code comparison: Prevent timing attacks (secrets.compare_digest) +- Rate limiting: Max 3 codes per domain per hour (prevents abuse) +- Attempt limiting: Max 3 code submission attempts (prevents brute force) +- Single-use codes: Deleted after successful verification +- Email masking in logs: Only log partial email (privacy) +- No email storage: Email used only during verification, never persisted + +**Testing Requirements**: +- ✅ Full verification flow: DNS → rel="me" → email → code verification +- ✅ DNS verification failure blocks flow +- ✅ Site fetch failure blocks flow +- ✅ rel="me" discovery failure blocks flow +- ✅ Email send failure cleans up stored code +- ✅ Code verification success stores domain in database +- ✅ Code verification failure decrements remaining attempts +- ✅ Too many attempts invalidates code +- ✅ Invalid code returns error with attempts remaining +- ✅ Code expiration handled by storage layer +- ✅ Rate limiting prevents excessive code requests +- ✅ Already verified domain check works +- ✅ Email masking works correctly + +--- + +### 4. Domain Verification Endpoints + +**File**: `src/gondulf/routers/verification.py` + +**Purpose**: HTTP API endpoints for user interaction during verification flow. + +**Public Interface**: + +```python +from fastapi import APIRouter, HTTPException, Depends +from pydantic import BaseModel, Field +from typing import Optional + +router = APIRouter(prefix="/api/verify", tags=["verification"]) + +# Request/Response Models +class VerificationStartRequest(BaseModel): + """Request to start domain verification.""" + domain: str = Field( + ..., + min_length=3, + max_length=253, + description="Domain to verify (e.g., 'example.com')" + ) + +class VerificationStartResponse(BaseModel): + """Response from starting verification.""" + success: bool + email_masked: Optional[str] = Field(None, description="Partially masked email (e.g., 'u***@example.com')") + error: Optional[str] = Field(None, description="Error message if success=False") + +class VerificationCodeRequest(BaseModel): + """Request to verify code.""" + email: str = Field(..., description="Email address discovered from rel='me'") + code: str = Field(..., min_length=6, max_length=6, pattern="^[0-9]{6}$", description="6-digit verification code") + +class VerificationCodeResponse(BaseModel): + """Response from code verification.""" + success: bool + domain: Optional[str] = Field(None, description="Verified domain if success=True") + error: Optional[str] = Field(None, description="Error message if success=False") + +# Endpoints +@router.post("/start", response_model=VerificationStartResponse) +async def start_verification( + request: VerificationStartRequest, + domain_verification: DomainVerificationService = Depends(get_domain_verification_service) +) -> VerificationStartResponse: + """ + Start domain verification process. + + Steps: + 1. Verify DNS TXT record exists + 2. Discover email from rel="me" link + 3. Send verification code to discovered email + + Returns masked email on success, error message on failure. + """ + +@router.post("/code", response_model=VerificationCodeResponse) +async def verify_code( + request: VerificationCodeRequest, + domain_verification: DomainVerificationService = Depends(get_domain_verification_service) +) -> VerificationCodeResponse: + """ + Verify submitted code. + + Returns verified domain on success, error message on failure. + """ +``` + +**Implementation Details**: + +```python +@router.post("/start", response_model=VerificationStartResponse) +async def start_verification( + request: VerificationStartRequest, + domain_verification: DomainVerificationService = Depends(get_domain_verification_service) +) -> VerificationStartResponse: + """Start domain verification process.""" + logger.info(f"Verification start request: {request.domain}") + + # Normalize domain (lowercase, remove trailing slash) + domain = request.domain.lower().rstrip('/') + + # Remove protocol if present + if domain.startswith('http://') or domain.startswith('https://'): + domain = domain.split('://', 1)[1] + + # Remove path if present + if '/' in domain: + domain = domain.split('/')[0] + + # Validate domain format (basic validation) + if not domain or '.' not in domain: + logger.warning(f"Invalid domain format: {request.domain}") + return VerificationStartResponse( + success=False, + email_masked=None, + error="Invalid domain format. Please provide a valid domain (e.g., 'example.com')." + ) + + # Start verification + success, email_masked, error = domain_verification.start_verification(domain) + + if not success: + logger.warning(f"Verification start failed for {domain}: {error}") + return VerificationStartResponse( + success=False, + email_masked=email_masked, + error=error + ) + + logger.info(f"Verification started successfully for {domain}") + return VerificationStartResponse( + success=True, + email_masked=email_masked, + error=None + ) + +@router.post("/code", response_model=VerificationCodeResponse) +async def verify_code( + request: VerificationCodeRequest, + domain_verification: DomainVerificationService = Depends(get_domain_verification_service) +) -> VerificationCodeResponse: + """Verify submitted code.""" + logger.info(f"Code verification request for email: {request.email[:3]}***") + + # Verify code + success, domain, error = domain_verification.verify_code(request.email, request.code) + + if not success: + logger.warning(f"Code verification failed for {request.email[:3]}***: {error}") + return VerificationCodeResponse( + success=False, + domain=None, + error=error + ) + + logger.info(f"Code verified successfully for domain: {domain}") + return VerificationCodeResponse( + success=True, + domain=domain, + error=None + ) +``` + +**Dependencies**: +- FastAPI router and dependency injection +- Pydantic models for request/response validation +- Domain verification service (injected via Depends) +- Phase 1 logging configuration + +**Error Handling**: +- Invalid domain format: Return 200 with success=False, descriptive error +- Pydantic validation errors: Automatic 422 response with validation details +- Service errors: Propagated via success=False in response +- All errors logged at WARNING level +- No 500 errors expected (all errors handled gracefully) + +**Security Considerations**: +- Input validation: Pydantic models enforce constraints +- Domain normalization: Prevent URL injection +- No authentication required: Public endpoints (verification is the authentication) +- Rate limiting: Handled by DomainVerificationService (not endpoint level) +- Email not validated at endpoint level: Service handles validation + +**Testing Requirements**: +- ✅ POST /api/verify/start with valid domain returns success +- ✅ POST /api/verify/start with invalid domain format returns error +- ✅ POST /api/verify/start with DNS failure returns error +- ✅ POST /api/verify/start with rel="me" failure returns error +- ✅ POST /api/verify/start with email send failure returns error +- ✅ POST /api/verify/code with valid code returns domain +- ✅ POST /api/verify/code with invalid code returns error +- ✅ POST /api/verify/code with expired code returns error +- ✅ POST /api/verify/code with missing code returns error +- ✅ POST /api/verify/code with too many attempts returns error +- ✅ Pydantic validation errors return 422 + +--- + +### 5. Authorization Endpoint + +**File**: `src/gondulf/routers/authorization.py` + +**Purpose**: Implement IndieAuth authorization endpoint (`/authorize`) per W3C spec. + +**Public Interface**: + +```python +from fastapi import APIRouter, Request, HTTPException, Depends +from fastapi.responses import RedirectResponse, HTMLResponse +from pydantic import BaseModel, HttpUrl, Field +from typing import Optional, Literal + +router = APIRouter(tags=["indieauth"]) + +# Request Models +class AuthorizeRequest(BaseModel): + """ + IndieAuth authorization request parameters. + + Per W3C IndieAuth specification (Section 5.1): + https://www.w3.org/TR/indieauth/#authorization-request + """ + me: HttpUrl = Field(..., description="User's profile URL (domain identity)") + client_id: HttpUrl = Field(..., description="Client application URL") + redirect_uri: HttpUrl = Field(..., description="Where to redirect after authorization") + state: str = Field(..., min_length=1, max_length=512, description="CSRF protection token") + response_type: Literal["code"] = Field(..., description="Must be 'code' for authorization code flow") + scope: Optional[str] = Field(None, description="Requested scopes (ignored in v1.0.0)") + code_challenge: Optional[str] = Field(None, description="PKCE challenge (not supported in v1.0.0)") + code_challenge_method: Optional[str] = Field(None, description="PKCE method (not supported in v1.0.0)") + +# Endpoints +@router.get("/authorize") +async def authorize( + request: Request, + me: str, + client_id: str, + redirect_uri: str, + state: str, + response_type: str, + scope: Optional[str] = None, + code_challenge: Optional[str] = None, + code_challenge_method: Optional[str] = None, + domain_verification: DomainVerificationService = Depends(get_domain_verification_service) +) -> HTMLResponse: + """ + IndieAuth authorization endpoint. + + Per W3C IndieAuth specification: + https://www.w3.org/TR/indieauth/#authorization-request + + Flow: + 1. Validate all parameters + 2. Check if domain already verified (skip verification if cached) + 3. If not verified, initiate two-factor verification flow + 4. Display consent screen with client info + 5. On approval, generate authorization code + 6. Redirect to client with code + state + """ +``` + +**Implementation Details** (High-Level - Full implementation too long for this doc): + +```python +@router.get("/authorize") +async def authorize( + request: Request, + me: str, + client_id: str, + redirect_uri: str, + state: str, + response_type: str, + # ... other parameters +) -> HTMLResponse: + """IndieAuth authorization endpoint.""" + + # STEP 1: Validate response_type + if response_type != "code": + # Return error (redirect if possible) + return _error_response( + redirect_uri=redirect_uri, + state=state, + error="unsupported_response_type", + description="Only response_type=code is supported" + ) + + # STEP 2: Validate and normalize 'me' parameter + me_normalized = _validate_and_normalize_me(me) + if me_normalized is None: + return _error_response( + redirect_uri=redirect_uri, + state=state, + error="invalid_request", + description="Invalid 'me' parameter format" + ) + + # STEP 3: Validate client_id + client_valid = _validate_client_id(client_id) + if not client_valid: + return _error_response( + redirect_uri=redirect_uri, + state=state, + error="invalid_client", + description="Invalid client_id" + ) + + # STEP 4: Validate redirect_uri + redirect_valid = _validate_redirect_uri(redirect_uri, client_id) + if not redirect_valid: + # SECURITY: Cannot redirect to invalid URI - display error page + return _error_page("Invalid redirect_uri") + + # STEP 5: Check if domain already verified + domain = _extract_domain_from_me(me_normalized) + + if domain_verification.is_domain_verified(domain): + # Skip verification, go directly to consent + logger.info(f"Domain already verified: {domain}") + return await _show_consent_screen( + me=me_normalized, + client_id=client_id, + redirect_uri=redirect_uri, + state=state + ) + + # STEP 6: Domain not verified - start verification flow + logger.info(f"Starting verification for new domain: {domain}") + + success, email_masked, error = domain_verification.start_verification(domain) + + if not success: + # Verification failed - show error with instructions + return _verification_error_page(domain, error) + + # STEP 7: Show code entry form + return _code_entry_page( + domain=domain, + email_masked=email_masked, + me=me_normalized, + client_id=client_id, + redirect_uri=redirect_uri, + state=state + ) + +# Additional endpoints for verification flow +@router.post("/authorize/verify-code") +async def verify_code_and_consent( + request: Request, + email: str, + code: str, + me: str, + client_id: str, + redirect_uri: str, + state: str, + domain_verification: DomainVerificationService = Depends(get_domain_verification_service) +) -> HTMLResponse: + """ + Verify code and show consent screen. + + Called when user submits verification code during authorization flow. + """ + # Verify code + success, domain, error = domain_verification.verify_code(email, code) + + if not success: + # Code invalid - show error, allow retry + return _code_entry_page_with_error( + domain=_extract_domain_from_me(me), + email_masked=_mask_email(email), + error=error, + me=me, + client_id=client_id, + redirect_uri=redirect_uri, + state=state + ) + + # Code valid - show consent screen + return await _show_consent_screen( + me=me, + client_id=client_id, + redirect_uri=redirect_uri, + state=state + ) + +@router.post("/authorize/consent") +async def handle_consent( + request: Request, + action: Literal["approve", "deny"], + me: str, + client_id: str, + redirect_uri: str, + state: str, + code_storage: CodeStorage = Depends(get_code_storage) +) -> RedirectResponse: + """ + Handle user consent decision. + + Called when user approves or denies authorization. + """ + if action == "deny": + # User denied - redirect with error + return RedirectResponse( + url=f"{redirect_uri}?error=access_denied&error_description=User denied authorization&state={state}", + status_code=302 + ) + + # User approved - generate authorization code + auth_code = _generate_authorization_code() + + # Store code in memory with metadata + code_storage.store(auth_code, { + 'me': me, + 'client_id': client_id, + 'redirect_uri': redirect_uri, + 'state': state, + 'created_at': datetime.utcnow() + }, ttl=600) # 10 minutes + + logger.info(f"Authorization code generated for {me} / {client_id}") + + # Redirect to client with code + state + return RedirectResponse( + url=f"{redirect_uri}?code={auth_code}&state={state}", + status_code=302 + ) + +# Helper functions (implementations not shown for brevity) +def _validate_and_normalize_me(me: str) -> Optional[str]: + """Validate and normalize 'me' parameter per IndieAuth spec.""" + pass + +def _validate_client_id(client_id: str) -> bool: + """Validate client_id is a valid URL.""" + pass + +def _validate_redirect_uri(redirect_uri: str, client_id: str) -> bool: + """Validate redirect_uri against client_id.""" + pass + +def _extract_domain_from_me(me: str) -> str: + """Extract domain from 'me' URL.""" + pass + +async def _show_consent_screen(...) -> HTMLResponse: + """Render consent screen HTML.""" + pass + +def _code_entry_page(...) -> HTMLResponse: + """Render code entry page HTML.""" + pass + +def _error_response(...) -> RedirectResponse: + """Generate OAuth 2.0 error redirect.""" + pass + +def _generate_authorization_code() -> str: + """Generate cryptographically secure authorization code.""" + return secrets.token_urlsafe(32) # 256 bits +``` + +**Dependencies**: +- FastAPI router, Request, Response types +- Pydantic models for validation +- Domain verification service (Phase 2) +- Code storage (Phase 1) +- HTML templates (new - Jinja2) +- Python standard library: secrets, datetime + +**Error Handling**: +- Invalid response_type: Redirect with `unsupported_response_type` error +- Invalid me parameter: Redirect with `invalid_request` error +- Invalid client_id: Redirect with `invalid_client` error +- Invalid redirect_uri: Display error page (cannot redirect) +- DNS verification failure: Display error page with setup instructions +- rel="me" discovery failure: Display error page with HTML example +- Email send failure: Display error page with troubleshooting +- Code verification failure: Display code entry page with error, allow retry +- User denies consent: Redirect with `access_denied` error +- All errors follow OAuth 2.0 error response format + +**Security Considerations**: +- HTTPS only: Enforced by middleware (production) +- redirect_uri validation: Prevent open redirect attacks +- State parameter: Passed through, client validates (CSRF protection) +- Authorization code: Cryptographically secure (256 bits) +- Code single-use: Enforced by token endpoint (Phase 3) +- Code expiration: 10 minutes TTL +- Domain verification: Two-factor required before code generation +- No client secrets: All clients are public per IndieAuth spec + +**Testing Requirements**: +- ✅ GET /authorize with valid parameters shows verification or consent +- ✅ GET /authorize with invalid response_type returns error +- ✅ GET /authorize with invalid me parameter returns error +- ✅ GET /authorize with invalid client_id returns error +- ✅ GET /authorize with invalid redirect_uri shows error page +- ✅ GET /authorize with already verified domain skips to consent +- ✅ POST /authorize/verify-code with valid code shows consent +- ✅ POST /authorize/verify-code with invalid code shows error +- ✅ POST /authorize/consent with action=approve generates code and redirects +- ✅ POST /authorize/consent with action=deny redirects with access_denied +- ✅ Authorization code stored in memory with correct metadata +- ✅ Authorization code expires after 10 minutes +- ✅ State parameter passed through all steps + +--- + +## Data Flow + +### Complete Two-Factor Verification Flow + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ User / Client Application │ +└───────────────────────────────┬─────────────────────────────────┘ + │ + │ GET /authorize?me=example.com&... + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Authorization Endpoint │ +│ ┌──────────────────────────────────────────────────────────┐ │ +│ │ 1. Validate parameters (me, client_id, redirect_uri, │ │ +│ │ state, response_type) │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +│ ┌──────────────────────────▼───────────────────────────────┐ │ +│ │ 2. Check if domain already verified in database │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +│ ┌────────┴────────┐ │ +│ │ │ │ +│ │ Verified? │ │ +│ │ │ │ +│ ┌─────────┴─────No─────────┴─────────┐ │ +│ │ │ │ +│ │ YES │ NO │ +│ │ │ │ +│ ▼ ▼ │ +│ ┌──────────────────┐ ┌──────────────────────────┐ │ +│ │ Skip to Consent │ │ Start Verification Flow │ │ +│ │ (Step 9) │ │ (Step 3) │ │ +│ └──────────────────┘ └─────────┬────────────────┘ │ +│ │ │ +└───────────────────────────────────────────────┼──────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Domain Verification Service (Two-Factor) │ +│ ┌──────────────────────────────────────────────────────────┐ │ +│ │ 3. Verify DNS TXT Record (First Factor) │ │ +│ │ Query: _gondulf.example.com TXT │ │ +│ │ Expected: "verified" │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +│ ┌────────┴────────┐ │ +│ │ TXT found? │ │ +│ ┌─────────┴─────No─────────┴─────────┐ │ +│ │ YES │ NO │ +│ ▼ ▼ │ +│ ┌──────────────────┐ ┌──────────────────────────┐ │ +│ │ Continue to │ │ FAIL: Display error │ │ +│ │ Step 4 │ │ "Add DNS TXT record" │ │ +│ └─────────┬────────┘ └──────────────────────────┘ │ +│ │ │ +│ ┌─────────▼────────────────────────────────────────────────┐ │ +│ │ 4. Fetch User's Homepage via HTTPS │ │ +│ │ URL: https://example.com │ │ +│ │ Timeout: 10s, Max size: 5MB, Verify SSL │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +│ ┌────────┴────────┐ │ +│ │ Fetch success? │ │ +│ ┌─────────┴─────No─────────┴─────────┐ │ +│ │ YES │ NO │ +│ ▼ ▼ │ +│ ┌──────────────────┐ ┌──────────────────────────┐ │ +│ │ Continue to │ │ FAIL: Display error │ │ +│ │ Step 5 │ │ "Site unreachable" │ │ +│ └─────────┬────────┘ └──────────────────────────┘ │ +│ │ │ +│ ┌─────────▼────────────────────────────────────────────────┐ │ +│ │ 5. Discover Email via rel="me" (Second Factor Discovery)│ │ +│ │ Parse HTML for: │ │ +│ │ Extract and validate email format │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +│ ┌────────┴────────┐ │ +│ │ Email found? │ │ +│ ┌─────────┴─────No─────────┴─────────┐ │ +│ │ YES │ NO │ +│ ▼ ▼ │ +│ ┌──────────────────┐ ┌──────────────────────────┐ │ +│ │ Continue to │ │ FAIL: Display error │ │ +│ │ Step 6 │ │ "Add rel='me' link" │ │ +│ └─────────┬────────┘ └──────────────────────────┘ │ +│ │ │ +│ ┌─────────▼────────────────────────────────────────────────┐ │ +│ │ 6. Generate and Send Verification Code │ │ +│ │ (Second Factor Verification) │ │ +│ │ - Generate 6-digit code (cryptographically secure) │ │ +│ │ - Store code in memory (TTL: 15 minutes) │ │ +│ │ - Send code to discovered email via SMTP │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +└─────────────────────────────┼────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Display Code Entry Form │ +│ ┌──────────────────────────────────────────────────────────┐ │ +│ │ "Verification code sent to u***@example.com" │ │ +│ │ [Enter 6-digit code: ______] │ │ +│ │ [Submit] │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +└─────────────────────────────┼────────────────────────────────────┘ + │ + │ POST /authorize/verify-code + │ {email, code, me, client_id, ...} + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Domain Verification Service (Continued) │ +│ ┌──────────────────────────────────────────────────────────┐ │ +│ │ 7. Verify Submitted Code │ │ +│ │ - Retrieve stored code from memory │ │ +│ │ - Check expiration (15 min TTL) │ │ +│ │ - Check attempts (max 3) │ │ +│ │ - Constant-time compare submitted vs stored │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +│ ┌────────┴────────┐ │ +│ │ Code valid? │ │ +│ ┌─────────┴─────No─────────┴─────────┐ │ +│ │ YES │ NO │ +│ ▼ ▼ │ +│ ┌──────────────────┐ ┌──────────────────────────┐ │ +│ │ Store verified │ │ Show error, allow retry │ │ +│ │ domain in DB │ │ (if attempts remaining) │ │ +│ └─────────┬────────┘ └──────────────────────────┘ │ +│ │ │ +│ ┌─────────▼────────────────────────────────────────────────┐ │ +│ │ 8. Domain Verified (Two-Factor Complete) │ │ +│ │ - DNS TXT verified ✓ │ │ +│ │ - Email verified ✓ │ │ +│ │ - Store in database: verification_method='two_factor' │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +└─────────────────────────────┼────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Display Consent Screen │ +│ ┌──────────────────────────────────────────────────────────┐ │ +│ │ "Sign in to [App Name] as example.com" │ │ +│ │ │ │ +│ │ Client: https://client.example.com │ │ +│ │ Redirect: https://client.example.com/callback │ │ +│ │ │ │ +│ │ [Approve] [Deny] │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +└─────────────────────────────┼────────────────────────────────────┘ + │ + │ POST /authorize/consent + │ {action: "approve", ...} + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Authorization Endpoint (Continued) │ +│ ┌──────────────────────────────────────────────────────────┐ │ +│ │ 9. Generate Authorization Code │ │ +│ │ - Generate cryptographically secure code (256 bits) │ │ +│ │ - Store in memory with metadata: │ │ +│ │ • me (user's domain) │ │ +│ │ • client_id │ │ +│ │ • redirect_uri │ │ +│ │ • state │ │ +│ │ • TTL: 10 minutes │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +│ │ │ +│ ┌──────────────────────────▼───────────────────────────────┐ │ +│ │ 10. Redirect to Client with Code │ │ +│ │ {redirect_uri}?code={code}&state={state} │ │ +│ └──────────────────────────┬───────────────────────────────┘ │ +└─────────────────────────────┼────────────────────────────────────┘ + │ + │ HTTP 302 Redirect + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Client Application │ +│ • Receives authorization code │ +│ • Validates state parameter (CSRF protection) │ +│ • Exchanges code for token (Phase 3: Token Endpoint) │ +└─────────────────────────────────────────────────────────────────┘ +``` + +### State Transitions + +**Domain Verification States**: +1. **Unverified**: Domain never seen before +2. **DNS Verified**: TXT record confirmed +3. **Email Discovered**: rel="me" link found +4. **Code Sent**: Verification code sent to email +5. **Fully Verified**: Code verified, stored in database +6. **Cached**: Domain verification cached (skip steps 1-5 on future auth) + +**Authorization Flow States**: +1. **Request Received**: Parameters validated +2. **Domain Check**: Checking if domain verified +3. **Verification In Progress**: User entering code +4. **Consent Pending**: User viewing consent screen +5. **Approved**: User approved, code generated +6. **Denied**: User denied, error redirect +7. **Complete**: Redirected to client with code + +### Error Paths + +**DNS Verification Failure**: +``` +/authorize → Validate params → Check DNS TXT → [NOT FOUND] + → Display error page with instructions + → User adds TXT record, clicks "Retry" + → Loop back to Check DNS TXT +``` + +**rel="me" Discovery Failure**: +``` +/authorize → DNS verified → Fetch site → Discover email → [NOT FOUND] + → Display error page with HTML example + → User adds , clicks "Retry" + → Loop back to Fetch site +``` + +**Email Send Failure**: +``` +/authorize → DNS + rel="me" OK → Send email → [SMTP ERROR] + → Display error page with troubleshooting + → User checks SMTP config, clicks "Retry" + → Loop back to Send email +``` + +**Invalid Code**: +``` +/authorize/verify-code → Verify code → [INVALID] + → Display code entry form with error + → "Invalid code. 2 attempts remaining." + → User enters code again + → Loop back to Verify code +``` + +**Rate Limit Exceeded**: +``` +/authorize → Start verification → Check rate limit → [EXCEEDED] + → Display error: "Too many attempts, wait 1 hour" + → User waits, tries again later +``` + +## API Endpoints + +### POST /api/verify/start + +**Purpose**: Start domain verification process. + +**Request**: +```json +{ + "domain": "example.com" +} +``` + +**Success Response** (200 OK): +```json +{ + "success": true, + "email_masked": "u***@example.com", + "error": null +} +``` + +**Error Response** (200 OK with success=false): +```json +{ + "success": false, + "email_masked": null, + "error": "DNS TXT record not found for _gondulf.example.com. Please add: Type=TXT, Name=_gondulf.example.com, Value=verified" +} +``` + +**Validation Errors** (422 Unprocessable Entity): +```json +{ + "detail": [ + { + "loc": ["body", "domain"], + "msg": "field required", + "type": "value_error.missing" + } + ] +} +``` + +**Rate Limiting**: +- Max 3 requests per domain per hour +- Enforced by DomainVerificationService + +**Authentication**: None required (public endpoint) + +--- + +### POST /api/verify/code + +**Purpose**: Verify submitted 6-digit code. + +**Request**: +```json +{ + "email": "user@example.com", + "code": "123456" +} +``` + +**Success Response** (200 OK): +```json +{ + "success": true, + "domain": "example.com", + "error": null +} +``` + +**Error Response** (200 OK with success=false): +```json +{ + "success": false, + "domain": null, + "error": "Invalid code. 2 attempts remaining." +} +``` + +**Validation Errors** (422 Unprocessable Entity): +```json +{ + "detail": [ + { + "loc": ["body", "code"], + "msg": "string does not match regex \"^[0-9]{6}$\"", + "type": "value_error.str.regex" + } + ] +} +``` + +**Rate Limiting**: +- Max 3 attempts per email per code +- Enforced by code verification logic + +**Authentication**: None required (code is the authentication) + +--- + +### GET /authorize + +**Purpose**: IndieAuth authorization endpoint. + +**Query Parameters**: +- `me` (required): User's profile URL (e.g., "https://example.com") +- `client_id` (required): Client application URL +- `redirect_uri` (required): Where to redirect after authorization +- `state` (required): CSRF protection token +- `response_type` (required): Must be "code" +- `scope` (optional): Requested scopes (ignored in v1.0.0) +- `code_challenge` (optional): PKCE challenge (not supported in v1.0.0) +- `code_challenge_method` (optional): PKCE method (not supported in v1.0.0) + +**Success Response**: HTML page (verification form or consent screen) + +**Error Redirect** (302 Found): +``` +{redirect_uri}?error=invalid_request&error_description=Invalid+me+parameter&state={state} +``` + +**Error Codes** (OAuth 2.0 standard): +- `invalid_request`: Missing or invalid parameter +- `unauthorized_client`: Client not authorized +- `access_denied`: User denied authorization +- `unsupported_response_type`: response_type not "code" +- `server_error`: Internal server error + +**Error Page** (when redirect not possible): +```html + + +Invalid redirect_uri. Cannot redirect safely.
+ + +``` + +**Rate Limiting**: None at endpoint level (handled by verification service) + +**Authentication**: None initially (domain verification IS the authentication) + +--- + +### POST /authorize/verify-code + +**Purpose**: Verify code during authorization flow. + +**Form Data**: +- `email` (required): Email address from rel="me" +- `code` (required): 6-digit verification code +- `me` (required): User's profile URL +- `client_id` (required): Client application URL +- `redirect_uri` (required): Redirect URI +- `state` (required): State parameter + +**Success Response**: HTML page (consent screen) + +**Error Response**: HTML page (code entry form with error message) + +--- + +### POST /authorize/consent + +**Purpose**: Handle user consent decision. + +**Form Data**: +- `action` (required): "approve" or "deny" +- `me` (required): User's profile URL +- `client_id` (required): Client application URL +- `redirect_uri` (required): Redirect URI +- `state` (required): State parameter + +**Success Response (Approve)** (302 Found): +``` +{redirect_uri}?code={authorization_code}&state={state} +``` + +**Success Response (Deny)** (302 Found): +``` +{redirect_uri}?error=access_denied&error_description=User+denied+authorization&state={state} +``` + +## Data Models + +### Verified Domain (Database Table) + +**Table**: `domains` + +**Schema** (from Phase 1): +```sql +CREATE TABLE domains ( + domain TEXT PRIMARY KEY, + verification_method TEXT NOT NULL, -- 'two_factor' for v1.0.0 + verified BOOLEAN NOT NULL DEFAULT FALSE, + verified_at TIMESTAMP, + last_dns_check TIMESTAMP, + last_email_check TIMESTAMP +); +``` + +**Updated in Phase 2**: Change `verification_method` values from `'email'` / `'txt_record'` to `'two_factor'`. + +**Migration**: `002_update_verification_method.sql`: +```sql +-- Update verification_method values to reflect two-factor requirement +UPDATE domains +SET verification_method = 'two_factor' +WHERE verification_method IN ('email', 'txt_record'); +``` + +**Indexes** (from Phase 1): +```sql +CREATE INDEX idx_domains_domain ON domains(domain); +CREATE INDEX idx_domains_verified ON domains(verified); +``` + +--- + +### Authorization Code (In-Memory) + +**Storage**: Phase 1 CodeStorage with metadata + +**Structure**: +```python +{ + "code": "abc123...", # 43-char base64url (32 bytes) + "me": "https://example.com", + "client_id": "https://client.example.com", + "redirect_uri": "https://client.example.com/callback", + "state": "client-provided-state", + "created_at": datetime, + "expires_at": datetime, # created_at + 10 minutes + "used": False # For Phase 3 token endpoint +} +``` + +**TTL**: 10 minutes (per W3C spec: "shortly after") + +**Storage Location**: Phase 1 CodeStorage service + +--- + +### Verification Code Metadata (In-Memory) + +**Storage**: Additional metadata alongside verification codes + +**Structure**: +```python +{ + "email": "user@example.com", + "domain": "example.com", + "attempts": 0, # Increment on each failed attempt + "created_at": datetime +} +``` + +**Purpose**: Track attempts and associate email with domain for rate limiting. + +**TTL**: Same as verification code (15 minutes) + +## Security Requirements + +### Input Validation + +**Domain Parameter**: +```python +def validate_domain(domain: str) -> Tuple[bool, Optional[str], Optional[str]]: + """ + Validate domain parameter. + + Returns: (is_valid, normalized_domain, error_message) + """ + # Remove protocol if present + if domain.startswith('http://') or domain.startswith('https://'): + domain = domain.split('://', 1)[1] + + # Remove path if present + if '/' in domain: + domain = domain.split('/')[0] + + # Lowercase + domain = domain.lower().strip() + + # Must contain at least one dot + if '.' not in domain: + return False, None, "Domain must contain at least one dot (e.g., example.com)" + + # Must not be empty + if not domain: + return False, None, "Domain cannot be empty" + + # Must not contain invalid characters + if any(c in domain for c in [' ', '@', ':', '?', '#']): + return False, None, "Domain contains invalid characters" + + # Length check + if len(domain) > 253: + return False, None, "Domain too long (max 253 characters)" + + return True, domain, None +``` + +**Email Parameter**: +```python +def validate_email(email: str) -> bool: + """ + Validate email format (RFC 5322 simplified). + + Used by rel="me" discovery service. + """ + email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' + + if not re.match(email_regex, email): + return False + + if len(email) > 254: # RFC 5321 maximum + return False + + if email.count('@') != 1: + return False + + local, domain = email.split('@') + if '.' not in domain: + return False + + return True +``` + +**URL Parameters** (me, client_id, redirect_uri): +```python +def validate_url(url: str, param_name: str) -> Tuple[bool, Optional[str]]: + """ + Validate URL parameter. + + Returns: (is_valid, error_message) + """ + from urllib.parse import urlparse + + try: + parsed = urlparse(url) + except Exception: + return False, f"{param_name} must be a valid URL" + + # Must have scheme and netloc + if not parsed.scheme or not parsed.netloc: + return False, f"{param_name} must be a complete URL (e.g., https://example.com)" + + # Must be http or https + if parsed.scheme not in ['http', 'https']: + return False, f"{param_name} must use http or https" + + # No fragments for 'me' parameter + if param_name == "me" and parsed.fragment: + return False, "me parameter must not contain fragment" + + # No credentials + if parsed.username or parsed.password: + return False, f"{param_name} must not contain credentials" + + return True, None +``` + +--- + +### HTTPS Enforcement + +**Configuration**: +```python +# In production config +if not DEBUG: + # Enforce HTTPS + app.add_middleware(HTTPSRedirectMiddleware) + + # Reject HTTP redirect_uri (except localhost) + if redirect_uri.startswith('http://'): + parsed = urlparse(redirect_uri) + if parsed.hostname not in ['localhost', '127.0.0.1']: + return error_response("redirect_uri must use HTTPS in production") +``` + +**HTML Fetching**: +- HTTPS only (hardcoded `https://` in URL) +- SSL certificate verification enforced (`verify=True`, no option to disable) +- Reject sites with invalid certificates + +--- + +### HTML Parsing Security + +**BeautifulSoup Configuration**: +```python +# Use html.parser (Python standard library, safe for untrusted HTML) +soup = BeautifulSoup(html_content, 'html.parser') +``` + +**Why html.parser**: +- Part of Python standard library (no external C dependencies) +- Designed for untrusted HTML +- No script execution +- No external resource loading +- Handles malformed HTML gracefully + +**Size Limits**: +- Maximum response size: 5MB (configurable) +- Checked both in Content-Length header and actual content + +**Timeout**: +- HTTP request timeout: 10 seconds (configurable) +- Prevents hanging on slow sites + +--- + +### Protection Against Open Redirects + +**redirect_uri Validation**: +```python +def validate_redirect_uri(redirect_uri: str, client_id: str) -> Tuple[bool, Optional[str]]: + """ + Validate redirect_uri against client_id. + + Returns: (is_valid, warning_message) + """ + from urllib.parse import urlparse + + redirect_parsed = urlparse(redirect_uri) + client_parsed = urlparse(client_id) + + # Must be HTTPS (except localhost) + if redirect_parsed.scheme != 'https': + if redirect_parsed.hostname not in ['localhost', '127.0.0.1']: + return False, "redirect_uri must use HTTPS" + + # Must have valid hostname + if not redirect_parsed.hostname: + return False, "redirect_uri must have valid hostname" + + redirect_domain = redirect_parsed.hostname.lower() + client_domain = client_parsed.hostname.lower() + + # Exact match: OK + if redirect_domain == client_domain: + return True, None + + # Subdomain of client: OK + if redirect_domain.endswith('.' + client_domain): + return True, None + + # Different domain: WARNING (display to user, but allow) + warning = ( + f"Warning: Redirect to different domain ({redirect_domain}) " + f"than client ({client_domain}). Ensure you trust this application." + ) + return True, warning +``` + +**Display Warning to User**: +- If redirect_uri domain differs from client_id domain, show warning on consent screen +- User must explicitly approve redirect to different domain +- Prevents phishing via redirect URI manipulation + +--- + +### CSRF Protection + +**State Parameter**: +- Required in authorization request +- Stored with authorization code +- Passed through verification and consent steps +- Returned unchanged in redirect +- Client validates state matches original (client responsibility per OAuth 2.0) + +**Gondulf does NOT validate state** - This is intentional per OAuth 2.0: +- State is opaque to authorization server +- Client generates state, client validates state +- Gondulf only passes it through unchanged + +--- + +### Code Replay Prevention + +**Authorization Code**: +- Single-use enforcement (Phase 3 token endpoint marks as used) +- 10-minute expiration +- Bound to client_id, redirect_uri, me +- Stored in memory (Phase 1 CodeStorage) + +**Verification Code**: +- Single-use: Deleted after successful verification +- 15-minute expiration +- Max 3 attempts before invalidation +- Constant-time comparison (prevent timing attacks) + +## Testing Requirements + +### Unit Tests + +**HTML Fetcher Service** (9 tests): +- ✅ Successful HTTPS fetch returns content +- ✅ SSL verification failure returns None +- ✅ Timeout returns None +- ✅ HTTP error codes (404, 500) return None +- ✅ Redirects followed (up to max) +- ✅ Too many redirects returns None +- ✅ Content-Length exceeds limit returns None +- ✅ Actual content exceeds limit returns None +- ✅ Custom User-Agent sent + +**rel="me" Discovery Service** (12 tests): +- ✅ Discovery from `` tag +- ✅ Discovery from `` tag +- ✅ Multiple rel="me" links: first mailto selected +- ✅ Malformed HTML handled +- ✅ Missing rel="me" returns None +- ✅ Invalid email in link returns None +- ✅ Empty href returns None +- ✅ Non-mailto links ignored +- ✅ mailto with query params strips params +- ✅ Email validation: valid formats +- ✅ Email validation: invalid formats +- ✅ Exception during parsing returns None + +**Domain Verification Service** (15 tests): +- ✅ Full flow: DNS → rel="me" → email → code +- ✅ DNS failure blocks flow +- ✅ Site fetch failure blocks flow +- ✅ rel="me" failure blocks flow +- ✅ Email send failure cleans up code +- ✅ Code verification success stores domain +- ✅ Code verification failure decrements attempts +- ✅ Too many attempts invalidates code +- ✅ Invalid code returns error +- ✅ Code expiration handled +- ✅ Rate limiting works +- ✅ Already verified domain check +- ✅ Email masking correct +- ✅ Constant-time comparison used +- ✅ Metadata tracking works + +**Estimated Unit Test Count**: ~36 tests + +--- + +### Integration Tests + +**Verification Endpoints** (10 tests): +- ✅ POST /api/verify/start success case +- ✅ POST /api/verify/start with invalid domain +- ✅ POST /api/verify/start with DNS failure +- ✅ POST /api/verify/start with rel="me" failure +- ✅ POST /api/verify/start with email send failure +- ✅ POST /api/verify/code success case +- ✅ POST /api/verify/code with invalid code +- ✅ POST /api/verify/code with expired code +- ✅ POST /api/verify/code with missing code +- ✅ POST /api/verify/code with too many attempts + +**Authorization Endpoint** (15 tests): +- ✅ GET /authorize with valid params (already verified domain) +- ✅ GET /authorize with valid params (new domain) +- ✅ GET /authorize with invalid response_type +- ✅ GET /authorize with invalid me parameter +- ✅ GET /authorize with invalid client_id +- ✅ GET /authorize with invalid redirect_uri +- ✅ GET /authorize with missing state +- ✅ POST /authorize/verify-code with valid code +- ✅ POST /authorize/verify-code with invalid code +- ✅ POST /authorize/consent with action=approve +- ✅ POST /authorize/consent with action=deny +- ✅ Authorization code stored with metadata +- ✅ Authorization code expires after 10 min +- ✅ State parameter passed through +- ✅ redirect_uri domain mismatch shows warning + +**Estimated Integration Test Count**: ~25 tests + +--- + +### End-to-End Tests + +**Complete Flows** (5 tests): +- ✅ Full auth flow: /authorize → verify → consent → redirect with code +- ✅ Full auth flow with cached domain (skip verification) +- ✅ User denies consent → redirect with access_denied +- ✅ DNS verification failure → error page → retry → success +- ✅ Invalid code × 3 → error "too many attempts" + +**Estimated E2E Test Count**: ~5 tests + +--- + +### Security Tests + +**Input Validation** (8 tests): +- ✅ Malformed domain rejected +- ✅ Malformed email rejected (during validation) +- ✅ Malformed URL (me, client_id, redirect_uri) rejected +- ✅ URL with credentials rejected +- ✅ URL with fragment rejected (me parameter) +- ✅ Oversized HTML (>5MB) rejected +- ✅ Invalid email in rel="me" logged and skipped +- ✅ SQL injection attempts in domain parameter (should be parameterized) + +**Authentication Security** (5 tests): +- ✅ Expired code rejected +- ✅ Used code rejected (Phase 3) +- ✅ Invalid code rejected +- ✅ Brute force prevented (max 3 attempts) +- ✅ Constant-time comparison used (verify via timing analysis - difficult to test) + +**TLS/HTTPS** (4 tests): +- ✅ HTTP redirect_uri rejected in production +- ✅ Invalid SSL certificate rejected +- ✅ Site fetch over HTTPS only +- ✅ HTTP allowed for localhost only + +**Open Redirect** (3 tests): +- ✅ redirect_uri domain mismatch shows warning +- ✅ Invalid redirect_uri shows error page (no redirect) +- ✅ redirect_uri without hostname rejected + +**Estimated Security Test Count**: ~20 tests + +--- + +### Coverage Target + +**Phase 2 Overall**: 80%+ coverage (same as Phase 1) + +**Critical Code** (95%+ coverage): +- Domain verification service (orchestration logic) +- rel="me" discovery (email extraction) +- Authorization endpoint (parameter validation) +- Security functions (validation, constant-time comparison) + +**Total Estimated Test Count**: ~86 tests + +## Error Handling + +### DNS Verification Failure + +**Error Message**: +``` +DNS Verification Failed + +The DNS TXT record was not found for your domain. + +Please add the following TXT record to your DNS: + Type: TXT + Name: _gondulf.example.com + Value: verified + +DNS changes may take up to 24 hours to propagate. + +[Retry] +``` + +**HTTP Response**: 200 OK (HTML error page) + +**Logging**: WARNING level with domain + +--- + +### rel="me" Discovery Failure + +**Error Message**: +``` +Email Discovery Failed + +No rel="me" email link was found on your homepage. + +Please add the following to https://example.com: + + +This allows us to discover your email address automatically. + +Learn more: https://indieweb.org/rel-me + +[Retry] +``` + +**HTTP Response**: 200 OK (HTML error page) + +**Logging**: WARNING level with domain + +--- + +### Site Unreachable + +**Error Message**: +``` +Site Fetch Failed + +Could not fetch your site at https://example.com + +Please check: +• Site is accessible via HTTPS +• SSL certificate is valid +• No firewall blocking requests + +[Retry] +``` + +**HTTP Response**: 200 OK (HTML error page) + +**Logging**: ERROR level with domain and error details + +--- + +### Email Send Failure + +**Error Message**: +``` +Email Delivery Failed + +Failed to send verification code to u***@example.com + +Please check: +• Email address is correct in your rel="me" link +• Email server is accepting mail +• Check spam/junk folder + +[Retry] +``` + +**HTTP Response**: 200 OK (HTML error page) + +**Logging**: ERROR level with masked email + +--- + +### Invalid Code + +**Error Message**: +``` +Invalid code. 2 attempts remaining. +``` + +**HTTP Response**: 200 OK (code entry form with error) + +**Logging**: WARNING level with masked email + +--- + +### Too Many Attempts + +**Error Message**: +``` +Too Many Attempts + +You have exceeded the maximum number of attempts. + +Please request a new verification code. + +[Request New Code] +``` + +**HTTP Response**: 200 OK (error page with retry link) + +**Logging**: WARNING level with masked email + +--- + +### Rate Limit Exceeded + +**Error Message**: +``` +Rate Limit Exceeded + +Too many verification requests for this domain. + +Please wait 1 hour before requesting another code. +``` + +**HTTP Response**: 200 OK (error page) + +**Logging**: WARNING level with domain + +--- + +### OAuth 2.0 Errors (Authorization Endpoint) + +**Error Redirect Format**: +``` +{redirect_uri}?error={error_code}&error_description={description}&state={state} +``` + +**Error Codes**: +- `invalid_request`: Missing or invalid parameter +- `unauthorized_client`: Client not authorized +- `access_denied`: User denied authorization +- `unsupported_response_type`: response_type not "code" +- `server_error`: Internal server error + +**Example**: +``` +https://client.example.com/callback?error=invalid_request&error_description=Missing+state+parameter&state=abc123 +``` + +**Logging**: WARNING or ERROR level depending on error type + +--- + +### Error Logging Standards + +**Log Levels**: +- **DEBUG**: Normal operations, detailed flow +- **INFO**: Successful operations (code sent, domain verified) +- **WARNING**: Expected errors (invalid code, DNS not found) +- **ERROR**: Unexpected errors (SMTP failure, site unreachable) +- **CRITICAL**: System failures (should not occur in Phase 2) + +**What to Log**: +- ✅ Domain (public information) +- ✅ Email (partial mask: first 3 chars) +- ✅ Error details (for debugging) +- ✅ Request IDs (for correlation) + +**What NOT to Log**: +- ❌ Full email addresses +- ❌ Verification codes +- ❌ Authorization codes +- ❌ User-Agent (GDPR) +- ❌ IP addresses (GDPR) + +## Dependencies + +### New Python Packages + +**Add to pyproject.toml**: +```toml +[project] +dependencies = [ + # ... existing dependencies from Phase 1 + "beautifulsoup4>=4.12.0", # HTML parsing for rel="me" discovery +] +``` + +**Why beautifulsoup4**: +- Robust HTML parsing (handles malformed HTML) +- Safe for untrusted content (no script execution) +- Standard in Python ecosystem +- Pure Python (no C dependencies with html.parser) + +### Phase 1 Dependencies Used + +- `requests` (HTTP fetching - already in pyproject.toml) +- `dnspython` (DNS queries - Phase 1) +- `smtplib` (Email sending - Python stdlib, used by Phase 1) +- `sqlalchemy` (Database - Phase 1) +- `fastapi` (Web framework - Phase 1) +- `pydantic` (Data validation - Phase 1) + +### Configuration Additions + +**Optional new environment variables**: +```bash +# HTML Fetching (optional - has defaults) +GONDULF_HTML_FETCH_TIMEOUT=10 # seconds +GONDULF_HTML_MAX_SIZE=5242880 # bytes (5MB) +GONDULF_HTML_MAX_REDIRECTS=5 + +# Rate Limiting (optional - has defaults) +GONDULF_VERIFICATION_RATE_LIMIT=3 # codes per domain per hour +``` + +**Add to .env.example**: +```bash +# HTML Fetching Configuration (optional) +GONDULF_HTML_FETCH_TIMEOUT=10 +GONDULF_HTML_MAX_SIZE=5242880 +GONDULF_HTML_MAX_REDIRECTS=5 + +# Rate Limiting (optional) +GONDULF_VERIFICATION_RATE_LIMIT=3 +``` + +## Implementation Notes + +### Suggested Implementation Order + +1. **HTML Fetcher Service** (0.5 days) + - Straightforward HTTP fetching + - Few dependencies + - Easy to test in isolation + +2. **rel="me" Discovery Service** (0.5 days) + - Pure parsing logic + - No external dependencies (besides HTML input) + - Easy to test with mock HTML + +3. **Domain Verification Service** (1 day) + - Orchestrates all services + - More complex logic + - Needs all previous services complete + +4. **Database Migration** (0.5 days) + - Simple UPDATE query + - Apply before verification endpoints + +5. **Verification Endpoints** (0.5 days) + - Thin API layer over service + - FastAPI makes this straightforward + +6. **Authorization Endpoint** (3-4 days) + - Most complex component + - HTML templates needed + - Multiple sub-endpoints + - Needs comprehensive testing + +7. **Integration Testing** (1 day) + - Test all components together + - End-to-end flow verification + +**Total**: ~7-8 days (matches estimate in phase-1-impact-assessment.md) + +--- + +### Risks and Mitigations + +**Risk 1: HTML Parsing Edge Cases** +- **Mitigation**: BeautifulSoup handles malformed HTML gracefully +- **Testing**: Include malformed HTML in test cases +- **Fallback**: Clear error messages guide users to fix HTML + +**Risk 2: Email Delivery Failures** +- **Mitigation**: Comprehensive SMTP error handling +- **Testing**: Mock SMTP failures in tests +- **Fallback**: Clear troubleshooting instructions in error messages + +**Risk 3: DNS TXT Record Setup Complexity** +- **Mitigation**: Clear setup instructions with examples +- **User Education**: Document common DNS providers +- **Support**: Provide example DNS configurations + +**Risk 4: Authorization Endpoint Complexity** +- **Mitigation**: Break into smaller sub-endpoints (verify-code, consent) +- **Testing**: Comprehensive integration tests +- **Design**: Keep state management simple (use forms, avoid complex sessions) + +**Risk 5: Rate Limiting Implementation** +- **Mitigation**: Start with simple in-memory tracking (Phase 2) +- **Future**: Migrate to Redis for distributed rate limiting (Phase 3+) +- **Placeholder**: Implement rate limit check, return False for now + +--- + +### Performance Considerations + +**HTML Fetching**: +- Timeout: 10 seconds (prevent hanging) +- Size limit: 5MB (prevent memory exhaustion) +- Concurrent requests: Not needed in Phase 2 (one request per auth flow) + +**Database Queries**: +- Index on domains.domain ensures fast lookups +- Simple SELECT queries (no joins in Phase 2) +- Consider adding index on domains.verified if needed + +**In-Memory Storage**: +- Verification codes: ~100 bytes each +- Authorization codes: ~200 bytes each +- Expected load: 10s of users, <100 concurrent verifications +- Memory impact: Negligible (<10KB) + +**rel="me" Parsing**: +- BeautifulSoup is pure Python (not fastest, but sufficient) +- HTML size limited to 5MB (parse time <1 second) +- No performance issues expected for typical homepages + +--- + +### Future Extensibility + +**Redis Integration** (Phase 3+): +- Replace in-memory CodeStorage with Redis +- Enables distributed deployment (multiple Gondulf instances) +- No code changes needed (CodeStorage interface unchanged) + +**Client Metadata Caching** (Phase 3): +- Cache client_id fetch results +- Reduces HTTP requests during authorization +- Store in database or Redis + +**PKCE Support** (v1.1.0): +- Add code_challenge validation in authorization endpoint +- Add code_verifier validation in token endpoint (Phase 3) +- No breaking changes to v1.0.0 clients + +**Additional Authentication Methods** (v1.2.0+): +- GitHub/GitLab OAuth providers +- WebAuthn support +- All additive (user chooses method) + +## Acceptance Criteria + +Phase 2 is complete when ALL of the following criteria are met: + +### Functionality + +- [ ] HTML fetcher service fetches user homepages successfully +- [ ] rel="me" discovery service discovers email from HTML +- [ ] Domain verification service orchestrates two-factor verification +- [ ] DNS TXT verification required and working +- [ ] Email verification via rel="me" required and working +- [ ] Verification endpoints (/api/verify/start, /api/verify/code) working +- [ ] Authorization endpoint (/authorize) validates all parameters +- [ ] Authorization endpoint checks domain verification status +- [ ] Authorization endpoint shows verification form for unverified domains +- [ ] Authorization endpoint shows consent screen after verification +- [ ] Authorization code generated and stored on approval +- [ ] User can deny consent (redirects with access_denied) +- [ ] State parameter passed through all steps + +### Testing + +- [ ] All unit tests passing (estimated ~36 tests) +- [ ] All integration tests passing (estimated ~25 tests) +- [ ] All end-to-end tests passing (estimated ~5 tests) +- [ ] All security tests passing (estimated ~20 tests) +- [ ] Test coverage ≥80% overall +- [ ] Test coverage ≥95% for domain verification service +- [ ] Test coverage ≥95% for authorization endpoint +- [ ] No known bugs or failing tests + +### Security + +- [ ] HTTPS enforcement working (production) +- [ ] SSL certificate validation enforced (HTML fetching) +- [ ] HTML parsing secure (BeautifulSoup with html.parser) +- [ ] Input validation comprehensive (domain, email, URLs) +- [ ] Open redirect protection working (redirect_uri validation) +- [ ] Constant-time code comparison used +- [ ] Rate limiting implemented (basic in-memory) +- [ ] Attempt limiting working (max 3 per code) +- [ ] No PII in logs (email masked, no full addresses) +- [ ] Authorization codes single-use (marked for Phase 3) + +### Error Handling + +- [ ] DNS verification failure shows clear instructions +- [ ] rel="me" discovery failure shows HTML example +- [ ] Site unreachable shows troubleshooting steps +- [ ] Email send failure shows error with retry +- [ ] Invalid code shows attempts remaining +- [ ] Too many attempts invalidates code +- [ ] Rate limit exceeded shows wait time +- [ ] OAuth 2.0 errors formatted correctly +- [ ] All errors logged appropriately + +### Documentation + +- [ ] All new services have docstrings +- [ ] All public methods have type hints +- [ ] API endpoints documented (this design doc) +- [ ] Error messages user-friendly +- [ ] Setup instructions clear (DNS + rel="me") +- [ ] Database migration documented + +### Dependencies + +- [ ] beautifulsoup4 added to pyproject.toml +- [ ] No new system dependencies (all Python) +- [ ] Configuration updated (.env.example) + +### Database + +- [ ] Migration 002 applied successfully +- [ ] domains.verification_method updated to 'two_factor' +- [ ] No schema changes needed (existing schema works) + +### Integration + +- [ ] All Phase 1 services integrated successfully +- [ ] DNS service used for TXT verification +- [ ] Email service used for code sending +- [ ] Database service used for storing verified domains +- [ ] In-memory storage used for codes +- [ ] Logging used throughout + +### Performance + +- [ ] HTML fetching completes within 10 seconds +- [ ] rel="me" parsing completes within 1 second +- [ ] Full verification flow completes within 30 seconds +- [ ] Authorization endpoint responds within 2 seconds +- [ ] No memory leaks (codes expire and clean up) + +## Timeline Estimate + +**Phase 2 Implementation**: 7-9 days + +**Breakdown**: +- HTML Fetcher Service: 0.5 days +- rel="me" Discovery Service: 0.5 days +- Domain Verification Service: 1 day +- Database Migration: 0.5 days +- Verification Endpoints: 0.5 days +- Authorization Endpoint: 3-4 days +- Integration Testing: 1 day +- Documentation: 0.5 days (included in parallel) + +**Dependencies**: Phase 1 complete and approved + +**Risk Buffer**: +2 days (for unforeseen issues with HTML parsing or authorization flow complexity) + +## Sign-off + +**Design Status**: Complete and ready for implementation + +**Architect**: Claude (Architect Agent) +**Date**: 2025-11-20 + +**Next Steps**: +1. Developer reviews design document +2. Developer asks clarification questions if needed +3. Architect updates design based on feedback +4. Developer begins implementation following design +5. Developer creates implementation report upon completion +6. Architect reviews implementation report + +**Related Documents**: +- `/docs/architecture/overview.md` - System architecture +- `/docs/architecture/indieauth-protocol.md` - IndieAuth protocol implementation +- `/docs/architecture/security.md` - Security architecture +- `/docs/architecture/phase-1-impact-assessment.md` - Phase 2 requirements +- `/docs/decisions/ADR-005-email-based-authentication-v1-0-0.md` - Two-factor verification decision +- `/docs/decisions/ADR-008-rel-me-email-discovery.md` - rel="me" pattern decision +- `/docs/reports/2025-11-20-phase-1-foundation.md` - Phase 1 implementation +- `/docs/roadmap/v1.0.0.md` - Version plan + +--- + +**DESIGN READY: Phase 2 Domain Verification - Please review /docs/designs/phase-2-domain-verification.md** diff --git a/docs/designs/phase-2-implementation-guide.md b/docs/designs/phase-2-implementation-guide.md new file mode 100644 index 0000000..df5746e --- /dev/null +++ b/docs/designs/phase-2-implementation-guide.md @@ -0,0 +1,739 @@ +# Phase 2 Implementation Guide - Specific Details + +**Date**: 2024-11-20 +**Architect**: Claude (Architect Agent) +**Status**: Supplementary to Phase 2 Design +**Purpose**: Provide specific implementation details for Developer clarification questions + +This document supplements `/docs/designs/phase-2-domain-verification.md` with specific implementation decisions from ADR-0004. + +## 1. Rate Limiting Implementation + +### Approach +Implement actual in-memory rate limiting with timestamp tracking. + +### Implementation Specifications + +**Service Structure**: +```python +# src/gondulf/rate_limiter.py +from typing import Dict, List +import time + +class RateLimiter: + """In-memory rate limiter for domain verification attempts.""" + + def __init__(self, max_attempts: int = 3, window_hours: int = 1): + """ + Args: + max_attempts: Maximum attempts per domain in time window (default: 3) + window_hours: Time window in hours (default: 1) + """ + self.max_attempts = max_attempts + self.window_seconds = window_hours * 3600 + self._attempts: Dict[str, List[int]] = {} # domain -> [timestamp1, timestamp2, ...] + + def check_rate_limit(self, domain: str) -> bool: + """ + Check if domain has exceeded rate limit. + + Args: + domain: Domain to check + + Returns: + True if within rate limit, False if exceeded + """ + # Clean old timestamps first + self._clean_old_attempts(domain) + + # Check current count + if domain not in self._attempts: + return True + + return len(self._attempts[domain]) < self.max_attempts + + def record_attempt(self, domain: str) -> None: + """Record a verification attempt for domain.""" + now = int(time.time()) + if domain not in self._attempts: + self._attempts[domain] = [] + self._attempts[domain].append(now) + + def _clean_old_attempts(self, domain: str) -> None: + """Remove timestamps older than window.""" + if domain not in self._attempts: + return + + now = int(time.time()) + cutoff = now - self.window_seconds + self._attempts[domain] = [ts for ts in self._attempts[domain] if ts > cutoff] + + # Remove domain entirely if no recent attempts + if not self._attempts[domain]: + del self._attempts[domain] +``` + +**Usage in Endpoints**: +```python +# In verification endpoint +rate_limiter = get_rate_limiter() +if not rate_limiter.check_rate_limit(domain): + return {"success": False, "error": "rate_limit_exceeded"} + +rate_limiter.record_attempt(domain) +# ... proceed with verification +``` + +**Consequences**: +- State lost on restart (acceptable trade-off for simplicity) +- No persistence needed +- Simple dictionary-based implementation + +## 2. Authorization Code Metadata Structure + +### Approach +Use Phase 1's `CodeStorage` service with complete metadata structure from the start. + +### Data Structure Specification + +**Authorization Code Metadata**: +```python +{ + "client_id": "https://client.example.com/", + "redirect_uri": "https://client.example.com/callback", + "state": "client_state_value", + "code_challenge": "base64url_encoded_challenge", + "code_challenge_method": "S256", + "scope": "profile email", + "me": "https://user.example.com/", + "created_at": 1700000000, # epoch integer + "expires_at": 1700000600, # epoch integer (created_at + 600) + "used": False # Include now, consume in Phase 3 +} +``` + +**Storage Implementation**: +```python +# Use Phase 1's CodeStorage +code_storage = get_code_storage() +authorization_code = generate_random_code() +metadata = { + "client_id": client_id, + "redirect_uri": redirect_uri, + "state": state, + "code_challenge": code_challenge, + "code_challenge_method": code_challenge_method, + "scope": scope, + "me": me, + "created_at": int(time.time()), + "expires_at": int(time.time()) + 600, + "used": False +} +code_storage.store(f"authz:{authorization_code}", metadata, ttl=600) +``` + +**Rationale**: +- Epoch integers simpler than datetime objects +- Include `used` field now (Phase 3 will check/update it) +- Reuse existing `CodeStorage` infrastructure +- Key prefix `authz:` distinguishes from verification codes + +## 3. HTML Template Implementation + +### Approach +Use Jinja2 templates with separate template files. + +### Directory Structure +``` +src/gondulf/templates/ +├── base.html # Shared layout +├── verify_email.html # Email verification form +├── verify_totp.html # TOTP verification form (future) +├── authorize.html # Authorization consent page +└── error.html # Generic error page +``` + +### Base Template +```html + + + + + + +A verification code has been sent to {{ masked_email }}
+Please enter the 6-digit code to complete verification:
+ +{% if error %} +{{ error }}
+{% endif %} + + +{% endblock %} +``` + +### FastAPI Integration +```python +from fastapi import FastAPI, Request +from fastapi.templating import Jinja2Templates + +templates = Jinja2Templates(directory="src/gondulf/templates") + +@app.get("/verify/email") +async def verify_email_page(request: Request, domain: str): + masked = mask_email(discovered_email) + return templates.TemplateResponse("verify_email.html", { + "request": request, + "domain": domain, + "masked_email": masked + }) +``` + +**Dependencies**: +- Add to `pyproject.toml`: `jinja2 = "^3.1.0"` + +## 4. Database Migration Timing + +### Approach +Apply migration 002 immediately as part of Phase 2 setup. + +### Execution Order +1. Developer runs migration: `alembic upgrade head` +2. Migration 002 adds `two_factor` column with default value `false` +3. All Phase 2 code assumes column exists +4. New domains inserted with explicit `two_factor` value + +### Migration File (if not already created) +```python +# migrations/versions/002_add_two_factor_column.py +"""Add two_factor column to domains table + +Revision ID: 002 +Revises: 001 +Create Date: 2024-11-20 +""" +from alembic import op +import sqlalchemy as sa + +def upgrade(): + op.add_column('domains', + sa.Column('two_factor', sa.Boolean(), nullable=False, server_default='false') + ) + +def downgrade(): + op.drop_column('domains', 'two_factor') +``` + +**Rationale**: +- Keep database schema current with code expectations +- No conditional logic needed in Phase 2 code +- Clean separation: migration handles existing data, new code uses new schema + +## 5. Client Validation Helper Functions + +### Approach +Standalone utility functions in shared module. + +### Module Structure +```python +# src/gondulf/utils/validation.py +"""Client validation and utility functions.""" +from urllib.parse import urlparse +import re + +def mask_email(email: str) -> str: + """ + Mask email for display: user@example.com -> u***@example.com + + Args: + email: Email address to mask + + Returns: + Masked email string + """ + if '@' not in email: + return email + + local, domain = email.split('@', 1) + if len(local) <= 1: + return email + + masked_local = local[0] + '***' + return f"{masked_local}@{domain}" + + +def normalize_client_id(client_id: str) -> str: + """ + Normalize client_id URL to canonical form. + + Rules: + - Ensure https:// scheme + - Remove default port (443) + - Preserve path + + Args: + client_id: Client ID URL + + Returns: + Normalized client_id + """ + parsed = urlparse(client_id) + + # Ensure https + if parsed.scheme != 'https': + raise ValueError("client_id must use https scheme") + + # Remove default HTTPS port + netloc = parsed.netloc + if netloc.endswith(':443'): + netloc = netloc[:-4] + + # Reconstruct + normalized = f"https://{netloc}{parsed.path}" + if parsed.query: + normalized += f"?{parsed.query}" + if parsed.fragment: + normalized += f"#{parsed.fragment}" + + return normalized + + +def validate_redirect_uri(redirect_uri: str, client_id: str) -> bool: + """ + Validate redirect_uri against client_id per IndieAuth spec. + + Rules: + - Must use https scheme (except localhost) + - Must share same origin as client_id OR + - Must be subdomain of client_id domain + + Args: + redirect_uri: Redirect URI to validate + client_id: Client ID for comparison + + Returns: + True if valid, False otherwise + """ + try: + redirect_parsed = urlparse(redirect_uri) + client_parsed = urlparse(client_id) + + # Check scheme (allow http for localhost only) + if redirect_parsed.scheme != 'https': + if redirect_parsed.hostname not in ('localhost', '127.0.0.1'): + return False + + # Same origin check + if (redirect_parsed.scheme == client_parsed.scheme and + redirect_parsed.netloc == client_parsed.netloc): + return True + + # Subdomain check + redirect_host = redirect_parsed.hostname or '' + client_host = client_parsed.hostname or '' + + # Must end with .{client_host} + if redirect_host.endswith(f".{client_host}"): + return True + + return False + + except Exception: + return False +``` + +**Usage**: +```python +from gondulf.utils.validation import mask_email, validate_redirect_uri, normalize_client_id + +# In verification endpoint +masked = mask_email(discovered_email) + +# In authorization endpoint +normalized_client = normalize_client_id(client_id) +if not validate_redirect_uri(redirect_uri, normalized_client): + return error_response("invalid_redirect_uri") +``` + +## 6. Error Response Format Consistency + +### Approach +Use format appropriate to endpoint type. + +### Format Rules by Endpoint Type + +**Verification Endpoints** (`/verify/email`, `/verify/totp`): +```python +# Always return 200 OK with JSON +return JSONResponse( + status_code=200, + content={"success": False, "error": "invalid_code"} +) +``` + +**Authorization Endpoint - Pre-Client Validation**: +```python +# Return HTML error page if client_id not yet validated +return templates.TemplateResponse("error.html", { + "request": request, + "error": "Missing required parameter: client_id", + "error_code": "invalid_request" +}, status_code=400) +``` + +**Authorization Endpoint - Post-Client Validation**: +```python +# Return OAuth redirect with error parameter +from urllib.parse import urlencode +error_params = { + "error": "invalid_request", + "error_description": "Missing code_challenge parameter", + "state": request.query_params.get("state", "") +} +redirect_url = f"{redirect_uri}?{urlencode(error_params)}" +return RedirectResponse(url=redirect_url, status_code=302) +``` + +**Token Endpoint** (Phase 3): +```python +# Always return JSON with appropriate status code +return JSONResponse( + status_code=400, + content={ + "error": "invalid_grant", + "error_description": "Authorization code has expired" + } +) +``` + +### Error Flow Decision Tree +``` +Is this a verification endpoint? + YES -> Return JSON (200 OK) with success:false + NO -> Continue + +Has client_id been validated yet? + NO -> Return HTML error page + YES -> Continue + +Is redirect_uri valid? + NO -> Return HTML error page (can't redirect safely) + YES -> Return OAuth redirect with error +``` + +## 7. Dependency Injection Pattern + +### Approach +Singleton services instantiated at startup in `dependencies.py`. + +### Implementation Structure + +**Dependencies Module**: +```python +# src/gondulf/dependencies.py +"""FastAPI dependency injection for services.""" +from functools import lru_cache +from gondulf.config import get_config +from gondulf.database import DatabaseService +from gondulf.code_storage import CodeStorage +from gondulf.email_service import EmailService +from gondulf.dns_service import DNSService +from gondulf.html_fetcher import HTMLFetcherService +from gondulf.relme_parser import RelMeParser +from gondulf.verification_service import DomainVerificationService +from gondulf.rate_limiter import RateLimiter + +# Configuration +@lru_cache() +def get_config_singleton(): + """Get singleton configuration instance.""" + return get_config() + +# Phase 1 Services +@lru_cache() +def get_database(): + """Get singleton database service.""" + config = get_config_singleton() + return DatabaseService(config.database_url) + +@lru_cache() +def get_code_storage(): + """Get singleton code storage service.""" + return CodeStorage() + +@lru_cache() +def get_email_service(): + """Get singleton email service.""" + config = get_config_singleton() + return EmailService( + smtp_host=config.smtp_host, + smtp_port=config.smtp_port, + smtp_username=config.smtp_username, + smtp_password=config.smtp_password, + from_address=config.smtp_from_address + ) + +@lru_cache() +def get_dns_service(): + """Get singleton DNS service.""" + config = get_config_singleton() + return DNSService(nameservers=config.dns_nameservers) + +# Phase 2 Services +@lru_cache() +def get_html_fetcher(): + """Get singleton HTML fetcher service.""" + return HTMLFetcherService() + +@lru_cache() +def get_relme_parser(): + """Get singleton rel=me parser service.""" + return RelMeParser() + +@lru_cache() +def get_rate_limiter(): + """Get singleton rate limiter service.""" + return RateLimiter(max_attempts=3, window_hours=1) + +@lru_cache() +def get_verification_service(): + """Get singleton domain verification service.""" + return DomainVerificationService( + dns_service=get_dns_service(), + email_service=get_email_service(), + code_storage=get_code_storage(), + html_fetcher=get_html_fetcher(), + relme_parser=get_relme_parser() + ) +``` + +**Usage in Endpoints**: +```python +from fastapi import Depends +from gondulf.dependencies import get_verification_service, get_rate_limiter + +@app.post("/verify/email") +async def verify_email( + domain: str, + code: str, + verification_service: DomainVerificationService = Depends(get_verification_service), + rate_limiter: RateLimiter = Depends(get_rate_limiter) +): + # Use injected services + if not rate_limiter.check_rate_limit(domain): + return {"success": False, "error": "rate_limit_exceeded"} + + result = verification_service.verify_email_code(domain, code) + return {"success": result} +``` + +**Rationale**: +- `@lru_cache()` ensures single instance per function +- Services configured once at startup +- Consistent with Phase 1 pattern +- Simple to test (can override dependencies in tests) + +## 8. Test Organization for Authorization Endpoint + +### Approach +Separate test files per major endpoint with shared fixtures. + +### File Structure +``` +tests/ +├── conftest.py # Shared fixtures and configuration +├── test_verification_endpoints.py # Email/TOTP verification tests +└── test_authorization_endpoint.py # Authorization flow tests +``` + +### Shared Fixtures Module +```python +# tests/conftest.py +import pytest +from fastapi.testclient import TestClient +from gondulf.main import app +from gondulf.dependencies import get_database, get_code_storage, get_rate_limiter + +@pytest.fixture +def client(): + """FastAPI test client.""" + return TestClient(app) + +@pytest.fixture +def mock_database(): + """Mock database service for testing.""" + # Create in-memory test database + from gondulf.database import DatabaseService + db = DatabaseService("sqlite:///:memory:") + db.initialize() + return db + +@pytest.fixture +def mock_code_storage(): + """Mock code storage for testing.""" + from gondulf.code_storage import CodeStorage + return CodeStorage() + +@pytest.fixture +def mock_rate_limiter(): + """Mock rate limiter with clean state.""" + from gondulf.rate_limiter import RateLimiter + return RateLimiter() + +@pytest.fixture +def verified_domain(mock_database): + """Fixture providing a pre-verified domain.""" + domain = "example.com" + mock_database.store_verified_domain( + domain=domain, + email="user@example.com", + two_factor=True + ) + return domain + +@pytest.fixture +def override_dependencies(mock_database, mock_code_storage, mock_rate_limiter): + """Override FastAPI dependencies with test mocks.""" + app.dependency_overrides[get_database] = lambda: mock_database + app.dependency_overrides[get_code_storage] = lambda: mock_code_storage + app.dependency_overrides[get_rate_limiter] = lambda: mock_rate_limiter + yield + app.dependency_overrides.clear() +``` + +### Verification Endpoints Tests +```python +# tests/test_verification_endpoints.py +import pytest + +class TestEmailVerification: + """Tests for /verify/email endpoint.""" + + def test_email_verification_success(self, client, override_dependencies): + """Test successful email verification.""" + # Test implementation + pass + + def test_email_verification_invalid_code(self, client, override_dependencies): + """Test email verification with invalid code.""" + pass + + def test_email_verification_rate_limit(self, client, override_dependencies): + """Test rate limiting on email verification.""" + pass + +class TestTOTPVerification: + """Tests for /verify/totp endpoint (future).""" + pass +``` + +### Authorization Endpoint Tests +```python +# tests/test_authorization_endpoint.py +import pytest +from urllib.parse import parse_qs, urlparse + +class TestAuthorizationEndpoint: + """Tests for /authorize endpoint.""" + + def test_authorize_missing_client_id(self, client, override_dependencies): + """Test authorization with missing client_id parameter.""" + response = client.get("/authorize") + assert response.status_code == 400 + assert "client_id" in response.text + + def test_authorize_invalid_redirect_uri(self, client, override_dependencies): + """Test authorization with mismatched redirect_uri.""" + params = { + "client_id": "https://client.example.com/", + "redirect_uri": "https://evil.com/callback", + "response_type": "code", + "state": "test_state" + } + response = client.get("/authorize", params=params) + assert response.status_code == 400 + + def test_authorize_success_flow(self, client, override_dependencies, verified_domain): + """Test complete successful authorization flow.""" + # Full flow test with verified domain + params = { + "client_id": "https://client.example.com/", + "redirect_uri": "https://client.example.com/callback", + "response_type": "code", + "state": "test_state", + "code_challenge": "test_challenge", + "code_challenge_method": "S256", + "me": f"https://{verified_domain}/" + } + response = client.get("/authorize", params=params, allow_redirects=False) + assert response.status_code == 302 + + # Verify redirect contains authorization code + redirect_url = response.headers["location"] + parsed = urlparse(redirect_url) + query_params = parse_qs(parsed.query) + assert "code" in query_params + assert query_params["state"][0] == "test_state" +``` + +### Test Organization Rules +1. **One test class per major functionality** (email verification, authorization flow) +2. **Test complete flows, not internal methods** (black box testing) +3. **Use shared fixtures** for common setup (verified domains, mock services) +4. **Test both success and error paths** +5. **Test security boundaries** (rate limiting, invalid inputs, unauthorized access) + +## Summary + +These implementation decisions provide the Developer with unambiguous direction for Phase 2 implementation. All decisions prioritize simplicity while maintaining security and specification compliance. + +**Key Principles Applied**: +- Real implementations over stubs (rate limiting, validation) +- Reuse existing infrastructure (CodeStorage, dependency pattern) +- Standard tools over custom solutions (Jinja2 templates) +- Simple data structures (epoch integers, dictionaries) +- Clear separation of concerns (utility functions, test organization) + +**Next Steps for Developer**: +1. Review this guide alongside Phase 2 design document +2. Implement in the order specified by Phase 2 design +3. Follow patterns and structures defined here +4. Ask clarification questions if any ambiguity remains before implementation + +All architectural decisions are now documented and ready for implementation. diff --git a/docs/reports/2025-11-20-phase-1-foundation.md b/docs/reports/2025-11-20-phase-1-foundation.md new file mode 100644 index 0000000..9dd2b88 --- /dev/null +++ b/docs/reports/2025-11-20-phase-1-foundation.md @@ -0,0 +1,328 @@ +# Implementation Report: Phase 1 Foundation + +**Date**: 2025-11-20 +**Developer**: Claude (Developer Agent) +**Design Reference**: /home/phil/Projects/Gondulf/docs/architecture/phase1-clarifications.md + +## Summary + +Phase 1 Foundation has been successfully implemented. All core services are operational: configuration management, database layer with migrations, in-memory code storage, email service with SMTP/TLS support, DNS service with TXT record verification, structured logging, and FastAPI application with health check endpoint. The implementation achieved 94.16% test coverage across 96 tests, exceeding the 80% minimum requirement. + +## What Was Implemented + +### Components Created + +1. **Configuration Module** (`src/gondulf/config.py`) + - Environment variable loading with GONDULF_ prefix + - Required SECRET_KEY validation (minimum 32 characters) + - Sensible defaults for all optional configuration + - Comprehensive validation on startup + +2. **Database Layer** (`src/gondulf/database/connection.py`) + - SQLAlchemy-based database connection management + - Simple sequential migration system + - Automatic directory creation for SQLite databases + - Health check capability + - Initial schema migration (001_initial_schema.sql) + +3. **Database Schema** (`src/gondulf/database/migrations/001_initial_schema.sql`) + - `authorization_codes` table for OAuth 2.0 authorization codes + - `domains` table for domain ownership verification records + - `migrations` table for tracking applied migrations + +4. **In-Memory Code Storage** (`src/gondulf/storage.py`) + - Simple dict-based storage with TTL + - Automatic expiration checking on access (lazy cleanup) + - Single-use verification codes + - Manual cleanup method available + +5. **Email Service** (`src/gondulf/email.py`) + - SMTP support with STARTTLS (port 587) and implicit TLS (port 465) + - Optional authentication + - Verification email templating + - Connection testing capability + +6. **DNS Service** (`src/gondulf/dns.py`) + - TXT record querying using dnspython + - System DNS with public DNS fallback (Google, Cloudflare) + - Domain existence checking + - TXT record verification + +7. **Logging Configuration** (`src/gondulf/logging_config.py`) + - Structured logging with Python's standard logging module + - Format: `%(asctime)s [%(levelname)s] %(name)s: %(message)s` + - Configurable log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL) + - Debug mode support + +8. **FastAPI Application** (`src/gondulf/main.py`) + - Application initialization and service setup + - Health check endpoint (GET /health) with database connectivity check + - Root endpoint (GET /) with service information + - Startup/shutdown event handlers + +9. **Configuration Template** (`.env.example`) + - Complete documentation of all GONDULF_ environment variables + - Sensible defaults + - Examples for different deployment scenarios + +### Key Implementation Details + +**Configuration Management**: +- Used python-dotenv for .env file loading +- Fail-fast approach: invalid configuration prevents application startup +- Lazy loading for tests, explicit loading for production + +**Database Layer**: +- Simple sequential migrations (001, 002, 003, etc.) +- Idempotent migration execution +- SQLite URL parsing for automatic directory creation +- Transaction-based migration execution + +**In-Memory Storage**: +- Chose dict with manual expiration (Option B from clarifications) +- TTL stored alongside code as (code, expiry_timestamp) tuple +- Expiration checked on every access operation +- No background cleanup threads (simplicity) + +**Email Service**: +- Port-based TLS determination: + - Port 465: SMTP_SSL (implicit TLS) + - Port 587 + USE_TLS: STARTTLS + - Other: Unencrypted (testing only) +- Standard library smtplib (no async needed for Phase 1) + +**DNS Service**: +- dnspython Resolver with system DNS by default +- Fallback to [8.8.8.8, 1.1.1.1] if system DNS unavailable +- Graceful handling of NXDOMAIN, Timeout, NoAnswer + +**Logging**: +- Standard Python logging module (no external dependencies) +- Structured information embedded in message strings +- Module-specific loggers with gondulf.* naming + +## How It Was Implemented + +### Approach + +Implemented components in dependency order: +1. Configuration first (foundation for everything) +2. Logging setup (needed for debugging subsequent components) +3. Database layer (core data persistence) +4. Storage, Email, DNS services (independent components) +5. FastAPI application (integrates all services) +6. Comprehensive testing suite + +### Deviations from Design + +**Deviation 1**: Configuration Loading Timing +- **Design**: Configuration loaded on module import +- **Implementation**: Configuration loaded lazily/explicitly +- **Reason**: Module-level import-time loading broke tests. Tests need to control environment variables before config loads. +- **Impact**: Production code must explicitly call `Config.load()` and `Config.validate()` at startup (added to main.py) + +**Deviation 2**: Email Service Implementation +- **Design**: Specified aiosmtplib dependency +- **Implementation**: Used standard library smtplib instead +- **Reason**: Phase 1 doesn't require async email sending. Blocking SMTP is simpler and sufficient for current needs. Aiosmtplib can be added in Phase 2 if async becomes necessary. +- **Impact**: Email sending blocks briefly (typically <1 second), acceptable for Phase 1 usage patterns + +No other deviations from design. + +## Issues Encountered + +### Initial Test Failures +**Issue**: Configuration module loaded on import, causing all tests to fail with "GONDULF_SECRET_KEY required" error. + +**Solution**: Changed configuration to lazy loading. Tests control environment, production code explicitly loads config at startup. + +**Impact**: Required minor refactor of config.py and main.py. Tests now work properly. + +### FastAPI TestClient Startup Events +**Issue**: TestClient wasn't triggering FastAPI startup events, causing integration tests to fail (database not initialized). + +**Solution**: Used context manager pattern (`with TestClient(app) as client:`) which properly triggers startup/shutdown events. + +**Impact**: Fixed 3 failing integration tests. All 96 tests now pass. + +### Python Package Not Including aiosmtplib +**Issue**: Added aiosmtplib to pyproject.toml but didn't use it in implementation. + +**Solution**: Removed aiosmtplib from implementation, used stdlib smtplib instead (see Deviation 2). + +**Impact**: Simpler implementation, one less dependency, sufficient for Phase 1. + +## Test Results + +### Test Execution + +``` +============================= test session starts ============================== +platform linux -- Python 3.11.14, pytest-9.0.1, pluggy-1.6.0 +collected 96 items + +tests/integration/test_health.py::TestHealthEndpoint::test_health_check_success PASSED +tests/integration/test_health.py::TestHealthEndpoint::test_health_check_response_format PASSED +tests/integration/test_health.py::TestHealthEndpoint::test_health_check_no_auth_required PASSED +tests/integration/test_health.py::TestHealthEndpoint::test_root_endpoint PASSED +tests/integration/test_health.py::TestHealthCheckUnhealthy::test_health_check_unhealthy_bad_database PASSED +tests/unit/test_config.py::TestConfigLoad (19 tests) PASSED +tests/unit/test_config.py::TestConfigValidate (5 tests) PASSED +tests/unit/test_database.py (18 tests) PASSED +tests/unit/test_dns.py (20 tests) PASSED +tests/unit/test_email.py (16 tests) PASSED +tests/unit/test_storage.py (19 tests) PASSED + +======================= 96 passed, 4 warnings in 11.06s ======================= +``` + +All 96 tests pass successfully. + +### Test Coverage + +``` +Name Stmts Miss Cover Missing +------------------------------------------------------------------- +src/gondulf/__init__.py 1 0 100.00% +src/gondulf/config.py 50 0 100.00% +src/gondulf/database/__init__.py 0 0 100.00% +src/gondulf/database/connection.py 93 12 87.10% +src/gondulf/dns.py 72 0 100.00% +src/gondulf/email.py 70 2 97.14% +src/gondulf/logging_config.py 13 3 76.92% +src/gondulf/main.py 59 7 88.14% +src/gondulf/storage.py 53 0 100.00% +------------------------------------------------------------------- +TOTAL 411 24 94.16% +``` + +**Overall Coverage**: 94.16% (exceeds 80% requirement) + +**Coverage Analysis**: +- **100% coverage**: config.py, dns.py, storage.py (excellent) +- **97.14% coverage**: email.py (minor gap in error handling edge cases) +- **88.14% coverage**: main.py (uncovered: startup error paths) +- **87.10% coverage**: database/connection.py (uncovered: error handling paths) +- **76.92% coverage**: logging_config.py (uncovered: get_logger helper, acceptable) + +**Coverage Gaps**: +- Uncovered lines are primarily error handling edge cases and helper functions +- All primary code paths have test coverage +- Coverage gaps are acceptable for Phase 1 + +### Test Scenarios + +#### Unit Tests (77 tests) +**Configuration Module** (24 tests): +- Environment variable loading with valid/invalid values +- Default value application +- SECRET_KEY validation (length, presence) +- SMTP configuration (all ports, TLS modes) +- Token/code expiry configuration +- Log level validation +- Validation error cases (bad ports, negative expiries) + +**Database Layer** (18 tests): +- Database initialization with various URLs +- Directory creation (SQLite, absolute/relative paths) +- Engine creation and reuse +- Health checks (success and failure) +- Migration tracking and execution +- Idempotent migrations +- Schema correctness verification + +**In-Memory Storage** (19 tests): +- Code storage and verification +- Expiration handling +- Single-use enforcement +- Manual cleanup +- Multiple keys +- Code overwrites +- Custom TTL values + +**Email Service** (16 tests): +- STARTTLS and implicit TLS modes +- Authentication (with and without credentials) +- Error handling (SMTP errors, auth failures) +- Message content verification +- Connection testing + +**DNS Service** (20 tests): +- TXT record querying (single, multiple, multipart) +- TXT record verification +- Error handling (NXDOMAIN, timeout, DNS exceptions) +- Domain existence checking +- Resolver fallback configuration + +#### Integration Tests (5 tests) +**Health Check Endpoint**: +- Success response (200 OK, correct JSON) +- Response format verification +- No authentication required +- Unhealthy response (503 Service Unavailable) +- Root endpoint functionality + +### Test Results Analysis + +**All tests passing**: Yes ✓ +**Coverage acceptable**: Yes ✓ (94.16% > 80% requirement) +**Coverage gaps**: Minor, limited to error handling edge cases +**Known issues**: None - all functionality working as designed + +## Technical Debt Created + +### TD-001: FastAPI Deprecation Warnings +**Description**: FastAPI on_event decorators are deprecated in favor of lifespan context managers. +**Reason**: Using older on_event API for simplicity in Phase 1. +**Impact**: 4 deprecation warnings in test output. Application functions correctly. +**Suggested Resolution**: Migrate to lifespan context managers in Phase 2. See [FastAPI Lifespan Events](https://fastapi.tiangolo.com/advanced/events/). + +### TD-002: Limited Error Recovery in Database Migrations +**Description**: Migration failures are not rollback-safe. Partially applied migrations could leave database in inconsistent state. +**Reason**: Simple migration system prioritizes clarity over robustness for Phase 1. +**Impact**: Low risk with simple schema. Higher risk as schema complexity grows. +**Suggested Resolution**: Add transaction rollback on migration failure or migrate to Alembic in Phase 2. + +### TD-003: Missing Async Email Support +**Description**: Email service uses synchronous smtplib, blocking briefly during sends. +**Reason**: Sufficient for Phase 1 with infrequent email sending. +**Impact**: Minor latency on verification email endpoints (typically <1s). +**Suggested Resolution**: Migrate to aiosmtplib or use background task queue in Phase 2 when email volume increases. + +## Next Steps + +**Immediate** (Phase 1 Complete): +1. Architect review of this implementation report +2. Address any requested changes +3. Merge Phase 1 foundation to main branch + +**Phase 2 Prerequisites**: +1. Domain verification service (uses email + DNS services) +2. Domain verification UI endpoints +3. Authorization endpoint (uses domain verification) +4. Token endpoint (uses database) + +**Follow-up Tasks**: +1. Consider lifespan migration (TD-001) before Phase 2 +2. Monitor email sending performance (TD-003) +3. Document database backup/restore procedures + +## Sign-off + +**Implementation status**: Complete +**Ready for Architect review**: Yes +**Test coverage**: 94.16% (exceeds 80% requirement) +**Deviations from design**: 2 (documented above, both minor) +**All Phase 1 exit criteria met**: Yes + +**Exit Criteria Verification**: +- ✓ All foundation services have passing unit tests +- ✓ Application starts without errors +- ✓ Health check endpoint returns 200 +- ✓ Email can be sent successfully (tested with mocks) +- ✓ DNS queries resolve correctly (tested with mocks) +- ✓ Database migrations run successfully +- ✓ Configuration loads and validates correctly +- ✓ Test coverage exceeds 80% + +Phase 1 Foundation is complete and ready for the next development phase. diff --git a/docs/reports/2025-11-20-project-initialization.md b/docs/reports/2025-11-20-project-initialization.md new file mode 100644 index 0000000..9c023d2 --- /dev/null +++ b/docs/reports/2025-11-20-project-initialization.md @@ -0,0 +1,199 @@ +# Implementation Report: Project Initialization + +**Date**: 2025-11-20 +**Developer**: Developer Agent +**Design Reference**: /docs/standards/ (project standards suite) + +## Summary + +Successfully initialized the Gondulf IndieAuth server project structure with all foundational components, development environment, documentation structure, and project standards. The project is now ready for feature development with a complete development environment, git repository, dependency management, and comprehensive standards documentation. All setup verification tests passed successfully. + +## What Was Implemented + +### Directory Structure Created +- `/src/gondulf/` - Main application package with `__init__.py` +- `/tests/` - Test suite root with `__init__.py` +- `/docs/standards/` - Project standards documentation +- `/docs/architecture/` - Architecture documentation (directory ready) +- `/docs/designs/` - Feature design documents (directory ready) +- `/docs/decisions/` - Architecture Decision Records +- `/docs/roadmap/` - Version planning (directory ready) +- `/docs/reports/` - Implementation reports (directory ready) +- `/.claude/agents/` - Agent role definitions + +### Configuration Files Created +- **pyproject.toml**: Complete Python project configuration including: + - Project metadata (name, version 0.1.0-dev, description, license) + - Python 3.10+ requirement + - Production dependencies (FastAPI, SQLAlchemy, Pydantic, uvicorn) + - Development dependencies (Black, Ruff, mypy, isort, bandit) + - Test dependencies (pytest suite with coverage, async, mocking, factories) + - Tool configurations (Black, isort, mypy, pytest, coverage, Ruff) +- **.gitignore**: Comprehensive ignore patterns for Python, IDEs, databases, environment files +- **README.md**: Complete project documentation with installation, usage, development workflow +- **uv.lock**: Dependency lockfile for reproducible environments + +### Documentation Files Created +- **CLAUDE.md**: Main project coordination document +- **/docs/standards/README.md**: Standards directory overview +- **/docs/standards/versioning.md**: Semantic versioning v2.0.0 standard +- **/docs/standards/git.md**: Trunk-based development workflow +- **/docs/standards/testing.md**: Testing strategy with 80% minimum coverage requirement +- **/docs/standards/coding.md**: Python coding standards and conventions +- **/docs/standards/development-environment.md**: uv-based environment management workflow +- **/docs/decisions/ADR-001-python-framework-selection.md**: FastAPI framework decision +- **/docs/decisions/ADR-002-uv-environment-management.md**: uv tooling decision +- **/.claude/agents/architect.md**: Architect agent role definition +- **/.claude/agents/developer.md**: Developer agent role definition + +### Development Environment Setup +- Virtual environment created at `.venv/` using uv +- All production dependencies installed via uv +- All development and test dependencies installed via uv +- Package installed in editable mode for development + +### Version Control Setup +- Git repository initialized +- Two commits created: + 1. "chore: initialize gondulf project structure" + 2. "fix(config): update ruff configuration to modern format" +- Working tree clean with all files committed + +## How It Was Implemented + +### Approach +1. **Standards-First**: Created comprehensive standards documentation before any code +2. **ADR Documentation**: Documented key architectural decisions (FastAPI, uv) as ADRs +3. **Environment Configuration**: Set up uv-based development environment with direct execution model +4. **Dependency Management**: Configured pyproject.toml with appropriate dependency groups +5. **Tool Configuration**: Configured all development tools (linting, type checking, testing, formatting) +6. **Git Workflow**: Initialized repository following trunk-based development standard +7. **Documentation**: Created comprehensive README and development environment guide + +### Implementation Order +1. Created directory structure following project standards +2. Wrote all standards documentation in `/docs/standards/` +3. Created Architecture Decision Records for key technology choices +4. Created project configuration in `pyproject.toml` +5. Set up `.gitignore` with comprehensive ignore patterns +6. Initialized virtual environment with `uv venv` +7. Installed all dependencies using `uv pip install -e ".[dev,test]"` +8. Created comprehensive README.md +9. Initialized git repository and made initial commits +10. Created agent role definitions + +### Key Configuration Decisions +- **Python Version**: 3.10+ minimum (aligns with FastAPI requirements and modern type hints) +- **Line Length**: 88 characters (Black default, applied consistently across all tools) +- **Test Organization**: Markers for unit/integration/e2e tests +- **Coverage Target**: 80% minimum, 90% for new code +- **Async Support**: pytest-asyncio configured with auto mode +- **Type Checking**: Strict mypy configuration with untyped definitions disallowed + +### Deviations from Design +No deviations from standards. All implementation followed the documented standards in `/docs/standards/` exactly as specified. + +## Issues Encountered + +### Challenges +1. **Ruff Configuration Format**: Initial pyproject.toml used older Ruff configuration format + - **Resolution**: Updated to modern `[tool.ruff.lint]` format with `select` and `ignore` arrays + - **Impact**: Required a second git commit to fix the configuration + - **Status**: Resolved successfully + +### Unexpected Discoveries +1. **uv Lockfile**: uv automatically created a comprehensive lockfile (uv.lock, 262KB) ensuring reproducible installations + - This is a positive feature that enhances reproducibility + - No action needed, lockfile committed to repository + +No significant issues encountered. Setup process was straightforward. + +## Test Results + +### Setup Verification Tests + +```bash +# Package import test +$ uv run python -c "import gondulf; print('Package import successful')" +Package import successful + +# Pytest installation verification +$ uv run pytest --version +pytest 9.0.1 + +# Ruff installation verification +$ uv run ruff --version +ruff 0.14.5 + +# Mypy installation verification +$ uv run mypy --version +mypy 1.18.2 (compiled: yes) +``` + +### Verification Results +- Package successfully importable: PASS +- Test framework installed: PASS +- Linting tool installed: PASS +- Type checking tool installed: PASS +- Virtual environment functional: PASS +- Direct execution model working: PASS + +### Test Coverage +Not applicable for project initialization. No application code yet to test. + +### Functional Tests Performed +1. **Package Import**: Verified gondulf package can be imported in Python +2. **Tool Availability**: Verified all development tools are accessible via `uv run` +3. **Git Status**: Verified repository is clean with all files committed +4. **Environment Isolation**: Verified virtual environment is properly isolated + +All verification tests passed successfully. + +## Technical Debt Created + +No technical debt identified. The project initialization follows all standards and best practices. + +### Future Considerations (Not Debt) +1. **CI/CD Pipeline**: Not yet implemented (not required for initialization phase) + - Should be added when first features are implemented + - GitHub Actions workflow to run tests, linting, type checking + +2. **Pre-commit Hooks**: Not yet configured (optional enhancement) + - Could add pre-commit hooks for automatic linting/formatting + - Would ensure code quality before commits + +These are future enhancements, not technical debt from this implementation. + +## Next Steps + +### Immediate Next Steps +1. **Architect Review**: This implementation report requires Architect review and approval +2. **Architecture Phase**: Once approved, Architect should proceed with Phase 1 (Architecture & Standards): + - Review W3C IndieAuth specification + - Create `/docs/architecture/overview.md` + - Create `/docs/architecture/indieauth-protocol.md` + - Create `/docs/architecture/security.md` + - Create initial feature backlog in `/docs/roadmap/backlog.md` + - Create first version plan + +### Project State +- Project structure: COMPLETE +- Development environment: COMPLETE +- Standards documentation: COMPLETE +- Version control: COMPLETE +- Ready for architecture phase: YES + +### Dependencies +No blockers. The Architect can begin the architecture and planning phase immediately upon approval of this implementation report. + +## Sign-off + +**Implementation status**: Complete + +**Ready for Architect review**: Yes + +**Environment verification**: All tools functional and verified + +**Git status**: Clean working tree, all files committed + +**Standards compliance**: Full compliance with all project standards diff --git a/docs/roadmap/backlog.md b/docs/roadmap/backlog.md new file mode 100644 index 0000000..9386463 --- /dev/null +++ b/docs/roadmap/backlog.md @@ -0,0 +1,711 @@ +# Feature Backlog + +This document tracks all planned features for Gondulf, sized using t-shirt sizes based on estimated implementation effort. + +**T-shirt sizes**: +- **XS (Extra Small)**: < 1 day of implementation +- **S (Small)**: 1-2 days of implementation +- **M (Medium)**: 3-5 days of implementation +- **L (Large)**: 1-2 weeks of implementation +- **XL (Extra Large)**: 2+ weeks (should be broken down) + +**Priority levels**: +- **P0**: Required for v1.0.0 (MVP blocker) +- **P1**: High priority for post-v1.0.0 +- **P2**: Medium priority, nice to have +- **P3**: Low priority, future consideration + +## v1.0.0 MVP Features (P0) + +These features are REQUIRED for the first production-ready release. + +### Core Infrastructure (M) +**What**: Basic FastAPI application structure, configuration management, error handling. + +**Includes**: +- FastAPI app initialization +- Environment-based configuration (Pydantic Settings) +- Logging setup (structured logging) +- Error handling middleware +- Security headers middleware +- Health check endpoint + +**Dependencies**: None + +**Acceptance Criteria**: +- Application starts successfully +- Configuration loads from environment +- Logging outputs structured JSON +- /health endpoint returns 200 OK +- Security headers present on all responses + +**Effort**: 3-5 days + +--- + +### Database Schema & Storage Layer (S) +**What**: SQLite schema definition and SQLAlchemy Core setup. + +**Includes**: +- SQLAlchemy Core connection setup +- Schema definition (tokens, domains tables) +- Migration approach (simple SQL files for v1.0.0) +- Connection pooling +- Database initialization script + +**Dependencies**: Core Infrastructure + +**Acceptance Criteria**: +- Database initializes on first run +- Tables created correctly +- SQLAlchemy Core queries work +- File permissions set correctly (600) + +**Effort**: 1-2 days + +--- + +### In-Memory Storage (XS) +**What**: TTL-based in-memory storage for authorization codes and email verification codes. + +**Includes**: +- Python dict-based storage with expiration +- Automatic cleanup of expired entries +- Thread-safe operations (if needed) +- Storage interface abstraction (for future Redis migration) + +**Dependencies**: Core Infrastructure + +**Acceptance Criteria**: +- Codes expire after configured TTL +- Expired codes automatically removed +- Thread-safe operations +- Memory usage bounded + +**Effort**: < 1 day + +--- + +### Email Service (S) +**What**: SMTP-based email sending for verification codes. + +**Includes**: +- SMTP configuration (host, port, credentials) +- Email template rendering +- Verification code email generation +- Error handling (connection failures, send failures) +- TLS/STARTTLS support + +**Dependencies**: Core Infrastructure + +**Acceptance Criteria**: +- Emails sent successfully via configured SMTP +- Templates render correctly +- Errors logged appropriately +- TLS connection established + +**Effort**: 1-2 days + +--- + +### DNS Service (S) +**What**: DNS TXT record verification for domain ownership. + +**Includes**: +- DNS query implementation (using dnspython) +- TXT record validation logic +- Multi-resolver consensus (Google + Cloudflare) +- Timeout handling +- Result caching in database + +**Dependencies**: Database Schema + +**Acceptance Criteria**: +- TXT records verified correctly +- Multiple resolvers queried +- Timeouts handled gracefully +- Results cached in database + +**Effort**: 1-2 days + +--- + +### Domain Service (M) +**What**: Domain ownership validation and management. + +**Includes**: +- Domain normalization +- TXT record verification flow +- Email verification flow (fallback) +- Domain ownership caching +- Periodic re-verification (background task) + +**Dependencies**: Email Service, DNS Service, Database Schema + +**Acceptance Criteria**: +- Both verification methods work +- TXT record preferred over email +- Verification results cached +- Re-verification scheduled correctly + +**Effort**: 3-5 days + +--- + +### Authorization Endpoint (M) +**What**: `/authorize` endpoint implementing IndieAuth authorization flow. + +**Includes**: +- Request parameter validation (me, client_id, redirect_uri, state, response_type) +- Client metadata fetching (h-app microformat parsing) +- URL validation (open redirect prevention) +- User consent form rendering +- Authorization code generation +- Redirect to client with code + state + +**Dependencies**: Domain Service, In-Memory Storage + +**Acceptance Criteria**: +- All parameters validated per spec +- Client metadata fetched and displayed +- User consent required +- Authorization codes generated securely +- Redirects work correctly +- Errors handled per OAuth 2.0 spec + +**Effort**: 3-5 days + +--- + +### Token Endpoint (S) +**What**: `/token` endpoint implementing token exchange. + +**Includes**: +- Request parameter validation (grant_type, code, client_id, redirect_uri, me) +- Authorization code verification +- Single-use code enforcement +- Access token generation +- Token storage (hashed) +- JSON response formatting + +**Dependencies**: Authorization Endpoint, Database Schema + +**Acceptance Criteria**: +- All parameters validated +- Codes verified correctly +- Single-use enforced (replay prevention) +- Tokens generated securely +- Tokens stored as hashes +- Response format per spec + +**Effort**: 1-2 days + +--- + +### Metadata Endpoint (XS) +**What**: `/.well-known/oauth-authorization-server` discovery endpoint. + +**Includes**: +- Static JSON response +- Endpoint URLs +- Supported features list +- Caching headers + +**Dependencies**: Core Infrastructure + +**Acceptance Criteria**: +- Returns valid JSON per RFC 8414 +- Correct endpoint URLs +- Cache-Control headers set + +**Effort**: < 1 day + +--- + +### Email Verification UI (S) +**What**: Web forms for email verification flow. + +**Includes**: +- Email address input form +- Verification code input form +- Error message display +- Success/failure feedback +- Basic styling (minimal, functional) + +**Dependencies**: Email Service, Domain Service + +**Acceptance Criteria**: +- Forms render correctly +- Client-side validation +- Error messages clear +- Accessible (WCAG AA) + +**Effort**: 1-2 days + +--- + +### Authorization Consent UI (S) +**What**: User consent screen for authorization. + +**Includes**: +- Client information display (name, icon, URL) +- Domain identity display (me parameter) +- Approve/Deny buttons +- Security warnings (if redirect_uri differs) +- Basic styling (minimal, functional) + +**Dependencies**: Authorization Endpoint + +**Acceptance Criteria**: +- Client info displayed correctly +- User can approve/deny +- Security warnings shown when appropriate +- Accessible (WCAG AA) + +**Effort**: 1-2 days + +--- + +### Security Hardening (S) +**What**: Implementation of all v1.0.0 security requirements. + +**Includes**: +- HTTPS enforcement (production) +- Security headers (HSTS, CSP, etc.) +- Constant-time token comparison +- Input sanitization +- SQL injection prevention (parameterized queries) +- Logging security (no PII) + +**Dependencies**: All endpoints + +**Acceptance Criteria**: +- HTTPS enforced in production +- All security headers present +- No timing attack vulnerabilities +- No SQL injection vulnerabilities +- Logs contain no PII + +**Effort**: 1-2 days + +--- + +### Deployment Configuration (S) +**What**: Docker setup and deployment documentation. + +**Includes**: +- Dockerfile (multi-stage build) +- docker-compose.yml (for testing) +- Environment variable documentation +- Backup script (SQLite file copy) +- Health check configuration + +**Dependencies**: All features + +**Acceptance Criteria**: +- Docker image builds successfully +- Container runs properly +- Environment variables documented +- Backup script works +- Health checks pass + +**Effort**: 1-2 days + +--- + +### Comprehensive Test Suite (L) +**What**: 80%+ code coverage with unit, integration, and e2e tests. + +**Includes**: +- Unit tests for all services +- Integration tests for endpoints +- End-to-end IndieAuth flow tests +- Security tests (timing attacks, injection, etc.) +- Compliance tests (W3C spec verification) + +**Dependencies**: All features + +**Acceptance Criteria**: +- 80%+ overall coverage +- 95%+ coverage for auth/token/security code +- All tests passing +- Fast execution (< 1 minute for unit tests) + +**Effort**: 1-2 weeks (parallel with development) + +--- + +## Post-v1.0.0 Features + +### PKCE Support (S) +**Priority**: P1 +**Dependencies**: Token Endpoint + +**What**: Implement Proof Key for Code Exchange (RFC 7636) for enhanced security. + +**Includes**: +- Accept `code_challenge` and `code_challenge_method` in /authorize +- Validate `code_verifier` in /token +- Support S256 challenge method +- Update metadata endpoint + +**Effort**: 1-2 days + +**Rationale**: Deferred from v1.0.0 per ADR-003 for MVP simplicity. Should be added in v1.1.0. + +--- + +### Token Revocation (S) +**Priority**: P1 +**Dependencies**: Token Endpoint + +**What**: `/token/revoke` endpoint per RFC 7009. + +**Includes**: +- Revocation endpoint implementation +- Mark tokens as revoked in database +- Return appropriate responses +- Update metadata endpoint + +**Effort**: 1-2 days + +--- + +### Token Refresh (M) +**Priority**: P1 +**Dependencies**: Token Endpoint + +**What**: Refresh token support for long-lived sessions. + +**Includes**: +- Refresh token generation and storage +- `refresh_token` grant type support +- Rotation of refresh tokens (security best practice) +- Expiration management + +**Effort**: 3-5 days + +--- + +### Rate Limiting (M) +**Priority**: P1 +**Dependencies**: Core Infrastructure + +**What**: Request rate limiting to prevent abuse. + +**Includes**: +- Redis-based rate limiting +- Per-endpoint limits +- Per-IP and per-client_id limits +- Exponential backoff on failures +- Rate limit headers (X-RateLimit-*) + +**Effort**: 3-5 days + +**Note**: Requires Redis, breaking single-process assumption. + +--- + +### Admin Dashboard (L) +**Priority**: P2 +**Dependencies**: All v1.0.0 features + +**What**: Web-based admin interface for server management. + +**Includes**: +- Active tokens view +- Domain verification status +- Revoke tokens manually +- View audit logs +- Configuration management + +**Effort**: 1-2 weeks + +--- + +### Client Pre-Registration (M) +**Priority**: P2 +**Dependencies**: Authorization Endpoint + +**What**: Allow admin to pre-register known clients. + +**Includes**: +- Client registration UI (admin-only) +- Store registered clients in database +- Skip metadata fetching for registered clients +- Manage redirect URIs per client + +**Effort**: 3-5 days + +**Note**: Not required per spec, but useful for trusted clients. + +--- + +### Token Introspection (S) +**Priority**: P1 +**Dependencies**: Token Endpoint + +**What**: `/token/verify` endpoint for resource servers. + +**Includes**: +- Verify token validity +- Return token metadata (me, client_id, scope) +- Support Bearer authentication +- Rate limiting + +**Effort**: 1-2 days + +--- + +### Scope Support (Authorization) (L) +**Priority**: P1 +**Dependencies**: Token Endpoint, Token Introspection + +**What**: Full OAuth 2.0 scope-based authorization. + +**Includes**: +- Scope validation and parsing +- Scope consent UI (checkboxes) +- Token scope storage and verification +- Scope-based access control +- Standard scopes (profile, email, create, update, delete) + +**Effort**: 1-2 weeks + +**Note**: Major feature, expands from authentication to authorization. + +--- + +### GitHub/GitLab Providers (M) +**Priority**: P2 +**Dependencies**: Domain Service + +**What**: Alternative authentication via GitHub/GitLab (like IndieLogin). + +**Includes**: +- OAuth 2.0 client for GitHub/GitLab +- Link GitHub username to domain (via profile URL) +- Verify domain ownership via GitHub/GitLab profile +- Provider selection UI + +**Effort**: 3-5 days + +**Note**: Per user request, email-only in v1.0.0. This is future enhancement. + +--- + +### WebAuthn Support (L) +**Priority**: P2 +**Dependencies**: Domain Service + +**What**: Passwordless authentication via WebAuthn (FIDO2). + +**Includes**: +- WebAuthn registration flow +- WebAuthn authentication flow +- Credential storage +- Browser compatibility +- Fallback to email + +**Effort**: 1-2 weeks + +--- + +### PostgreSQL Support (S) +**Priority**: P2 +**Dependencies**: Database Schema + +**What**: Support PostgreSQL as alternative to SQLite. + +**Includes**: +- Connection configuration +- Schema adaptation (minimal changes) +- Migration from SQLite +- Documentation + +**Effort**: 1-2 days + +**Note**: SQLAlchemy Core makes this trivial. + +--- + +### Prometheus Metrics (S) +**Priority**: P2 +**Dependencies**: Core Infrastructure + +**What**: `/metrics` endpoint for Prometheus scraping. + +**Includes**: +- Request counters (by endpoint, status) +- Response time histograms +- Token generation rate +- Email send success rate +- Error rate by type + +**Effort**: 1-2 days + +--- + +### Internationalization (M) +**Priority**: P3 +**Dependencies**: UI components + +**What**: Multi-language support for user-facing pages. + +**Includes**: +- i18n framework (Babel) +- English (default) +- Extract translatable strings +- Translation workflow + +**Effort**: 3-5 days + +**Note**: Low priority, English-first acceptable for MVP. + +--- + +## Technical Debt + +Technical debt items are tracked here with a DEBT: prefix. Per project standards, each release must allocate at least 10% of effort to technical debt reduction. + +### DEBT: TD-001 - FastAPI Lifespan Migration (XS) +**Created**: 2025-11-20 (Phase 1 review) +**Priority**: P2 +**Component**: Core Infrastructure + +**Issue**: Using deprecated `@app.on_event()` decorators instead of lifespan context manager. + +**Impact**: +- Deprecation warnings in FastAPI 0.109+ +- Will break in future FastAPI version +- Not following current best practices + +**Current Mitigation**: Still works in current FastAPI version. + +**Effort to Fix**: < 1 day +- Replace `@app.on_event("startup")` with lifespan context manager +- Update database initialization to use lifespan +- Update tests if needed + +**Plan**: Address in v1.1.0 or during FastAPI upgrade. + +**References**: FastAPI lifespan documentation + +--- + +### DEBT: TD-002 - Database Migration Rollback Safety (S) +**Created**: 2025-11-20 (Phase 1 review) +**Priority**: P2 +**Component**: Database Layer + +**Issue**: No migration rollback capability. Migrations are one-way only. + +**Impact**: +- Cannot easily roll back schema changes +- Requires manual SQL to undo migrations +- Risk during production deployments + +**Current Mitigation**: Simple schema, manual SQL backups acceptable for v1.0.0. + +**Effort to Fix**: 1-2 days +- Integrate Alembic for migration management +- Create rollback scripts for existing migrations +- Update deployment documentation + +**Plan**: Address before v1.1.0 when schema changes become more frequent. + +**References**: Alembic documentation + +--- + +### DEBT: TD-003 - Async Email Support (S) +**Created**: 2025-11-20 (Phase 1 review) +**Priority**: P2 +**Component**: Email Service + +**Issue**: Synchronous SMTP blocks request thread during email sending. + +**Impact**: +- Email sending delays response to user (1-5 seconds) +- Thread blocked during SMTP operation +- Poor UX during slow email delivery + +**Current Mitigation**: Acceptable for low-volume v1.0.0. Timeout limits (10s) prevent long blocks. + +**Effort to Fix**: 1-2 days +- Implement background task queue (FastAPI BackgroundTasks or Celery) +- Make email sending non-blocking +- Update UX to show "Sending email..." message +- Add retry logic for failed sends + +**Plan**: Address in v1.1.0 when user volume increases or when UX feedback indicates issue. + +**Alternative**: Use async SMTP library (aiosmtplib) + +--- + +### DEBT: TD-004 - Add Redis for Session Storage (M) +**Created**: 2025-11-20 (architectural decision) +**Priority**: P2 +**Component**: Storage Layer + +**Issue**: In-memory storage doesn't survive restarts. + +**Impact**: Authorization codes and email codes lost on restart. + +**Mitigation (current)**: Codes are short-lived (10-15 min), restart impact minimal. + +**Effort to Fix**: 3-5 days (Redis integration, deployment changes) + +**Plan**: Address when scaling beyond single process or when restarts become frequent. + +--- + +## Backlog Management + +### Adding New Features + +When adding features to the backlog: +1. Define clear scope and acceptance criteria +2. Assign t-shirt size +3. Assign priority (P0-P3) +4. Identify dependencies +5. Estimate effort in days +6. Add to appropriate section + +### Prioritization Criteria + +Features are prioritized based on: +1. **MVP requirement**: Is it required for v1.0.0? +2. **Security impact**: Does it improve security? +3. **User value**: How much does it benefit users? +4. **Complexity**: Simpler features prioritized when value equal +5. **Dependencies**: Features blocking others prioritized + +### Technical Debt Policy + +- Minimum 10% effort per release allocated to technical debt +- Technical debt items must have: + - Creation date + - Issue description + - Current impact and mitigation + - Effort to fix + - Resolution plan +- Debt reviewed quarterly, re-prioritized based on impact + +## Version Planning + +See version-specific roadmap files: +- `/docs/roadmap/v1.0.0.md` - MVP features and plan +- `/docs/roadmap/v1.1.0.md` - First post-MVP release (future) +- `/docs/roadmap/v2.0.0.md` - Major feature release (future) + +## Estimation Accuracy + +After each feature implementation, review estimation accuracy: +- Compare actual effort vs. estimated +- Update t-shirt size if significantly different +- Document lessons learned +- Adjust future estimates accordingly + +Current estimation baseline: TBD (will be established after v1.0.0 completion) diff --git a/docs/roadmap/v1.0.0.md b/docs/roadmap/v1.0.0.md new file mode 100644 index 0000000..613f624 --- /dev/null +++ b/docs/roadmap/v1.0.0.md @@ -0,0 +1,593 @@ +# Version 1.0.0 Release Plan + +## Release Overview + +**Target Version**: 1.0.0 +**Release Type**: Initial MVP (Minimum Viable Product) +**Target Date**: TBD (6-8 weeks from project start) +**Status**: Planning + +## Release Goals + +### Primary Objective +Deliver a production-ready, W3C IndieAuth-compliant authentication server that: +1. Allows users to authenticate using their domain as their identity +2. Supports email-based domain ownership verification +3. Enables any compliant IndieAuth client to authenticate successfully +4. Operates securely in a Docker-based deployment +5. Supports 10s of users with room to scale + +### Success Criteria + +**Functional**: +- ✅ Complete IndieAuth authentication flow (authorization + token exchange) +- ✅ Email-based domain ownership verification +- ✅ DNS TXT record verification (preferred method) +- ✅ Secure token generation and storage +- ✅ Client metadata fetching (h-app microformat) + +**Quality**: +- ✅ 80%+ overall test coverage +- ✅ 95%+ coverage for authentication/token/security code +- ✅ All security best practices implemented +- ✅ Comprehensive documentation + +**Operational**: +- ✅ Docker deployment ready +- ✅ Simple SQLite backup strategy +- ✅ Health check endpoint +- ✅ Structured logging + +**Compliance**: +- ✅ W3C IndieAuth specification compliance +- ✅ OAuth 2.0 error responses +- ✅ Security headers and HTTPS enforcement + +## Feature Scope + +### Included Features (P0) + +All features listed below are REQUIRED for v1.0.0 release. + +| Feature | Size | Effort (days) | Dependencies | +|---------|------|---------------|--------------| +| Core Infrastructure | M | 3-5 | None | +| Database Schema & Storage Layer | S | 1-2 | Core Infrastructure | +| In-Memory Storage | XS | <1 | Core Infrastructure | +| Email Service | S | 1-2 | Core Infrastructure | +| DNS Service | S | 1-2 | Database Schema | +| Domain Service | M | 3-5 | Email, DNS, Database | +| Authorization Endpoint | M | 3-5 | Domain Service, In-Memory | +| Token Endpoint | S | 1-2 | Authorization Endpoint, Database | +| Metadata Endpoint | XS | <1 | Core Infrastructure | +| Email Verification UI | S | 1-2 | Email Service, Domain Service | +| Authorization Consent UI | S | 1-2 | Authorization Endpoint | +| Security Hardening | S | 1-2 | All endpoints | +| Deployment Configuration | S | 1-2 | All features | +| Comprehensive Test Suite | L | 10-14 | All features (parallel) | + +**Total Estimated Effort**: 32-44 days of development + testing + +### Explicitly Excluded Features + +These features are intentionally deferred to post-v1.0.0 releases: + +**Excluded (for simplicity)**: +- ❌ PKCE support (planned for v1.1.0, see ADR-003) +- ❌ Token refresh (planned for v1.1.0) +- ❌ Token revocation (planned for v1.1.0) +- ❌ Scope-based authorization (planned for v1.2.0) +- ❌ Rate limiting (planned for v1.1.0) + +**Excluded (not needed for MVP)**: +- ❌ Admin dashboard (planned for v1.2.0) +- ❌ Client pre-registration (planned for v1.2.0) +- ❌ Alternative auth providers (GitHub/GitLab) (planned for v1.3.0) +- ❌ WebAuthn support (planned for v2.0.0) +- ❌ PostgreSQL support (planned for v1.2.0) +- ❌ Prometheus metrics (planned for v1.1.0) + +**Rationale**: Focus on core authentication functionality with minimal complexity. Additional features add value but increase risk and development time. MVP should prove the concept and gather user feedback. + +## Implementation Plan + +### Phase 1: Foundation (Week 1-2) + +**Goal**: Establish application foundation and core services. + +**Features**: +1. Core Infrastructure (M) - 3-5 days +2. Database Schema & Storage Layer (S) - 1-2 days +3. In-Memory Storage (XS) - <1 day +4. Email Service (S) - 1-2 days +5. DNS Service (S) - 1-2 days + +**Deliverables**: +- FastAPI application running +- Configuration management working +- SQLite database initialized +- Email sending functional +- DNS queries working +- Unit tests for all services (80%+ coverage) + +**Risks**: +- SMTP configuration issues (mitigation: test with real SMTP early) +- DNS query timeouts (mitigation: implement retries and fallback) + +**Exit Criteria**: +- All foundation services have passing unit tests +- Application starts without errors +- Health check endpoint returns 200 +- Email can be sent successfully +- DNS queries resolve correctly + +--- + +### Phase 2: Domain Verification (Week 2-3) + +**Goal**: Implement complete domain ownership verification flows. + +**Features**: +1. Domain Service (M) - 3-5 days +2. Email Verification UI (S) - 1-2 days + +**Deliverables**: +- TXT record verification working +- Email verification flow complete +- Domain ownership caching in database +- User-facing verification forms +- Integration tests for both verification methods + +**Risks**: +- Email delivery failures (mitigation: comprehensive error handling) +- DNS propagation delays (mitigation: cache results, allow retry) +- UI/UX complexity (mitigation: keep forms minimal) + +**Exit Criteria**: +- Both verification methods work end-to-end +- TXT record verification preferred when available +- Email fallback works when TXT record absent +- Verification results cached in database +- UI forms accessible and functional + +--- + +### Phase 3: IndieAuth Protocol (Week 3-5) + +**Goal**: Implement core IndieAuth endpoints (authorization and token). + +**Features**: +1. Authorization Endpoint (M) - 3-5 days +2. Token Endpoint (S) - 1-2 days +3. Metadata Endpoint (XS) - <1 day +4. Authorization Consent UI (S) - 1-2 days + +**Deliverables**: +- /authorize endpoint with full validation +- /token endpoint with code exchange +- /.well-known/oauth-authorization-server metadata +- Client metadata fetching (h-app) +- User consent screen +- OAuth 2.0 compliant error responses +- Integration tests for full auth flow + +**Risks**: +- Client metadata fetching failures (mitigation: timeouts, fallbacks) +- Open redirect vulnerabilities (mitigation: thorough URL validation) +- State parameter handling (mitigation: clear documentation, tests) + +**Exit Criteria**: +- Authorization flow completes successfully +- Tokens generated and validated +- Client metadata displayed correctly +- All parameter validation working +- Error responses compliant with OAuth 2.0 +- End-to-end tests pass + +--- + +### Phase 4: Security & Hardening (Week 5-6) + +**Goal**: Ensure all security requirements met and production-ready. + +**Features**: +1. Security Hardening (S) - 1-2 days +2. Security testing - 2-3 days + +**Deliverables**: +- HTTPS enforcement (production) +- Security headers on all responses +- Constant-time token comparison +- Input sanitization throughout +- SQL injection prevention verified +- No PII in logs +- Security test suite (timing attacks, injection, etc.) +- Security documentation review + +**Risks**: +- Undiscovered vulnerabilities (mitigation: comprehensive security testing) +- Performance impact of security measures (mitigation: benchmark) + +**Exit Criteria**: +- All security tests passing +- Security headers verified +- HTTPS enforced in production +- Timing attack tests pass +- SQL injection tests pass +- No sensitive data in logs +- External security review recommended (optional but encouraged) + +--- + +### Phase 5: Deployment & Testing (Week 6-8) + +**Goal**: Prepare for production deployment with comprehensive testing. + +**Features**: +1. Deployment Configuration (S) - 1-2 days +2. Comprehensive Test Suite (L) - ongoing +3. Documentation review and updates - 2-3 days +4. Integration testing with real clients - 2-3 days + +**Deliverables**: +- Dockerfile with multi-stage build +- docker-compose.yml for testing +- Backup script for SQLite +- Complete environment variable documentation +- 80%+ test coverage achieved +- All documentation reviewed and updated +- Tested with at least one real IndieAuth client +- Release notes prepared + +**Risks**: +- Docker build issues (mitigation: test early and often) +- Interoperability issues with clients (mitigation: test multiple clients) +- Documentation gaps (mitigation: external review) + +**Exit Criteria**: +- Docker image builds successfully +- Container runs in production-like environment +- All tests passing (unit, integration, e2e, security) +- Test coverage ≥80% overall, ≥95% for critical code +- Successfully authenticates with real IndieAuth client +- Documentation complete and accurate +- Release notes approved + +--- + +## Testing Strategy + +### Test Coverage Requirements + +**Overall**: 80% minimum coverage +**Critical Paths** (auth, token, security): 95% minimum coverage +**New Code**: 90% coverage required + +### Test Levels + +**Unit Tests** (70% of test suite): +- All services (Domain, Email, DNS, Auth, Token) +- All utility functions +- Input validation +- Error handling +- Fast execution (<1 minute total) + +**Integration Tests** (20% of test suite): +- Endpoint tests (FastAPI TestClient) +- Database operations +- Email sending (mocked SMTP) +- DNS queries (mocked resolver) +- Multi-component workflows + +**End-to-End Tests** (10% of test suite): +- Complete authentication flow +- Email verification flow +- TXT record verification flow +- Error scenarios +- OAuth 2.0 error responses + +**Security Tests**: +- Timing attack resistance (token verification) +- SQL injection prevention +- XSS prevention (HTML escaping) +- Open redirect prevention +- CSRF protection (state parameter) +- Input validation edge cases + +**Compliance Tests**: +- W3C IndieAuth specification adherence +- OAuth 2.0 error response format +- Required parameters validation +- Optional parameters handling + +### Test Execution + +**Local Development**: +```bash +# All tests +uv run pytest + +# With coverage +uv run pytest --cov=src/gondulf --cov-report=html --cov-report=term-missing + +# Specific test level +uv run pytest -m unit +uv run pytest -m integration +uv run pytest -m e2e +uv run pytest -m security +``` + +**CI/CD Pipeline**: +- Run on every commit to main +- Run on all pull requests +- Block merge if tests fail +- Block merge if coverage drops +- Generate coverage reports + +**Pre-release**: +- Full test suite execution +- Manual end-to-end testing +- Test with real IndieAuth clients +- Security scan (bandit, pip-audit) +- Performance baseline + +--- + +## Risk Assessment + +### High-Risk Areas + +**Email Delivery**: +- **Risk**: SMTP configuration issues or delivery failures +- **Impact**: Users cannot verify domain ownership +- **Mitigation**: + - Comprehensive error handling and logging + - Test with real SMTP early in development + - Provide clear error messages to users + - Support TXT record as primary verification method +- **Contingency**: Admin can manually verify domains if email fails + +**Security Vulnerabilities**: +- **Risk**: Security flaws in authentication/authorization logic +- **Impact**: Unauthorized access, data exposure +- **Mitigation**: + - Follow OAuth 2.0 security best practices + - Comprehensive security testing + - External security review (recommended) + - Conservative defaults +- **Contingency**: Rapid patch release if vulnerability found + +**Interoperability**: +- **Risk**: Incompatibility with IndieAuth clients +- **Impact**: Clients cannot authenticate +- **Mitigation**: + - Strict W3C spec compliance + - Test with multiple clients + - Reference implementation comparison +- **Contingency**: Fix and patch release + +### Medium-Risk Areas + +**Client Metadata Fetching**: +- **Risk**: Timeout or parse failures when fetching client_id +- **Impact**: Poor UX (generic client display) +- **Mitigation**: + - Aggressive timeouts (5 seconds) + - Fallback to domain name + - Cache successful fetches +- **Contingency**: Display warning, continue with basic info + +**DNS Resolution**: +- **Risk**: DNS query failures or timeouts +- **Impact**: TXT verification unavailable +- **Mitigation**: + - Multiple resolvers (Google + Cloudflare) + - Timeout handling + - Fallback to email verification +- **Contingency**: Email verification as alternative + +**Database Performance**: +- **Risk**: SQLite performance degrades with usage +- **Impact**: Slow response times +- **Mitigation**: + - Indexes on critical columns + - Periodic cleanup of expired tokens + - Benchmark under load +- **Contingency**: Migrate to PostgreSQL if needed (already supported by SQLAlchemy) + +### Low-Risk Areas + +**Deployment**: +- **Risk**: Docker issues or configuration errors +- **Impact**: Cannot deploy +- **Mitigation**: Test deployment early, document thoroughly + +**UI/UX**: +- **Risk**: Forms confusing or inaccessible +- **Impact**: User frustration +- **Mitigation**: Keep forms simple, test accessibility + +--- + +## Release Checklist + +### Pre-Release + +- [ ] All P0 features implemented +- [ ] All tests passing (unit, integration, e2e, security) +- [ ] Test coverage ≥80% overall, ≥95% critical paths +- [ ] Security scan completed (bandit, pip-audit) +- [ ] Documentation complete and reviewed +- [ ] Tested with real IndieAuth client(s) +- [ ] Docker image builds successfully +- [ ] Deployment tested in production-like environment +- [ ] Environment variables documented +- [ ] Backup/restore procedure tested +- [ ] Release notes drafted +- [ ] Version bumped to 1.0.0 in pyproject.toml + +### Security Review + +- [ ] HTTPS enforcement verified +- [ ] Security headers present +- [ ] No PII in logs +- [ ] Constant-time comparisons verified +- [ ] SQL injection tests pass +- [ ] Open redirect tests pass +- [ ] CSRF protection verified +- [ ] Timing attack tests pass +- [ ] Input validation comprehensive +- [ ] External security review recommended (optional) + +### Documentation Review + +- [ ] README.md accurate and complete +- [ ] /docs/architecture/ documents accurate +- [ ] /docs/standards/ documents followed +- [ ] Installation guide tested +- [ ] Configuration guide complete +- [ ] Deployment guide tested +- [ ] API documentation generated (OpenAPI) +- [ ] Troubleshooting guide created + +### Deployment Verification + +- [ ] Docker image tagged with v1.0.0 +- [ ] Docker image pushed to registry +- [ ] Test deployment successful +- [ ] Health check endpoint responds +- [ ] Logging working correctly +- [ ] Backup script functional +- [ ] Environment variables set correctly +- [ ] HTTPS certificate valid + +### Release Publication + +- [ ] Git tag created: v1.0.0 +- [ ] GitHub release created with notes +- [ ] Docker image published +- [ ] Documentation published +- [ ] Announcement prepared (optional) + +--- + +## Post-Release Activities + +### Monitoring (First Week) + +- Monitor logs for errors +- Track authentication success/failure rates +- Monitor email delivery success +- Monitor DNS query failures +- Monitor response times +- Collect user feedback + +### Support + +- Respond to bug reports within 24 hours +- Security issues: patch within 24-48 hours +- Feature requests: triage and add to backlog +- Documentation improvements: apply quickly + +### Retrospective (After 2 Weeks) + +- Review actual vs. estimated effort +- Document lessons learned +- Update estimation baseline +- Identify technical debt +- Plan v1.1.0 features + +--- + +## Version 1.1.0 Preview + +Tentative features for next release: + +**High Priority**: +- PKCE support (ADR-003 resolution) +- Token revocation endpoint +- Rate limiting (Redis-based) +- Token introspection endpoint + +**Medium Priority**: +- Token refresh +- Prometheus metrics +- Enhanced logging + +**Technical Debt**: +- Schema migrations (Alembic) +- Redis integration (if scaling needed) + +**Target**: 4-6 weeks after v1.0.0 release + +--- + +## Success Metrics + +### Release Success + +The v1.0.0 release is successful if: + +1. **Functional**: At least one real-world user successfully authenticates +2. **Quality**: No critical bugs reported in first week +3. **Security**: No security vulnerabilities reported in first month +4. **Operational**: Server runs stably for 1 week without restarts +5. **Compliance**: Successfully interoperates with ≥2 different IndieAuth clients + +### User Success + +Users are successful if: + +1. Can verify domain ownership (either method) in <5 minutes +2. Can complete authentication flow in <2 minutes +3. Understand what is happening at each step +4. Feel secure about the process +5. Experience no unexpected errors + +### Developer Success + +Development process is successful if: + +1. Actual effort within 20% of estimated effort +2. No major scope changes during development +3. Test coverage goals met +4. No cutting corners on security +5. Documentation kept up-to-date during development + +--- + +## Budget + +**Total Estimated Effort**: 32-44 days of development + 10-14 days of testing (parallel) + +**Breakdown**: +- Phase 1 (Foundation): 7-11 days +- Phase 2 (Domain Verification): 4-7 days +- Phase 3 (IndieAuth Protocol): 6-9 days +- Phase 4 (Security): 3-5 days +- Phase 5 (Deployment & Testing): 5-8 days +- Testing (parallel throughout): 10-14 days + +**Technical Debt Allocation**: 10% = 4-5 days +- Schema migration prep +- Redis integration groundwork +- Documentation improvements + +**Total Timeline**: 6-8 weeks (assuming 1 developer, ~5 days/week) + +--- + +## Approval + +This release plan requires review and approval by: + +- [x] Architect (design complete) +- [ ] Developer (feasibility confirmed) +- [ ] User (scope confirmed) + +Once approved, this plan becomes the binding contract for v1.0.0 development. + +**Approved by**: TBD +**Approval Date**: TBD +**Development Start Date**: TBD +**Target Release Date**: TBD diff --git a/pyproject.toml b/pyproject.toml index 5275f21..b2f47ba 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -26,6 +26,9 @@ dependencies = [ "pydantic>=2.0.0", "pydantic-settings>=2.0.0", "python-multipart>=0.0.6", + "python-dotenv>=1.0.0", + "dnspython>=2.4.0", + "aiosmtplib>=3.0.0", ] [project.optional-dependencies] diff --git a/src/gondulf/config.py b/src/gondulf/config.py new file mode 100644 index 0000000..22e8c37 --- /dev/null +++ b/src/gondulf/config.py @@ -0,0 +1,125 @@ +""" +Configuration management for Gondulf IndieAuth server. + +Loads configuration from environment variables with GONDULF_ prefix. +Validates required settings on startup and provides sensible defaults. +""" + +import os +from typing import Optional + +from dotenv import load_dotenv + +# Load environment variables from .env file if present +load_dotenv() + + +class ConfigurationError(Exception): + """Raised when configuration is invalid or missing required values.""" + + pass + + +class Config: + """Application configuration loaded from environment variables.""" + + # Required settings - no defaults + SECRET_KEY: str + + # Database + DATABASE_URL: str + + # SMTP Configuration + SMTP_HOST: str + SMTP_PORT: int + SMTP_USERNAME: Optional[str] + SMTP_PASSWORD: Optional[str] + SMTP_FROM: str + SMTP_USE_TLS: bool + + # Token and Code Expiry (seconds) + TOKEN_EXPIRY: int + CODE_EXPIRY: int + + # Logging + LOG_LEVEL: str + DEBUG: bool + + @classmethod + def load(cls) -> None: + """ + Load and validate configuration from environment variables. + + Raises: + ConfigurationError: If required settings are missing or invalid + """ + # Required - SECRET_KEY must exist and be sufficiently long + secret_key = os.getenv("GONDULF_SECRET_KEY") + if not secret_key: + raise ConfigurationError( + "GONDULF_SECRET_KEY is required. Generate with: " + "python -c \"import secrets; print(secrets.token_urlsafe(32))\"" + ) + if len(secret_key) < 32: + raise ConfigurationError( + "GONDULF_SECRET_KEY must be at least 32 characters for security" + ) + cls.SECRET_KEY = secret_key + + # Database - with sensible default + cls.DATABASE_URL = os.getenv( + "GONDULF_DATABASE_URL", "sqlite:///./data/gondulf.db" + ) + + # SMTP Configuration + cls.SMTP_HOST = os.getenv("GONDULF_SMTP_HOST", "localhost") + cls.SMTP_PORT = int(os.getenv("GONDULF_SMTP_PORT", "587")) + cls.SMTP_USERNAME = os.getenv("GONDULF_SMTP_USERNAME") or None + cls.SMTP_PASSWORD = os.getenv("GONDULF_SMTP_PASSWORD") or None + cls.SMTP_FROM = os.getenv("GONDULF_SMTP_FROM", "noreply@example.com") + cls.SMTP_USE_TLS = os.getenv("GONDULF_SMTP_USE_TLS", "true").lower() == "true" + + # Token and Code Expiry + cls.TOKEN_EXPIRY = int(os.getenv("GONDULF_TOKEN_EXPIRY", "3600")) + cls.CODE_EXPIRY = int(os.getenv("GONDULF_CODE_EXPIRY", "600")) + + # Logging + cls.DEBUG = os.getenv("GONDULF_DEBUG", "false").lower() == "true" + # If DEBUG is true, default LOG_LEVEL to DEBUG, otherwise INFO + default_log_level = "DEBUG" if cls.DEBUG else "INFO" + cls.LOG_LEVEL = os.getenv("GONDULF_LOG_LEVEL", default_log_level).upper() + + # Validate log level + valid_log_levels = {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"} + if cls.LOG_LEVEL not in valid_log_levels: + raise ConfigurationError( + f"GONDULF_LOG_LEVEL must be one of: {', '.join(valid_log_levels)}" + ) + + @classmethod + def validate(cls) -> None: + """ + Validate configuration after loading. + + Performs additional validation beyond initial loading. + """ + # Validate SMTP port is reasonable + if cls.SMTP_PORT < 1 or cls.SMTP_PORT > 65535: + raise ConfigurationError( + f"GONDULF_SMTP_PORT must be between 1 and 65535, got {cls.SMTP_PORT}" + ) + + # Validate expiry times are positive + if cls.TOKEN_EXPIRY <= 0: + raise ConfigurationError( + f"GONDULF_TOKEN_EXPIRY must be positive, got {cls.TOKEN_EXPIRY}" + ) + if cls.CODE_EXPIRY <= 0: + raise ConfigurationError( + f"GONDULF_CODE_EXPIRY must be positive, got {cls.CODE_EXPIRY}" + ) + + +# Configuration is loaded lazily or explicitly by the application +# Tests should call Config.load() explicitly in fixtures +# Production code should call Config.load() at startup diff --git a/src/gondulf/database/__init__.py b/src/gondulf/database/__init__.py new file mode 100644 index 0000000..62c6837 --- /dev/null +++ b/src/gondulf/database/__init__.py @@ -0,0 +1 @@ +"""Database module for Gondulf IndieAuth server.""" diff --git a/src/gondulf/database/connection.py b/src/gondulf/database/connection.py new file mode 100644 index 0000000..2fe09c2 --- /dev/null +++ b/src/gondulf/database/connection.py @@ -0,0 +1,226 @@ +""" +Database connection management and migrations for Gondulf. + +Provides database initialization, migration running, and health checks. +""" + +import logging +from pathlib import Path +from typing import Optional +from urllib.parse import urlparse + +from sqlalchemy import create_engine, text +from sqlalchemy.engine import Engine +from sqlalchemy.exc import SQLAlchemyError + +logger = logging.getLogger("gondulf.database") + + +class DatabaseError(Exception): + """Raised when database operations fail.""" + + pass + + +class Database: + """ + Database connection manager with migration support. + + Handles database initialization, migration execution, and health checks. + """ + + def __init__(self, database_url: str): + """ + Initialize database connection. + + Args: + database_url: SQLAlchemy database URL (e.g., sqlite:///./data/gondulf.db) + """ + self.database_url = database_url + self._engine: Optional[Engine] = None + + def ensure_database_directory(self) -> None: + """ + Create database directory if it doesn't exist (for SQLite). + + Only applies to SQLite databases. Creates parent directory structure. + """ + if self.database_url.startswith("sqlite:///"): + # Parse path from URL + # sqlite:///./data/gondulf.db -> ./data/gondulf.db + # sqlite:////var/lib/gondulf/gondulf.db -> /var/lib/gondulf/gondulf.db + db_path_str = self.database_url.replace("sqlite:///", "", 1) + db_file = Path(db_path_str) + + # Create parent directory if needed + db_file.parent.mkdir(parents=True, exist_ok=True) + logger.info(f"Database directory ensured: {db_file.parent}") + + def get_engine(self) -> Engine: + """ + Get or create SQLAlchemy engine. + + Returns: + SQLAlchemy Engine instance + + Raises: + DatabaseError: If engine creation fails + """ + if self._engine is None: + try: + self._engine = create_engine( + self.database_url, + echo=False, # Don't log all SQL statements + pool_pre_ping=True, # Verify connections before using + ) + logger.debug(f"Created database engine for {self.database_url}") + except Exception as e: + raise DatabaseError(f"Failed to create database engine: {e}") from e + + return self._engine + + def check_health(self, timeout_seconds: int = 5) -> bool: + """ + Check if database is accessible and healthy. + + Args: + timeout_seconds: Query timeout in seconds + + Returns: + True if database is healthy, False otherwise + """ + try: + engine = self.get_engine() + with engine.connect() as conn: + # Simple health check query + result = conn.execute(text("SELECT 1")) + result.fetchone() + logger.debug("Database health check passed") + return True + except Exception as e: + logger.warning(f"Database health check failed: {e}") + return False + + def get_applied_migrations(self) -> set[int]: + """ + Get set of applied migration versions. + + Returns: + Set of migration version numbers that have been applied + + Raises: + DatabaseError: If query fails + """ + try: + engine = self.get_engine() + with engine.connect() as conn: + # Check if migrations table exists first + try: + result = conn.execute(text("SELECT version FROM migrations")) + versions = {row[0] for row in result} + logger.debug(f"Applied migrations: {versions}") + return versions + except SQLAlchemyError: + # Migrations table doesn't exist yet + logger.debug("Migrations table does not exist yet") + return set() + except Exception as e: + raise DatabaseError(f"Failed to query applied migrations: {e}") from e + + def run_migration(self, version: int, sql_file_path: Path) -> None: + """ + Run a single migration file. + + Args: + version: Migration version number + sql_file_path: Path to SQL migration file + + Raises: + DatabaseError: If migration fails + """ + try: + logger.info(f"Running migration {version}: {sql_file_path.name}") + + # Read SQL file + sql_content = sql_file_path.read_text() + + # Execute migration in a transaction + engine = self.get_engine() + with engine.begin() as conn: + # Split by semicolons and execute each statement + # Note: This is simple splitting, doesn't handle semicolons in strings + statements = [s.strip() for s in sql_content.split(";") if s.strip()] + + for statement in statements: + if statement: + conn.execute(text(statement)) + + logger.info(f"Migration {version} completed successfully") + + except Exception as e: + raise DatabaseError(f"Migration {version} failed: {e}") from e + + def run_migrations(self) -> None: + """ + Run all pending database migrations. + + Discovers migration files in migrations/ directory and runs any that haven't + been applied yet. + + Raises: + DatabaseError: If migrations fail + """ + # Get migrations directory + migrations_dir = Path(__file__).parent / "migrations" + if not migrations_dir.exists(): + logger.warning(f"Migrations directory not found: {migrations_dir}") + return + + # Get applied migrations + applied = self.get_applied_migrations() + + # Find all migration files + migration_files = sorted(migrations_dir.glob("*.sql")) + + if not migration_files: + logger.info("No migration files found") + return + + # Run pending migrations in order + for migration_file in migration_files: + # Extract version number from filename (e.g., "001_initial_schema.sql" -> 1) + try: + version = int(migration_file.stem.split("_")[0]) + except (ValueError, IndexError): + logger.warning(f"Skipping invalid migration filename: {migration_file}") + continue + + if version not in applied: + self.run_migration(version, migration_file) + else: + logger.debug(f"Migration {version} already applied, skipping") + + logger.info("All migrations completed") + + def initialize(self) -> None: + """ + Initialize database: create directories and run migrations. + + This is the main entry point for setting up the database. + + Raises: + DatabaseError: If initialization fails + """ + logger.info("Initializing database") + + # Ensure database directory exists (for SQLite) + self.ensure_database_directory() + + # Run migrations + self.run_migrations() + + # Verify database is healthy + if not self.check_health(): + raise DatabaseError("Database health check failed after initialization") + + logger.info("Database initialization complete") diff --git a/src/gondulf/database/migrations/001_initial_schema.sql b/src/gondulf/database/migrations/001_initial_schema.sql new file mode 100644 index 0000000..fd28790 --- /dev/null +++ b/src/gondulf/database/migrations/001_initial_schema.sql @@ -0,0 +1,38 @@ +-- Migration 001: Initial schema for Gondulf v1.0.0 Phase 1 +-- Creates tables for authorization codes, domain verification, and migration tracking + +-- Authorization codes table +-- Stores temporary OAuth 2.0 authorization codes with PKCE support +CREATE TABLE authorization_codes ( + code TEXT PRIMARY KEY, + client_id TEXT NOT NULL, + redirect_uri TEXT NOT NULL, + state TEXT, + code_challenge TEXT, + code_challenge_method TEXT, + scope TEXT, + me TEXT NOT NULL, + created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP +); + +-- Domains table +-- Stores domain ownership verification records +CREATE TABLE domains ( + domain TEXT PRIMARY KEY, + email TEXT NOT NULL, + verification_code TEXT NOT NULL, + verified BOOLEAN NOT NULL DEFAULT FALSE, + created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + verified_at TIMESTAMP +); + +-- Migrations table +-- Tracks applied database migrations +CREATE TABLE migrations ( + version INTEGER PRIMARY KEY, + description TEXT NOT NULL, + applied_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP +); + +-- Record this migration +INSERT INTO migrations (version, description) VALUES (1, 'Initial schema - authorization_codes, domains, migrations tables'); diff --git a/src/gondulf/dns.py b/src/gondulf/dns.py new file mode 100644 index 0000000..03c9003 --- /dev/null +++ b/src/gondulf/dns.py @@ -0,0 +1,160 @@ +""" +DNS service for TXT record verification. + +Provides domain verification via DNS TXT records with system DNS resolver +and fallback to public DNS servers. +""" + +import logging +from typing import List, Optional + +import dns.resolver +from dns.exception import DNSException + +logger = logging.getLogger("gondulf.dns") + + +class DNSError(Exception): + """Raised when DNS queries fail.""" + + pass + + +class DNSService: + """ + DNS resolver service for TXT record verification. + + Uses system DNS with fallback to public DNS (Google and Cloudflare). + """ + + def __init__(self) -> None: + """Initialize DNS service with system resolver and public fallbacks.""" + self.resolver = self._create_resolver() + logger.debug("DNSService initialized with system resolver") + + def _create_resolver(self) -> dns.resolver.Resolver: + """ + Create DNS resolver with system DNS and public fallbacks. + + Returns: + Configured DNS resolver + """ + resolver = dns.resolver.Resolver() + + # System DNS is already configured by default + # If system DNS fails to load, use public DNS as fallback + if not resolver.nameservers: + logger.info("System DNS not available, using public DNS fallback") + resolver.nameservers = ["8.8.8.8", "1.1.1.1"] + else: + logger.debug(f"Using system DNS: {resolver.nameservers}") + + return resolver + + def get_txt_records(self, domain: str) -> List[str]: + """ + Query TXT records for a domain. + + Args: + domain: Domain name to query + + Returns: + List of TXT record strings (decoded from bytes) + + Raises: + DNSError: If DNS query fails + """ + try: + logger.debug(f"Querying TXT records for domain={domain}") + answers = self.resolver.resolve(domain, "TXT") + + # Extract and decode TXT records + txt_records = [] + for rdata in answers: + # Each TXT record can have multiple strings, join them + txt_value = "".join([s.decode("utf-8") for s in rdata.strings]) + txt_records.append(txt_value) + + logger.info(f"Found {len(txt_records)} TXT record(s) for domain={domain}") + return txt_records + + except dns.resolver.NXDOMAIN: + logger.debug(f"Domain does not exist: {domain}") + raise DNSError(f"Domain does not exist: {domain}") + except dns.resolver.NoAnswer: + logger.debug(f"No TXT records found for domain={domain}") + return [] # No TXT records is not an error, return empty list + except dns.resolver.Timeout: + logger.warning(f"DNS query timeout for domain={domain}") + raise DNSError(f"DNS query timeout for domain: {domain}") + except DNSException as e: + logger.error(f"DNS query failed for domain={domain}: {e}") + raise DNSError(f"DNS query failed: {e}") from e + + def verify_txt_record(self, domain: str, expected_value: str) -> bool: + """ + Verify that domain has a TXT record with the expected value. + + Args: + domain: Domain name to verify + expected_value: Expected TXT record value + + Returns: + True if expected value found in TXT records, False otherwise + """ + try: + txt_records = self.get_txt_records(domain) + + # Check if expected value is in any TXT record + for record in txt_records: + if expected_value in record: + logger.info( + f"TXT record verification successful for domain={domain}" + ) + return True + + logger.debug( + f"TXT record verification failed: expected value not found " + f"for domain={domain}" + ) + return False + + except DNSError as e: + logger.warning(f"TXT record verification failed for domain={domain}: {e}") + return False + + def check_domain_exists(self, domain: str) -> bool: + """ + Check if a domain exists (has any DNS records). + + Args: + domain: Domain name to check + + Returns: + True if domain exists, False otherwise + """ + try: + # Try to resolve A or AAAA record + try: + self.resolver.resolve(domain, "A") + logger.debug(f"Domain exists (A record): {domain}") + return True + except dns.resolver.NoAnswer: + # Try AAAA if no A record + try: + self.resolver.resolve(domain, "AAAA") + logger.debug(f"Domain exists (AAAA record): {domain}") + return True + except dns.resolver.NoAnswer: + # Try any record type (TXT, MX, etc.) + # If NXDOMAIN not raised, domain exists + logger.debug(f"Domain exists (other records): {domain}") + return True + + except dns.resolver.NXDOMAIN: + logger.debug(f"Domain does not exist: {domain}") + return False + except DNSException as e: + logger.warning(f"DNS check failed for domain={domain}: {e}") + # Treat DNS errors as "unknown" - return False to be safe + return False diff --git a/src/gondulf/email.py b/src/gondulf/email.py new file mode 100644 index 0000000..db5cb39 --- /dev/null +++ b/src/gondulf/email.py @@ -0,0 +1,177 @@ +""" +Email service for sending verification codes via SMTP. + +Supports both STARTTLS (port 587) and implicit TLS (port 465) based on +configuration. Handles authentication and error cases. +""" + +import logging +import smtplib +from email.mime.multipart import MIMEMultipart +from email.mime.text import MIMEText +from typing import Optional + +logger = logging.getLogger("gondulf.email") + + +class EmailError(Exception): + """Raised when email sending fails.""" + + pass + + +class EmailService: + """ + SMTP email service for sending verification emails. + + Supports STARTTLS and implicit TLS configurations based on port number. + """ + + def __init__( + self, + smtp_host: str, + smtp_port: int, + smtp_from: str, + smtp_username: Optional[str] = None, + smtp_password: Optional[str] = None, + smtp_use_tls: bool = True, + ): + """ + Initialize email service. + + Args: + smtp_host: SMTP server hostname + smtp_port: SMTP server port (587 for STARTTLS, 465 for implicit TLS) + smtp_from: From address for sent emails + smtp_username: SMTP username for authentication (optional) + smtp_password: SMTP password for authentication (optional) + smtp_use_tls: Whether to use TLS (STARTTLS on port 587) + """ + self.smtp_host = smtp_host + self.smtp_port = smtp_port + self.smtp_from = smtp_from + self.smtp_username = smtp_username + self.smtp_password = smtp_password + self.smtp_use_tls = smtp_use_tls + + logger.debug( + f"EmailService initialized: host={smtp_host} port={smtp_port} " + f"tls={smtp_use_tls}" + ) + + def send_verification_code(self, to_email: str, code: str, domain: str) -> None: + """ + Send domain verification code via email. + + Args: + to_email: Recipient email address + code: Verification code to send + domain: Domain being verified + + Raises: + EmailError: If sending fails + """ + subject = f"Domain Verification Code for {domain}" + body = f""" +Hello, + +Your domain verification code for {domain} is: + + {code} + +This code will expire in 10 minutes. + +If you did not request this verification, please ignore this email. + +--- +Gondulf IndieAuth Server +""" + + try: + self._send_email(to_email, subject, body) + logger.info(f"Verification code sent to {to_email} for domain={domain}") + except Exception as e: + logger.error(f"Failed to send verification email to {to_email}: {e}") + raise EmailError(f"Failed to send verification email: {e}") from e + + def _send_email(self, to_email: str, subject: str, body: str) -> None: + """ + Send email via SMTP. + + Handles STARTTLS vs implicit TLS based on port configuration. + + Args: + to_email: Recipient email address + subject: Email subject + body: Email body (plain text) + + Raises: + EmailError: If sending fails + """ + # Create message + msg = MIMEMultipart() + msg["From"] = self.smtp_from + msg["To"] = to_email + msg["Subject"] = subject + msg.attach(MIMEText(body, "plain")) + + try: + # Determine connection type based on port + if self.smtp_port == 465: + # Implicit TLS (SSL/TLS from start) + logger.debug("Using implicit TLS (SMTP_SSL)") + server = smtplib.SMTP_SSL(self.smtp_host, self.smtp_port, timeout=10) + elif self.smtp_port == 587 and self.smtp_use_tls: + # STARTTLS (upgrade plain connection to TLS) + logger.debug("Using STARTTLS") + server = smtplib.SMTP(self.smtp_host, self.smtp_port, timeout=10) + server.starttls() + else: + # Unencrypted (for testing only) + logger.warning("Using unencrypted SMTP connection") + server = smtplib.SMTP(self.smtp_host, self.smtp_port, timeout=10) + + # Authenticate if credentials provided + if self.smtp_username and self.smtp_password: + logger.debug(f"Authenticating as {self.smtp_username}") + server.login(self.smtp_username, self.smtp_password) + + # Send email + server.send_message(msg) + server.quit() + + logger.debug(f"Email sent successfully to {to_email}") + + except smtplib.SMTPAuthenticationError as e: + raise EmailError(f"SMTP authentication failed: {e}") from e + except smtplib.SMTPException as e: + raise EmailError(f"SMTP error: {e}") from e + except Exception as e: + raise EmailError(f"Failed to send email: {e}") from e + + def test_connection(self) -> bool: + """ + Test SMTP connection and authentication. + + Returns: + True if connection successful, False otherwise + """ + try: + if self.smtp_port == 465: + server = smtplib.SMTP_SSL(self.smtp_host, self.smtp_port, timeout=10) + elif self.smtp_port == 587 and self.smtp_use_tls: + server = smtplib.SMTP(self.smtp_host, self.smtp_port, timeout=10) + server.starttls() + else: + server = smtplib.SMTP(self.smtp_host, self.smtp_port, timeout=10) + + if self.smtp_username and self.smtp_password: + server.login(self.smtp_username, self.smtp_password) + + server.quit() + logger.info("SMTP connection test successful") + return True + + except Exception as e: + logger.warning(f"SMTP connection test failed: {e}") + return False diff --git a/src/gondulf/logging_config.py b/src/gondulf/logging_config.py new file mode 100644 index 0000000..ebbe98b --- /dev/null +++ b/src/gondulf/logging_config.py @@ -0,0 +1,57 @@ +""" +Logging configuration for Gondulf IndieAuth server. + +Provides structured logging with consistent format across all modules. +Uses Python's standard logging module with configurable levels. +""" + +import logging +import sys + + +def configure_logging(log_level: str = "INFO", debug: bool = False) -> None: + """ + Configure application logging. + + Sets up structured logging format and level for all Gondulf modules. + Logs to stdout/stderr for container-friendly output. + + Args: + log_level: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) + debug: If True, overrides log_level to DEBUG + """ + # Determine effective log level + effective_level = "DEBUG" if debug else log_level + + # Configure root logger + logging.basicConfig( + level=effective_level, + format="%(asctime)s [%(levelname)s] %(name)s: %(message)s", + datefmt="%Y-%m-%d %H:%M:%S", + stream=sys.stdout, + force=True, # Override any existing configuration + ) + + # Set level for gondulf modules specifically + gondulf_logger = logging.getLogger("gondulf") + gondulf_logger.setLevel(effective_level) + + # Reduce noise from third-party libraries in production + if not debug: + logging.getLogger("urllib3").setLevel(logging.WARNING) + logging.getLogger("sqlalchemy").setLevel(logging.WARNING) + + logging.info(f"Logging configured: level={effective_level}") + + +def get_logger(name: str) -> logging.Logger: + """ + Get a logger instance for a module. + + Args: + name: Logger name (typically __name__ from calling module) + + Returns: + Configured logger instance + """ + return logging.getLogger(name) diff --git a/src/gondulf/main.py b/src/gondulf/main.py new file mode 100644 index 0000000..a60ebef --- /dev/null +++ b/src/gondulf/main.py @@ -0,0 +1,166 @@ +""" +Gondulf IndieAuth Server - Main application entry point. + +FastAPI application with health check endpoint and core service initialization. +""" + +import logging + +from fastapi import FastAPI +from fastapi.responses import JSONResponse + +from gondulf.config import Config +from gondulf.database.connection import Database +from gondulf.dns import DNSService +from gondulf.email import EmailService +from gondulf.logging_config import configure_logging +from gondulf.storage import CodeStore + +# Load configuration at application startup +Config.load() +Config.validate() + +# Configure logging +configure_logging(log_level=Config.LOG_LEVEL, debug=Config.DEBUG) +logger = logging.getLogger("gondulf.main") + +# Initialize FastAPI application +app = FastAPI( + title="Gondulf IndieAuth Server", + description="Self-hosted IndieAuth authentication server", + version="0.1.0-dev", +) + +# Initialize core services +database: Database = None +code_store: CodeStore = None +email_service: EmailService = None +dns_service: DNSService = None + + +@app.on_event("startup") +async def startup_event() -> None: + """ + Initialize application on startup. + + Initializes database, code storage, email service, and DNS service. + """ + global database, code_store, email_service, dns_service + + logger.info("Starting Gondulf IndieAuth Server") + logger.info(f"Configuration: DATABASE_URL={Config.DATABASE_URL}") + logger.info(f"Configuration: SMTP_HOST={Config.SMTP_HOST}:{Config.SMTP_PORT}") + logger.info(f"Configuration: DEBUG={Config.DEBUG}") + + try: + # Initialize database + logger.info("Initializing database") + database = Database(Config.DATABASE_URL) + database.initialize() + logger.info("Database initialized successfully") + + # Initialize code store + logger.info("Initializing code store") + code_store = CodeStore(ttl_seconds=Config.CODE_EXPIRY) + logger.info(f"Code store initialized with TTL={Config.CODE_EXPIRY}s") + + # Initialize email service + logger.info("Initializing email service") + email_service = EmailService( + smtp_host=Config.SMTP_HOST, + smtp_port=Config.SMTP_PORT, + smtp_from=Config.SMTP_FROM, + smtp_username=Config.SMTP_USERNAME, + smtp_password=Config.SMTP_PASSWORD, + smtp_use_tls=Config.SMTP_USE_TLS, + ) + logger.info("Email service initialized") + + # Initialize DNS service + logger.info("Initializing DNS service") + dns_service = DNSService() + logger.info("DNS service initialized") + + logger.info("Gondulf startup complete") + + except Exception as e: + logger.critical(f"Failed to initialize application: {e}") + raise + + +@app.on_event("shutdown") +async def shutdown_event() -> None: + """Clean up resources on shutdown.""" + logger.info("Shutting down Gondulf IndieAuth Server") + + +@app.get("/health") +async def health_check() -> JSONResponse: + """ + Health check endpoint. + + Verifies that the application is running and database is accessible. + Does not require authentication. + + Returns: + JSON response with health status: + - 200 OK: {"status": "healthy", "database": "connected"} + - 503 Service Unavailable: {"status": "unhealthy", "database": "error", "error": "..."} + """ + # Check database connectivity + if database is None: + logger.warning("Health check failed: database not initialized") + return JSONResponse( + status_code=503, + content={ + "status": "unhealthy", + "database": "error", + "error": "database not initialized", + }, + ) + + is_healthy = database.check_health(timeout_seconds=5) + + if is_healthy: + logger.debug("Health check passed") + return JSONResponse( + status_code=200, + content={"status": "healthy", "database": "connected"}, + ) + else: + logger.warning("Health check failed: unable to connect to database") + return JSONResponse( + status_code=503, + content={ + "status": "unhealthy", + "database": "error", + "error": "unable to connect to database", + }, + ) + + +@app.get("/") +async def root() -> dict: + """ + Root endpoint. + + Returns basic server information. + """ + return { + "service": "Gondulf IndieAuth Server", + "version": "0.1.0-dev", + "status": "operational", + } + + +# Entry point for uvicorn +if __name__ == "__main__": + import uvicorn + + uvicorn.run( + "gondulf.main:app", + host="0.0.0.0", + port=8000, + reload=Config.DEBUG, + log_level=Config.LOG_LEVEL.lower(), + ) diff --git a/src/gondulf/storage.py b/src/gondulf/storage.py new file mode 100644 index 0000000..75bd3f0 --- /dev/null +++ b/src/gondulf/storage.py @@ -0,0 +1,150 @@ +""" +In-memory storage for short-lived codes with TTL. + +Provides simple dict-based storage for email verification codes and authorization +codes with automatic expiration checking on access. +""" + +import logging +import time +from typing import Dict, Optional, Tuple + +logger = logging.getLogger("gondulf.storage") + + +class CodeStore: + """ + In-memory storage for domain verification codes with TTL. + + Stores codes with expiration timestamps and automatically removes expired + codes on access. No background cleanup needed - cleanup happens lazily. + """ + + def __init__(self, ttl_seconds: int = 600): + """ + Initialize code store. + + Args: + ttl_seconds: Time-to-live for codes in seconds (default: 600 = 10 minutes) + """ + self._store: Dict[str, Tuple[str, float]] = {} + self._ttl = ttl_seconds + logger.debug(f"CodeStore initialized with TTL={ttl_seconds}s") + + def store(self, key: str, code: str) -> None: + """ + Store verification code with expiry timestamp. + + Args: + key: Storage key (typically email address or similar identifier) + code: Verification code to store + """ + expiry = time.time() + self._ttl + self._store[key] = (code, expiry) + logger.debug(f"Code stored for key={key} expires_in={self._ttl}s") + + def verify(self, key: str, code: str) -> bool: + """ + Verify code matches stored value and remove from store. + + Checks both expiration and code matching. If valid, removes the code + from storage (single-use). Expired codes are also removed. + + Args: + key: Storage key to verify + code: Code to verify + + Returns: + True if code matches and is not expired, False otherwise + """ + if key not in self._store: + logger.debug(f"Verification failed: key={key} not found") + return False + + stored_code, expiry = self._store[key] + + # Check expiration + if time.time() > expiry: + del self._store[key] + logger.debug(f"Verification failed: key={key} expired") + return False + + # Check code match + if code != stored_code: + logger.debug(f"Verification failed: key={key} code mismatch") + return False + + # Valid - remove from store (single use) + del self._store[key] + logger.info(f"Code verified successfully for key={key}") + return True + + def get(self, key: str) -> Optional[str]: + """ + Get code without removing it (for testing/debugging). + + Checks expiration and removes expired codes. + + Args: + key: Storage key to retrieve + + Returns: + Code if exists and not expired, None otherwise + """ + if key not in self._store: + return None + + stored_code, expiry = self._store[key] + + # Check expiration + if time.time() > expiry: + del self._store[key] + return None + + return stored_code + + def delete(self, key: str) -> None: + """ + Explicitly delete a code from storage. + + Args: + key: Storage key to delete + """ + if key in self._store: + del self._store[key] + logger.debug(f"Code deleted for key={key}") + + def cleanup_expired(self) -> int: + """ + Manually cleanup all expired codes. + + This is optional - cleanup happens automatically on access. But can be + called periodically if needed to free memory. + + Returns: + Number of expired codes removed + """ + now = time.time() + expired_keys = [key for key, (_, expiry) in self._store.items() if now > expiry] + + for key in expired_keys: + del self._store[key] + + if expired_keys: + logger.debug(f"Cleaned up {len(expired_keys)} expired codes") + + return len(expired_keys) + + def size(self) -> int: + """ + Get number of codes currently in storage (including expired). + + Returns: + Number of codes in storage + """ + return len(self._store) + + def clear(self) -> None: + """Clear all codes from storage.""" + self._store.clear() + logger.debug("Code store cleared") diff --git a/tests/conftest.py b/tests/conftest.py new file mode 100644 index 0000000..712cde2 --- /dev/null +++ b/tests/conftest.py @@ -0,0 +1,20 @@ +""" +Pytest configuration and shared fixtures. +""" + +import pytest + + +@pytest.fixture(autouse=True) +def reset_config_before_test(monkeypatch): + """ + Reset configuration before each test. + + This prevents config from one test affecting another test. + """ + # Clear all GONDULF_ environment variables + import os + + gondulf_vars = [key for key in os.environ.keys() if key.startswith("GONDULF_")] + for var in gondulf_vars: + monkeypatch.delenv(var, raising=False) diff --git a/tests/integration/__init__.py b/tests/integration/__init__.py new file mode 100644 index 0000000..c66cd71 --- /dev/null +++ b/tests/integration/__init__.py @@ -0,0 +1 @@ +"""Integration tests package.""" diff --git a/tests/integration/test_health.py b/tests/integration/test_health.py new file mode 100644 index 0000000..ee827e0 --- /dev/null +++ b/tests/integration/test_health.py @@ -0,0 +1,101 @@ +""" +Integration tests for health check endpoint. + +Tests the /health endpoint with actual FastAPI TestClient. +""" + +import tempfile +from pathlib import Path + +import pytest +from fastapi.testclient import TestClient + + +class TestHealthEndpoint: + """Integration tests for /health endpoint.""" + + @pytest.fixture + def test_app(self, monkeypatch): + """Create test FastAPI app with temporary database.""" + # Set up test environment + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + + # Set required environment variables + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.setenv("GONDULF_DATABASE_URL", f"sqlite:///{db_path}") + monkeypatch.setenv("GONDULF_DEBUG", "true") + + # Import app AFTER setting env vars + from gondulf.main import app + + yield app + + def test_health_check_success(self, test_app): + """Test health check returns 200 when database is healthy.""" + with TestClient(test_app) as client: + response = client.get("/health") + + assert response.status_code == 200 + data = response.json() + assert data["status"] == "healthy" + assert data["database"] == "connected" + + def test_health_check_response_format(self, test_app): + """Test health check response has correct format.""" + with TestClient(test_app) as client: + response = client.get("/health") + + assert response.status_code == 200 + data = response.json() + assert "status" in data + assert "database" in data + + def test_health_check_no_auth_required(self, test_app): + """Test health check endpoint doesn't require authentication.""" + with TestClient(test_app) as client: + # Should work without any authentication headers + response = client.get("/health") + + assert response.status_code == 200 + + def test_root_endpoint(self, test_app): + """Test root endpoint returns service information.""" + client = TestClient(test_app) + + response = client.get("/") + + assert response.status_code == 200 + data = response.json() + assert "service" in data + assert "version" in data + assert "Gondulf" in data["service"] + + +class TestHealthCheckUnhealthy: + """Tests for unhealthy database scenarios.""" + + def test_health_check_unhealthy_bad_database(self, monkeypatch): + """Test health check returns 503 when database inaccessible.""" + # Set up with non-existent database path + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.setenv( + "GONDULF_DATABASE_URL", "sqlite:////nonexistent/path/db.db" + ) + monkeypatch.setenv("GONDULF_DEBUG", "true") + + # Import app AFTER setting env vars + # This should fail during startup, so we need to handle it + try: + from gondulf.main import app + + client = TestClient(app, raise_server_exceptions=False) + response = client.get("/health") + + # If startup succeeds but health check fails + assert response.status_code == 503 + data = response.json() + assert data["status"] == "unhealthy" + except Exception: + # Startup failure is also acceptable for this test + pass diff --git a/tests/unit/__init__.py b/tests/unit/__init__.py new file mode 100644 index 0000000..ea3f8b9 --- /dev/null +++ b/tests/unit/__init__.py @@ -0,0 +1 @@ +"""Unit tests package.""" diff --git a/tests/unit/test_config.py b/tests/unit/test_config.py new file mode 100644 index 0000000..a96490a --- /dev/null +++ b/tests/unit/test_config.py @@ -0,0 +1,182 @@ +""" +Unit tests for configuration module. + +Tests environment variable loading, validation, and error handling. +""" + +import os +import pytest + +from gondulf.config import Config, ConfigurationError + + +class TestConfigLoad: + """Tests for Config.load() method.""" + + def test_load_with_valid_secret_key(self, monkeypatch): + """Test configuration loads successfully with valid SECRET_KEY.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + Config.load() + assert Config.SECRET_KEY == "a" * 32 + + def test_load_missing_secret_key_raises_error(self, monkeypatch): + """Test that missing SECRET_KEY raises ConfigurationError.""" + monkeypatch.delenv("GONDULF_SECRET_KEY", raising=False) + with pytest.raises(ConfigurationError, match="GONDULF_SECRET_KEY is required"): + Config.load() + + def test_load_short_secret_key_raises_error(self, monkeypatch): + """Test that SECRET_KEY shorter than 32 chars raises error.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "short") + with pytest.raises(ConfigurationError, match="at least 32 characters"): + Config.load() + + def test_load_database_url_default(self, monkeypatch): + """Test DATABASE_URL defaults to sqlite:///./data/gondulf.db.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.delenv("GONDULF_DATABASE_URL", raising=False) + Config.load() + assert Config.DATABASE_URL == "sqlite:///./data/gondulf.db" + + def test_load_database_url_custom(self, monkeypatch): + """Test DATABASE_URL can be customized.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.setenv("GONDULF_DATABASE_URL", "sqlite:////tmp/test.db") + Config.load() + assert Config.DATABASE_URL == "sqlite:////tmp/test.db" + + def test_load_smtp_configuration_defaults(self, monkeypatch): + """Test SMTP configuration uses sensible defaults.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + for key in [ + "GONDULF_SMTP_HOST", + "GONDULF_SMTP_PORT", + "GONDULF_SMTP_USERNAME", + "GONDULF_SMTP_PASSWORD", + "GONDULF_SMTP_FROM", + "GONDULF_SMTP_USE_TLS", + ]: + monkeypatch.delenv(key, raising=False) + + Config.load() + + assert Config.SMTP_HOST == "localhost" + assert Config.SMTP_PORT == 587 + assert Config.SMTP_USERNAME is None + assert Config.SMTP_PASSWORD is None + assert Config.SMTP_FROM == "noreply@example.com" + assert Config.SMTP_USE_TLS is True + + def test_load_smtp_configuration_custom(self, monkeypatch): + """Test SMTP configuration can be customized.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.setenv("GONDULF_SMTP_HOST", "smtp.gmail.com") + monkeypatch.setenv("GONDULF_SMTP_PORT", "465") + monkeypatch.setenv("GONDULF_SMTP_USERNAME", "user@gmail.com") + monkeypatch.setenv("GONDULF_SMTP_PASSWORD", "password123") + monkeypatch.setenv("GONDULF_SMTP_FROM", "sender@example.com") + monkeypatch.setenv("GONDULF_SMTP_USE_TLS", "false") + + Config.load() + + assert Config.SMTP_HOST == "smtp.gmail.com" + assert Config.SMTP_PORT == 465 + assert Config.SMTP_USERNAME == "user@gmail.com" + assert Config.SMTP_PASSWORD == "password123" + assert Config.SMTP_FROM == "sender@example.com" + assert Config.SMTP_USE_TLS is False + + def test_load_token_expiry_default(self, monkeypatch): + """Test TOKEN_EXPIRY defaults to 3600 seconds.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.delenv("GONDULF_TOKEN_EXPIRY", raising=False) + Config.load() + assert Config.TOKEN_EXPIRY == 3600 + + def test_load_code_expiry_default(self, monkeypatch): + """Test CODE_EXPIRY defaults to 600 seconds.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.delenv("GONDULF_CODE_EXPIRY", raising=False) + Config.load() + assert Config.CODE_EXPIRY == 600 + + def test_load_token_expiry_custom(self, monkeypatch): + """Test TOKEN_EXPIRY can be customized.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.setenv("GONDULF_TOKEN_EXPIRY", "7200") + Config.load() + assert Config.TOKEN_EXPIRY == 7200 + + def test_load_log_level_default_production(self, monkeypatch): + """Test LOG_LEVEL defaults to INFO in production mode.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.delenv("GONDULF_LOG_LEVEL", raising=False) + monkeypatch.delenv("GONDULF_DEBUG", raising=False) + Config.load() + assert Config.LOG_LEVEL == "INFO" + assert Config.DEBUG is False + + def test_load_log_level_default_debug(self, monkeypatch): + """Test LOG_LEVEL defaults to DEBUG when DEBUG=true.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.delenv("GONDULF_LOG_LEVEL", raising=False) + monkeypatch.setenv("GONDULF_DEBUG", "true") + Config.load() + assert Config.LOG_LEVEL == "DEBUG" + assert Config.DEBUG is True + + def test_load_log_level_custom(self, monkeypatch): + """Test LOG_LEVEL can be customized.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.setenv("GONDULF_LOG_LEVEL", "WARNING") + Config.load() + assert Config.LOG_LEVEL == "WARNING" + + def test_load_invalid_log_level_raises_error(self, monkeypatch): + """Test invalid LOG_LEVEL raises ConfigurationError.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + monkeypatch.setenv("GONDULF_LOG_LEVEL", "INVALID") + with pytest.raises(ConfigurationError, match="must be one of"): + Config.load() + + +class TestConfigValidate: + """Tests for Config.validate() method.""" + + def test_validate_valid_configuration(self, monkeypatch): + """Test validation passes with valid configuration.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + Config.load() + Config.validate() # Should not raise + + def test_validate_smtp_port_too_low(self, monkeypatch): + """Test validation fails when SMTP_PORT < 1.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + Config.load() + Config.SMTP_PORT = 0 + with pytest.raises(ConfigurationError, match="must be between 1 and 65535"): + Config.validate() + + def test_validate_smtp_port_too_high(self, monkeypatch): + """Test validation fails when SMTP_PORT > 65535.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + Config.load() + Config.SMTP_PORT = 70000 + with pytest.raises(ConfigurationError, match="must be between 1 and 65535"): + Config.validate() + + def test_validate_token_expiry_negative(self, monkeypatch): + """Test validation fails when TOKEN_EXPIRY <= 0.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + Config.load() + Config.TOKEN_EXPIRY = -1 + with pytest.raises(ConfigurationError, match="must be positive"): + Config.validate() + + def test_validate_code_expiry_zero(self, monkeypatch): + """Test validation fails when CODE_EXPIRY <= 0.""" + monkeypatch.setenv("GONDULF_SECRET_KEY", "a" * 32) + Config.load() + Config.CODE_EXPIRY = 0 + with pytest.raises(ConfigurationError, match="must be positive"): + Config.validate() diff --git a/tests/unit/test_database.py b/tests/unit/test_database.py new file mode 100644 index 0000000..fd53d38 --- /dev/null +++ b/tests/unit/test_database.py @@ -0,0 +1,274 @@ +""" +Unit tests for database connection and migrations. + +Tests database initialization, migration running, and health checks. +""" + +import tempfile +from pathlib import Path + +import pytest +from sqlalchemy import text + +from gondulf.database.connection import Database, DatabaseError + + +class TestDatabaseInit: + """Tests for Database initialization.""" + + def test_init_with_valid_url(self): + """Test Database can be initialized with valid URL.""" + db = Database("sqlite:///:memory:") + assert db.database_url == "sqlite:///:memory:" + + def test_init_with_file_url(self): + """Test Database can be initialized with file URL.""" + db = Database("sqlite:///./test.db") + assert db.database_url == "sqlite:///./test.db" + + +class TestDatabaseDirectory: + """Tests for database directory creation.""" + + def test_ensure_directory_creates_parent(self): + """Test ensure_database_directory creates parent directories.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "subdir" / "nested" / "test.db" + db_url = f"sqlite:///{db_path}" + + db = Database(db_url) + db.ensure_database_directory() + + assert db_path.parent.exists() + + def test_ensure_directory_relative_path(self): + """Test ensure_database_directory works with relative paths.""" + with tempfile.TemporaryDirectory() as tmpdir: + # Change to temp dir temporarily to test relative paths + import os + + original_cwd = os.getcwd() + try: + os.chdir(tmpdir) + + db = Database("sqlite:///./data/test.db") + db.ensure_database_directory() + + assert Path("data").exists() + finally: + os.chdir(original_cwd) + + def test_ensure_directory_does_not_fail_if_exists(self): + """Test ensure_database_directory doesn't fail if directory exists.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db_url = f"sqlite:///{db_path}" + + db = Database(db_url) + db.ensure_database_directory() + # Call again - should not raise + db.ensure_database_directory() + + +class TestDatabaseEngine: + """Tests for database engine creation.""" + + def test_get_engine_creates_engine(self): + """Test get_engine creates SQLAlchemy engine.""" + db = Database("sqlite:///:memory:") + engine = db.get_engine() + + assert engine is not None + assert engine.url.drivername == "sqlite" + + def test_get_engine_returns_same_instance(self): + """Test get_engine returns same engine instance.""" + db = Database("sqlite:///:memory:") + engine1 = db.get_engine() + engine2 = db.get_engine() + + assert engine1 is engine2 + + def test_get_engine_with_invalid_url_raises_error(self): + """Test get_engine raises DatabaseError with invalid URL.""" + db = Database("invalid://bad_url") + + with pytest.raises(DatabaseError, match="Failed to create database engine"): + db.get_engine() + + +class TestDatabaseHealth: + """Tests for database health checks.""" + + def test_check_health_success(self): + """Test health check passes for healthy database.""" + db = Database("sqlite:///:memory:") + db.get_engine() # Initialize engine + + assert db.check_health() is True + + def test_check_health_failure(self): + """Test health check fails for inaccessible database.""" + db = Database("sqlite:////nonexistent/path/db.db") + + # Trying to check health on non-existent DB should fail gracefully + assert db.check_health() is False + + +class TestDatabaseMigrations: + """Tests for database migrations.""" + + def test_get_applied_migrations_empty(self): + """Test get_applied_migrations returns empty set for new database.""" + db = Database("sqlite:///:memory:") + db.get_engine() # Initialize engine + + migrations = db.get_applied_migrations() + + assert migrations == set() + + def test_get_applied_migrations_after_running(self): + """Test get_applied_migrations returns versions after running migrations.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db = Database(f"sqlite:///{db_path}") + + # Initialize will run migrations + db.initialize() + + migrations = db.get_applied_migrations() + + # Migration 001 should be applied + assert 1 in migrations + + def test_run_migrations_creates_tables(self): + """Test run_migrations creates expected tables.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db = Database(f"sqlite:///{db_path}") + + db.ensure_database_directory() + db.run_migrations() + + # Check that tables were created + engine = db.get_engine() + with engine.connect() as conn: + # Check migrations table + result = conn.execute(text("SELECT name FROM sqlite_master WHERE type='table'")) + tables = {row[0] for row in result} + + assert "migrations" in tables + assert "authorization_codes" in tables + assert "domains" in tables + + def test_run_migrations_idempotent(self): + """Test run_migrations can be run multiple times safely.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db = Database(f"sqlite:///{db_path}") + + db.ensure_database_directory() + db.run_migrations() + # Run again - should not raise or duplicate + db.run_migrations() + + engine = db.get_engine() + with engine.connect() as conn: + # Check migration was recorded only once + result = conn.execute(text("SELECT COUNT(*) FROM migrations")) + count = result.fetchone()[0] + assert count == 1 + + def test_initialize_full_setup(self): + """Test initialize performs full database setup.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db = Database(f"sqlite:///{db_path}") + + db.initialize() + + # Verify database is healthy + assert db.check_health() is True + + # Verify migrations ran + migrations = db.get_applied_migrations() + assert 1 in migrations + + # Verify tables exist + engine = db.get_engine() + with engine.connect() as conn: + result = conn.execute(text("SELECT name FROM sqlite_master WHERE type='table'")) + tables = {row[0] for row in result} + + assert "migrations" in tables + assert "authorization_codes" in tables + assert "domains" in tables + + +class TestMigrationSchemaCorrectness: + """Tests for correctness of migration schema.""" + + def test_authorization_codes_schema(self): + """Test authorization_codes table has correct columns.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db = Database(f"sqlite:///{db_path}") + db.initialize() + + engine = db.get_engine() + with engine.connect() as conn: + result = conn.execute(text("PRAGMA table_info(authorization_codes)")) + columns = {row[1] for row in result} # row[1] is column name + + expected_columns = { + "code", + "client_id", + "redirect_uri", + "state", + "code_challenge", + "code_challenge_method", + "scope", + "me", + "created_at", + } + + assert columns == expected_columns + + def test_domains_schema(self): + """Test domains table has correct columns.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db = Database(f"sqlite:///{db_path}") + db.initialize() + + engine = db.get_engine() + with engine.connect() as conn: + result = conn.execute(text("PRAGMA table_info(domains)")) + columns = {row[1] for row in result} + + expected_columns = { + "domain", + "email", + "verification_code", + "verified", + "created_at", + "verified_at", + } + + assert columns == expected_columns + + def test_migrations_schema(self): + """Test migrations table has correct columns.""" + with tempfile.TemporaryDirectory() as tmpdir: + db_path = Path(tmpdir) / "test.db" + db = Database(f"sqlite:///{db_path}") + db.initialize() + + engine = db.get_engine() + with engine.connect() as conn: + result = conn.execute(text("PRAGMA table_info(migrations)")) + columns = {row[1] for row in result} + + expected_columns = {"version", "description", "applied_at"} + + assert columns == expected_columns diff --git a/tests/unit/test_dns.py b/tests/unit/test_dns.py new file mode 100644 index 0000000..cf22e01 --- /dev/null +++ b/tests/unit/test_dns.py @@ -0,0 +1,293 @@ +""" +Unit tests for DNS service. + +Tests TXT record querying, domain verification, and error handling. +Uses mocking to avoid actual DNS queries. +""" + +from unittest.mock import MagicMock, patch + +import pytest +import dns.resolver +from dns.exception import DNSException + +from gondulf.dns import DNSError, DNSService + + +class TestDNSServiceInit: + """Tests for DNSService initialization.""" + + def test_init_creates_resolver(self): + """Test DNSService initializes with resolver.""" + service = DNSService() + + assert service.resolver is not None + assert isinstance(service.resolver, dns.resolver.Resolver) + + +class TestGetTxtRecords: + """Tests for get_txt_records method.""" + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_get_txt_records_success(self, mock_resolve): + """Test getting TXT records successfully.""" + # Mock TXT record response + mock_rdata = MagicMock() + mock_rdata.strings = [b"v=spf1 include:example.com ~all"] + mock_resolve.return_value = [mock_rdata] + + service = DNSService() + records = service.get_txt_records("example.com") + + assert len(records) == 1 + assert records[0] == "v=spf1 include:example.com ~all" + mock_resolve.assert_called_once_with("example.com", "TXT") + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_get_txt_records_multiple(self, mock_resolve): + """Test getting multiple TXT records.""" + # Mock multiple TXT records + mock_rdata1 = MagicMock() + mock_rdata1.strings = [b"record1"] + mock_rdata2 = MagicMock() + mock_rdata2.strings = [b"record2"] + mock_resolve.return_value = [mock_rdata1, mock_rdata2] + + service = DNSService() + records = service.get_txt_records("example.com") + + assert len(records) == 2 + assert "record1" in records + assert "record2" in records + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_get_txt_records_multipart(self, mock_resolve): + """Test getting TXT record with multiple strings (joined).""" + # Mock TXT record with multiple strings + mock_rdata = MagicMock() + mock_rdata.strings = [b"part1", b"part2", b"part3"] + mock_resolve.return_value = [mock_rdata] + + service = DNSService() + records = service.get_txt_records("example.com") + + assert len(records) == 1 + assert records[0] == "part1part2part3" + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_get_txt_records_no_answer(self, mock_resolve): + """Test getting TXT records when none exist returns empty list.""" + mock_resolve.side_effect = dns.resolver.NoAnswer() + + service = DNSService() + records = service.get_txt_records("example.com") + + assert records == [] + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_get_txt_records_nxdomain(self, mock_resolve): + """Test DNSError raised when domain doesn't exist.""" + mock_resolve.side_effect = dns.resolver.NXDOMAIN() + + service = DNSService() + + with pytest.raises(DNSError, match="Domain does not exist"): + service.get_txt_records("nonexistent.example") + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_get_txt_records_timeout(self, mock_resolve): + """Test DNSError raised on timeout.""" + mock_resolve.side_effect = dns.resolver.Timeout() + + service = DNSService() + + with pytest.raises(DNSError, match="timeout"): + service.get_txt_records("example.com") + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_get_txt_records_dns_exception(self, mock_resolve): + """Test DNSError raised on other DNS exceptions.""" + mock_resolve.side_effect = DNSException("DNS query failed") + + service = DNSService() + + with pytest.raises(DNSError, match="DNS query failed"): + service.get_txt_records("example.com") + + +class TestVerifyTxtRecord: + """Tests for verify_txt_record method.""" + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_verify_txt_record_success(self, mock_resolve): + """Test TXT record verification succeeds when value found.""" + mock_rdata = MagicMock() + mock_rdata.strings = [b"gondulf-verify=ABC123"] + mock_resolve.return_value = [mock_rdata] + + service = DNSService() + result = service.verify_txt_record("example.com", "gondulf-verify=ABC123") + + assert result is True + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_verify_txt_record_partial_match(self, mock_resolve): + """Test TXT record verification succeeds with partial match.""" + mock_rdata = MagicMock() + mock_rdata.strings = [b"some prefix gondulf-verify=ABC123 some suffix"] + mock_resolve.return_value = [mock_rdata] + + service = DNSService() + result = service.verify_txt_record("example.com", "gondulf-verify=ABC123") + + assert result is True + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_verify_txt_record_not_found(self, mock_resolve): + """Test TXT record verification fails when value not found.""" + mock_rdata = MagicMock() + mock_rdata.strings = [b"different-value"] + mock_resolve.return_value = [mock_rdata] + + service = DNSService() + result = service.verify_txt_record("example.com", "gondulf-verify=ABC123") + + assert result is False + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_verify_txt_record_no_txt_records(self, mock_resolve): + """Test TXT record verification fails when no TXT records exist.""" + mock_resolve.side_effect = dns.resolver.NoAnswer() + + service = DNSService() + result = service.verify_txt_record("example.com", "gondulf-verify=ABC123") + + assert result is False + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_verify_txt_record_nxdomain(self, mock_resolve): + """Test TXT record verification fails when domain doesn't exist.""" + mock_resolve.side_effect = dns.resolver.NXDOMAIN() + + service = DNSService() + result = service.verify_txt_record("nonexistent.example", "value") + + assert result is False + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_verify_txt_record_timeout(self, mock_resolve): + """Test TXT record verification fails on timeout.""" + mock_resolve.side_effect = dns.resolver.Timeout() + + service = DNSService() + result = service.verify_txt_record("example.com", "value") + + assert result is False + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_verify_txt_record_among_multiple(self, mock_resolve): + """Test TXT record verification finds value among multiple records.""" + mock_rdata1 = MagicMock() + mock_rdata1.strings = [b"unrelated-record"] + mock_rdata2 = MagicMock() + mock_rdata2.strings = [b"gondulf-verify=ABC123"] + mock_rdata3 = MagicMock() + mock_rdata3.strings = [b"another-record"] + mock_resolve.return_value = [mock_rdata1, mock_rdata2, mock_rdata3] + + service = DNSService() + result = service.verify_txt_record("example.com", "gondulf-verify=ABC123") + + assert result is True + + +class TestCheckDomainExists: + """Tests for check_domain_exists method.""" + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_check_domain_exists_a_record(self, mock_resolve): + """Test domain exists check succeeds with A record.""" + mock_resolve.return_value = [MagicMock()] + + service = DNSService() + result = service.check_domain_exists("example.com") + + assert result is True + mock_resolve.assert_called_with("example.com", "A") + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_check_domain_exists_aaaa_record(self, mock_resolve): + """Test domain exists check succeeds with AAAA record.""" + # First call (A record) fails, second call (AAAA) succeeds + mock_resolve.side_effect = [ + dns.resolver.NoAnswer(), + [MagicMock()], # AAAA record exists + ] + + service = DNSService() + result = service.check_domain_exists("example.com") + + assert result is True + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_check_domain_exists_no_records(self, mock_resolve): + """Test domain exists check succeeds even with no A/AAAA records.""" + # Both A and AAAA fail with NoAnswer (but not NXDOMAIN) + mock_resolve.side_effect = [ + dns.resolver.NoAnswer(), + dns.resolver.NoAnswer(), + ] + + service = DNSService() + result = service.check_domain_exists("example.com") + + # Domain exists even if no A/AAAA records (might have MX, TXT, etc.) + assert result is True + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_check_domain_not_exists_nxdomain(self, mock_resolve): + """Test domain exists check fails with NXDOMAIN.""" + mock_resolve.side_effect = dns.resolver.NXDOMAIN() + + service = DNSService() + result = service.check_domain_exists("nonexistent.example") + + assert result is False + + @patch("gondulf.dns.dns.resolver.Resolver.resolve") + def test_check_domain_exists_dns_error(self, mock_resolve): + """Test domain exists check returns False on DNS error.""" + mock_resolve.side_effect = DNSException("DNS failure") + + service = DNSService() + result = service.check_domain_exists("example.com") + + assert result is False + + +class TestResolverFallback: + """Tests for DNS resolver fallback configuration.""" + + @patch("gondulf.dns.dns.resolver.Resolver") + def test_resolver_uses_system_dns(self, mock_resolver_class): + """Test resolver uses system DNS when available.""" + mock_resolver = MagicMock() + mock_resolver.nameservers = ["192.168.1.1"] # System DNS + mock_resolver_class.return_value = mock_resolver + + service = DNSService() + + # System DNS should be used + assert service.resolver.nameservers == ["192.168.1.1"] + + @patch("gondulf.dns.dns.resolver.Resolver") + def test_resolver_fallback_to_public_dns(self, mock_resolver_class): + """Test resolver falls back to public DNS when system DNS unavailable.""" + mock_resolver = MagicMock() + mock_resolver.nameservers = [] # No system DNS + mock_resolver_class.return_value = mock_resolver + + service = DNSService() + + # Should fall back to public DNS + assert service.resolver.nameservers == ["8.8.8.8", "1.1.1.1"] diff --git a/tests/unit/test_email.py b/tests/unit/test_email.py new file mode 100644 index 0000000..6b6de4d --- /dev/null +++ b/tests/unit/test_email.py @@ -0,0 +1,304 @@ +""" +Unit tests for email service. + +Tests email sending with SMTP, TLS configuration, and error handling. +Uses mocking to avoid actual SMTP connections. +""" + +from unittest.mock import MagicMock, patch + +import pytest +import smtplib + +from gondulf.email import EmailError, EmailService + + +class TestEmailServiceInit: + """Tests for EmailService initialization.""" + + def test_init_with_all_parameters(self): + """Test EmailService initializes with all parameters.""" + service = EmailService( + smtp_host="smtp.gmail.com", + smtp_port=587, + smtp_from="sender@example.com", + smtp_username="user@example.com", + smtp_password="password", + smtp_use_tls=True, + ) + + assert service.smtp_host == "smtp.gmail.com" + assert service.smtp_port == 587 + assert service.smtp_from == "sender@example.com" + assert service.smtp_username == "user@example.com" + assert service.smtp_password == "password" + assert service.smtp_use_tls is True + + def test_init_without_credentials(self): + """Test EmailService initializes without username/password.""" + service = EmailService( + smtp_host="localhost", + smtp_port=25, + smtp_from="sender@example.com", + ) + + assert service.smtp_username is None + assert service.smtp_password is None + + +class TestEmailServiceSendVerificationCode: + """Tests for send_verification_code method.""" + + @patch("gondulf.email.smtplib.SMTP") + def test_send_verification_code_success_starttls(self, mock_smtp): + """Test sending verification code with STARTTLS.""" + mock_server = MagicMock() + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="smtp.example.com", + smtp_port=587, + smtp_from="sender@example.com", + smtp_username="user", + smtp_password="pass", + smtp_use_tls=True, + ) + + service.send_verification_code("recipient@example.com", "123456", "example.com") + + # Verify SMTP was called correctly + mock_smtp.assert_called_once_with("smtp.example.com", 587, timeout=10) + mock_server.starttls.assert_called_once() + mock_server.login.assert_called_once_with("user", "pass") + mock_server.send_message.assert_called_once() + mock_server.quit.assert_called_once() + + @patch("gondulf.email.smtplib.SMTP_SSL") + def test_send_verification_code_success_implicit_tls(self, mock_smtp_ssl): + """Test sending verification code with implicit TLS (port 465).""" + mock_server = MagicMock() + mock_smtp_ssl.return_value = mock_server + + service = EmailService( + smtp_host="smtp.example.com", + smtp_port=465, + smtp_from="sender@example.com", + smtp_username="user", + smtp_password="pass", + ) + + service.send_verification_code("recipient@example.com", "123456", "example.com") + + # Verify SMTP_SSL was called + mock_smtp_ssl.assert_called_once_with("smtp.example.com", 465, timeout=10) + # starttls should NOT be called for implicit TLS + assert not mock_server.starttls.called + mock_server.login.assert_called_once() + mock_server.send_message.assert_called_once() + + @patch("gondulf.email.smtplib.SMTP") + def test_send_verification_code_without_auth(self, mock_smtp): + """Test sending without authentication.""" + mock_server = MagicMock() + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="localhost", + smtp_port=25, + smtp_from="sender@example.com", + smtp_use_tls=False, + ) + + service.send_verification_code("recipient@example.com", "123456", "example.com") + + # Verify login was not called + assert not mock_server.login.called + mock_server.send_message.assert_called_once() + + @patch("gondulf.email.smtplib.SMTP") + def test_send_verification_code_smtp_error(self, mock_smtp): + """Test EmailError raised on SMTP failure.""" + mock_server = MagicMock() + mock_server.send_message.side_effect = smtplib.SMTPException("SMTP error") + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="smtp.example.com", + smtp_port=587, + smtp_from="sender@example.com", + ) + + with pytest.raises(EmailError, match="SMTP error"): + service.send_verification_code( + "recipient@example.com", "123456", "example.com" + ) + + @patch("gondulf.email.smtplib.SMTP") + def test_send_verification_code_auth_error(self, mock_smtp): + """Test EmailError raised on authentication failure.""" + mock_server = MagicMock() + mock_server.login.side_effect = smtplib.SMTPAuthenticationError( + 535, "Authentication failed" + ) + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="smtp.example.com", + smtp_port=587, + smtp_from="sender@example.com", + smtp_username="user", + smtp_password="wrong", + smtp_use_tls=True, + ) + + with pytest.raises(EmailError, match="authentication failed"): + service.send_verification_code( + "recipient@example.com", "123456", "example.com" + ) + + +class TestEmailServiceConnection: + """Tests for test_connection method.""" + + @patch("gondulf.email.smtplib.SMTP") + def test_connection_success_starttls(self, mock_smtp): + """Test connection test succeeds with STARTTLS.""" + mock_server = MagicMock() + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="smtp.example.com", + smtp_port=587, + smtp_from="sender@example.com", + smtp_username="user", + smtp_password="pass", + smtp_use_tls=True, + ) + + assert service.test_connection() is True + + mock_smtp.assert_called_once() + mock_server.starttls.assert_called_once() + mock_server.login.assert_called_once() + mock_server.quit.assert_called_once() + + @patch("gondulf.email.smtplib.SMTP_SSL") + def test_connection_success_implicit_tls(self, mock_smtp_ssl): + """Test connection test succeeds with implicit TLS.""" + mock_server = MagicMock() + mock_smtp_ssl.return_value = mock_server + + service = EmailService( + smtp_host="smtp.example.com", + smtp_port=465, + smtp_from="sender@example.com", + smtp_username="user", + smtp_password="pass", + ) + + assert service.test_connection() is True + + mock_smtp_ssl.assert_called_once() + mock_server.login.assert_called_once() + + @patch("gondulf.email.smtplib.SMTP") + def test_connection_failure(self, mock_smtp): + """Test connection test fails gracefully.""" + mock_smtp.side_effect = smtplib.SMTPException("Connection failed") + + service = EmailService( + smtp_host="smtp.example.com", + smtp_port=587, + smtp_from="sender@example.com", + ) + + assert service.test_connection() is False + + @patch("gondulf.email.smtplib.SMTP") + def test_connection_without_credentials(self, mock_smtp): + """Test connection test works without credentials.""" + mock_server = MagicMock() + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="localhost", + smtp_port=25, + smtp_from="sender@example.com", + smtp_use_tls=False, + ) + + assert service.test_connection() is True + + # Login should not be called without credentials + assert not mock_server.login.called + + +class TestEmailMessageContent: + """Tests for email message content.""" + + @patch("gondulf.email.smtplib.SMTP") + def test_message_contains_code(self, mock_smtp): + """Test email message contains the verification code.""" + mock_server = MagicMock() + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="localhost", + smtp_port=25, + smtp_from="sender@example.com", + ) + + service.send_verification_code("recipient@example.com", "ABC123", "example.com") + + # Get the message that was sent + call_args = mock_server.send_message.call_args + sent_message = call_args[0][0] + + # Verify message contains code + message_body = sent_message.as_string() + assert "ABC123" in message_body + + @patch("gondulf.email.smtplib.SMTP") + def test_message_contains_domain(self, mock_smtp): + """Test email message contains the domain being verified.""" + mock_server = MagicMock() + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="localhost", + smtp_port=25, + smtp_from="sender@example.com", + ) + + service.send_verification_code( + "recipient@example.com", "123456", "mydomain.com" + ) + + # Get the message that was sent + call_args = mock_server.send_message.call_args + sent_message = call_args[0][0] + + message_body = sent_message.as_string() + assert "mydomain.com" in message_body + + @patch("gondulf.email.smtplib.SMTP") + def test_message_has_correct_headers(self, mock_smtp): + """Test email message has correct From/To/Subject headers.""" + mock_server = MagicMock() + mock_smtp.return_value = mock_server + + service = EmailService( + smtp_host="localhost", + smtp_port=25, + smtp_from="noreply@gondulf.example", + ) + + service.send_verification_code("user@example.com", "123456", "example.com") + + # Get the message that was sent + call_args = mock_server.send_message.call_args + sent_message = call_args[0][0] + + assert sent_message["From"] == "noreply@gondulf.example" + assert sent_message["To"] == "user@example.com" + assert "example.com" in sent_message["Subject"] diff --git a/tests/unit/test_storage.py b/tests/unit/test_storage.py new file mode 100644 index 0000000..a10a870 --- /dev/null +++ b/tests/unit/test_storage.py @@ -0,0 +1,218 @@ +""" +Unit tests for in-memory code storage. + +Tests code storage, verification, expiration, and cleanup. +""" + +import time + +import pytest + +from gondulf.storage import CodeStore + + +class TestCodeStore: + """Tests for CodeStore class.""" + + def test_store_and_verify_success(self): + """Test storing and verifying a valid code.""" + store = CodeStore(ttl_seconds=60) + store.store("test@example.com", "123456") + + assert store.verify("test@example.com", "123456") is True + + def test_verify_wrong_code_fails(self): + """Test verification fails with wrong code.""" + store = CodeStore(ttl_seconds=60) + store.store("test@example.com", "123456") + + assert store.verify("test@example.com", "wrong") is False + + def test_verify_nonexistent_key_fails(self): + """Test verification fails for nonexistent key.""" + store = CodeStore(ttl_seconds=60) + + assert store.verify("nonexistent@example.com", "123456") is False + + def test_verify_removes_code_after_success(self): + """Test that successful verification removes code (single-use).""" + store = CodeStore(ttl_seconds=60) + store.store("test@example.com", "123456") + + # First verification succeeds + assert store.verify("test@example.com", "123456") is True + + # Second verification fails (code removed) + assert store.verify("test@example.com", "123456") is False + + def test_verify_expired_code_fails(self): + """Test verification fails for expired code.""" + store = CodeStore(ttl_seconds=1) + store.store("test@example.com", "123456") + + # Wait for expiration + time.sleep(1.1) + + assert store.verify("test@example.com", "123456") is False + + def test_verify_removes_expired_code(self): + """Test that expired codes are removed from storage.""" + store = CodeStore(ttl_seconds=1) + store.store("test@example.com", "123456") + + # Wait for expiration + time.sleep(1.1) + + # Verification fails and removes code + store.verify("test@example.com", "123456") + + # Code should be gone from storage + assert store.size() == 0 + + def test_get_valid_code(self): + """Test getting a valid code without removing it.""" + store = CodeStore(ttl_seconds=60) + store.store("test@example.com", "123456") + + assert store.get("test@example.com") == "123456" + # Code should still be in storage + assert store.get("test@example.com") == "123456" + + def test_get_nonexistent_code(self): + """Test getting nonexistent code returns None.""" + store = CodeStore(ttl_seconds=60) + + assert store.get("nonexistent@example.com") is None + + def test_get_expired_code(self): + """Test getting expired code returns None.""" + store = CodeStore(ttl_seconds=1) + store.store("test@example.com", "123456") + + # Wait for expiration + time.sleep(1.1) + + assert store.get("test@example.com") is None + + def test_delete_code(self): + """Test explicitly deleting a code.""" + store = CodeStore(ttl_seconds=60) + store.store("test@example.com", "123456") + + store.delete("test@example.com") + + assert store.get("test@example.com") is None + + def test_delete_nonexistent_code(self): + """Test deleting nonexistent code doesn't raise error.""" + store = CodeStore(ttl_seconds=60) + + # Should not raise + store.delete("nonexistent@example.com") + + def test_cleanup_expired_codes(self): + """Test manual cleanup of expired codes.""" + store = CodeStore(ttl_seconds=1) + + # Store multiple codes + store.store("test1@example.com", "code1") + store.store("test2@example.com", "code2") + store.store("test3@example.com", "code3") + + assert store.size() == 3 + + # Wait for expiration + time.sleep(1.1) + + # Cleanup should remove all expired codes + removed = store.cleanup_expired() + + assert removed == 3 + assert store.size() == 0 + + def test_cleanup_expired_partial(self): + """Test cleanup removes only expired codes, not valid ones.""" + store = CodeStore(ttl_seconds=2) + + # Store first code + store.store("test1@example.com", "code1") + + # Wait 1 second + time.sleep(1) + + # Store second code (will expire later) + store.store("test2@example.com", "code2") + + # Wait for first code to expire + time.sleep(1.1) + + # Cleanup should remove only first code + removed = store.cleanup_expired() + + assert removed == 1 + assert store.size() == 1 + assert store.get("test2@example.com") == "code2" + + def test_size(self): + """Test size() returns correct count.""" + store = CodeStore(ttl_seconds=60) + + assert store.size() == 0 + + store.store("test1@example.com", "code1") + assert store.size() == 1 + + store.store("test2@example.com", "code2") + assert store.size() == 2 + + store.delete("test1@example.com") + assert store.size() == 1 + + def test_clear(self): + """Test clear() removes all codes.""" + store = CodeStore(ttl_seconds=60) + + store.store("test1@example.com", "code1") + store.store("test2@example.com", "code2") + store.store("test3@example.com", "code3") + + assert store.size() == 3 + + store.clear() + + assert store.size() == 0 + + def test_custom_ttl(self): + """Test custom TTL is respected.""" + store = CodeStore(ttl_seconds=2) + store.store("test@example.com", "123456") + + # Code valid after 1 second + time.sleep(1) + assert store.get("test@example.com") == "123456" + + # Code expired after 2+ seconds + time.sleep(1.1) + assert store.get("test@example.com") is None + + def test_multiple_keys(self): + """Test storing multiple different keys.""" + store = CodeStore(ttl_seconds=60) + + store.store("test1@example.com", "code1") + store.store("test2@example.com", "code2") + store.store("test3@example.com", "code3") + + assert store.verify("test1@example.com", "code1") is True + assert store.verify("test2@example.com", "code2") is True + assert store.verify("test3@example.com", "code3") is True + + def test_overwrite_existing_code(self): + """Test storing new code with same key overwrites old code.""" + store = CodeStore(ttl_seconds=60) + + store.store("test@example.com", "old_code") + store.store("test@example.com", "new_code") + + assert store.verify("test@example.com", "old_code") is False + assert store.verify("test@example.com", "new_code") is True diff --git a/uv.lock b/uv.lock index bf7e93f..8d66500 100644 --- a/uv.lock +++ b/uv.lock @@ -2,6 +2,15 @@ version = 1 revision = 3 requires-python = ">=3.10" +[[package]] +name = "aiosmtplib" +version = "5.0.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/a2/15/c2dc93a58d716bce64b53918d3cf667d86c96a56a9f3a239a9f104643637/aiosmtplib-5.0.0.tar.gz", hash = "sha256:514ac11c31cb767c764077eb3c2eb2ae48df6f63f1e847aeb36119c4fc42b52d", size = 61057, upload-time = "2025-10-19T19:12:31.426Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/99/42/b997c306dc54e6ac62a251787f6b5ec730797eea08e0336d8f0d7b899d5f/aiosmtplib-5.0.0-py3-none-any.whl", hash = "sha256:95eb0f81189780845363ab0627e7f130bca2d0060d46cd3eeb459f066eb7df32", size = 27048, upload-time = "2025-10-19T19:12:30.124Z" }, +] + [[package]] name = "annotated-doc" version = "0.0.4" @@ -232,6 +241,15 @@ toml = [ { name = "tomli", marker = "python_full_version <= '3.11'" }, ] +[[package]] +name = "dnspython" +version = "2.8.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/8c/8b/57666417c0f90f08bcafa776861060426765fdb422eb10212086fb811d26/dnspython-2.8.0.tar.gz", hash = "sha256:181d3c6996452cb1189c4046c61599b84a5a86e099562ffde77d26984ff26d0f", size = 368251, upload-time = "2025-09-07T18:58:00.022Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/ba/5a/18ad964b0086c6e62e2e7500f7edc89e3faa45033c71c1893d34eed2b2de/dnspython-2.8.0-py3-none-any.whl", hash = "sha256:01d9bbc4a2d76bf0db7c1f729812ded6d912bd318d3b1cf81d30c0f845dbf3af", size = 331094, upload-time = "2025-09-07T18:57:58.071Z" }, +] + [[package]] name = "exceptiongroup" version = "1.3.0" @@ -314,9 +332,12 @@ name = "gondulf" version = "0.1.0.dev0" source = { editable = "." } dependencies = [ + { name = "aiosmtplib" }, + { name = "dnspython" }, { name = "fastapi" }, { name = "pydantic" }, { name = "pydantic-settings" }, + { name = "python-dotenv" }, { name = "python-multipart" }, { name = "sqlalchemy" }, { name = "uvicorn", extra = ["standard"] }, @@ -343,8 +364,10 @@ test = [ [package.metadata] requires-dist = [ + { name = "aiosmtplib", specifier = ">=3.0.0" }, { name = "bandit", marker = "extra == 'dev'", specifier = ">=1.7.0" }, { name = "black", marker = "extra == 'dev'", specifier = ">=23.0.0" }, + { name = "dnspython", specifier = ">=2.4.0" }, { name = "factory-boy", marker = "extra == 'test'", specifier = ">=3.2.0" }, { name = "fastapi", specifier = ">=0.104.0" }, { name = "flake8", marker = "extra == 'dev'", specifier = ">=6.0.0" }, @@ -358,6 +381,7 @@ requires-dist = [ { name = "pytest-asyncio", marker = "extra == 'test'", specifier = ">=0.20.0" }, { name = "pytest-cov", marker = "extra == 'test'", specifier = ">=4.0.0" }, { name = "pytest-mock", marker = "extra == 'test'", specifier = ">=3.10.0" }, + { name = "python-dotenv", specifier = ">=1.0.0" }, { name = "python-multipart", specifier = ">=0.0.6" }, { name = "ruff", marker = "extra == 'dev'", specifier = ">=0.1.0" }, { name = "sqlalchemy", specifier = ">=2.0.0" },