# Phase 1 Impact Assessment: Authentication Flow Change **Date**: 2025-11-20 **Architect**: Claude (Architect Agent) **Related ADRs**: ADR-005 (updated), ADR-008 (new) **Related Report**: /docs/reports/2025-11-20-phase-1-foundation.md ## Summary The authentication design has been updated to require BOTH DNS TXT verification AND email verification via rel="me" discovery. This change impacts Phase 1 implementation and defines new requirements for Phase 2. ## Authentication Flow Change ### Original Design (ADR-005 v1) - **Primary**: Email verification (user provides email) - **Optional**: DNS TXT verification (fast-path to skip email) - **Flow**: DNS check → if not found, request email → send code → verify code ### Updated Design (ADR-005 v2 + ADR-008) - **Required Factor 1**: DNS TXT verification (`_gondulf.{domain}` = `verified`) - **Required Factor 2**: Email verification via rel="me" discovery - **Flow**: DNS check → rel="me" discovery → send code to discovered email → verify code ### Key Differences | Aspect | Original | Updated | |--------|----------|---------| | DNS TXT | Optional (fast-path) | Required (first factor) | | Email Discovery | User input | rel="me" link parsing | | Email Verification | Optional (fallback) | Required (second factor) | | Security Model | Single-factor | Two-factor | | Attack Resistance | Moderate | High (requires DNS + email control) | | Setup Complexity | Lower (email only works) | Higher (both required) | ## Phase 1 Implementation Impact ### What Phase 1 Implemented Phase 1 successfully implemented: - ✅ Configuration management (GONDULF_* environment variables) - ✅ Database layer with migrations (SQLite, SQLAlchemy Core) - ✅ In-memory code storage (TTL-based expiration) - ✅ Email service (SMTP with STARTTLS support) - ✅ DNS service (TXT record querying with fallback resolvers) - ✅ Structured logging - ✅ FastAPI application with health check endpoint - ✅ 94.16% test coverage (96 tests passing) ### Does Phase 1 Need Changes? **Answer: NO. Phase 1 implementation remains valid.** #### Analysis **Email Service** (`src/gondulf/email.py`): - Current: Generic email sending service - Change Impact: **None** - Reason: Email service sends codes to any email address. Whether email is user-provided or rel="me"-discovered doesn't affect this service. - Status: **No changes needed** **DNS Service** (`src/gondulf/dns.py`): - Current: TXT record verification with fallback resolvers - Change Impact: **None** - Reason: DNS service already implements TXT record verification as designed. Changing from "optional" to "required" is a business logic change, not a DNS service change. - Status: **No changes needed** **In-Memory Storage** (`src/gondulf/storage.py`): - Current: TTL-based code storage - Change Impact: **None** - Reason: Storage mechanism is independent of how email is discovered or whether DNS is optional/required. - Status: **No changes needed** **Database Schema** (`001_initial_schema.sql`): - Current: `domains` table with `domain`, `verification_method`, `verified_at` - Change Impact: **Minor update needed in Phase 2** - Reason: Schema already supports storing verification method. Will need to update from `'txt_record'` or `'email'` to `'two_factor'` when storing records. - Status: **Schema structure OK, values will change in Phase 2** **Configuration** (`src/gondulf/config.py`): - Current: SMTP configuration, DNS configuration, timeouts - Change Impact: **None immediately, optional addition in Phase 2** - Reason: Current configuration supports both email and DNS. May want to add timeout for HTML fetching in Phase 2. - Status: **No changes needed now** ### Phase 1 Status: APPROVED Phase 1 implementation remains valid and does NOT require any revisions due to the authentication flow change. All Phase 1 components are foundational services that work regardless of how they're orchestrated in the authentication flow. ## Phase 2 Requirements: New Implementation Needs Phase 2 must now implement the updated authentication flow. Here's what needs to be built: ### 1. HTML Fetching Service (NEW) **Purpose**: Fetch user's homepage to discover rel="me" links **Implementation**: ```python # src/gondulf/html_fetcher.py import requests from typing import Optional class HTMLFetcherService: """ Fetch user's homepage over HTTPS. """ def __init__(self, timeout: int = 10): self.timeout = timeout self.max_redirects = 5 self.max_size = 5 * 1024 * 1024 # 5MB def fetch_site(self, domain: str) -> Optional[str]: """ Fetch site HTML content. Args: domain: Domain to fetch (e.g., "example.com") Returns: HTML content as string, or None if fetch fails """ url = f"https://{domain}" try: response = requests.get( url, timeout=self.timeout, allow_redirects=True, max_redirects=self.max_redirects, verify=True # Enforce SSL verification ) response.raise_for_status() # Check content size if len(response.content) > self.max_size: raise ValueError(f"Response too large: {len(response.content)} bytes") return response.text except requests.exceptions.SSLError as e: logger.error(f"SSL verification failed for {domain}: {e}") return None except requests.exceptions.Timeout: logger.error(f"Timeout fetching {domain}") return None except Exception as e: logger.error(f"Failed to fetch {domain}: {e}") return None ``` **Dependencies**: - `requests` library (already in pyproject.toml) - Timeout configuration (add to Config if needed) **Tests Required**: - Successful HTTPS fetch - SSL verification failure - Timeout handling - HTTP error codes (404, 500, etc.) - Redirect following - Size limit enforcement --- ### 2. rel="me" Email Discovery Service (NEW) **Purpose**: Parse HTML to discover email from rel="me" links **Implementation**: ```python # src/gondulf/relme.py from bs4 import BeautifulSoup from typing import Optional import re class RelMeDiscoveryService: """ Discover email addresses from rel="me" links in HTML. """ def discover_email(self, html_content: str) -> Optional[str]: """ Parse HTML and discover email from rel="me" link. Args: html_content: HTML content as string Returns: Email address or None if not found """ try: # Parse HTML (BeautifulSoup handles malformed HTML) soup = BeautifulSoup(html_content, 'html.parser') # Find all rel="me" links ( and tags) me_links = soup.find_all('link', rel='me') + soup.find_all('a', rel='me') # Look for mailto: links for link in me_links: href = link.get('href', '') if href.startswith('mailto:'): email = href.replace('mailto:', '').strip() # Validate email format if self._validate_email_format(email): logger.info(f"Discovered email via rel='me': {email[:3]}***") return email logger.warning("No rel='me' mailto: link found in HTML") return None except Exception as e: logger.error(f"Failed to parse HTML: {e}") return None def _validate_email_format(self, email: str) -> bool: """Validate email address format (RFC 5322 simplified).""" email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' if not re.match(email_regex, email): return False if len(email) > 254: # RFC 5321 maximum return False if email.count('@') != 1: return False return True ``` **Dependencies**: - `beautifulsoup4` library (add to pyproject.toml) - `html.parser` (Python standard library) **Tests Required**: - Discovery from `` tags - Discovery from `` tags - Multiple rel="me" links (select first mailto) - Malformed HTML handling - Missing rel="me" links - Invalid email format in link - Edge cases (empty href, non-mailto links, etc.) --- ### 3. Domain Verification Service (UPDATED) **Purpose**: Orchestrate two-factor verification (DNS + Email) **Implementation**: ```python # src/gondulf/domain_verification.py from typing import Tuple, Optional from .dns import DNSService from .html_fetcher import HTMLFetcherService from .relme import RelMeDiscoveryService from .email import EmailService from .storage import CodeStorage class DomainVerificationService: """ Two-factor domain verification service. Verifies domain ownership through: 1. DNS TXT record verification 2. Email verification via rel="me" discovery """ def __init__( self, dns_service: DNSService, html_fetcher: HTMLFetcherService, relme_discovery: RelMeDiscoveryService, email_service: EmailService, code_storage: CodeStorage ): self.dns = dns_service self.html_fetcher = html_fetcher self.relme = relme_discovery self.email = email_service self.code_storage = code_storage def start_verification(self, domain: str) -> Tuple[bool, Optional[str], Optional[str]]: """ Start domain verification process. Returns: (success, discovered_email, error_message) Raises HTTPException with appropriate error if verification cannot start. """ # Step 1: Verify DNS TXT record dns_verified = self.dns.verify_txt_record(domain, "verified") if not dns_verified: error = f"DNS TXT record not found for {domain}. Please add: _gondulf.{domain} TXT verified" return False, None, error # Step 2: Fetch site and discover email html = self.html_fetcher.fetch_site(domain) if html is None: error = f"Could not fetch site at https://{domain}. Please ensure site is accessible via HTTPS." return False, None, error # Step 3: Discover email from rel="me" email = self.relme.discover_email(html) if email is None: error = 'No rel="me" mailto: link found. Please add: ' return False, None, error # Step 4: Generate and send verification code code = self._generate_code() self.code_storage.store(email, code, ttl=900) # 15 minutes email_sent = self.email.send_verification_email(email, code) if not email_sent: error = f"Failed to send verification email to {email}. Please try again." return False, email, error # Success: code sent to discovered email return True, email, None def verify_code(self, email: str, submitted_code: str) -> Tuple[bool, str]: """ Verify submitted code. Returns: (success, domain_or_error_message) """ stored_data = self.code_storage.get(email) if stored_data is None: return False, "No verification code found. Please restart verification." code, domain = stored_data # Verify code (constant-time comparison) if not secrets.compare_digest(submitted_code, code): return False, "Invalid code. Please try again." # Success: mark code as used self.code_storage.delete(email) return True, domain def _generate_code(self) -> str: """Generate 6-digit verification code.""" return ''.join(secrets.choice('0123456789') for _ in range(6)) ``` **Dependencies**: - All Phase 1 services (DNS, Email, Storage) - New HTML fetcher service - New rel="me" discovery service **Tests Required**: - Full verification flow (DNS → rel="me" → email → code) - DNS verification failure - Site fetch failure - rel="me" discovery failure - Email send failure - Code verification success/failure - Multiple attempts tracking - Code expiration --- ### 4. Domain Verification UI Endpoints (NEW) **Purpose**: HTTP endpoints for user interaction **Implementation**: ```python # src/gondulf/routers/verification.py from fastapi import APIRouter, HTTPException from pydantic import BaseModel router = APIRouter(prefix="/verify", tags=["verification"]) class VerificationStartRequest(BaseModel): domain: str class VerificationStartResponse(BaseModel): success: bool email_masked: Optional[str] # e.g., "u***@example.com" error: Optional[str] class VerificationCodeRequest(BaseModel): email: str code: str class VerificationCodeResponse(BaseModel): success: bool domain: Optional[str] error: Optional[str] @router.post("/start", response_model=VerificationStartResponse) async def start_verification(request: VerificationStartRequest): """ Start domain verification process. Steps: 1. Verify DNS TXT record 2. Discover email from rel="me" 3. Send verification code to email """ success, email, error = domain_verification_service.start_verification(request.domain) if not success: return VerificationStartResponse(success=False, email_masked=None, error=error) # Mask email for display: u***@example.com masked_email = f"{email[0]}***@{email.split('@')[1]}" return VerificationStartResponse( success=True, email_masked=masked_email, error=None ) @router.post("/code", response_model=VerificationCodeResponse) async def verify_code(request: VerificationCodeRequest): """ Verify submitted code. Returns domain if code is valid. """ success, result = domain_verification_service.verify_code(request.email, request.code) if not success: return VerificationCodeResponse(success=False, domain=None, error=result) return VerificationCodeResponse(success=True, domain=result, error=None) ``` **Dependencies**: - FastAPI router - Pydantic models - Domain verification service **Tests Required**: - POST /verify/start success case - POST /verify/start with DNS failure - POST /verify/start with rel="me" failure - POST /verify/start with email send failure - POST /verify/code success case - POST /verify/code with invalid code - POST /verify/code with expired code - POST /verify/code with missing code --- ### 5. Authorization Endpoint Integration (UPDATED) **Changes to Authorization Flow**: **Before** (original design): ``` 1. User enters domain (me parameter) 2. Display form: "Enter your email at {domain}" 3. User enters email manually 4. Send code, user enters code 5. Display consent screen ``` **After** (updated design): ``` 1. User enters domain (me parameter) 2. Server performs two-factor verification: a. Verify DNS TXT record b. Discover email from rel="me" c. Send code to discovered email 3. Display code entry form (show discovered email masked) 4. User enters code 5. Display consent screen ``` **Implementation Changes**: - Call `DomainVerificationService.start_verification()` instead of requesting email from user - Update UI to show "Sending code to u***@example.com" instead of email input form - Handle new error cases (DNS not found, rel="me" not found, site unreachable) --- ## Phase 2 Feature Breakdown ### New Dependencies to Add **pyproject.toml additions**: ```toml [project] dependencies = [ # ... existing dependencies "beautifulsoup4>=4.12.0", # HTML parsing for rel="me" discovery ] ``` ### New Source Files 1. `src/gondulf/html_fetcher.py` - HTML fetching service 2. `src/gondulf/relme.py` - rel="me" email discovery service 3. `src/gondulf/domain_verification.py` - Two-factor verification orchestration 4. `src/gondulf/routers/verification.py` - Verification endpoints (if implemented separately from authorization) ### Updated Files 1. `src/gondulf/main.py` - Register new routers, initialize new services 2. `src/gondulf/config.py` - Optional: add HTML fetch timeout config 3. Database migration (002_update_verification_method.sql) - Change domain.verification_method values ### New Test Files 1. `tests/unit/test_html_fetcher.py` - HTML fetching tests 2. `tests/unit/test_relme.py` - rel="me" discovery tests 3. `tests/unit/test_domain_verification.py` - Verification service tests 4. `tests/integration/test_verification_endpoints.py` - Verification endpoint tests ### Estimated Effort **New Components**: - HTML Fetcher Service: 0.5 days - rel="me" Discovery Service: 0.5 days - Domain Verification Service: 1 day - Verification Endpoints: 0.5 days - Tests (all new components): 1 day **Total New Work**: ~3.5 days **Authorization Endpoint** (already planned): - Original estimate: 3-5 days - Updated estimate: 3-5 days (same - just uses DomainVerificationService) ## Database Schema Updates ### Migration: 002_update_verification_method.sql ```sql -- Update verification_method values from single-factor to two-factor -- This is a data migration, not schema change UPDATE domains SET verification_method = 'two_factor' WHERE verification_method IN ('txt_record', 'email'); -- No schema changes needed - 'verification_method' column already exists ``` **When to Apply**: Phase 2, before authorization endpoint implementation ## Error Message Updates ### DNS TXT Not Found ``` DNS Verification Failed Please add this TXT record to your domain's DNS: Type: TXT Name: _gondulf.example.com Value: verified DNS changes may take up to 24 hours to propagate. Need help? See: https://docs.gondulf.example.com/setup/dns ``` ### rel="me" Not Found ``` Email Discovery Failed Could not find a rel="me" email link on your homepage. Please add this to your homepage (https://example.com): This declares your email address for IndieAuth verification. Learn more: https://indieweb.org/rel-me ``` ### Site Unreachable ``` Site Fetch Failed Could not fetch your site at https://example.com Please check: - Site is accessible via HTTPS - SSL certificate is valid - No firewall blocking requests Try again once your site is accessible. ``` ### Email Send Failure ``` Email Delivery Failed Failed to send verification code to u***@example.com Please check: - Email address is correct in your rel="me" link - Email server is accepting mail - Check spam/junk folder Try again, or contact support if the issue persists. ``` ## Documentation Updates Needed ### User Documentation (Phase 2) 1. **Setup Guide**: `/docs/user/setup.md` - Step 1: Add DNS TXT record - Step 2: Add rel="me" link to homepage - Step 3: Test verification 2. **Troubleshooting**: `/docs/user/troubleshooting.md` - DNS verification failures - rel="me" discovery issues - Email delivery problems 3. **Examples**: `/docs/user/examples.md` - Example HTML with rel="me" link - Example DNS configuration (various providers) ### Developer Documentation (Phase 2) 1. **API Reference**: `/docs/api/verification.md` - POST /verify/start endpoint - POST /verify/code endpoint - Error codes and responses 2. **Architecture**: `/docs/architecture/domain-verification.md` - Two-factor verification flow diagram - Service interaction diagram - Error handling flowchart ## Security Considerations for Phase 2 ### New Attack Surfaces 1. **HTML Parsing**: - Risk: Malicious HTML exploiting parser - Mitigation: BeautifulSoup handles untrusted HTML safely - Test: Fuzzing with malformed HTML 2. **HTTPS Fetching**: - Risk: SSL verification bypass - Mitigation: Enforce `verify=True` in requests - Test: Attempt to fetch site with invalid certificate (must fail) 3. **rel="me" Spoofing**: - Risk: Attacker adds rel="me" to compromised site - Mitigation: Two-factor requirement (also need DNS control) - Test: Verify DNS check happens BEFORE rel="me" discovery ### Security Testing Required 1. **Input Validation**: - Malformed domain names - Oversized HTML responses (>5MB) - Invalid email formats in rel="me" links 2. **TLS Enforcement**: - Verify HTTPS-only fetching - Verify SSL certificate validation - Reject sites with invalid certificates 3. **Rate Limiting** (future): - Prevent bulk rel="me" discovery - Limit verification attempts per domain ## Configuration Updates ### Optional New Config ```python # src/gondulf/config.py class Config: # ... existing config # HTML Fetching (optional, has sensible defaults) HTML_FETCH_TIMEOUT: int = 10 # seconds HTML_MAX_SIZE: int = 5 * 1024 * 1024 # 5MB HTML_MAX_REDIRECTS: int = 5 ``` ### Environment Variables ```bash # .env.example additions (optional) # HTML Fetching Configuration (optional - has defaults) GONDULF_HTML_FETCH_TIMEOUT=10 # Timeout for fetching user's site (seconds) GONDULF_HTML_MAX_SIZE=5242880 # Maximum HTML size (bytes, default 5MB) GONDULF_HTML_MAX_REDIRECTS=5 # Maximum redirects to follow ``` ## Testing Strategy for Phase 2 ### Unit Tests **HTML Fetcher**: - Mock successful HTTPS response - Mock SSL verification failure - Mock timeout - Mock HTTP errors (404, 500, etc.) - Mock size limit exceeded - Mock redirect following **rel="me" Discovery**: - Parse `` - Parse `` - Handle malformed HTML - Handle missing rel="me" links - Handle invalid email in link - Handle multiple rel="me" links (select first) **Domain Verification Service**: - Full two-factor flow success - DNS verification failure - Site fetch failure - rel="me" discovery failure - Email send failure - Code verification success/failure ### Integration Tests **Verification Endpoints**: - POST /verify/start with valid domain (mock services) - POST /verify/start with DNS failure - POST /verify/start with rel="me" failure - POST /verify/code with valid code - POST /verify/code with invalid code ### End-to-End Tests (Future) - Complete verification flow with real HTML - Authorization flow integration - Token issuance after successful verification ## Acceptance Criteria for Phase 2 Phase 2 will be considered complete when: 1. ✅ HTML fetcher service implemented and tested 2. ✅ rel="me" discovery service implemented and tested 3. ✅ Domain verification service orchestrates two-factor verification 4. ✅ Verification endpoints return correct responses for all cases 5. ✅ Error messages are clear and actionable 6. ✅ All new tests passing (unit + integration) 7. ✅ Test coverage remains >80% overall 8. ✅ Security testing complete (HTML parsing, TLS enforcement) 9. ✅ Documentation updated (user setup guide, API reference) 10. ✅ Database migration applied successfully ## Timeline Estimate **Phase 2 Components**: - HTML Fetcher: 0.5 days - rel="me" Discovery: 0.5 days - Domain Verification Service: 1 day - Verification Endpoints: 0.5 days - Testing: 1 day - Documentation: 0.5 days **Total New Work**: ~4 days **Authorization Endpoint** (already planned): - Original estimate: 3-5 days - Updated estimate: 3-5 days (uses DomainVerificationService) **Phase 2 Total**: ~7-9 days (vs. original estimate of 3-5 days) **Impact**: +4 days of work due to authentication flow change ## Recommendation **Phase 1**: APPROVED as-is. No changes needed. **Phase 2**: Proceed with implementation of: 1. HTML fetching service 2. rel="me" discovery service 3. Domain verification service (two-factor orchestration) 4. Verification endpoints 5. Updated authorization endpoint to use domain verification service The additional work (HTML fetching + rel="me" discovery) adds ~4 days to Phase 2, bringing total Phase 2 estimate to 7-9 days instead of original 3-5 days. ## Sign-off **Assessment Status**: Complete **Phase 1 Impact**: None - Phase 1 approved as-is **Phase 2 Impact**: Additional 4 days of work for new services **Risk Level**: Low - All new work is well-scoped and testable **Ready to Proceed**: Yes --- **Assessment completed**: 2025-11-20 **Architect**: Claude (Architect Agent)