# Phase 5b: Integration and End-to-End Tests Design ## Purpose Phase 5b enhances the test suite to achieve comprehensive coverage through integration and end-to-end testing. While the current test suite has 86.93% coverage with 327 tests, critical gaps remain in verifying complete authentication flows and component interactions. This phase ensures the IndieAuth server operates correctly as a complete system, not just as individual components. ### Goals 1. Verify all components work together correctly (integration tests) 2. Validate complete IndieAuth authentication flows (E2E tests) 3. Test real-world scenarios and error conditions 4. Achieve 90%+ overall coverage with 95%+ on critical paths 5. Ensure test reliability and maintainability ## Specification References ### W3C IndieAuth Requirements - Section 5.2: Authorization Endpoint - complete flow validation - Section 5.3: Token Endpoint - code exchange validation - Section 5.4: Token Verification - end-to-end verification - Section 6: Client Information Discovery - metadata integration - Section 7: Security Considerations - comprehensive security testing ### OAuth 2.0 RFC 6749 - Section 4.1: Authorization Code Grant - full flow testing - Section 10: Security Considerations - threat mitigation verification ## Design Overview The testing expansion follows a three-layer approach: 1. **Integration Layer**: Tests component interactions within the system 2. **End-to-End Layer**: Tests complete user flows from start to finish 3. **Scenario Layer**: Tests real-world usage patterns and edge cases ### Test Organization Structure ``` tests/ ├── integration/ # Component interaction tests │ ├── api/ # API endpoint integration │ │ ├── test_auth_token_flow.py │ │ ├── test_metadata_integration.py │ │ └── test_verification_flow.py │ ├── services/ # Service layer integration │ │ ├── test_domain_email_integration.py │ │ ├── test_token_storage_integration.py │ │ └── test_client_metadata_integration.py │ └── middleware/ # Middleware chain tests │ ├── test_security_chain.py │ └── test_https_headers_integration.py │ ├── e2e/ # End-to-end flow tests │ ├── test_complete_auth_flow.py │ ├── test_domain_verification_flow.py │ ├── test_error_scenarios.py │ └── test_client_interactions.py │ └── fixtures/ # Shared test fixtures ├── domains.py # Domain test data ├── clients.py # Client configurations ├── tokens.py # Token fixtures └── mocks.py # External service mocks ``` ## Component Details ### 1. Integration Test Suite Expansion #### 1.1 API Endpoint Integration Tests **File**: `tests/integration/api/test_auth_token_flow.py` Tests the complete interaction between authorization and token endpoints: ```python class TestAuthTokenFlow: """Test authorization and token endpoint integration.""" async def test_successful_auth_to_token_flow(self, test_client, mock_domain): """Test complete flow from authorization to token generation.""" # 1. Start authorization request auth_response = await test_client.get("/authorize", params={ "response_type": "code", "client_id": "https://app.example.com", "redirect_uri": "https://app.example.com/callback", "state": "random_state", "code_challenge": "challenge", "code_challenge_method": "S256", "me": mock_domain.url }) # 2. Verify domain ownership (mocked as verified) # 3. User consents consent_response = await test_client.post("/consent", data={ "auth_request_id": auth_response.json()["request_id"], "consent": "approve" }) # 4. Extract authorization code from redirect location = consent_response.headers["location"] code = extract_code_from_redirect(location) # 5. Exchange code for token token_response = await test_client.post("/token", data={ "grant_type": "authorization_code", "code": code, "client_id": "https://app.example.com", "redirect_uri": "https://app.example.com/callback", "code_verifier": "verifier" }) # Assertions assert token_response.status_code == 200 assert "access_token" in token_response.json() assert "me" in token_response.json() async def test_code_replay_prevention(self, test_client, valid_auth_code): """Test that authorization codes cannot be reused.""" # First exchange should succeed # Second exchange should fail with 400 Bad Request async def test_code_expiration(self, test_client, freezer): """Test that expired codes are rejected.""" # Generate code # Advance time beyond expiration # Attempt exchange should fail ``` **File**: `tests/integration/api/test_metadata_integration.py` Tests client metadata fetching and caching: ```python class TestMetadataIntegration: """Test client metadata discovery integration.""" async def test_happ_metadata_fetch_and_display(self, test_client, mock_http): """Test h-app metadata fetching and authorization page display.""" # Mock client_id URL to return h-app microformat mock_http.get("https://app.example.com", text="""

Example App

""") # Request authorization response = await test_client.get("/authorize", params={ "client_id": "https://app.example.com", # ... other params }) # Verify metadata appears in consent page assert "Example App" in response.text assert "logo.png" in response.text async def test_metadata_caching(self, test_client, mock_http, db_session): """Test that client metadata is cached after first fetch.""" # First request fetches from HTTP # Second request uses cache # Verify only one HTTP call made async def test_metadata_fallback(self, test_client, mock_http): """Test fallback when client has no h-app metadata.""" # Mock client_id URL with no h-app # Verify domain name used as fallback ``` #### 1.2 Service Layer Integration Tests **File**: `tests/integration/services/test_domain_email_integration.py` Tests domain verification service integration: ```python class TestDomainEmailIntegration: """Test domain verification with email service integration.""" async def test_dns_then_email_fallback(self, domain_service, dns_service, email_service): """Test DNS check fails, falls back to email verification.""" # Mock DNS to return no TXT records dns_service.mock_empty_response() # Request verification result = await domain_service.initiate_verification("user.example.com") # Should send email assert email_service.send_called assert result.method == "email" async def test_verification_result_storage(self, domain_service, db_session): """Test verification results are properly stored.""" # Verify domain await domain_service.verify_domain("user.example.com", method="dns") # Check database stored = db_session.query(DomainVerification).filter_by( domain="user.example.com" ).first() assert stored.verified is True assert stored.method == "dns" ``` **File**: `tests/integration/services/test_token_storage_integration.py` Tests token service with storage integration: ```python class TestTokenStorageIntegration: """Test token service with database storage.""" async def test_token_lifecycle(self, token_service, storage_service): """Test complete token lifecycle: create, store, retrieve, expire.""" # Create token token = await token_service.create_access_token( client_id="https://app.example.com", me="https://user.example.com" ) # Verify stored stored = await storage_service.get_token(token.value) assert stored is not None # Verify retrieval retrieved = await token_service.validate_token(token.value) assert retrieved.client_id == "https://app.example.com" # Test expiration with freeze_time(datetime.now() + timedelta(hours=2)): expired = await token_service.validate_token(token.value) assert expired is None async def test_concurrent_token_operations(self, token_service): """Test thread-safety of token operations.""" # Create multiple tokens concurrently # Verify no collisions or race conditions ``` #### 1.3 Middleware Chain Tests **File**: `tests/integration/middleware/test_security_chain.py` Tests security middleware integration: ```python class TestSecurityMiddlewareChain: """Test security middleware working together.""" async def test_complete_security_chain(self, test_client): """Test all security middleware in sequence.""" # Make HTTPS request response = await test_client.get( "https://server.example.com/authorize", headers={"X-Forwarded-Proto": "https"} ) # Verify all security headers present assert response.headers["X-Frame-Options"] == "DENY" assert response.headers["X-Content-Type-Options"] == "nosniff" assert "Content-Security-Policy" in response.headers assert response.headers["Strict-Transport-Security"] async def test_http_redirect_with_headers(self, test_client): """Test HTTP->HTTPS redirect includes security headers.""" response = await test_client.get( "http://server.example.com/authorize", follow_redirects=False ) assert response.status_code == 307 assert response.headers["Location"].startswith("https://") assert response.headers["X-Frame-Options"] == "DENY" ``` ### 2. End-to-End Authentication Flow Tests **File**: `tests/e2e/test_complete_auth_flow.py` Complete IndieAuth flow testing: ```python class TestCompleteAuthFlow: """Test complete IndieAuth authentication flows.""" async def test_first_time_user_flow(self, browser, test_server): """Test complete flow for new user.""" # 1. Client initiates authorization await browser.goto(f"{test_server}/authorize?client_id=...") # 2. User enters domain await browser.fill("#domain", "user.example.com") await browser.click("#verify") # 3. Domain verification (DNS) await browser.wait_for_selector(".verification-success") # 4. User reviews client info assert await browser.text_content(".client-name") == "Test App" # 5. User consents await browser.click("#approve") # 6. Redirect with code assert "code=" in browser.url # 7. Client exchanges code for token token_response = await exchange_code(extract_code(browser.url)) assert token_response["me"] == "https://user.example.com" async def test_returning_user_flow(self, browser, test_server, existing_domain): """Test flow for user with verified domain.""" # Should skip verification step # Should recognize returning user async def test_multiple_redirect_uris(self, browser, test_server): """Test client with multiple registered redirect URIs.""" # Verify correct URI validation # Test selection if multiple valid ``` **File**: `tests/e2e/test_domain_verification_flow.py` Domain verification E2E tests: ```python class TestDomainVerificationE2E: """Test complete domain verification flows.""" async def test_dns_verification_flow(self, browser, test_server, mock_dns): """Test DNS TXT record verification flow.""" # Setup mock DNS mock_dns.add_txt_record( "user.example.com", "indieauth=https://server.example.com" ) # Start verification await browser.goto(f"{test_server}/verify") await browser.fill("#domain", "user.example.com") await browser.click("#verify-dns") # Should auto-detect and verify await browser.wait_for_selector(".verified", timeout=5000) assert await browser.text_content(".method") == "DNS TXT Record" async def test_email_verification_flow(self, browser, test_server, mock_smtp): """Test email-based verification flow.""" # Start verification await browser.goto(f"{test_server}/verify") await browser.fill("#domain", "user.example.com") await browser.click("#verify-email") # Check email sent assert mock_smtp.messages_sent == 1 verification_link = extract_link(mock_smtp.last_message) # Click verification link await browser.goto(verification_link) # Enter code from email code = extract_code(mock_smtp.last_message) await browser.fill("#code", code) await browser.click("#confirm") # Should be verified assert await browser.text_content(".status") == "Verified" async def test_both_methods_available(self, browser, test_server): """Test when both DNS and email verification available.""" # Should prefer DNS # Should allow manual email selection ``` **File**: `tests/e2e/test_error_scenarios.py` Error scenario E2E tests: ```python class TestErrorScenariosE2E: """Test error handling in complete flows.""" async def test_invalid_client_id(self, test_client): """Test flow with invalid client_id.""" response = await test_client.get("/authorize", params={ "client_id": "not-a-url", "redirect_uri": "https://app.example.com/callback" }) assert response.status_code == 400 assert response.json()["error"] == "invalid_request" async def test_expired_authorization_code(self, test_client, freezer): """Test token exchange with expired code.""" # Generate code code = await generate_auth_code() # Advance time past expiration freezer.move_to(datetime.now() + timedelta(minutes=15)) # Attempt exchange response = await test_client.post("/token", data={ "code": code, "grant_type": "authorization_code" }) assert response.status_code == 400 assert response.json()["error"] == "invalid_grant" async def test_mismatched_redirect_uri(self, test_client): """Test token request with different redirect_uri.""" # Authorization with one redirect_uri # Token request with different redirect_uri # Should fail async def test_network_timeout_handling(self, test_client, slow_http): """Test handling of slow client_id fetches.""" slow_http.add_delay("https://slow-app.example.com", delay=10) # Should timeout and use fallback response = await test_client.get("/authorize", params={ "client_id": "https://slow-app.example.com" }) # Should still work but without metadata assert response.status_code == 200 assert "slow-app.example.com" in response.text # Fallback to domain ``` ### 3. Test Data and Fixtures **File**: `tests/fixtures/domains.py` Domain test fixtures: ```python @pytest.fixture def verified_domain(db_session): """Create pre-verified domain.""" domain = DomainVerification( domain="user.example.com", verified=True, method="dns", verified_at=datetime.utcnow() ) db_session.add(domain) db_session.commit() return domain @pytest.fixture def pending_domain(db_session): """Create domain pending verification.""" domain = DomainVerification( domain="pending.example.com", verified=False, verification_code="123456", created_at=datetime.utcnow() ) db_session.add(domain) db_session.commit() return domain @pytest.fixture def multiple_domains(db_session): """Create multiple test domains.""" domains = [ DomainVerification(domain=f"user{i}.example.com", verified=True) for i in range(5) ] db_session.add_all(domains) db_session.commit() return domains ``` **File**: `tests/fixtures/clients.py` Client configuration fixtures: ```python @pytest.fixture def simple_client(): """Basic IndieAuth client configuration.""" return { "client_id": "https://app.example.com", "redirect_uri": "https://app.example.com/callback", "client_name": "Example App", "client_uri": "https://app.example.com", "logo_uri": "https://app.example.com/logo.png" } @pytest.fixture def client_with_metadata(mock_http): """Client with h-app microformat metadata.""" mock_http.get("https://rich-app.example.com", text="""

Rich Application

Home
""") return { "client_id": "https://rich-app.example.com", "redirect_uri": "https://rich-app.example.com/auth/callback" } @pytest.fixture def malicious_client(): """Client with potentially malicious configuration.""" return { "client_id": "https://evil.example.com", "redirect_uri": "https://evil.example.com/steal", "state": "" } ``` **File**: `tests/fixtures/mocks.py` External service mocks: ```python @pytest.fixture def mock_dns(monkeypatch): """Mock DNS resolver.""" class MockDNS: def __init__(self): self.txt_records = {} def add_txt_record(self, domain, value): self.txt_records[domain] = [value] def resolve(self, domain, rdtype): if rdtype == "TXT" and domain in self.txt_records: return MockAnswer(self.txt_records[domain]) raise NXDOMAIN() mock = MockDNS() monkeypatch.setattr("dns.resolver.Resolver", lambda: mock) return mock @pytest.fixture def mock_smtp(monkeypatch): """Mock SMTP server.""" class MockSMTP: def __init__(self): self.messages_sent = 0 self.last_message = None def send_message(self, msg): self.messages_sent += 1 self.last_message = msg mock = MockSMTP() monkeypatch.setattr("smtplib.SMTP_SSL", lambda *args: mock) return mock @pytest.fixture def mock_http(responses): """Mock HTTP responses using responses library.""" return responses @pytest.fixture async def test_database(): """Provide clean test database.""" # Create in-memory SQLite database engine = create_async_engine("sqlite+aiosqlite:///:memory:") async with engine.begin() as conn: await conn.run_sync(Base.metadata.create_all) async_session = sessionmaker(engine, class_=AsyncSession) async with async_session() as session: yield session await engine.dispose() ``` ### 4. Coverage Enhancement Strategy #### 4.1 Target Coverage by Module ```python # Coverage targets in pyproject.toml [tool.coverage.report] fail_under = 90 precision = 2 exclude_lines = [ "pragma: no cover", "def __repr__", "raise AssertionError", "raise NotImplementedError", "if __name__ == .__main__.:", "if TYPE_CHECKING:" ] [tool.coverage.run] source = ["src/gondulf"] omit = [ "*/tests/*", "*/migrations/*", "*/__main__.py" ] # Per-module thresholds [tool.coverage.module] "gondulf.routers.authorization" = 95 "gondulf.routers.token" = 95 "gondulf.services.token_service" = 95 "gondulf.services.domain_verification" = 90 "gondulf.security" = 95 "gondulf.models" = 85 ``` #### 4.2 Gap Analysis and Remediation Current gaps (from coverage report): - `routers/verification.py`: 48% - Needs complete flow testing - `routers/token.py`: 88% - Missing error scenarios - `services/token_service.py`: 92% - Missing edge cases - `services/happ_parser.py`: 97% - Missing malformed HTML cases Remediation tests: ```python # tests/integration/api/test_verification_gap.py class TestVerificationEndpointGaps: """Fill coverage gaps in verification endpoint.""" async def test_verify_dns_preference(self): """Test DNS verification preference over email.""" async def test_verify_email_fallback(self): """Test email fallback when DNS unavailable.""" async def test_verify_both_methods_fail(self): """Test handling when both verification methods fail.""" # tests/unit/test_token_service_gaps.py class TestTokenServiceGaps: """Fill coverage gaps in token service.""" def test_token_cleanup_expired(self): """Test cleanup of expired tokens.""" def test_token_collision_handling(self): """Test handling of token ID collisions.""" ``` ### 5. Test Execution Framework #### 5.1 Parallel Test Execution ```python # pytest.ini configuration [pytest] minversion = 7.0 testpaths = tests python_files = test_*.py python_classes = Test* python_functions = test_* # Parallel execution addopts = -n auto --dist loadscope --maxfail 5 --strict-markers # Test markers markers = unit: Unit tests (fast, isolated) integration: Integration tests (component interaction) e2e: End-to-end tests (complete flows) security: Security-specific tests slow: Tests that take >1 second requires_network: Tests requiring network access ``` #### 5.2 Test Organization ```python # conftest.py - Shared configuration import pytest from typing import AsyncGenerator # Auto-use fixtures for all tests @pytest.fixture(autouse=True) async def reset_database(test_database): """Reset database state between tests.""" await test_database.execute("DELETE FROM tokens") await test_database.execute("DELETE FROM auth_codes") await test_database.execute("DELETE FROM domain_verifications") await test_database.commit() @pytest.fixture(autouse=True) def reset_rate_limiter(rate_limiter): """Clear rate limiter between tests.""" rate_limiter.reset() # Shared test utilities class TestBase: """Base class for test organization.""" @staticmethod def generate_auth_request(**kwargs): """Generate valid authorization request.""" defaults = { "response_type": "code", "client_id": "https://app.example.com", "redirect_uri": "https://app.example.com/callback", "state": "random_state", "code_challenge": "challenge", "code_challenge_method": "S256" } defaults.update(kwargs) return defaults ``` ### 6. Performance Benchmarks #### 6.1 Response Time Tests ```python # tests/performance/test_response_times.py class TestResponseTimes: """Ensure response times meet requirements.""" @pytest.mark.benchmark async def test_authorization_endpoint_performance(self, test_client, benchmark): """Authorization endpoint must respond in <200ms.""" def make_request(): return test_client.get("/authorize", params={ "response_type": "code", "client_id": "https://app.example.com" }) result = benchmark(make_request) assert result.response_time < 0.2 # 200ms @pytest.mark.benchmark async def test_token_endpoint_performance(self, test_client, benchmark): """Token endpoint must respond in <100ms.""" def exchange_token(): return test_client.post("/token", data={ "grant_type": "authorization_code", "code": "test_code" }) result = benchmark(exchange_token) assert result.response_time < 0.1 # 100ms ``` ## Testing Strategy ### Test Reliability 1. **Isolation**: Each test runs in isolation with clean state 2. **Determinism**: No random failures, use fixed seeds and frozen time 3. **Speed**: Unit tests <1ms, integration <100ms, E2E <1s 4. **Independence**: Tests can run in any order without dependencies ### Test Maintenance 1. **DRY Principle**: Shared fixtures and utilities 2. **Clear Names**: Test names describe what is being tested 3. **Documentation**: Each test includes docstring explaining purpose 4. **Refactoring**: Regular cleanup of redundant or obsolete tests ### Continuous Integration ```yaml # .github/workflows/test.yml name: Test Suite on: [push, pull_request] jobs: test: runs-on: ubuntu-latest strategy: matrix: python-version: [3.11, 3.12] test-type: [unit, integration, e2e, security] steps: - uses: actions/checkout@v3 - name: Set up Python uses: actions/setup-python@v4 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | pip install uv uv sync --dev - name: Run ${{ matrix.test-type }} tests run: | uv run pytest tests/${{ matrix.test-type }} \ --cov=src/gondulf \ --cov-report=xml \ --cov-report=term-missing - name: Upload coverage uses: codecov/codecov-action@v3 with: file: ./coverage.xml flags: ${{ matrix.test-type }} - name: Check coverage threshold run: | uv run python -m coverage report --fail-under=90 ``` ## Security Considerations ### Test Data Security 1. **No Production Data**: Never use real user data in tests 2. **Mock Secrets**: Generate test keys/tokens dynamically 3. **Secure Fixtures**: Don't commit sensitive test data ### Security Test Coverage Required security tests: - SQL injection attempts on all endpoints - XSS attempts in all user inputs - CSRF token validation - Open redirect prevention - Timing attack resistance - Rate limiting enforcement ## Acceptance Criteria ### Coverage Requirements - [ ] Overall test coverage ≥ 90% - [ ] Critical path coverage ≥ 95% (auth, token, security) - [ ] All endpoints have integration tests - [ ] Complete E2E flow tests for all user journeys ### Test Quality Requirements - [ ] All tests pass consistently (no flaky tests) - [ ] Test execution time < 30 seconds for full suite - [ ] Unit tests execute in < 5 seconds - [ ] Tests run successfully in CI/CD pipeline ### Documentation Requirements - [ ] All test files have module docstrings - [ ] Complex tests have explanatory comments - [ ] Test fixtures are documented - [ ] Coverage gaps are identified and tracked ### Integration Requirements - [ ] Tests verify component interactions - [ ] Database operations are tested - [ ] External service mocks are comprehensive - [ ] Middleware chain is tested ### E2E Requirements - [ ] Complete authentication flow tested - [ ] Domain verification flows tested - [ ] Error scenarios comprehensively tested - [ ] Real-world usage patterns covered ## Implementation Priority ### Phase 1: Integration Tests (2-3 days) 1. API endpoint integration tests 2. Service layer integration tests 3. Middleware chain tests 4. Database integration tests ### Phase 2: E2E Tests (2-3 days) 1. Complete authentication flow 2. Domain verification flows 3. Error scenario testing 4. Client interaction tests ### Phase 3: Gap Remediation (1-2 days) 1. Analyze coverage report 2. Write targeted tests for gaps 3. Refactor existing tests 4. Update test documentation ### Phase 4: Performance & Security (1 day) 1. Performance benchmarks 2. Security test suite 3. Load testing scenarios 4. Chaos testing (optional) ## Success Metrics The test suite expansion is successful when: 1. Coverage targets are achieved (90%+ overall, 95%+ critical) 2. All integration tests pass consistently 3. E2E tests validate complete user journeys 4. No critical bugs found in tested code paths 5. Test execution remains fast and reliable 6. New features can be safely added with test protection ## Technical Debt Considerations ### Current Debt - Missing verification endpoint tests (48% coverage) - Incomplete error scenario coverage - No performance benchmarks - Limited security test coverage ### Debt Prevention - Maintain test coverage thresholds - Require tests for all new features - Regular test refactoring - Performance regression detection ## Notes This comprehensive test expansion ensures the IndieAuth server operates correctly as a complete system. The focus on integration and E2E testing validates that individual components work together properly and that users can successfully complete authentication flows. The structured approach with clear organization, shared fixtures, and targeted gap remediation provides confidence in the implementation's correctness and security.