feat(test): add Phase 5b integration and E2E tests

Add comprehensive integration and end-to-end test suites: - Integration tests for API flows (authorization, token, verification) - Integration tests for middleware chain and security headers - Integration tests for domain verification services - E2E tests for complete authentication flows - E2E tests for error scenarios and edge cases - Shared test fixtures and utilities in conftest.py - Rename Dockerfile to Containerfile for Podman compatibility 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 22:22:04 -07:00
parent 01dcaba86b
commit e1f79af347
19 changed files with 4387 additions and 0 deletions
--- a/docs/designs/phase-5b-clarifications.md
+++ b/docs/designs/phase-5b-clarifications.md
@@ -0,0 +1,255 @@
+# Phase 5b Implementation Clarifications
+
+This document provides clear answers to the Developer's implementation questions for Phase 5b.
+
+## Questions and Answers
+
+### 1. E2E Browser Automation
+
+**Question**: Should we use Playwright/Selenium for browser automation, or TestClient-based flow simulation?
+
+**Decision**: Use TestClient-based flow simulation.
+
+**Rationale**:
+- Simpler and more maintainable - no browser drivers to manage
+- Faster execution - no browser startup overhead
+- Better CI/CD compatibility - no headless browser configuration
+- Sufficient for protocol compliance testing - we're testing OAuth flows, not UI rendering
+- Aligns with existing test patterns in the codebase
+
+**Implementation Guidance**:
+```python
+# Use FastAPI TestClient with session persistence
+from fastapi.testclient import TestClient
+
+def test_full_authorization_flow():
+    client = TestClient(app)
+    # Simulate full OAuth flow through TestClient
+    # Parse HTML responses where needed for form submission
+```
+
+### 2. Database Fixtures
+
+**Question**: Design shows async SQLAlchemy but codebase uses sync. Should tests use existing sync patterns?
+
+**Decision**: Use existing sync patterns.
+
+**Rationale**:
+- Consistency with current codebase (Database class uses sync SQLAlchemy)
+- No need to introduce async complexity for testing
+- Simpler fixture management
+
+**Implementation Guidance**:
+```python
+# Keep using sync patterns as in existing database/connection.py
+@pytest.fixture
+def test_db():
+    """Create test database with sync SQLAlchemy."""
+    db = Database("sqlite:///:memory:")
+    db.initialize()
+    yield db
+    # cleanup
+```
+
+### 3. Parallel Test Execution
+
+**Question**: Should pytest-xdist be added for parallel test execution?
+
+**Decision**: No, not for Phase 5b.
+
+**Rationale**:
+- Current test suite is small enough for sequential execution
+- Avoids complexity of test isolation for parallel runs
+- Can be added later if test execution time becomes a problem
+- KISS principle - don't add infrastructure we don't need yet
+
+**Implementation Guidance**:
+- Run tests sequentially with standard pytest
+- Document in test README that parallel execution can be considered for future optimization
+
+### 4. Performance Benchmarks
+
+**Question**: Should pytest-benchmark be added? How to handle potentially flaky CI tests?
+
+**Decision**: No benchmarking in Phase 5b.
+
+**Rationale**:
+- Performance testing is not in Phase 5b scope
+- Focus on functional correctness and security first
+- Performance optimization is premature at this stage
+- Can be added in a dedicated performance phase if needed
+
+**Implementation Guidance**:
+- Skip any performance-related tests for now
+- Focus on correctness and security tests only
+
+### 5. Coverage Thresholds
+
+**Question**: Per-module thresholds aren't natively supported by coverage.py. What approach?
+
+**Decision**: Use global threshold of 80% for Phase 5b.
+
+**Rationale**:
+- Simple to implement and verify
+- coverage.py supports this natively with `fail_under`
+- Per-module thresholds add unnecessary complexity
+- 80% is a reasonable target for this phase
+
+**Implementation Guidance**:
+```ini
+# In pyproject.toml
+[tool.coverage.report]
+fail_under = 80
+```
+
+### 6. Consent Flow Testing
+
+**Question**: Design shows `/consent` with JSON but implementation is `/authorize/consent` with HTML forms. Which to follow?
+
+**Decision**: Follow the actual implementation: `/authorize/consent` with HTML forms.
+
+**Rationale**:
+- Test the system as it actually works
+- The design document was conceptual; implementation is authoritative
+- HTML form testing is more realistic for IndieAuth flows
+
+**Implementation Guidance**:
+```python
+def test_consent_form_submission():
+    # POST to /authorize/consent with form data
+    response = client.post(
+        "/authorize/consent",
+        data={
+            "client_id": "...",
+            "redirect_uri": "...",
+            # ... other form fields
+        }
+    )
+```
+
+### 7. Fixtures Directory
+
+**Question**: Create new `tests/fixtures/` or keep existing `conftest.py` pattern?
+
+**Decision**: Keep existing `conftest.py` pattern.
+
+**Rationale**:
+- Consistency with current test structure
+- pytest naturally discovers fixtures in conftest.py
+- No need to introduce new patterns
+- Can organize fixtures within conftest.py with clear sections
+
+**Implementation Guidance**:
+```python
+# In tests/conftest.py, add new fixtures with clear sections:
+
+# === Database Fixtures ===
+@pytest.fixture
+def test_database():
+    """Test database fixture."""
+    pass
+
+# === Client Fixtures ===
+@pytest.fixture
+def registered_client():
+    """Pre-registered client fixture."""
+    pass
+
+# === Authorization Fixtures ===
+@pytest.fixture
+def valid_auth_code():
+    """Valid authorization code fixture."""
+    pass
+```
+
+### 8. CI/CD Workflow
+
+**Question**: Is GitHub Actions workflow in scope for Phase 5b?
+
+**Decision**: No, CI/CD is out of scope for Phase 5b.
+
+**Rationale**:
+- Phase 5b focuses on test implementation, not deployment infrastructure
+- CI/CD should be a separate phase with its own design
+- Keeps Phase 5b scope manageable
+
+**Implementation Guidance**:
+- Focus only on making tests runnable via `pytest`
+- Document test execution commands in tests/README.md
+- CI/CD integration can come later
+
+### 9. DNS Mocking
+
+**Question**: Global patching vs dependency injection override (existing pattern)?
+
+**Decision**: Use dependency injection override pattern (existing in codebase).
+
+**Rationale**:
+- Consistency with existing patterns (see get_database, get_verification_service)
+- More explicit and controllable
+- Easier to reason about in tests
+- Avoids global state issues
+
+**Implementation Guidance**:
+```python
+# Use FastAPI dependency override pattern
+def test_with_mocked_dns():
+    def mock_dns_service():
+        service = Mock()
+        service.resolve_txt.return_value = ["expected", "values"]
+        return service
+
+    app.dependency_overrides[get_dns_service] = mock_dns_service
+    # run test
+    app.dependency_overrides.clear()
+```
+
+### 10. HTTP Mocking
+
+**Question**: Use `responses` library (for requests) or `respx` (for httpx)?
+
+**Decision**: Neither - use unittest.mock for urllib.
+
+**Rationale**:
+- The codebase uses urllib.request (see HTMLFetcherService), not requests or httpx
+- httpx is only in test dependencies, not used in production code
+- Existing tests already mock urllib successfully
+- No need to add new mocking libraries
+
+**Implementation Guidance**:
+```python
+# Follow existing pattern from test_html_fetcher.py
+@patch('gondulf.services.html_fetcher.urllib.request.urlopen')
+def test_http_fetch(mock_urlopen):
+    mock_response = MagicMock()
+    mock_response.read.return_value = b"<html>...</html>"
+    mock_urlopen.return_value = mock_response
+    # test the fetch
+```
+
+## Summary of Decisions
+
+1. **E2E Testing**: TestClient-based simulation (no browser automation)
+2. **Database**: Sync SQLAlchemy (match existing patterns)
+3. **Parallel Tests**: No (keep it simple)
+4. **Benchmarks**: No (out of scope)
+5. **Coverage**: Global 80% threshold
+6. **Consent Endpoint**: `/authorize/consent` with HTML forms (match implementation)
+7. **Fixtures**: Keep conftest.py pattern
+8. **CI/CD**: Out of scope
+9. **DNS Mocking**: Dependency injection pattern
+10. **HTTP Mocking**: unittest.mock for urllib
+
+## Implementation Priority
+
+Focus on these test categories in order:
+1. Integration tests for complete OAuth flows
+2. Security tests for timing attacks and injection
+3. Error handling tests
+4. Edge case coverage
+
+## Key Principle
+
+**Simplicity and Consistency**: Every decision above favors simplicity and consistency with existing patterns over introducing new complexity. The goal is comprehensive testing that works with what we have, not a perfect test infrastructure.
+
+CLARIFICATIONS PROVIDED: Phase 5b - Developer may proceed
--- a/docs/designs/phase-5b-integration-e2e-tests.md
+++ b/docs/designs/phase-5b-integration-e2e-tests.md
@@ -0,0 +1,924 @@
+# Phase 5b: Integration and End-to-End Tests Design
+
+## Purpose
+
+Phase 5b enhances the test suite to achieve comprehensive coverage through integration and end-to-end testing. While the current test suite has 86.93% coverage with 327 tests, critical gaps remain in verifying complete authentication flows and component interactions. This phase ensures the IndieAuth server operates correctly as a complete system, not just as individual components.
+
+### Goals
+1. Verify all components work together correctly (integration tests)
+2. Validate complete IndieAuth authentication flows (E2E tests)
+3. Test real-world scenarios and error conditions
+4. Achieve 90%+ overall coverage with 95%+ on critical paths
+5. Ensure test reliability and maintainability
+
+## Specification References
+
+### W3C IndieAuth Requirements
+- Section 5.2: Authorization Endpoint - complete flow validation
+- Section 5.3: Token Endpoint - code exchange validation
+- Section 5.4: Token Verification - end-to-end verification
+- Section 6: Client Information Discovery - metadata integration
+- Section 7: Security Considerations - comprehensive security testing
+
+### OAuth 2.0 RFC 6749
+- Section 4.1: Authorization Code Grant - full flow testing
+- Section 10: Security Considerations - threat mitigation verification
+
+## Design Overview
+
+The testing expansion follows a three-layer approach:
+
+1. **Integration Layer**: Tests component interactions within the system
+2. **End-to-End Layer**: Tests complete user flows from start to finish
+3. **Scenario Layer**: Tests real-world usage patterns and edge cases
+
+### Test Organization Structure
+```
+tests/
+├── integration/          # Component interaction tests
+│   ├── api/             # API endpoint integration
+│   │   ├── test_auth_token_flow.py
+│   │   ├── test_metadata_integration.py
+│   │   └── test_verification_flow.py
+│   ├── services/        # Service layer integration
+│   │   ├── test_domain_email_integration.py
+│   │   ├── test_token_storage_integration.py
+│   │   └── test_client_metadata_integration.py
+│   └── middleware/      # Middleware chain tests
+│       ├── test_security_chain.py
+│       └── test_https_headers_integration.py
+│
+├── e2e/                 # End-to-end flow tests
+│   ├── test_complete_auth_flow.py
+│   ├── test_domain_verification_flow.py
+│   ├── test_error_scenarios.py
+│   └── test_client_interactions.py
+│
+└── fixtures/            # Shared test fixtures
+    ├── domains.py       # Domain test data
+    ├── clients.py       # Client configurations
+    ├── tokens.py        # Token fixtures
+    └── mocks.py         # External service mocks
+```
+
+## Component Details
+
+### 1. Integration Test Suite Expansion
+
+#### 1.1 API Endpoint Integration Tests
+
+**File**: `tests/integration/api/test_auth_token_flow.py`
+
+Tests the complete interaction between authorization and token endpoints:
+
+```python
+class TestAuthTokenFlow:
+    """Test authorization and token endpoint integration."""
+
+    async def test_successful_auth_to_token_flow(self, test_client, mock_domain):
+        """Test complete flow from authorization to token generation."""
+        # 1. Start authorization request
+        auth_response = await test_client.get("/authorize", params={
+            "response_type": "code",
+            "client_id": "https://app.example.com",
+            "redirect_uri": "https://app.example.com/callback",
+            "state": "random_state",
+            "code_challenge": "challenge",
+            "code_challenge_method": "S256",
+            "me": mock_domain.url
+        })
+
+        # 2. Verify domain ownership (mocked as verified)
+        # 3. User consents
+        consent_response = await test_client.post("/consent", data={
+            "auth_request_id": auth_response.json()["request_id"],
+            "consent": "approve"
+        })
+
+        # 4. Extract authorization code from redirect
+        location = consent_response.headers["location"]
+        code = extract_code_from_redirect(location)
+
+        # 5. Exchange code for token
+        token_response = await test_client.post("/token", data={
+            "grant_type": "authorization_code",
+            "code": code,
+            "client_id": "https://app.example.com",
+            "redirect_uri": "https://app.example.com/callback",
+            "code_verifier": "verifier"
+        })
+
+        # Assertions
+        assert token_response.status_code == 200
+        assert "access_token" in token_response.json()
+        assert "me" in token_response.json()
+
+    async def test_code_replay_prevention(self, test_client, valid_auth_code):
+        """Test that authorization codes cannot be reused."""
+        # First exchange should succeed
+        # Second exchange should fail with 400 Bad Request
+
+    async def test_code_expiration(self, test_client, freezer):
+        """Test that expired codes are rejected."""
+        # Generate code
+        # Advance time beyond expiration
+        # Attempt exchange should fail
+```
+
+**File**: `tests/integration/api/test_metadata_integration.py`
+
+Tests client metadata fetching and caching:
+
+```python
+class TestMetadataIntegration:
+    """Test client metadata discovery integration."""
+
+    async def test_happ_metadata_fetch_and_display(self, test_client, mock_http):
+        """Test h-app metadata fetching and authorization page display."""
+        # Mock client_id URL to return h-app microformat
+        mock_http.get("https://app.example.com", text="""
+            <div class="h-app">
+                <h1 class="p-name">Example App</h1>
+                <img class="u-logo" src="/logo.png" />
+            </div>
+        """)
+
+        # Request authorization
+        response = await test_client.get("/authorize", params={
+            "client_id": "https://app.example.com",
+            # ... other params
+        })
+
+        # Verify metadata appears in consent page
+        assert "Example App" in response.text
+        assert "logo.png" in response.text
+
+    async def test_metadata_caching(self, test_client, mock_http, db_session):
+        """Test that client metadata is cached after first fetch."""
+        # First request fetches from HTTP
+        # Second request uses cache
+        # Verify only one HTTP call made
+
+    async def test_metadata_fallback(self, test_client, mock_http):
+        """Test fallback when client has no h-app metadata."""
+        # Mock client_id URL with no h-app
+        # Verify domain name used as fallback
+```
+
+#### 1.2 Service Layer Integration Tests
+
+**File**: `tests/integration/services/test_domain_email_integration.py`
+
+Tests domain verification service integration:
+
+```python
+class TestDomainEmailIntegration:
+    """Test domain verification with email service integration."""
+
+    async def test_dns_then_email_fallback(self, domain_service, dns_service, email_service):
+        """Test DNS check fails, falls back to email verification."""
+        # Mock DNS to return no TXT records
+        dns_service.mock_empty_response()
+
+        # Request verification
+        result = await domain_service.initiate_verification("user.example.com")
+
+        # Should send email
+        assert email_service.send_called
+        assert result.method == "email"
+
+    async def test_verification_result_storage(self, domain_service, db_session):
+        """Test verification results are properly stored."""
+        # Verify domain
+        await domain_service.verify_domain("user.example.com", method="dns")
+
+        # Check database
+        stored = db_session.query(DomainVerification).filter_by(
+            domain="user.example.com"
+        ).first()
+        assert stored.verified is True
+        assert stored.method == "dns"
+```
+
+**File**: `tests/integration/services/test_token_storage_integration.py`
+
+Tests token service with storage integration:
+
+```python
+class TestTokenStorageIntegration:
+    """Test token service with database storage."""
+
+    async def test_token_lifecycle(self, token_service, storage_service):
+        """Test complete token lifecycle: create, store, retrieve, expire."""
+        # Create token
+        token = await token_service.create_access_token(
+            client_id="https://app.example.com",
+            me="https://user.example.com"
+        )
+
+        # Verify stored
+        stored = await storage_service.get_token(token.value)
+        assert stored is not None
+
+        # Verify retrieval
+        retrieved = await token_service.validate_token(token.value)
+        assert retrieved.client_id == "https://app.example.com"
+
+        # Test expiration
+        with freeze_time(datetime.now() + timedelta(hours=2)):
+            expired = await token_service.validate_token(token.value)
+            assert expired is None
+
+    async def test_concurrent_token_operations(self, token_service):
+        """Test thread-safety of token operations."""
+        # Create multiple tokens concurrently
+        # Verify no collisions or race conditions
+```
+
+#### 1.3 Middleware Chain Tests
+
+**File**: `tests/integration/middleware/test_security_chain.py`
+
+Tests security middleware integration:
+
+```python
+class TestSecurityMiddlewareChain:
+    """Test security middleware working together."""
+
+    async def test_complete_security_chain(self, test_client):
+        """Test all security middleware in sequence."""
+        # Make HTTPS request
+        response = await test_client.get(
+            "https://server.example.com/authorize",
+            headers={"X-Forwarded-Proto": "https"}
+        )
+
+        # Verify all security headers present
+        assert response.headers["X-Frame-Options"] == "DENY"
+        assert response.headers["X-Content-Type-Options"] == "nosniff"
+        assert "Content-Security-Policy" in response.headers
+        assert response.headers["Strict-Transport-Security"]
+
+    async def test_http_redirect_with_headers(self, test_client):
+        """Test HTTP->HTTPS redirect includes security headers."""
+        response = await test_client.get(
+            "http://server.example.com/authorize",
+            follow_redirects=False
+        )
+
+        assert response.status_code == 307
+        assert response.headers["Location"].startswith("https://")
+        assert response.headers["X-Frame-Options"] == "DENY"
+```
+
+### 2. End-to-End Authentication Flow Tests
+
+**File**: `tests/e2e/test_complete_auth_flow.py`
+
+Complete IndieAuth flow testing:
+
+```python
+class TestCompleteAuthFlow:
+    """Test complete IndieAuth authentication flows."""
+
+    async def test_first_time_user_flow(self, browser, test_server):
+        """Test complete flow for new user."""
+        # 1. Client initiates authorization
+        await browser.goto(f"{test_server}/authorize?client_id=...")
+
+        # 2. User enters domain
+        await browser.fill("#domain", "user.example.com")
+        await browser.click("#verify")
+
+        # 3. Domain verification (DNS)
+        await browser.wait_for_selector(".verification-success")
+
+        # 4. User reviews client info
+        assert await browser.text_content(".client-name") == "Test App"
+
+        # 5. User consents
+        await browser.click("#approve")
+
+        # 6. Redirect with code
+        assert "code=" in browser.url
+
+        # 7. Client exchanges code for token
+        token_response = await exchange_code(extract_code(browser.url))
+        assert token_response["me"] == "https://user.example.com"
+
+    async def test_returning_user_flow(self, browser, test_server, existing_domain):
+        """Test flow for user with verified domain."""
+        # Should skip verification step
+        # Should recognize returning user
+
+    async def test_multiple_redirect_uris(self, browser, test_server):
+        """Test client with multiple registered redirect URIs."""
+        # Verify correct URI validation
+        # Test selection if multiple valid
+```
+
+**File**: `tests/e2e/test_domain_verification_flow.py`
+
+Domain verification E2E tests:
+
+```python
+class TestDomainVerificationE2E:
+    """Test complete domain verification flows."""
+
+    async def test_dns_verification_flow(self, browser, test_server, mock_dns):
+        """Test DNS TXT record verification flow."""
+        # Setup mock DNS
+        mock_dns.add_txt_record(
+            "user.example.com",
+            "indieauth=https://server.example.com"
+        )
+
+        # Start verification
+        await browser.goto(f"{test_server}/verify")
+        await browser.fill("#domain", "user.example.com")
+        await browser.click("#verify-dns")
+
+        # Should auto-detect and verify
+        await browser.wait_for_selector(".verified", timeout=5000)
+        assert await browser.text_content(".method") == "DNS TXT Record"
+
+    async def test_email_verification_flow(self, browser, test_server, mock_smtp):
+        """Test email-based verification flow."""
+        # Start verification
+        await browser.goto(f"{test_server}/verify")
+        await browser.fill("#domain", "user.example.com")
+        await browser.click("#verify-email")
+
+        # Check email sent
+        assert mock_smtp.messages_sent == 1
+        verification_link = extract_link(mock_smtp.last_message)
+
+        # Click verification link
+        await browser.goto(verification_link)
+
+        # Enter code from email
+        code = extract_code(mock_smtp.last_message)
+        await browser.fill("#code", code)
+        await browser.click("#confirm")
+
+        # Should be verified
+        assert await browser.text_content(".status") == "Verified"
+
+    async def test_both_methods_available(self, browser, test_server):
+        """Test when both DNS and email verification available."""
+        # Should prefer DNS
+        # Should allow manual email selection
+```
+
+**File**: `tests/e2e/test_error_scenarios.py`
+
+Error scenario E2E tests:
+
+```python
+class TestErrorScenariosE2E:
+    """Test error handling in complete flows."""
+
+    async def test_invalid_client_id(self, test_client):
+        """Test flow with invalid client_id."""
+        response = await test_client.get("/authorize", params={
+            "client_id": "not-a-url",
+            "redirect_uri": "https://app.example.com/callback"
+        })
+
+        assert response.status_code == 400
+        assert response.json()["error"] == "invalid_request"
+
+    async def test_expired_authorization_code(self, test_client, freezer):
+        """Test token exchange with expired code."""
+        # Generate code
+        code = await generate_auth_code()
+
+        # Advance time past expiration
+        freezer.move_to(datetime.now() + timedelta(minutes=15))
+
+        # Attempt exchange
+        response = await test_client.post("/token", data={
+            "code": code,
+            "grant_type": "authorization_code"
+        })
+
+        assert response.status_code == 400
+        assert response.json()["error"] == "invalid_grant"
+
+    async def test_mismatched_redirect_uri(self, test_client):
+        """Test token request with different redirect_uri."""
+        # Authorization with one redirect_uri
+        # Token request with different redirect_uri
+        # Should fail
+
+    async def test_network_timeout_handling(self, test_client, slow_http):
+        """Test handling of slow client_id fetches."""
+        slow_http.add_delay("https://slow-app.example.com", delay=10)
+
+        # Should timeout and use fallback
+        response = await test_client.get("/authorize", params={
+            "client_id": "https://slow-app.example.com"
+        })
+
+        # Should still work but without metadata
+        assert response.status_code == 200
+        assert "slow-app.example.com" in response.text  # Fallback to domain
+```
+
+### 3. Test Data and Fixtures
+
+**File**: `tests/fixtures/domains.py`
+
+Domain test fixtures:
+
+```python
+@pytest.fixture
+def verified_domain(db_session):
+    """Create pre-verified domain."""
+    domain = DomainVerification(
+        domain="user.example.com",
+        verified=True,
+        method="dns",
+        verified_at=datetime.utcnow()
+    )
+    db_session.add(domain)
+    db_session.commit()
+    return domain
+
+@pytest.fixture
+def pending_domain(db_session):
+    """Create domain pending verification."""
+    domain = DomainVerification(
+        domain="pending.example.com",
+        verified=False,
+        verification_code="123456",
+        created_at=datetime.utcnow()
+    )
+    db_session.add(domain)
+    db_session.commit()
+    return domain
+
+@pytest.fixture
+def multiple_domains(db_session):
+    """Create multiple test domains."""
+    domains = [
+        DomainVerification(domain=f"user{i}.example.com", verified=True)
+        for i in range(5)
+    ]
+    db_session.add_all(domains)
+    db_session.commit()
+    return domains
+```
+
+**File**: `tests/fixtures/clients.py`
+
+Client configuration fixtures:
+
+```python
+@pytest.fixture
+def simple_client():
+    """Basic IndieAuth client configuration."""
+    return {
+        "client_id": "https://app.example.com",
+        "redirect_uri": "https://app.example.com/callback",
+        "client_name": "Example App",
+        "client_uri": "https://app.example.com",
+        "logo_uri": "https://app.example.com/logo.png"
+    }
+
+@pytest.fixture
+def client_with_metadata(mock_http):
+    """Client with h-app microformat metadata."""
+    mock_http.get("https://rich-app.example.com", text="""
+        <html>
+        <body>
+            <div class="h-app">
+                <h1 class="p-name">Rich Application</h1>
+                <img class="u-logo" src="/assets/logo.png" alt="Logo">
+                <a class="u-url" href="/">Home</a>
+            </div>
+        </body>
+        </html>
+    """)
+
+    return {
+        "client_id": "https://rich-app.example.com",
+        "redirect_uri": "https://rich-app.example.com/auth/callback"
+    }
+
+@pytest.fixture
+def malicious_client():
+    """Client with potentially malicious configuration."""
+    return {
+        "client_id": "https://evil.example.com",
+        "redirect_uri": "https://evil.example.com/steal",
+        "state": "<script>alert('xss')</script>"
+    }
+```
+
+**File**: `tests/fixtures/mocks.py`
+
+External service mocks:
+
+```python
+@pytest.fixture
+def mock_dns(monkeypatch):
+    """Mock DNS resolver."""
+    class MockDNS:
+        def __init__(self):
+            self.txt_records = {}
+
+        def add_txt_record(self, domain, value):
+            self.txt_records[domain] = [value]
+
+        def resolve(self, domain, rdtype):
+            if rdtype == "TXT" and domain in self.txt_records:
+                return MockAnswer(self.txt_records[domain])
+            raise NXDOMAIN()
+
+    mock = MockDNS()
+    monkeypatch.setattr("dns.resolver.Resolver", lambda: mock)
+    return mock
+
+@pytest.fixture
+def mock_smtp(monkeypatch):
+    """Mock SMTP server."""
+    class MockSMTP:
+        def __init__(self):
+            self.messages_sent = 0
+            self.last_message = None
+
+        def send_message(self, msg):
+            self.messages_sent += 1
+            self.last_message = msg
+
+    mock = MockSMTP()
+    monkeypatch.setattr("smtplib.SMTP_SSL", lambda *args: mock)
+    return mock
+
+@pytest.fixture
+def mock_http(responses):
+    """Mock HTTP responses using responses library."""
+    return responses
+
+@pytest.fixture
+async def test_database():
+    """Provide clean test database."""
+    # Create in-memory SQLite database
+    engine = create_async_engine("sqlite+aiosqlite:///:memory:")
+
+    async with engine.begin() as conn:
+        await conn.run_sync(Base.metadata.create_all)
+
+    async_session = sessionmaker(engine, class_=AsyncSession)
+
+    async with async_session() as session:
+        yield session
+
+    await engine.dispose()
+```
+
+### 4. Coverage Enhancement Strategy
+
+#### 4.1 Target Coverage by Module
+
+```python
+# Coverage targets in pyproject.toml
+[tool.coverage.report]
+fail_under = 90
+precision = 2
+exclude_lines = [
+    "pragma: no cover",
+    "def __repr__",
+    "raise AssertionError",
+    "raise NotImplementedError",
+    "if __name__ == .__main__.:",
+    "if TYPE_CHECKING:"
+]
+
+[tool.coverage.run]
+source = ["src/gondulf"]
+omit = [
+    "*/tests/*",
+    "*/migrations/*",
+    "*/__main__.py"
+]
+
+# Per-module thresholds
+[tool.coverage.module]
+"gondulf.routers.authorization" = 95
+"gondulf.routers.token" = 95
+"gondulf.services.token_service" = 95
+"gondulf.services.domain_verification" = 90
+"gondulf.security" = 95
+"gondulf.models" = 85
+```
+
+#### 4.2 Gap Analysis and Remediation
+
+Current gaps (from coverage report):
+- `routers/verification.py`: 48% - Needs complete flow testing
+- `routers/token.py`: 88% - Missing error scenarios
+- `services/token_service.py`: 92% - Missing edge cases
+- `services/happ_parser.py`: 97% - Missing malformed HTML cases
+
+Remediation tests:
+
+```python
+# tests/integration/api/test_verification_gap.py
+class TestVerificationEndpointGaps:
+    """Fill coverage gaps in verification endpoint."""
+
+    async def test_verify_dns_preference(self):
+        """Test DNS verification preference over email."""
+
+    async def test_verify_email_fallback(self):
+        """Test email fallback when DNS unavailable."""
+
+    async def test_verify_both_methods_fail(self):
+        """Test handling when both verification methods fail."""
+
+# tests/unit/test_token_service_gaps.py
+class TestTokenServiceGaps:
+    """Fill coverage gaps in token service."""
+
+    def test_token_cleanup_expired(self):
+        """Test cleanup of expired tokens."""
+
+    def test_token_collision_handling(self):
+        """Test handling of token ID collisions."""
+```
+
+### 5. Test Execution Framework
+
+#### 5.1 Parallel Test Execution
+
+```python
+# pytest.ini configuration
+[pytest]
+minversion = 7.0
+testpaths = tests
+python_files = test_*.py
+python_classes = Test*
+python_functions = test_*
+
+# Parallel execution
+addopts =
+    -n auto
+    --dist loadscope
+    --maxfail 5
+    --strict-markers
+
+# Test markers
+markers =
+    unit: Unit tests (fast, isolated)
+    integration: Integration tests (component interaction)
+    e2e: End-to-end tests (complete flows)
+    security: Security-specific tests
+    slow: Tests that take >1 second
+    requires_network: Tests requiring network access
+```
+
+#### 5.2 Test Organization
+
+```python
+# conftest.py - Shared configuration
+import pytest
+from typing import AsyncGenerator
+
+# Auto-use fixtures for all tests
+@pytest.fixture(autouse=True)
+async def reset_database(test_database):
+    """Reset database state between tests."""
+    await test_database.execute("DELETE FROM tokens")
+    await test_database.execute("DELETE FROM auth_codes")
+    await test_database.execute("DELETE FROM domain_verifications")
+    await test_database.commit()
+
+@pytest.fixture(autouse=True)
+def reset_rate_limiter(rate_limiter):
+    """Clear rate limiter between tests."""
+    rate_limiter.reset()
+
+# Shared test utilities
+class TestBase:
+    """Base class for test organization."""
+
+    @staticmethod
+    def generate_auth_request(**kwargs):
+        """Generate valid authorization request."""
+        defaults = {
+            "response_type": "code",
+            "client_id": "https://app.example.com",
+            "redirect_uri": "https://app.example.com/callback",
+            "state": "random_state",
+            "code_challenge": "challenge",
+            "code_challenge_method": "S256"
+        }
+        defaults.update(kwargs)
+        return defaults
+```
+
+### 6. Performance Benchmarks
+
+#### 6.1 Response Time Tests
+
+```python
+# tests/performance/test_response_times.py
+class TestResponseTimes:
+    """Ensure response times meet requirements."""
+
+    @pytest.mark.benchmark
+    async def test_authorization_endpoint_performance(self, test_client, benchmark):
+        """Authorization endpoint must respond in <200ms."""
+
+        def make_request():
+            return test_client.get("/authorize", params={
+                "response_type": "code",
+                "client_id": "https://app.example.com"
+            })
+
+        result = benchmark(make_request)
+        assert result.response_time < 0.2  # 200ms
+
+    @pytest.mark.benchmark
+    async def test_token_endpoint_performance(self, test_client, benchmark):
+        """Token endpoint must respond in <100ms."""
+
+        def exchange_token():
+            return test_client.post("/token", data={
+                "grant_type": "authorization_code",
+                "code": "test_code"
+            })
+
+        result = benchmark(exchange_token)
+        assert result.response_time < 0.1  # 100ms
+```
+
+## Testing Strategy
+
+### Test Reliability
+
+1. **Isolation**: Each test runs in isolation with clean state
+2. **Determinism**: No random failures, use fixed seeds and frozen time
+3. **Speed**: Unit tests <1ms, integration <100ms, E2E <1s
+4. **Independence**: Tests can run in any order without dependencies
+
+### Test Maintenance
+
+1. **DRY Principle**: Shared fixtures and utilities
+2. **Clear Names**: Test names describe what is being tested
+3. **Documentation**: Each test includes docstring explaining purpose
+4. **Refactoring**: Regular cleanup of redundant or obsolete tests
+
+### Continuous Integration
+
+```yaml
+# .github/workflows/test.yml
+name: Test Suite
+
+on: [push, pull_request]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+
+    strategy:
+      matrix:
+        python-version: [3.11, 3.12]
+        test-type: [unit, integration, e2e, security]
+
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Install dependencies
+        run: |
+          pip install uv
+          uv sync --dev
+
+      - name: Run ${{ matrix.test-type }} tests
+        run: |
+          uv run pytest tests/${{ matrix.test-type }} \
+            --cov=src/gondulf \
+            --cov-report=xml \
+            --cov-report=term-missing
+
+      - name: Upload coverage
+        uses: codecov/codecov-action@v3
+        with:
+          file: ./coverage.xml
+          flags: ${{ matrix.test-type }}
+
+      - name: Check coverage threshold
+        run: |
+          uv run python -m coverage report --fail-under=90
+```
+
+## Security Considerations
+
+### Test Data Security
+
+1. **No Production Data**: Never use real user data in tests
+2. **Mock Secrets**: Generate test keys/tokens dynamically
+3. **Secure Fixtures**: Don't commit sensitive test data
+
+### Security Test Coverage
+
+Required security tests:
+- SQL injection attempts on all endpoints
+- XSS attempts in all user inputs
+- CSRF token validation
+- Open redirect prevention
+- Timing attack resistance
+- Rate limiting enforcement
+
+## Acceptance Criteria
+
+### Coverage Requirements
+- [ ] Overall test coverage ≥ 90%
+- [ ] Critical path coverage ≥ 95% (auth, token, security)
+- [ ] All endpoints have integration tests
+- [ ] Complete E2E flow tests for all user journeys
+
+### Test Quality Requirements
+- [ ] All tests pass consistently (no flaky tests)
+- [ ] Test execution time < 30 seconds for full suite
+- [ ] Unit tests execute in < 5 seconds
+- [ ] Tests run successfully in CI/CD pipeline
+
+### Documentation Requirements
+- [ ] All test files have module docstrings
+- [ ] Complex tests have explanatory comments
+- [ ] Test fixtures are documented
+- [ ] Coverage gaps are identified and tracked
+
+### Integration Requirements
+- [ ] Tests verify component interactions
+- [ ] Database operations are tested
+- [ ] External service mocks are comprehensive
+- [ ] Middleware chain is tested
+
+### E2E Requirements
+- [ ] Complete authentication flow tested
+- [ ] Domain verification flows tested
+- [ ] Error scenarios comprehensively tested
+- [ ] Real-world usage patterns covered
+
+## Implementation Priority
+
+### Phase 1: Integration Tests (2-3 days)
+1. API endpoint integration tests
+2. Service layer integration tests
+3. Middleware chain tests
+4. Database integration tests
+
+### Phase 2: E2E Tests (2-3 days)
+1. Complete authentication flow
+2. Domain verification flows
+3. Error scenario testing
+4. Client interaction tests
+
+### Phase 3: Gap Remediation (1-2 days)
+1. Analyze coverage report
+2. Write targeted tests for gaps
+3. Refactor existing tests
+4. Update test documentation
+
+### Phase 4: Performance & Security (1 day)
+1. Performance benchmarks
+2. Security test suite
+3. Load testing scenarios
+4. Chaos testing (optional)
+
+## Success Metrics
+
+The test suite expansion is successful when:
+1. Coverage targets are achieved (90%+ overall, 95%+ critical)
+2. All integration tests pass consistently
+3. E2E tests validate complete user journeys
+4. No critical bugs found in tested code paths
+5. Test execution remains fast and reliable
+6. New features can be safely added with test protection
+
+## Technical Debt Considerations
+
+### Current Debt
+- Missing verification endpoint tests (48% coverage)
+- Incomplete error scenario coverage
+- No performance benchmarks
+- Limited security test coverage
+
+### Debt Prevention
+- Maintain test coverage thresholds
+- Require tests for all new features
+- Regular test refactoring
+- Performance regression detection
+
+## Notes
+
+This comprehensive test expansion ensures the IndieAuth server operates correctly as a complete system. The focus on integration and E2E testing validates that individual components work together properly and that users can successfully complete authentication flows. The structured approach with clear organization, shared fixtures, and targeted gap remediation provides confidence in the implementation's correctness and security.