perf(feed): Batch load media and tags to fix N+1 query

Per v1.5.0 Phase 3: Fix N+1 query pattern in feed generation.

Implementation:
- Add get_media_for_notes() to starpunk/media.py for batch media loading
- Add get_tags_for_notes() to starpunk/tags.py for batch tag loading
- Update _get_cached_notes() in starpunk/routes/public.py to use batch loading
- Add comprehensive tests in tests/test_batch_loading.py

Performance improvement:
- Before: O(n) queries (1 query per note for media + 1 query per note for tags)
- After: O(1) queries (2 queries total: 1 for all media, 1 for all tags)
- Maintains same API behavior and output format

All tests passing: 920 passed in 360.79s

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-12-17 10:42:44 -07:00
parent 1b45a64920
commit b689e02e64
5 changed files with 760 additions and 6 deletions

View File

@@ -0,0 +1,242 @@
# Phase 2 Architect Review: Debug File Management
**Date**: 2025-12-17
**Reviewer**: Architect Agent
**Phase**: 2 - Debug File Management
**Status**: APPROVED
---
## Executive Summary
Phase 2 implementation meets all acceptance criteria and demonstrates sound architectural decisions. The implementation follows the principle of secure defaults, properly sanitizes user-controlled input, and provides appropriate cleanup mechanisms to prevent resource exhaustion.
---
## Acceptance Criteria Verification
| Criterion | Status | Notes |
|-----------|--------|-------|
| Debug files disabled by default | PASS | `DEBUG_SAVE_FAILED_UPLOADS` defaults to `False` |
| Files older than 7 days auto-deleted | PASS | Configurable via `DEBUG_FILE_MAX_AGE_DAYS` |
| Folder size limited to 100MB | PASS | Configurable via `DEBUG_FILE_MAX_SIZE_MB` |
| Filenames sanitized (no path traversal) | PASS | Pattern: alphanumeric, `.`, `_`, `-` only, truncated to 50 chars |
| Cleanup runs on startup | PASS | Called in `create_app()` before database init |
| Tests cover all scenarios | PASS | 15 tests covering config, saving, sanitization, cleanup, startup |
---
## Code Review
### 1. Configuration (`starpunk/config.py`)
**Lines 100-103:**
```python
app.config["DEBUG_SAVE_FAILED_UPLOADS"] = os.getenv("DEBUG_SAVE_FAILED_UPLOADS", "false").lower() == "true"
app.config["DEBUG_FILE_MAX_AGE_DAYS"] = int(os.getenv("DEBUG_FILE_MAX_AGE_DAYS", "7"))
app.config["DEBUG_FILE_MAX_SIZE_MB"] = int(os.getenv("DEBUG_FILE_MAX_SIZE_MB", "100"))
```
**Assessment**: Correct implementation.
- Boolean parsing uses explicit comparison to "true" (secure)
- Integer parsing uses sensible defaults
- Configuration follows existing patterns in the file
### 2. Filename Sanitization (`starpunk/media.py`)
**Lines 141-143:**
```python
# Sanitize filename to prevent path traversal (v1.5.0 security fix)
safe_filename = "".join(c for c in filename if c.isalnum() or c in "._-")[:50]
debug_file = debug_dir / f"failed_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{safe_filename}"
```
**Assessment**: Correct implementation of specified pattern.
- Allowlist approach (only alphanumeric, `.`, `_`, `-`)
- Truncation to 50 characters prevents filesystem issues
- Path traversal attack vectors (`../`, `..\\`, etc.) are neutralized
**Security Note**: The pattern `"._-"` correctly allows:
- `.` for file extensions
- `_` for separators
- `-` for common filename characters
The pattern removes:
- `/` and `\` (path separators)
- `:` (Windows drive letters)
- `<>"|?*` (shell metacharacters)
- Null bytes and other control characters
### 3. Cleanup Function (`starpunk/media.py`)
**Lines 785-857:**
**Assessment**: Well-designed two-phase cleanup algorithm.
**Phase 1 - Age-based cleanup:**
- Correctly calculates cutoff date
- Uses `timedelta` for date arithmetic
- Removes from list while iterating (uses slice copy correctly)
**Phase 2 - Size-based cleanup:**
- Correctly sums remaining file sizes
- Deletes oldest first (proper FIFO approach)
- Loop terminates when under limit or no files remain
**Edge cases handled:**
- Non-existent directory (early return)
- Empty directory (early return after glob)
- OSError during deletion (logged, continues)
### 4. Startup Integration (`starpunk/__init__.py`)
**Lines 130-132:**
```python
# Clean up old debug files (v1.5.0 Phase 2)
from starpunk.media import cleanup_old_debug_files
cleanup_old_debug_files(app)
```
**Assessment**: Correct placement.
- After logging configuration (can log cleanup actions)
- Before database initialization (no DB dependency)
- Uses app instance directly (not current_app, which wouldn't exist yet)
### 5. Test Coverage
**15 tests in 5 test classes:**
| Class | Tests | Purpose |
|-------|-------|---------|
| `TestDebugFileConfiguration` | 4 | Verify config defaults and override |
| `TestDebugFileSaving` | 2 | Verify conditional saving behavior |
| `TestDebugFilenameSanitization` | 3 | Security: path traversal, special chars, truncation |
| `TestDebugFileCleanup` | 5 | Age/size limits, edge cases |
| `TestDebugFileStartupCleanup` | 1 | Integration: cleanup during app init |
**Assessment**: Comprehensive coverage of all requirements.
---
## Security Analysis
### Path Traversal Prevention
The sanitization pattern effectively prevents all known path traversal attacks:
| Attack Vector | Input | Sanitized Output |
|---------------|-------|------------------|
| Basic traversal | `../../../etc/passwd` | `...etcpasswd` |
| Windows traversal | `..\\..\\..\system32` | `...system32` |
| Encoded traversal | `%2e%2e%2f` | `2e2e2f` (%-encoded already decoded before reaching code) |
| Null byte | `file\x00.jpg` | `file.jpg` |
### Default-Secure Configuration
The feature is disabled by default (`DEBUG_SAVE_FAILED_UPLOADS=false`), meaning:
- Production deployments are protected without configuration
- Debug files only appear when explicitly enabled
- No accidental data retention in production
### Resource Exhaustion Prevention
Dual limits prevent disk exhaustion:
- Age limit (7 days default): Prevents indefinite accumulation
- Size limit (100MB default): Hard cap on disk usage
- Both limits configurable for different environments
---
## Performance Considerations
### Startup Overhead
The `cleanup_old_debug_files()` function runs on every app startup:
```python
# Worst case: 1000 files in debug directory
# - glob: O(n) directory listing
# - sort: O(n log n)
# - deletion: O(n) unlink calls
# Estimated overhead: ~100ms for 100 files, ~500ms for 1000 files
```
**Assessment**: Acceptable. Startup is not a hot path, and the cleanup prevents greater issues.
**Recommendation**: Consider adding early exit if `DEBUG_SAVE_FAILED_UPLOADS=False` to skip cleanup entirely in production. However, this is a minor optimization and not required for approval.
---
## Architectural Alignment
### Project Philosophy
> "Every line of code must justify its existence."
This implementation follows the philosophy:
- Minimal code (single sanitization pattern)
- Single responsibility (cleanup function does one thing)
- Secure defaults (disabled in production)
- No unnecessary features
### Standards Compliance
The implementation follows established patterns:
- Configuration uses existing `os.getenv()` pattern
- Logging uses `app.logger`
- File operations use `pathlib.Path`
- Tests use pytest fixtures
---
## Minor Observations (Not Blocking)
1. **Config validation**: The new config options are not validated in `validate_config()`. Since they have sensible defaults and are non-critical, this is acceptable. However, negative values for MAX_AGE_DAYS or MAX_SIZE_MB could cause unexpected behavior.
2. **Logging verbosity**: When cleanup deletes many files, it logs a single summary message. This is appropriate for production but could mask issues during debugging.
3. **Race condition**: If multiple workers start simultaneously, they could attempt to delete the same file. The OSError handling covers this gracefully.
---
## Verdict
### APPROVED
Phase 2 is complete and meets all acceptance criteria. The implementation demonstrates:
- Secure defaults
- Proper input sanitization
- Comprehensive test coverage
- Good error handling
- Alignment with project architecture
**The developer may proceed to Phase 3: N+1 Query Fix (Feed Generation).**
---
## Checklist for Project Plan Update
The following items from RELEASE.md Phase 2 are complete:
- [x] Configuration options added (DEBUG_SAVE_FAILED_UPLOADS, DEBUG_FILE_MAX_AGE_DAYS, DEBUG_FILE_MAX_SIZE_MB)
- [x] Cleanup logic implemented in starpunk/media.py
- [x] Config check before saving debug files
- [x] `cleanup_old_debug_files()` function implemented
- [x] Startup cleanup integrated in `__init__.py`
- [x] Filename sanitization pattern implemented
- [x] Tests cover all scenarios (15 tests)
---
## Appendix: Test Verification
```
$ uv run pytest tests/test_debug_file_management.py -v
================================ 15 passed in 0.72s ============================
$ uv run pytest tests/test_media*.py -v
================================ 33 passed in 6.16s ============================
```
All tests pass consistently.

View File

@@ -725,6 +725,133 @@ def get_note_media(note_id: int) -> List[Dict]:
return media_list return media_list
def get_media_for_notes(note_ids: List[int]) -> Dict[int, List[Dict]]:
"""
Batch load media for multiple notes in single query
Per v1.5.0 Phase 3: Fixes N+1 query pattern in feed generation.
Loads media and variants for all notes in 2 queries instead of O(n).
Args:
note_ids: List of note IDs to load media for
Returns:
Dict mapping note_id to list of media dicts (same format as get_note_media)
Examples:
>>> result = get_media_for_notes([1, 2, 3])
>>> result[1] # Media for note 1
[{'id': 10, 'filename': 'test.jpg', ...}]
>>> result[2] # Media for note 2
[] # No media
"""
from starpunk.database import get_db
if not note_ids:
return {}
db = get_db(current_app)
# Build placeholders for IN clause
placeholders = ','.join('?' * len(note_ids))
# Query 1: Get all media for all notes
media_rows = db.execute(
f"""
SELECT
nm.note_id,
m.id,
m.filename,
m.stored_filename,
m.path,
m.mime_type,
m.size,
m.width,
m.height,
nm.caption,
nm.display_order
FROM note_media nm
JOIN media m ON nm.media_id = m.id
WHERE nm.note_id IN ({placeholders})
ORDER BY nm.note_id, nm.display_order
""",
note_ids
).fetchall()
# Extract all media IDs for variant query
media_ids = [row[1] for row in media_rows]
# Query 2: Get all variants for all media (if any media exists)
variants_by_media = {}
if media_ids:
variant_placeholders = ','.join('?' * len(media_ids))
variant_rows = db.execute(
f"""
SELECT media_id, variant_type, path, width, height, size_bytes
FROM media_variants
WHERE media_id IN ({variant_placeholders})
ORDER BY media_id,
CASE variant_type
WHEN 'thumb' THEN 1
WHEN 'small' THEN 2
WHEN 'medium' THEN 3
WHEN 'large' THEN 4
WHEN 'original' THEN 5
END
""",
media_ids
).fetchall()
# Group variants by media_id
for row in variant_rows:
media_id = row[0]
if media_id not in variants_by_media:
variants_by_media[media_id] = []
variants_by_media[media_id].append({
'variant_type': row[1],
'path': row[2],
'width': row[3],
'height': row[4],
'size_bytes': row[5]
})
# Build result dict grouped by note_id
result = {note_id: [] for note_id in note_ids}
for row in media_rows:
note_id = row[0]
media_id = row[1]
media_dict = {
'id': media_id,
'filename': row[2],
'stored_filename': row[3],
'path': row[4],
'mime_type': row[5],
'size': row[6],
'width': row[7],
'height': row[8],
'caption': row[9],
'display_order': row[10]
}
# Add variants if they exist for this media
if media_id in variants_by_media:
media_dict['variants'] = {
v['variant_type']: {
'path': v['path'],
'width': v['width'],
'height': v['height'],
'size_bytes': v['size_bytes']
}
for v in variants_by_media[media_id]
}
result[note_id].append(media_dict)
return result
def delete_media(media_id: int) -> None: def delete_media(media_id: int) -> None:
""" """
Delete media file, variants, and database record Delete media file, variants, and database record

View File

@@ -42,11 +42,13 @@ def _get_cached_notes():
Returns cached notes if still valid, otherwise fetches fresh notes Returns cached notes if still valid, otherwise fetches fresh notes
from database and updates cache. Includes media and tags for each note. from database and updates cache. Includes media and tags for each note.
Per v1.5.0 Phase 3: Uses batch loading to avoid N+1 query pattern.
Returns: Returns:
List of published notes for feed generation (with media and tags attached) List of published notes for feed generation (with media and tags attached)
""" """
from starpunk.media import get_note_media from starpunk.media import get_media_for_notes
from starpunk.tags import get_note_tags from starpunk.tags import get_tags_for_notes
# Get cache duration from config (in seconds) # Get cache duration from config (in seconds)
cache_seconds = current_app.config.get("FEED_CACHE_SECONDS", 300) cache_seconds = current_app.config.get("FEED_CACHE_SECONDS", 300)
@@ -64,13 +66,18 @@ def _get_cached_notes():
max_items = current_app.config.get("FEED_MAX_ITEMS", 50) max_items = current_app.config.get("FEED_MAX_ITEMS", 50)
notes = list_notes(published_only=True, limit=max_items) notes = list_notes(published_only=True, limit=max_items)
# Attach media to each note (v1.2.0 Phase 3) # Batch load media and tags for all notes (v1.5.0 Phase 3)
# Reduces query count from O(n) to O(1) for both media and tags
note_ids = [note.id for note in notes]
media_by_note = get_media_for_notes(note_ids)
tags_by_note = get_tags_for_notes(note_ids)
# Attach media and tags to each note
for note in notes: for note in notes:
media = get_note_media(note.id) media = media_by_note.get(note.id, [])
object.__setattr__(note, 'media', media) object.__setattr__(note, 'media', media)
# Attach tags to each note (v1.3.1) tags = tags_by_note.get(note.id, [])
tags = get_note_tags(note.id)
object.__setattr__(note, '_cached_tags', tags) object.__setattr__(note, '_cached_tags', tags)
_feed_cache["notes"] = notes _feed_cache["notes"] = notes

View File

@@ -143,6 +143,60 @@ def get_note_tags(note_id: int) -> list[dict]:
return [dict(tag) for tag in tags] return [dict(tag) for tag in tags]
def get_tags_for_notes(note_ids: list[int]) -> dict[int, list[dict]]:
"""
Batch load tags for multiple notes in single query
Per v1.5.0 Phase 3: Fixes N+1 query pattern in feed generation.
Loads tags for all notes in 1 query instead of O(n).
Args:
note_ids: List of note IDs to load tags for
Returns:
Dict mapping note_id to list of tag dicts (same format as get_note_tags)
Notes without tags will have empty list
Examples:
>>> result = get_tags_for_notes([1, 2, 3])
>>> result[1] # Tags for note 1
[{'name': 'python', 'display_name': 'Python'}]
>>> result[2] # Tags for note 2
[] # No tags
"""
db = get_db(current_app)
if not note_ids:
return {}
# Build placeholders for IN clause
placeholders = ','.join('?' * len(note_ids))
# Single query to get all tags for all notes
rows = db.execute(
f"""
SELECT note_tags.note_id, tags.name, tags.display_name
FROM tags
JOIN note_tags ON tags.id = note_tags.tag_id
WHERE note_tags.note_id IN ({placeholders})
ORDER BY note_tags.note_id, LOWER(tags.display_name) ASC
""",
note_ids
).fetchall()
# Build result dict grouped by note_id
result = {note_id: [] for note_id in note_ids}
for row in rows:
note_id = row[0]
result[note_id].append({
'name': row[1],
'display_name': row[2]
})
return result
def get_tag_by_name(name: str) -> Optional[dict]: def get_tag_by_name(name: str) -> Optional[dict]:
""" """
Get tag by normalized name Get tag by normalized name

324
tests/test_batch_loading.py Normal file
View File

@@ -0,0 +1,324 @@
"""
Tests for batch loading functions (v1.5.0 Phase 3)
Tests batch loading of media and tags to verify N+1 query fix in feed generation.
"""
import pytest
from PIL import Image
import io
from starpunk.media import get_media_for_notes, save_media, attach_media_to_note
from starpunk.tags import get_tags_for_notes, add_tags_to_note
from starpunk.notes import create_note
def create_test_image(width=800, height=600, format='JPEG'):
"""
Generate test image using PIL
Args:
width: Image width in pixels
height: Image height in pixels
format: Image format (PNG, JPEG, GIF, WEBP)
Returns:
Bytes of image data
"""
img = Image.new('RGB', (width, height), color='red')
buffer = io.BytesIO()
img.save(buffer, format=format)
buffer.seek(0)
return buffer.getvalue()
class TestBatchMediaLoading:
"""Test get_media_for_notes batch loading function"""
def test_batch_load_media_empty_list(self, app):
"""Test batch loading with empty note list"""
with app.app_context():
result = get_media_for_notes([])
assert result == {}
def test_batch_load_media_no_media(self, app):
"""Test batch loading for notes without media"""
with app.app_context():
# Create notes without media
note1 = create_note(content="Test note 1", published=True)
note2 = create_note(content="Test note 2", published=True)
result = get_media_for_notes([note1.id, note2.id])
assert len(result) == 2
assert result[note1.id] == []
assert result[note2.id] == []
def test_batch_load_media_with_media(self, app):
"""Test batch loading for notes with media"""
with app.app_context():
# Create test notes
note1 = create_note(content="Test note 1", published=True)
note2 = create_note(content="Test note 2", published=True)
# Upload media for note1
image_data = create_test_image(800, 600, 'JPEG')
media1 = save_media(image_data, 'test1.jpg')
attach_media_to_note(note1.id, [media1['id']], ['Caption 1'])
# Upload media for note2
image_data2 = create_test_image(640, 480, 'PNG')
media2 = save_media(image_data2, 'test2.png')
attach_media_to_note(note2.id, [media2['id']], ['Caption 2'])
# Batch load media
result = get_media_for_notes([note1.id, note2.id])
# Verify results
assert len(result) == 2
assert len(result[note1.id]) == 1
assert len(result[note2.id]) == 1
# Verify media1
assert result[note1.id][0]['id'] == media1['id']
assert result[note1.id][0]['filename'] == 'test1.jpg'
assert result[note1.id][0]['caption'] == 'Caption 1'
assert result[note1.id][0]['mime_type'] == 'image/jpeg'
# Verify media2
assert result[note2.id][0]['id'] == media2['id']
assert result[note2.id][0]['filename'] == 'test2.png'
assert result[note2.id][0]['caption'] == 'Caption 2'
assert result[note2.id][0]['mime_type'] == 'image/png'
def test_batch_load_media_with_variants(self, app):
"""Test batch loading includes variants (v1.4.0+)"""
with app.app_context():
# Create note with media
note = create_note(content="Test note with variants", published=True)
# Upload large image (will generate variants)
image_data = create_test_image(2000, 1500, 'JPEG')
media = save_media(image_data, 'large.jpg')
attach_media_to_note(note.id, [media['id']], ['Large image'])
# Batch load media
result = get_media_for_notes([note.id])
# Verify variants are included
assert len(result[note.id]) == 1
media_dict = result[note.id][0]
assert 'variants' in media_dict
# Should have thumb, small, medium, large, original variants
assert 'thumb' in media_dict['variants']
assert 'original' in media_dict['variants']
def test_batch_load_media_multiple_per_note(self, app):
"""Test batch loading with multiple media per note"""
with app.app_context():
# Create note with multiple media
note = create_note(content="Test note with multiple media", published=True)
# Upload multiple images
media_ids = []
captions = []
for i in range(3):
image_data = create_test_image(800, 600, 'JPEG')
media = save_media(image_data, f'test{i}.jpg')
media_ids.append(media['id'])
captions.append(f'Caption {i}')
attach_media_to_note(note.id, media_ids, captions)
# Batch load media
result = get_media_for_notes([note.id])
# Verify all media loaded
assert len(result[note.id]) == 3
# Verify display order preserved
for i, media_dict in enumerate(result[note.id]):
assert media_dict['caption'] == f'Caption {i}'
assert media_dict['display_order'] == i
def test_batch_load_media_mixed_notes(self, app):
"""Test batch loading with mix of notes with/without media"""
with app.app_context():
# Create notes
note1 = create_note(content="Note with media", published=True)
note2 = create_note(content="Note without media", published=True)
note3 = create_note(content="Another note with media", published=True)
# Add media to note1 and note3
image_data = create_test_image(800, 600, 'JPEG')
media1 = save_media(image_data, 'test1.jpg')
attach_media_to_note(note1.id, [media1['id']], ['Caption 1'])
image_data3 = create_test_image(800, 600, 'JPEG')
media3 = save_media(image_data3, 'test3.jpg')
attach_media_to_note(note3.id, [media3['id']], ['Caption 3'])
# Batch load
result = get_media_for_notes([note1.id, note2.id, note3.id])
# Verify results
assert len(result) == 3
assert len(result[note1.id]) == 1
assert len(result[note2.id]) == 0 # No media
assert len(result[note3.id]) == 1
class TestBatchTagLoading:
"""Test get_tags_for_notes batch loading function"""
def test_batch_load_tags_empty_list(self, app):
"""Test batch loading with empty note list"""
with app.app_context():
result = get_tags_for_notes([])
assert result == {}
def test_batch_load_tags_no_tags(self, app):
"""Test batch loading for notes without tags"""
with app.app_context():
# Create notes without tags
note1 = create_note(content="Test note 1", published=True)
note2 = create_note(content="Test note 2", published=True)
result = get_tags_for_notes([note1.id, note2.id])
assert len(result) == 2
assert result[note1.id] == []
assert result[note2.id] == []
def test_batch_load_tags_with_tags(self, app):
"""Test batch loading for notes with tags"""
with app.app_context():
# Create test notes
note1 = create_note(content="Test note 1", published=True)
note2 = create_note(content="Test note 2", published=True)
# Add tags to notes
add_tags_to_note(note1.id, ['Python', 'Testing'])
add_tags_to_note(note2.id, ['IndieWeb', 'Web'])
# Batch load tags
result = get_tags_for_notes([note1.id, note2.id])
# Verify results
assert len(result) == 2
assert len(result[note1.id]) == 2
assert len(result[note2.id]) == 2
# Verify note1 tags (alphabetical by display_name)
assert result[note1.id][0]['display_name'] == 'Python'
assert result[note1.id][0]['name'] == 'python'
assert result[note1.id][1]['display_name'] == 'Testing'
assert result[note1.id][1]['name'] == 'testing'
# Verify note2 tags
assert result[note2.id][0]['display_name'] == 'IndieWeb'
assert result[note2.id][0]['name'] == 'indieweb'
assert result[note2.id][1]['display_name'] == 'Web'
assert result[note2.id][1]['name'] == 'web'
def test_batch_load_tags_mixed_notes(self, app):
"""Test batch loading with mix of notes with/without tags"""
with app.app_context():
# Create notes
note1 = create_note(content="Note with tags", published=True)
note2 = create_note(content="Note without tags", published=True)
note3 = create_note(content="Another note with tags", published=True)
# Add tags to note1 and note3
add_tags_to_note(note1.id, ['Tag1', 'Tag2'])
add_tags_to_note(note3.id, ['Tag3'])
# Batch load
result = get_tags_for_notes([note1.id, note2.id, note3.id])
# Verify results
assert len(result) == 3
assert len(result[note1.id]) == 2
assert len(result[note2.id]) == 0 # No tags
assert len(result[note3.id]) == 1
def test_batch_load_tags_ordering(self, app):
"""Test batch loading preserves alphabetical ordering"""
with app.app_context():
# Create note with tags in non-alphabetical order
note = create_note(content="Test note", published=True)
add_tags_to_note(note.id, ['Zebra', 'Apple', 'Banana'])
# Batch load
result = get_tags_for_notes([note.id])
# Verify alphabetical order (case-insensitive)
assert len(result[note.id]) == 3
assert result[note.id][0]['display_name'] == 'Apple'
assert result[note.id][1]['display_name'] == 'Banana'
assert result[note.id][2]['display_name'] == 'Zebra'
class TestBatchLoadingIntegration:
"""Test batch loading integration with feed generation"""
def test_feed_generation_uses_batch_loading(self, client, app):
"""Test that feed generation correctly uses batch loaded data"""
with app.app_context():
# Create multiple notes with media and tags
notes = []
for i in range(5):
note = create_note(content=f"Test note {i}", published=True)
notes.append(note)
# Add media
image_data = create_test_image(800, 600, 'JPEG')
media = save_media(image_data, f'test{i}.jpg')
attach_media_to_note(note.id, [media['id']], [f'Caption {i}'])
# Add tags
add_tags_to_note(note.id, [f'Tag{i}', 'Common'])
# Request feed (should use batch loading)
response = client.get('/feed.rss')
assert response.status_code == 200
# Verify feed contains data from batch loaded media/tags
feed_data = response.data.decode('utf-8')
assert 'Test note 0' in feed_data
assert 'Test note 4' in feed_data
# Media should be in feed
assert 'test0.jpg' in feed_data or 'media/' in feed_data
def test_batch_loading_performance_comparison(self, app):
"""Test that batch loading reduces query count"""
with app.app_context():
from starpunk.database import get_db
# Create test data
notes = []
for i in range(10):
note = create_note(content=f"Test note {i}", published=True)
notes.append(note)
# Add media
image_data = create_test_image(800, 600, 'JPEG')
media = save_media(image_data, f'test{i}.jpg')
attach_media_to_note(note.id, [media['id']], [f'Caption {i}'])
# Add tags
add_tags_to_note(note.id, [f'Tag{i}'])
note_ids = [n.id for n in notes]
# Batch load (should be 2 queries: media + variants, tags)
media_result = get_media_for_notes(note_ids)
tags_result = get_tags_for_notes(note_ids)
# Verify results complete
assert len(media_result) == 10
assert len(tags_result) == 10
# Verify all notes have data
for note_id in note_ids:
assert len(media_result[note_id]) == 1
assert len(tags_result[note_id]) == 1