Skip to content

fix(conversations): eliminate N+1 photo queries from list endpoints#5271

Open
beastoin wants to merge 13 commits intomainfrom
fix/conversations-list-n-plus-1-4904
Open

fix(conversations): eliminate N+1 photo queries from list endpoints#5271
beastoin wants to merge 13 commits intomainfrom
fix/conversations-list-n-plus-1-4904

Conversation

@beastoin
Copy link
Collaborator

@beastoin beastoin commented Mar 2, 2026

Summary

  • Adds get_conversations_lite() — a lightweight DB function that skips the @with_photos decorator (eliminates N serial Firestore subcollection queries) and @prepare_for_read decorator (eliminates deepcopy + decrypt/decompress of transcript data)
  • Switches 4 conversation list endpoints to use it: GET /v1/conversations, GET /v1/mcp/conversations, MCP SSE list_conversations, and GET /v1/dev/user/conversations (when include_transcript=False)
  • Returns transcript_segments=[] and photos=[] in list responses — full data available via detail endpoint GET /v1/conversations/{id}
  • Preserves full hydration for limit=1 calls (in-progress conversation recovery in capture_provider.dart:1388 needs transcriptSegments)

Root cause investigation (prod evidence)

Verified via gcloud logging read on Cloud Run backend service:

  • 76 requests >5s on GET /v1/conversations in last 7 days
  • Worst latencies: 87.81s, 80.87s, 60.65s, 56.24s (all limit=100 offset=0)
  • 504 gateway timeouts at Cloud Run 30s boundary (multiple occurrences)
  • All slow requests are at offset=0 — the N+1 photo queries are the dominant bottleneck, not pagination

Before (limit=100): 1 main Firestore query + 100 serial photo subcollection queries + 100x deepcopy + decrypt/decompress = 101 Firestore round-trips
After (limit=100): 1 main Firestore query only = 1 Firestore round-trip

Code-level trace confirmed:

  • helpers.py:240@with_photos iterates list, calls get_conversation_photos() per item (serial)
  • conversations.py:99_prepare_conversation_for_read() does copy.deepcopy() on every conversation
  • Mobile app list view (conversation_list_item.dart) does NOT display transcript_segments

Review cycle fixes

  • R1: capture_provider.dart:1388 calls GET /v1/conversations?limit=1&statuses=in_progress and accesses transcriptSegments directly. Fixed by using get_conversations_lite() only when limit > 1; limit <= 1 calls use the original get_conversations() with full hydration (already fast — only 1-2 Firestore round-trips).

Tests added (12 new, all passing)

  • test_conversations_lite_db.py — imports REAL get_conversations_lite() via importlib; tests field stripping, all 7 filter parameters, include_discarded=True skip
  • test_conversations_router_branching.py — patches REAL routers.conversations.get_conversations; tests limit=1 full hydration, limit=2/100 lite, empty statuses default, locked field stripping
  • test_mcp_and_developer_conversations_lite.py — patches REAL routers.mcp and routers.developer; tests MCP always uses lite, developer conditional on include_transcript, speaker enrichment only on include_transcript=True
  • All registered in test.sh

Known edge case

Discarded conversations use getTranscript() for list title display (conversation_list_item.dart:289). With transcript_segments=[], discarded cards show blank title text. This only affects users who enable "show discarded" — the app's default call uses include_discarded=false. Full transcript remains available via the detail endpoint.

Test plan

  • backend/test.sh — 20 pass (8 original + 12 new), 5 pre-existing failures (unrelated test_process_conversation_usage_context.py)
  • Codex reviewer approved (R2): limit>1 branching, in-progress recovery preserved
  • Codex tester approved (R3): all 12 tests exercise real production functions
  • Verify on dev: conversation list loads fast, detail endpoint returns full data
  • Verify mobile app: list shows titles/overviews correctly, tapping opens full conversation with transcript

Closes #4904

by AI for @beastoin

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 2, 2026

Greptile Summary

This PR introduces get_conversations_lite() to eliminate the N+1 photo query problem and expensive transcript decryption/decompression on list endpoints. The optimization reduces Firestore queries from 101 to just 1 for a limit=100 request, addressing production timeouts (87s max, multiple 504s).

Key changes:

  • New get_conversations_lite() function skips @with_photos and @prepare_for_read decorators
  • Returns empty arrays for transcript_segments and photos in list responses
  • Switches 4 list endpoints to use the lite function
  • Detail endpoints (GET /v1/conversations/{id}) still return full data
  • Developer endpoint conditionally uses lite version when include_transcript=False

Implementation quality:

  • Query building logic correctly mirrors the original function with all filters (categories, folder_id, starred, date ranges)
  • Properly removes transcript_segments_compressed flag to avoid confusion
  • All endpoint parameter passing is correct
  • No security issues - encryption properly skipped since sensitive data isn't returned

Known tradeoff:
Discarded conversations may show blank titles in mobile list view when include_discarded=true (only affects users who enable "show discarded"). This is an acceptable tradeoff for the massive performance improvement, and full data remains available via detail endpoints.

Confidence Score: 5/5

  • This PR is safe to merge - it's a well-designed performance optimization with no breaking changes
  • Score reflects clean implementation of a critical performance fix that addresses production issues (87s latencies, 504 timeouts). The code correctly handles all query parameters, maintains API compatibility, preserves full data access via detail endpoints, and passes tests. The only tradeoff (discarded conversation titles) is documented and acceptable for the 100x query reduction.
  • No files require special attention - all changes follow established patterns and are correctly implemented

Important Files Changed

Filename Overview
backend/database/conversations.py Added get_conversations_lite() - skips decorators to eliminate N+1 queries, returns empty transcript_segments and photos arrays
backend/routers/conversations.py Switched main list endpoint to use get_conversations_lite() for better performance
backend/routers/developer.py Conditionally uses get_conversations_lite() when include_transcript=False, optimizes the default list view
backend/routers/mcp.py Switched MCP list endpoint to use get_conversations_lite(), detail endpoint still provides full data
backend/routers/mcp_sse.py Switched SSE list_conversations tool to use get_conversations_lite() for faster responses

Sequence Diagram

sequenceDiagram
    participant Client
    participant API as List Endpoint
    participant DB as conversations.py
    participant FS as Firestore

    Note over Client,FS: Before (N+1 queries for limit=3)
    Client->>API: GET /v1/conversations?limit=3
    API->>DB: get_conversations()
    DB->>FS: Query conversations collection
    FS-->>DB: 3 conversations (encrypted)
    Note over DB: @prepare_for_read decorator<br/>deepcopy + decrypt + decompress<br/>for each conversation
    Note over DB: @with_photos decorator<br/>fetches photos serially
    DB->>FS: Query photos subcollection (conv 1)
    FS-->>DB: photos for conv 1
    DB->>FS: Query photos subcollection (conv 2)
    FS-->>DB: photos for conv 2
    DB->>FS: Query photos subcollection (conv 3)
    FS-->>DB: photos for conv 3
    DB-->>API: 3 conversations with full data
    API-->>Client: Full conversations (4 Firestore queries)

    Note over Client,FS: After (single query)
    Client->>API: GET /v1/conversations?limit=3
    API->>DB: get_conversations_lite()
    DB->>FS: Query conversations collection
    FS-->>DB: 3 conversations (raw)
    Note over DB: No decorators<br/>Set transcript_segments=[]<br/>Set photos=[]
    DB-->>API: 3 conversations (list data only)
    API-->>Client: List view data (1 Firestore query)
Loading

Last reviewed commit: 1cdf95e

@beastoin
Copy link
Collaborator Author

beastoin commented Mar 2, 2026

PR ready for review

All checkpoints complete (CP0–CP8):

  • Codex reviewer: approved R2 (limit>1 branching preserves in-progress recovery)
  • Codex tester: approved R3 (12 tests exercise real production functions via importlib + unittest.mock.patch)
  • Tests: 20 pass, 5 pre-existing failures (unrelated)

Impact

  • Before: GET /v1/conversations?limit=100 → 101 Firestore round-trips (87s worst case, 504 timeouts)
  • After: Same request → 1 Firestore round-trip
  • limit=1 calls (in-progress recovery) still get full hydration

Awaiting dev deploy verification and external review.

by AI for @beastoin

@beastoin beastoin force-pushed the fix/conversations-list-n-plus-1-4904 branch from e696018 to 947924e Compare March 9, 2026 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GET /v1/conversations extreme latency (30-90s) from N+1 queries and full hydration

1 participant