fix(conversations): eliminate N+1 photo queries from list endpoints by beastoin · Pull Request #5271 · BasedHardware/omi

beastoin · 2026-03-02T01:58:58Z

Summary

Adds get_conversations_lite() — a lightweight DB function that skips the @with_photos decorator (eliminates N serial Firestore subcollection queries) and @prepare_for_read decorator (eliminates deepcopy + decrypt/decompress of transcript data)
Switches 4 conversation list endpoints to use it: GET /v1/conversations, GET /v1/mcp/conversations, MCP SSE list_conversations, and GET /v1/dev/user/conversations (when include_transcript=False)
Returns transcript_segments=[] and photos=[] in list responses — full data available via detail endpoint GET /v1/conversations/{id}
Preserves full hydration for limit=1 calls (in-progress conversation recovery in capture_provider.dart:1388 needs transcriptSegments)

Root cause investigation (prod evidence)

Verified via gcloud logging read on Cloud Run backend service:

76 requests >5s on GET /v1/conversations in last 7 days
Worst latencies: 87.81s, 80.87s, 60.65s, 56.24s (all limit=100 offset=0)
504 gateway timeouts at Cloud Run 30s boundary (multiple occurrences)
All slow requests are at offset=0 — the N+1 photo queries are the dominant bottleneck, not pagination

Before (limit=100): 1 main Firestore query + 100 serial photo subcollection queries + 100x deepcopy + decrypt/decompress = 101 Firestore round-trips
After (limit=100): 1 main Firestore query only = 1 Firestore round-trip

Code-level trace confirmed:

helpers.py:240 — @with_photos iterates list, calls get_conversation_photos() per item (serial)
conversations.py:99 — _prepare_conversation_for_read() does copy.deepcopy() on every conversation
Mobile app list view (conversation_list_item.dart) does NOT display transcript_segments

Review cycle fixes

R1: capture_provider.dart:1388 calls GET /v1/conversations?limit=1&statuses=in_progress and accesses transcriptSegments directly. Fixed by using get_conversations_lite() only when limit > 1; limit <= 1 calls use the original get_conversations() with full hydration (already fast — only 1-2 Firestore round-trips).

Tests added (12 new, all passing)

test_conversations_lite_db.py — imports REAL get_conversations_lite() via importlib; tests field stripping, all 7 filter parameters, include_discarded=True skip
test_conversations_router_branching.py — patches REAL routers.conversations.get_conversations; tests limit=1 full hydration, limit=2/100 lite, empty statuses default, locked field stripping
test_mcp_and_developer_conversations_lite.py — patches REAL routers.mcp and routers.developer; tests MCP always uses lite, developer conditional on include_transcript, speaker enrichment only on include_transcript=True
All registered in test.sh

Known edge case

Discarded conversations use getTranscript() for list title display (conversation_list_item.dart:289). With transcript_segments=[], discarded cards show blank title text. This only affects users who enable "show discarded" — the app's default call uses include_discarded=false. Full transcript remains available via the detail endpoint.

Test plan

backend/test.sh — 20 pass (8 original + 12 new), 5 pre-existing failures (unrelated test_process_conversation_usage_context.py)
Codex reviewer approved (R2): limit>1 branching, in-progress recovery preserved
Codex tester approved (R3): all 12 tests exercise real production functions
Verify on dev: conversation list loads fast, detail endpoint returns full data
Verify mobile app: list shows titles/overviews correctly, tapping opens full conversation with transcript

Closes #4904

by AI for @beastoin

greptile-apps · 2026-03-02T02:05:01Z

Greptile Summary

This PR introduces get_conversations_lite() to eliminate the N+1 photo query problem and expensive transcript decryption/decompression on list endpoints. The optimization reduces Firestore queries from 101 to just 1 for a limit=100 request, addressing production timeouts (87s max, multiple 504s).

Key changes:

New get_conversations_lite() function skips @with_photos and @prepare_for_read decorators
Returns empty arrays for transcript_segments and photos in list responses
Switches 4 list endpoints to use the lite function
Detail endpoints (GET /v1/conversations/{id}) still return full data
Developer endpoint conditionally uses lite version when include_transcript=False

Implementation quality:

Query building logic correctly mirrors the original function with all filters (categories, folder_id, starred, date ranges)
Properly removes transcript_segments_compressed flag to avoid confusion
All endpoint parameter passing is correct
No security issues - encryption properly skipped since sensitive data isn't returned

Known tradeoff:
Discarded conversations may show blank titles in mobile list view when include_discarded=true (only affects users who enable "show discarded"). This is an acceptable tradeoff for the massive performance improvement, and full data remains available via detail endpoints.

Confidence Score: 5/5

This PR is safe to merge - it's a well-designed performance optimization with no breaking changes
Score reflects clean implementation of a critical performance fix that addresses production issues (87s latencies, 504 timeouts). The code correctly handles all query parameters, maintains API compatibility, preserves full data access via detail endpoints, and passes tests. The only tradeoff (discarded conversation titles) is documented and acceptable for the 100x query reduction.
No files require special attention - all changes follow established patterns and are correctly implemented

Important Files Changed

Filename	Overview
backend/database/conversations.py	Added `get_conversations_lite()` - skips decorators to eliminate N+1 queries, returns empty transcript_segments and photos arrays
backend/routers/conversations.py	Switched main list endpoint to use `get_conversations_lite()` for better performance
backend/routers/developer.py	Conditionally uses `get_conversations_lite()` when `include_transcript=False`, optimizes the default list view
backend/routers/mcp.py	Switched MCP list endpoint to use `get_conversations_lite()`, detail endpoint still provides full data
backend/routers/mcp_sse.py	Switched SSE list_conversations tool to use `get_conversations_lite()` for faster responses

Sequence Diagram

sequenceDiagram
    participant Client
    participant API as List Endpoint
    participant DB as conversations.py
    participant FS as Firestore

    Note over Client,FS: Before (N+1 queries for limit=3)
    Client->>API: GET /v1/conversations?limit=3
    API->>DB: get_conversations()
    DB->>FS: Query conversations collection
    FS-->>DB: 3 conversations (encrypted)
    Note over DB: @prepare_for_read decorator<br/>deepcopy + decrypt + decompress<br/>for each conversation
    Note over DB: @with_photos decorator<br/>fetches photos serially
    DB->>FS: Query photos subcollection (conv 1)
    FS-->>DB: photos for conv 1
    DB->>FS: Query photos subcollection (conv 2)
    FS-->>DB: photos for conv 2
    DB->>FS: Query photos subcollection (conv 3)
    FS-->>DB: photos for conv 3
    DB-->>API: 3 conversations with full data
    API-->>Client: Full conversations (4 Firestore queries)

    Note over Client,FS: After (single query)
    Client->>API: GET /v1/conversations?limit=3
    API->>DB: get_conversations_lite()
    DB->>FS: Query conversations collection
    FS-->>DB: 3 conversations (raw)
    Note over DB: No decorators<br/>Set transcript_segments=[]<br/>Set photos=[]
    DB-->>API: 3 conversations (list data only)
    API-->>Client: List view data (1 Firestore query)

_{Last reviewed commit: 1cdf95e}

beastoin · 2026-03-02T02:29:15Z

PR ready for review

All checkpoints complete (CP0–CP8):

Codex reviewer: approved R2 (limit>1 branching preserves in-progress recovery)
Codex tester: approved R3 (12 tests exercise real production functions via importlib + unittest.mock.patch)
Tests: 20 pass, 5 pre-existing failures (unrelated)

Impact

Before: GET /v1/conversations?limit=100 → 101 Firestore round-trips (87s worst case, 504 timeouts)
After: Same request → 1 Firestore round-trip
limit=1 calls (in-progress recovery) still get full hydration

Awaiting dev deploy verification and external review.

by AI for @beastoin

… decryption

…hen transcript not requested

…overy capture_provider.dart calls GET /v1/conversations?limit=1&statuses=in_progress and needs transcript_segments for the capture UI. Only apply the lite optimization (skip N+1 photo queries and transcript decryption) for multi-item list fetches (limit > 1).

… filters

…ions router

…rtlib

…function

…functions

beastoin force-pushed the fix/conversations-list-n-plus-1-4904 branch from 5d304de to e696018 Compare March 7, 2026 03:48

beastoin mentioned this pull request Mar 7, 2026

Bug: /v1/conversations returns 500 — response exceeds Cloud Run 32MB size limit #5424

Closed

beastoin added 13 commits March 9, 2026 10:08

Add get_conversations_lite() to skip N+1 photo queries and transcript…

ba1ce9e

… decryption

Switch GET /v1/conversations to use get_conversations_lite()

9547e56

Switch MCP conversations list to use get_conversations_lite()

1487524

Switch MCP SSE conversations list to use get_conversations_lite()

a7afce9

Switch developer conversations list to use get_conversations_lite() w…

6676724

…hen transcript not requested

test: add unit tests for get_conversations_lite() field stripping and…

2a22174

… filters

test: add unit tests for limit-based lite/full branching in conversat…

7a5fce3

…ions router

test: add unit tests for MCP and developer lite/full routing

b554789

test: register new conversation lite tests in test.sh

b21923f

test: rewrite DB tests to import REAL get_conversations_lite via impo…

7188b3c

…rtlib

test: rewrite router tests to use unittest.mock.patch on REAL router …

4d91fb4

…function

test: rewrite MCP/developer tests to use unittest.mock.patch on REAL …

947924e

…functions

beastoin force-pushed the fix/conversations-list-n-plus-1-4904 branch from e696018 to 947924e Compare March 9, 2026 09:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(conversations): eliminate N+1 photo queries from list endpoints#5271

fix(conversations): eliminate N+1 photo queries from list endpoints#5271
beastoin wants to merge 13 commits intomainfrom
fix/conversations-list-n-plus-1-4904

beastoin commented Mar 2, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 2, 2026

Uh oh!

beastoin commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

beastoin commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause investigation (prod evidence)

Review cycle fixes

Tests added (12 new, all passing)

Known edge case

Test plan

Uh oh!

greptile-apps bot commented Mar 2, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

beastoin commented Mar 2, 2026

PR ready for review

Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

beastoin commented Mar 2, 2026 •

edited

Loading