Skip to content

perf: batch statistics queries and cache meeting sub-queries#192

Merged
kouloumos merged 2 commits intomainfrom
christos/perf
Feb 25, 2026
Merged

perf: batch statistics queries and cache meeting sub-queries#192
kouloumos merged 2 commits intomainfrom
christos/perf

Conversation

@christosporios
Copy link
Collaborator

@christosporios christosporios commented Feb 7, 2026

Summary

  • Batch statistics: Replace N×2 per-subject DB queries with 2-3 batched queries via getBatchStatisticsForSubjects()
  • Cache sub-queries: Wrap city, people, parties, subjects, taskStatus, and statistics in unstable_cache with proper tag-based invalidation
  • Split highlights: Fetch user-specific highlights separately (per-request) from cacheable core data
  • Cache hit/miss logging: createCache now logs HIT/MISS per sub-query for visibility

Manual testing plan

  • Load meeting page — verify data renders correctly, check logs for MISS on first load
  • Reload — verify HIT logs for cached queries, faster load
  • Navigate to subject page — should not trigger full re-fetch
  • Test with admin on unreleased meeting — verify it still loads
  • Trigger task completion — verify cache invalidation and fresh data on next load

🤖 Generated with Claude Code


Note

Medium Risk
Touches core meeting data assembly and statistics calculation plus caching behavior; mistakes could cause stale/incorrect meeting page data or performance regressions despite being largely read-path changes.

Overview
Meeting data fetching is refactored to improve performance by splitting getMeetingData into cacheable getMeetingDataCore and per-request composition of user-specific highlights in getMeetingDataCached.

Core meeting sub-queries (city, people, parties, subjects, task status, and subject statistics) are wrapped in unstable_cache with tag-based invalidation, and createCache now logs HIT/MISS/ERR timing per key for visibility. Subject statistics are optimized via a new getBatchStatisticsForSubjects() path that replaces per-subject DB querying with batched queries (with legacy fallback).

Written by Cursor Bugbot for commit 2e0919c. This will update automatically on new commits. Configure here.

Greptile Summary

Refactored meeting data fetching to improve performance by splitting getMeetingData into cacheable getMeetingDataCore and per-request composition with user-specific highlights. Core sub-queries (city, people, parties, subjects, task status, statistics) are wrapped in unstable_cache with tag-based invalidation. Statistics optimized via getBatchStatisticsForSubjects() replacing N×2 per-subject queries with 2-3 batched queries, supporting both new utterance-based and legacy systems.

Key changes:

  • getMeetingDataCore returns MeetingDataCore (without highlights) with individually cached sub-queries
  • getMeetingDataCached composes core data with fresh per-user highlights
  • getBatchStatisticsForSubjects batches all utterances and speaker segments into 2-3 queries
  • createCache now logs HIT/MISS/ERR with timing per key
  • API route /api/cities/[cityId]/meetings/[meetingId] now returns MeetingDataCore (highlights were never used by consumers)

Potential concerns:

  • Statistics cache uses Object.fromEntries(map) - Map doesn't JSON-serialize, so conversion is necessary but adds overhead
  • Meeting and transcript queries remain uncached (auth dependencies and 2MB size limit)
  • Cache invalidation uses broad city:${cityId}:meetings tag which invalidates all meeting sub-queries together

Confidence Score: 4/5

  • Safe to merge with thorough testing of cache behavior and statistics accuracy
  • Well-structured refactoring with clear separation of cacheable and per-request data. The batching logic is sound and maintains backward compatibility with legacy system. Main risks are around cache invalidation timing and potential performance regressions if caching doesn't work as expected, but the changes are largely on the read path with proper fallbacks
  • Focus testing on src/lib/getMeetingData.ts and src/lib/statistics.ts to verify cache hit rates and statistics accuracy across both new and old subject systems

Important Files Changed

Filename Overview
src/lib/cache/index.ts Added HIT/MISS logging with timing for cache visibility
src/lib/getMeetingData.ts Split into MeetingDataCore and cacheable sub-queries with highlights fetched separately; statistics now batched
src/lib/statistics.ts Added getBatchStatisticsForSubjects() to replace N×2 queries with 2-3 batched queries with dual-system support
src/app/api/cities/[cityId]/meetings/[meetingId]/route.ts Changed to use getMeetingDataCore instead of getMeetingData (no highlights)

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Request getMeetingDataCached] --> B[React cache dedup check]
    B --> C[getMeetingDataCore + getHighlights in parallel]
    
    C --> D[Meeting + Transcript uncached]
    C --> E[Cached: city with geometry]
    C --> F[Cached: people]
    C --> G[Cached: parties]
    C --> H[Cached: subjects]
    C --> I[Cached: taskStatus]
    
    H --> K[Cached: subjectStatistics]
    K --> L[getBatchStatisticsForSubjects]
    
    L --> M[Query 1: All utterances batch]
    L --> N[Query 2: Old system fallback batch]
    L --> O[Query 3: Speaker segments batch]
    
    O --> P[Compute stats per subject]
    P --> Q[Map to Object for cache]
    
    Q --> R[Merge with subjects]
    D --> R
    E --> R
    F --> R
    G --> R
    I --> R
    
    R --> S[MeetingDataCore ready]
    C --> T[Fresh highlights per user]
    S --> U[Combine into MeetingData]
    T --> U
    
    style L fill:#90EE90
    style M fill:#90EE90
    style N fill:#90EE90
    style O fill:#90EE90
    style E fill:#87CEEB
    style F fill:#87CEEB
    style G fill:#87CEEB
    style H fill:#87CEEB
    style I fill:#87CEEB
    style K fill:#87CEEB
    style D fill:#FFB6C1
    style T fill:#FFB6C1
Loading

Last reviewed commit: 715690e

@greptile-apps
Copy link

greptile-apps bot commented Feb 7, 2026

Greptile Overview

Greptile Summary

This PR implements significant performance improvements through query batching and granular caching of meeting data sub-queries.

Key Changes:

  • Batch statistics: Replaced N×2 per-subject database queries with 2-3 batched queries via new getBatchStatisticsForSubjects() function
  • Sub-query caching: Wrapped city, people, parties, subjects, taskStatus, and statistics in unstable_cache with tag-based invalidation
  • Separated concerns: Split getMeetingData() into getMeetingDataCore() (cacheable) and kept user-specific highlights fetched per-request
  • Cache observability: Added HIT/MISS logging to createCache() helper

Architecture:

  • Meeting and transcript queries remain uncached (auth checks + 2MB size limit)
  • Each sub-query is individually cached with appropriate cache tags
  • Statistics batching handles both new system (utterance-based) and legacy system (SubjectSpeakerSegment) fallback
  • Cache invalidation via revalidateTag() on task completion ensures fresh data

Issues Found:

  • Race condition in src/lib/cache/index.ts with shared wasMiss flag causing incorrect HIT/MISS logs under concurrent load
  • API endpoint now returns data without highlights field, potentially breaking existing API consumers

Confidence Score: 3/5

  • This PR has solid performance improvements but contains a race condition in cache logging and a breaking API change
  • Score reflects excellent query optimization work (batching statistics, granular caching) but critical issues prevent higher confidence: (1) race condition in createCache() will cause incorrect HIT/MISS logging under concurrent requests, (2) API endpoint removing highlights field could break external clients, (3) complex fallback logic in statistics batching increases maintenance burden
  • src/lib/cache/index.ts requires fixing the wasMiss race condition, and src/app/api/cities/[cityId]/meetings/[meetingId]/route.ts needs verification that removing highlights won't break API consumers

Important Files Changed

Filename Overview
src/lib/cache/index.ts Added HIT/MISS logging but introduced race condition with shared wasMiss flag across concurrent requests
src/lib/statistics.ts Added getBatchStatisticsForSubjects() to replace N×2 queries with 2-3 batched queries, properly handles both new and legacy systems
src/lib/getMeetingData.ts Refactored to getMeetingDataCore() with individual sub-query caching, separated user-specific highlights from cacheable data
src/lib/cache/queries.ts Updated to compose core data and highlights separately, maintains React cache() wrapper for request-level deduplication
src/app/api/cities/[cityId]/meetings/[meetingId]/route.ts Changed GET endpoint to return getMeetingDataCore() without highlights, potentially breaking API contract for external clients

Sequence Diagram

sequenceDiagram
    participant Client
    participant API as API Route
    participant Cache as getMeetingDataCached
    participant Core as getMeetingDataCore
    participant DB as Database
    participant UC as unstable_cache

    Client->>API: GET /api/cities/{cityId}/meetings/{meetingId}
    API->>Core: getMeetingDataCore(cityId, meetingId)
    
    Note over Core: Fetch meeting & transcript<br/>(NOT cached - auth + size)
    Core->>DB: getCouncilMeeting()
    DB-->>Core: meeting
    Core->>DB: getTranscript()
    DB-->>Core: transcript
    
    Note over Core: Cached sub-queries via createCache()
    Core->>UC: city cache (unstable_cache)
    alt Cache HIT
        UC-->>Core: cached city
    else Cache MISS
        UC->>DB: getCity()
        DB-->>UC: city
        UC-->>Core: city (cached)
    end
    
    Core->>UC: people cache
    UC-->>Core: people (HIT or MISS)
    
    Core->>UC: parties cache
    UC-->>Core: parties (HIT or MISS)
    
    Core->>UC: subjects cache
    UC-->>Core: subjects (HIT or MISS)
    
    Core->>UC: taskStatus cache
    UC-->>Core: taskStatus (HIT or MISS)
    
    Note over Core: Batch statistics query
    Core->>UC: statistics cache (all subjects)
    alt Cache HIT
        UC-->>Core: cached statistics
    else Cache MISS
        UC->>DB: getBatchStatisticsForSubjects()<br/>(2-3 queries instead of N×2)
        DB-->>UC: statistics map
        UC-->>Core: statistics (cached)
    end
    
    Core-->>API: MeetingDataCore
    API-->>Client: JSON (without highlights)
    
    Note over Cache,DB: Alternative: getMeetingDataCached path
    Client->>Cache: getMeetingDataCached()
    Cache->>Core: getMeetingDataCore()
    Core-->>Cache: core data
    Cache->>DB: getHighlightsForMeeting()<br/>(user-specific, NOT cached)
    DB-->>Cache: highlights
    Cache-->>Client: MeetingData (with highlights)
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link
Member

@kouloumos kouloumos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid perf improvement. The batch statistics approach is the right call -- N×2 queries down to 2-3 is a big win for meetings with many subjects. I've left a few comments inline and added tests for getBatchStatisticsForSubjects covering the main paths (new system, old system fallback, empty subjects): 15b3f68 -- feel free to cherry-pick.

@kouloumos
Copy link
Member

Hey! Main has moved forward quite a bit and your branch was no longer in sync. To get things back into a clean and reviewable state, I cherry-picked your commits on top of the current main and force-pushed the result to this PR branch.

To be safe, I've also kept a backup of your original branch (before the cherry-pick) here: pr-192-backup-sync

I'll finish reviewing this shortly!

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@kouloumos
Copy link
Member

I force-pushed to address review comments (1, 2, 3, 4, 5):

  • Pass meetingDate through getBatchStatisticsForSubjectsgetStatisticsForTranscript for correct party affiliation on both new and old system paths
  • Replace as never[] casts with as (TopicLabel & { topic: Topic })[] for clarity
  • Add timing to cache HIT logs (was only on MISS)
  • Update fetchCompleteMeetingData return type to MeetingDataCore in BulkExportActions and ExpandableMeetingRow

christosporios and others added 2 commits February 25, 2026 15:02
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consolidates all meeting data logic in one file, eliminating the
circular dependency between queries.ts and getMeetingData.ts.

Reuses getPeopleForCityCached and getPartiesForCityCached from
queries.ts instead of duplicating inline createCache calls.
@kouloumos
Copy link
Member

Force-pushed with an additional refactor commit that consolidates meeting data logic:

  • Moved getMeetingDataCached and getSubjectFromMeetingCached from cache/queries.ts into getMeetingData.ts — all meeting data assembly now lives in one file
  • Replaced inline createCache calls for people/parties with existing getPeopleForCityCached/getPartiesForCityCached from queries.ts, eliminating duplication
  • Updated consumer imports (layout.tsx, subjects/page.tsx) to import from @/lib/getMeetingData directly

All review comments have been addressed. This is ready to merge.

@github-actions
Copy link

github-actions bot commented Feb 25, 2026

🚀 Preview deployment ready!

Preview URL: https://pr-192.preview.opencouncil.gr
Commit: 715690e
Database: Shared staging

The preview will be automatically updated when you push new commits.
It will be destroyed when this PR is closed or merged.


This preview uses the staging database - any changes will affect other previews.

@kouloumos kouloumos merged commit 133e266 into main Feb 25, 2026
3 checks passed
@github-actions
Copy link

🧹 Preview deployment cleaned up

The preview instance for this PR has been destroyed.

Removed:

  • Preview service (port 3192)
  • Caddy reverse proxy configuration
  • All preview-specific resources

Automatic cleanup on PR close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants