Skip to content

Feat/final synthesis compaction#1273

Open
Aayushjshah wants to merge 2 commits intomainfrom
feat/finalSynthesisCompaction
Open

Feat/final synthesis compaction#1273
Aayushjshah wants to merge 2 commits intomainfrom
feat/finalSynthesisCompaction

Conversation

@Aayushjshah
Copy link
Collaborator

@Aayushjshah Aayushjshah commented Mar 9, 2026

Description

Testing

Additional Notes

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced AI model token limit configuration with model-specific optimizations
    • Intelligent final answer synthesis with adaptive single or sectional modes based on complexity
    • Improved image selection and inclusion in final answers
    • Advanced token budget management for optimized answer generation and streaming
  • Chores

    • Code refactoring to improve maintainability and modularity

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 9, 2026

📝 Walkthrough

Walkthrough

This PR introduces a centralized Final Answer Synthesis module that orchestrates end-to-end answer generation with token budgeting, section planning, fragment mapping, and streaming output. The message-agents flow is refactored to delegate synthesis to this new module. Model-specific input token limits are added to the configuration system.

Changes

Cohort / File(s) Summary
Model Configuration
server/ai/modelConfig.ts, server/shared/types.ts
Added DEFAULT_MAX_INPUT_TOKENS constant, MODEL_MAX_INPUT_TOKEN_OVERRIDES mapping, getModelMaxInputTokens() function, and optional maxInputTokens property to ModelConfiguration interface for per-model token constraints.
Final Answer Synthesis
server/api/chat/final-answer-synthesis.ts
New comprehensive module implementing end-to-end synthesis orchestration with token budgeting, mode selection (single/sectional), planner and mapper pipelines, fragment-to-section mapping, image handling, streaming output, and error recovery with fallbacks.
Message Agents Integration
server/api/chat/message-agents.ts
Refactored to delegate final synthesis to new module; removed in-file synthesis logic, updated imports to use executeFinalSynthesis and related helpers, simplified streaming path, and updated result handling to use synthesisResult fields.
Testing & Infrastructure
server/tests/finalAnswerSynthesis.test.ts, server/logger/index.ts
Added 263-line test suite covering fragment preview generation, payload construction, mode switching, and budget-aware fragment selection; removed unused pino levels import.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant MessageAgent as Message<br/>Agent
    participant FinalSynthesis as Final Answer<br/>Synthesis
    participant Planner
    participant Mapper
    participant Model
    
    Client->>MessageAgent: Request final answer
    MessageAgent->>FinalSynthesis: executeFinalSynthesis(context)
    
    FinalSynthesis->>FinalSynthesis: Estimate input tokens<br/>and select mode
    
    alt Sectional Mode
        FinalSynthesis->>Planner: planSections(fragments,<br/>budget, clarifications)
        Planner-->>FinalSynthesis: section plan
        FinalSynthesis->>FinalSynthesis: buildFragmentPreviews
        FinalSynthesis->>Mapper: mapFragmentsToSections(fragments,<br/>plan)
        Mapper-->>FinalSynthesis: fragment-to-section<br/>assignments
        FinalSynthesis->>FinalSynthesis: selectImagesForFragmentIds<br/>and selectMappedEntries
        loop For each section
            FinalSynthesis->>Model: synthesizeSection(context,<br/>section payload)
            Model-->>FinalSynthesis: section answer
        end
    else Single Mode
        FinalSynthesis->>FinalSynthesis: selectImagesForFinalSynthesis
        FinalSynthesis->>Model: synthesizeSingleAnswer(context,<br/>full payload)
        Model-->>FinalSynthesis: answer
    end
    
    FinalSynthesis->>FinalSynthesis: Stream answer chunks
    FinalSynthesis-->>MessageAgent: FinalSynthesisExecutionResult
    MessageAgent-->>Client: Final answer
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

server

Suggested reviewers

  • zereraz
  • shivamashtikar
  • junaid-shirur
  • devesh-juspay
  • kalpadhwaryu

Poem

🐰 A synthesis takes flight,
With tokens budgeted just right,
Sections planned and mapped with care,
Images chosen from the pair,
Streaming answers through the night! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feat/final synthesis compaction' is partially related to the changeset. It refers to final synthesis work, which is a significant part of the changes, but 'compaction' is vague and doesn't clearly convey the main architectural shift: extracting final synthesis logic into a dedicated module. Consider a more specific title like 'Extract final synthesis logic into dedicated module' or 'Refactor final answer synthesis into separate module' to better reflect the primary change.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/finalSynthesisCompaction

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the AI agent's ability to generate final answers, particularly when dealing with extensive evidence. By introducing a flexible synthesis pipeline that can dynamically switch between single and sectional processing modes, the system can now efficiently handle large context windows, improve answer quality, and reduce the risk of context length errors. The changes also centralize and streamline the underlying logic, making the system more scalable and easier to maintain.

Highlights

  • Introduced Sectional Final Answer Synthesis: Implemented a new, robust final answer synthesis pipeline that can operate in either a 'single' mode for smaller contexts or a 'sectional' mode for larger evidence sets. This allows the AI agent to break down complex answers into manageable sections, process them in parallel, and then reassemble them, significantly improving scalability and context handling.
  • Dynamic Model Input Token Configuration: Added dynamic configuration for Large Language Model (LLM) maximum input tokens, allowing the system to adapt its synthesis strategy based on the specific model's context window capabilities. This includes a new utility function to retrieve the maximum input tokens for any given model.
  • Refactored Final Synthesis Logic: Centralized the final answer synthesis logic into a dedicated module (final-answer-synthesis.ts), decoupling it from the main message-agents.ts file. This improves modularity, readability, and maintainability by removing duplicated prompt building and image selection logic.
  • Enhanced Context Management for Synthesis: Developed sophisticated mechanisms for managing context during synthesis, including fragment preview generation, intelligent fragment batching within token budgets, and image selection strategies that prioritize recent and user-attached images while respecting model limits.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • server/ai/modelConfig.ts
    • Added DEFAULT_MAX_INPUT_TOKENS constant to define a baseline token limit.
    • Introduced MODEL_MAX_INPUT_TOKEN_OVERRIDES to specify custom maximum input token limits for various LLM models.
    • Implemented logic to apply these overrides to the MODEL_CONFIGURATIONS.
    • Exported getModelMaxInputTokens function to retrieve the appropriate maximum input token count for a given model.
  • server/api/chat/final-answer-synthesis.ts
    • Added new file containing the core logic for final answer synthesis.
    • Defined types for FinalSynthesisExecutionResult, SynthesisModeSelection, FinalSection, SectionAnswerResult, FragmentPreviewRecord, FragmentAssignmentBatch, SelectedImagesResult, and SectionMappingEnvelope.
    • Implemented utility functions for text normalization, truncation, and token estimation (estimateTextTokens, estimateImageTokens, estimatePromptTokens).
    • Included error classification for planner fallbacks (classifyPlannerFallbackError).
    • Developed functions for managing stop signals and streaming answer chunks (buildStoppedFinalSynthesisResult, logFinalSynthesisStop, cancelConverseIterator, streamFinalAnswerChunk).
    • Created estimateSafeInputBudget to calculate available token budget for input.
    • Provided functions to format plan and clarifications for prompts (formatPlanForPrompt, formatClarificationsForPrompt).
    • Built shared context for final answer generation (buildSharedFinalAnswerContext).
    • Defined base system prompts for final and sectional answers (buildBaseFinalAnswerSystemPrompt).
    • Implemented buildFinalSynthesisPayload for single-shot synthesis.
    • Added functions for sectional planning and mapping (formatSectionPlanOverview, buildPlannerSystemPrompt, buildMapperSystemPrompt, formatSectionFragments).
    • Developed logic for fragment preview records and budget-aware text inclusion (findTimestamp, buildFragmentPreviewRecord, formatPreviewRecord, buildPreviewOmissionSummary, buildPreviewTextWithinBudget).
    • Implemented buildFragmentBatchesWithinBudget for efficient fragment processing.
    • Included functions for normalizing and merging section plans and assignments (normalizeSectionPlan, normalizeSectionAssignments, mergeSectionAssignments).
    • Created runWithConcurrency for parallel processing of items.
    • Defined buildDefaultSectionPlan and buildDefaultAssignments for fallback scenarios.
    • Implemented image selection logic (createSelectedImagesResult, selectImagesForFinalSynthesis, selectImagesForFragmentIds).
    • Added selectMappedEntriesWithinBudget to trim fragments based on token limits.
    • Developed buildSectionAnswerPayload for individual section synthesis.
    • Implemented decideSynthesisMode to choose between single and sectional synthesis.
    • Provided asynchronous functions for planning sections (planSections) and mapping fragments (mapFragmentsToSections).
    • Implemented the core synthesis functions for single answers (synthesizeSingleAnswer) and individual sections (synthesizeSection).
    • Added assembleSectionAnswers to combine results from sectional synthesis.
    • Implemented synthesizeSectionalAnswer to orchestrate the sectional synthesis process.
    • Exported executeFinalSynthesis as the main entry point for final answer generation.
    • Exposed internal functions for testing via __finalAnswerSynthesisInternals.
  • server/api/chat/message-agents.ts
    • Imported executeFinalSynthesis, formatClarificationsForPrompt, and formatPlanForPrompt from the new final-answer-synthesis module.
    • Removed buildAgentSystemPromptContextBlock, formatFragmentsWithMetadata imports from message-agents-metadata as they are now handled by the new module.
    • Exported buildFinalSynthesisPayload from final-answer-synthesis.
    • Removed the local formatPlanForPrompt function, now using the imported version.
    • Removed the local formatClarificationsForPrompt function, now using the imported version.
    • Removed the local buildFinalSynthesisPayload function, now using the imported version.
    • Removed the local selectImagesForFinalSynthesis function, now using the imported version.
    • Refactored createFinalSynthesisTool to call the new executeFinalSynthesis function, simplifying its internal logic.
    • Updated reasoning emission messages to reflect the new sectional synthesis mode.
  • server/logger/index.ts
    • Removed unused import of levels from the pino library.
  • server/shared/types.ts
    • Added an optional maxInputTokens property to the ModelConfiguration interface.
  • server/tests/finalAnswerSynthesis.test.ts
    • Added new test file for final-answer-synthesis module.
    • Included tests for buildFragmentPreviewRecord to ensure deterministic fragment previews.
    • Tested buildSectionAnswerPayload to verify shared context and section-specific instructions.
    • Verified decideSynthesisMode correctly switches to sectional mode when the input budget is exceeded.
    • Added tests for selectMappedEntriesWithinBudget to confirm proper handling of oversized fragments.
    • Included a test for synthesizeSection to ensure sections are omitted when no fragments fit the budget.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new, sophisticated final answer synthesis mechanism, primarily implemented in a new file server/api/chat/final-answer-synthesis.ts. This module intelligently decides between a single-shot or a multi-sectional approach for generating answers, especially when dealing with large amounts of context and images, to stay within model token limits. It includes logic for planning sections, mapping fragments to these sections, and synthesizing each section concurrently. To support this, server/ai/modelConfig.ts was updated to store and retrieve maximum input token limits for various models, and server/shared/types.ts was modified to include maxInputTokens in the ModelConfiguration interface. The existing server/api/chat/message-agents.ts file was refactored to delegate the entire final synthesis process to the new module, simplifying its code. Additionally, a new test file server/tests/finalAnswerSynthesis.test.ts was added to validate the new synthesis logic, covering fragment preview generation, sectional payload construction, mode selection, and fragment budget management. A minor indentation issue was noted in the selectMappedEntriesWithinBudget function call within the new synthesis file.

Note: Security Review did not run due to the size of the PR.

Comment on lines +1308 to +1312
selectMappedEntriesWithinBudget(
orderedEntries,
baseTokens,
safeInputBudget,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a minor indentation issue here that affects readability. The arguments to selectMappedEntriesWithinBudget are indented incorrectly.

Suggested change
selectMappedEntriesWithinBudget(
orderedEntries,
baseTokens,
safeInputBudget,
)
selectMappedEntriesWithinBudget(
orderedEntries,
baseTokens,
safeInputBudget,
)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/api/chat/message-agents.ts (1)

3092-3118: ⚠️ Potential issue | 🟠 Major

Reset the full final-synthesis state on every non-success exit.

review.lockedByFinalSynthesis is set before the streaming-channel guard and never cleared on the early return or either catch path. After a recoverable synthesis failure, the run will skip all later reviews/replanning, and the leftover streamedText / ackReceived can leak partial output into a retry or delegated fallback.

Proposed fix
     async execute(_args, context) {
       const mutableContext = mutableAgentContext(context)
-      if (!mutableContext.review.lockedByFinalSynthesis) {
-        mutableContext.review.lockedByFinalSynthesis = true
-        mutableContext.review.lockedAtTurn =
-          mutableContext.turnCount ?? MIN_TURN_NUMBER
-        loggerWithChild({ email: context.user.email }).info(
-          {
-            chatId: context.chat.externalId,
-            turn: mutableContext.review.lockedAtTurn,
-          },
-          "[MessageAgents][FinalSynthesis] Review lock activated after synthesis tool call."
-        )
-      }
+      const resetFinalSynthesisState = () => {
+        mutableContext.finalSynthesis.requested = false
+        mutableContext.finalSynthesis.completed = false
+        mutableContext.finalSynthesis.suppressAssistantStreaming = false
+        mutableContext.finalSynthesis.streamedText = ""
+        mutableContext.finalSynthesis.ackReceived = false
+        mutableContext.review.lockedByFinalSynthesis = false
+        mutableContext.review.lockedAtTurn = null
+      }
+
       if (
         mutableContext.finalSynthesis.requested &&
         mutableContext.finalSynthesis.completed
       ) {
         return ToolResponse.error(
@@
       if (!mutableContext.runtime?.streamAnswerText) {
         return ToolResponse.error(
           "EXECUTION_FAILED",
           "Streaming channel unavailable. Cannot deliver final answer."
         )
       }
+
+      if (!mutableContext.review.lockedByFinalSynthesis) {
+        mutableContext.review.lockedByFinalSynthesis = true
+        mutableContext.review.lockedAtTurn =
+          mutableContext.turnCount ?? MIN_TURN_NUMBER
+        loggerWithChild({ email: context.user.email }).info(
+          {
+            chatId: context.chat.externalId,
+            turn: mutableContext.review.lockedAtTurn,
+          },
+          "[MessageAgents][FinalSynthesis] Review lock activated after synthesis tool call."
+        )
+      }
@@
       } catch (error) {
+        resetFinalSynthesisState()
         if (isMessageAgentStopError(error)) {
-          context.finalSynthesis.suppressAssistantStreaming = false
-          context.finalSynthesis.requested = false
-          context.finalSynthesis.completed = false
           throw error
         }
-
-        context.finalSynthesis.suppressAssistantStreaming = false
-        context.finalSynthesis.requested = false
-        context.finalSynthesis.completed = false
         loggerWithChild({ email: context.user.email }).error(
           { err: error instanceof Error ? error.message : String(error) },
           "Final synthesis tool failed."
         )

Also applies to: 3160-3179

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/api/chat/message-agents.ts` around lines 3092 - 3118, The
final-synthesis state (e.g., mutableContext.review.lockedByFinalSynthesis,
mutableContext.review.lockedAtTurn and all fields under
mutableContext.finalSynthesis such as streamedText and ackReceived) is set
before the streaming-channel guard but not cleared on early returns or failure
paths; update the control flow so that any non-success exit (the early return
when runtime?.streamAnswerText is false, the "Final synthesis already completed"
path, and all catch/failure branches around the synthesis call) resets the
entire final-synthesis state to its initial/empty values and clears the review
lock (unset lockedByFinalSynthesis and lockedAtTurn) to avoid leaking partial
output into retries or later stages, ensuring you touch the same symbols
(mutableContext.review.lockedByFinalSynthesis,
mutableContext.review.lockedAtTurn, mutableContext.finalSynthesis.streamedText,
mutableContext.finalSynthesis.ackReceived, and
mutableContext.finalSynthesis.requested/completed).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@server/api/chat/message-agents.ts`:
- Around line 3092-3118: The final-synthesis state (e.g.,
mutableContext.review.lockedByFinalSynthesis, mutableContext.review.lockedAtTurn
and all fields under mutableContext.finalSynthesis such as streamedText and
ackReceived) is set before the streaming-channel guard but not cleared on early
returns or failure paths; update the control flow so that any non-success exit
(the early return when runtime?.streamAnswerText is false, the "Final synthesis
already completed" path, and all catch/failure branches around the synthesis
call) resets the entire final-synthesis state to its initial/empty values and
clears the review lock (unset lockedByFinalSynthesis and lockedAtTurn) to avoid
leaking partial output into retries or later stages, ensuring you touch the same
symbols (mutableContext.review.lockedByFinalSynthesis,
mutableContext.review.lockedAtTurn, mutableContext.finalSynthesis.streamedText,
mutableContext.finalSynthesis.ackReceived, and
mutableContext.finalSynthesis.requested/completed).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0bc99031-2b58-42ec-bc34-f430af9bf1ad

📥 Commits

Reviewing files that changed from the base of the PR and between f16ea5e and 9199f39.

📒 Files selected for processing (6)
  • server/ai/modelConfig.ts
  • server/api/chat/final-answer-synthesis.ts
  • server/api/chat/message-agents.ts
  • server/logger/index.ts
  • server/shared/types.ts
  • server/tests/finalAnswerSynthesis.test.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants