-
Notifications
You must be signed in to change notification settings - Fork 129
fix: add guards against possible memory overflow in find and aggregate tools MCP-21 #536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pull Request Test Coverage Report for Build 17863910702Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds memory overflow protection to MongoDB find and aggregate tools by implementing configurable limits on document retrieval and memory usage.
Key changes:
- Introduces
maxDocumentsPerQuery
(default: 50) andmaxBytesPerQuery
(default: 1MB) configuration parameters - Replaces
toArray()
with custom cursor iteration that monitors memory consumption - Adds fallback mechanisms for count operations with timeout protection
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
src/common/config.ts | Adds new configuration parameters for document and byte limits |
src/helpers/constants.ts | Defines timeout constants for count operations |
src/helpers/operationWithFallback.ts | Utility for operations with fallback values |
src/helpers/iterateCursor.ts | Core cursor iteration logic with memory monitoring |
src/tools/mongodb/read/find.ts | Updates find tool to use new memory-safe cursor iteration |
src/tools/mongodb/read/aggregate.ts | Updates aggregate tool to use new memory-safe cursor iteration |
tests/unit/helpers/*.test.ts | Unit tests for new helper functions |
tests/integration/tools/mongodb/read/*.test.ts | Integration tests for memory limits and updated response messages |
tests/integration/indexCheck.test.ts | Updates test expectations for new response format |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
This comment has been minimized.
This comment has been minimized.
Targets find and aggregate tool and does the following to avoid the memory overflow possibility: 1. Adds a configurable limit to restrict the number of documents fetched per query / aggregation. 2. Adds an iterator that keeps track of bytes consumed in memory by the retrieved documents and cuts off the iteration when there is a possibility of overflow. The overflow is based on configured maxBytesPerQuery parameter which defaults to 1MB.
Co-authored-by: Copilot <[email protected]>
This commit removes default limit from the find tool schema because now we have a configurable max limit of the documents allowed to be sent per query.
1. implements disabling maxDocumentsPerQuery and maxBytesPerQuery 2. use correct numbers for bytes calculation
97ad191
to
9d9b9f8
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Going to let Kevin and Gagik review the actual code, but do the limits sound reasonable? If we're preventing OOM situations, 1 MB sounds incredibly conservative - can we instead base it off the available memory of the machine - e.g. using |
The memory limit is to applied per tool response, so letting one tool call take almost entire available heap size seems too lenient, particularly given the case that we now support multiple clients connecting to the same server. I still think 16MB default through configuration and 1 MB on tool interface, (configurable by LLM and capped to configured limit) seems pretty reasonable. |
Yes, the memory limit is per tool call, and 1MB of results is a big chunk of results, and even more when some models are pretty aggressively using projections to reduce the context they use when reading the results. I think the question here would be: why would we need more memory?
If users want more accurate results, they should instruct the agent to delegate some of the work to an aggregation pipeline as in any other application. |
This commit implements the PR feedback about being generous on the config defaults and applying recommended restrictions on the tool parameters for capping the memory usage.
📊 Accuracy Test Results📈 Summary
📊 Baseline Comparison
📎 Download Full HTML Report - Look for the Report generated on: 9/19/2025, 4:06:32 PM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Proposed changes
This PR targets find and aggregate tool and does the following to
hopefully
guard against the memory overflow possibility:Cursor.toArray
with a dedicated cursor iterator that keeps track of bytes consumed in memory by the retrieved documents and cuts off the cursor iteration when there is a possibility of overflow. The overflow is based on configuredmaxBytesPerQuery
parameter which defaults to 16MB. The tools that use the cursor iterator such as find and aggregate also exposeresponseBytesLimit
as one of the parameter that defaults to 1MB to allow LLMs to invoke the tool with a bigger limit if needed, capped by the configured maxBytesPerQuery.Some key decisions that were taken on a call and not documented anywhere:
.toArray
and replace that with a cursor restricted to a configured limit.Note for reviewers
Please review with whitespaces turned off.
Checklist