fix: add guards against possible memory overflow in find and aggregate tools MCP-21 #536

himanshusinghs · 2025-09-09T14:52:01Z

Proposed changes

This PR targets find and aggregate tool and does the following to hopefully guard against the memory overflow possibility:

Adds a configurable limit to restrict the maximum number of documents fetched per query / aggregation. This defaults to 100 and can be configured through CLI args / env variables. This limit will be applied in addition to find tool's provided limit (default to 10) and the smaller one will be considered always.
Replaces the use of Cursor.toArray with a dedicated cursor iterator that keeps track of bytes consumed in memory by the retrieved documents and cuts off the cursor iteration when there is a possibility of overflow. The overflow is based on configured maxBytesPerQuery parameter which defaults to 16MB. The tools that use the cursor iterator such as find and aggregate also expose responseBytesLimit as one of the parameter that defaults to 1MB to allow LLMs to invoke the tool with a bigger limit if needed, capped by the configured maxBytesPerQuery.

Some key decisions that were taken on a call and not documented anywhere:

We will use hard limit cap to limit the number of documents retrieved by the query and the aggregation.
We will avoid using .toArray and replace that with a cursor restricted to a configured limit.
We will introduce a concept of configurable max bytes per query and by default will keep it relatively lower (1MB) and will restrict the cursor iteration to be within this limit.

Note for reviewers

Please review with whitespaces turned off.

Checklist

I have signed the MongoDB CLA

coveralls · 2025-09-09T16:32:06Z

Pull Request Test Coverage Report for Build 17863910702

Details

275 of 289 (95.16%) changed or added relevant lines in 7 files are covered.
2 unchanged lines in 1 file lost coverage.
Overall coverage increased (+0.5%) to 82.391%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/helpers/collectCursorUntilMaxBytes.ts	61	63	96.83%
src/tools/mongodb/read/aggregate.ts	101	107	94.39%
src/tools/mongodb/read/find.ts	90	96	93.75%

Files with Coverage Reduction	New Missed Lines	%
src/tools/mongodb/read/find.ts	2	92.26%

Totals
Change from base Build 17860042369:	0.5%
Covered Lines:	5238
Relevant Lines:	6247

💛 - Coveralls

Copilot

Pull Request Overview

This PR adds memory overflow protection to MongoDB find and aggregate tools by implementing configurable limits on document retrieval and memory usage.

Key changes:

Introduces maxDocumentsPerQuery (default: 50) and maxBytesPerQuery (default: 1MB) configuration parameters
Replaces toArray() with custom cursor iteration that monitors memory consumption
Adds fallback mechanisms for count operations with timeout protection

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/common/config.ts	Adds new configuration parameters for document and byte limits
src/helpers/constants.ts	Defines timeout constants for count operations
src/helpers/operationWithFallback.ts	Utility for operations with fallback values
src/helpers/iterateCursor.ts	Core cursor iteration logic with memory monitoring
src/tools/mongodb/read/find.ts	Updates find tool to use new memory-safe cursor iteration
src/tools/mongodb/read/aggregate.ts	Updates aggregate tool to use new memory-safe cursor iteration
tests/unit/helpers/*.test.ts	Unit tests for new helper functions
tests/integration/tools/mongodb/read/*.test.ts	Integration tests for memory limits and updated response messages
tests/integration/indexCheck.test.ts	Updates test expectations for new response format

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

tests/unit/helpers/iterateCursor.test.ts

src/helpers/iterateCursor.ts

src/tools/mongodb/read/find.ts

src/common/config.ts

src/tools/mongodb/read/aggregate.ts

src/tools/mongodb/read/find.ts

src/common/config.ts

src/helpers/iterateCursor.ts

Targets find and aggregate tool and does the following to avoid the memory overflow possibility: 1. Adds a configurable limit to restrict the number of documents fetched per query / aggregation. 2. Adds an iterator that keeps track of bytes consumed in memory by the retrieved documents and cuts off the iteration when there is a possibility of overflow. The overflow is based on configured maxBytesPerQuery parameter which defaults to 1MB.

Co-authored-by: Copilot <[email protected]>

This commit removes default limit from the find tool schema because now we have a configurable max limit of the documents allowed to be sent per query.

1. implements disabling maxDocumentsPerQuery and maxBytesPerQuery 2. use correct numbers for bytes calculation

src/tools/mongodb/read/find.ts

nirinchev · 2025-09-18T11:40:55Z

Going to let Kevin and Gagik review the actual code, but do the limits sound reasonable? If we're preventing OOM situations, 1 MB sounds incredibly conservative - can we instead base it off the available memory of the machine - e.g. using v8.getHeapStatistics?

himanshusinghs · 2025-09-18T11:46:01Z

The memory limit is to applied per tool response, so letting one tool call take almost entire available heap size seems too lenient, particularly given the case that we now support multiple clients connecting to the same server.

I still think 16MB default through configuration and 1 MB on tool interface, (configurable by LLM and capped to configured limit) seems pretty reasonable.

…ry-overflow

src/helpers/iterateCursor.ts

src/tools/mongodb/read/find.ts

kmruiz · 2025-09-18T17:36:29Z

Yes, the memory limit is per tool call, and 1MB of results is a big chunk of results, and even more when some models are pretty aggressively using projections to reduce the context they use when reading the results. I think the question here would be: why would we need more memory?

In the HTTP Transport server will likely run multiple tools for multiple users at the same time and we don't want noisy neighbors to overload the system.
1MB can hold a lot of data. To give you an example, in the sample_mflix.movies collection from our sample data, a single movie (with descriptions and summaries) bsonsize is 1.8KB. It means that we can fit around 550 movies in a single query result.
Most agents still have context windows of 128k to 256k tokens and 1MB of JSON is already a big hit on the agent context, which makes that much content pretty useless because of the loss of accuracy.

If users want more accurate results, they should instruct the agent to delegate some of the work to an aggregation pipeline as in any other application.

This commit implements the PR feedback about being generous on the config defaults and applying recommended restrictions on the tool parameters for capping the memory usage.

…ry-overflow

github-actions · 2025-09-19T16:06:34Z

📊 Accuracy Test Results

📈 Summary

Metric	Value
Commit SHA	`32e61661029ff721d10ec52287c72f342ac0e220`
Run ID	`e2b5e3eb-e87a-4309-b34c-1a9b0469abdc`
Status	done
Total Prompts Evaluated	61
Models Tested	1
Average Accuracy	88.9%
Responses with 0% Accuracy	6
Responses with 75% Accuracy	3
Responses with 100% Accuracy	52

📊 Baseline Comparison

Metric	Value
Baseline Commit	`c10955af6e78cc03aa30f3817e248bf130e053d3`
Baseline Run ID	`1d631660-589e-428e-b8a9-3d13f6f35615`
Baseline Run Status	`done`
Responses Improved	0
Responses Regressed	2

📎 Download Full HTML Report - Look for the accuracy-test-summary artifact for detailed results.

Report generated on: 9/19/2025, 4:06:32 PM

Copilot

Pull Request Overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

src/tools/mongodb/read/find.ts

src/tools/mongodb/read/aggregate.ts

src/helpers/collectCursorUntilMaxBytes.ts

himanshusinghs marked this pull request as ready for review September 10, 2025 09:35

Copilot AI review requested due to automatic review settings September 10, 2025 09:35

himanshusinghs requested a review from a team as a code owner September 10, 2025 09:35

Copilot AI reviewed Sep 10, 2025

View reviewed changes

tests/unit/helpers/iterateCursor.test.ts Outdated Show resolved Hide resolved

src/helpers/iterateCursor.ts Outdated Show resolved Hide resolved

src/tools/mongodb/read/find.ts Outdated Show resolved Hide resolved

himanshusinghs added the accuracy-tests label Sep 10, 2025

This comment has been minimized.

Sign in to view

blva reviewed Sep 10, 2025

View reviewed changes

src/common/config.ts Outdated Show resolved Hide resolved

src/tools/mongodb/read/aggregate.ts Outdated Show resolved Hide resolved

src/tools/mongodb/read/find.ts Outdated Show resolved Hide resolved

src/tools/mongodb/read/find.ts Show resolved Hide resolved

gagik reviewed Sep 10, 2025

View reviewed changes

src/common/config.ts Outdated Show resolved Hide resolved

gagik reviewed Sep 10, 2025

View reviewed changes

src/helpers/iterateCursor.ts Outdated Show resolved Hide resolved

himanshusinghs and others added 9 commits September 10, 2025 19:04

chore: fix existing tests

250299b

chore: tests for the new behavior

f1be251

chore: add missing constants files

eff03a8

Apply suggestion from @Copilot

7d670e8

Co-authored-by: Copilot <[email protected]>

chore: minor typo

8e8c3aa

fix: removes default limit from find tool schema

6bd5638

This commit removes default limit from the find tool schema because now we have a configurable max limit of the documents allowed to be sent per query.

chore: add an accuracy test for find tool

937908b

chore: PR feedback

9d9b9f8

1. implements disabling maxDocumentsPerQuery and maxBytesPerQuery 2. use correct numbers for bytes calculation

himanshusinghs force-pushed the fix/MCP-21-avoid-memory-overflow branch from 97ad191 to 9d9b9f8 Compare September 10, 2025 17:05

himanshusinghs added 3 commits September 10, 2025 20:32

chore: abort cursor iteration on request timeouts

13d8408

chore: use correct arg in agg tool

f09b4f4

chore: export tool matchers

7354562

himanshusinghs added accuracy-tests and removed accuracy-tests labels Sep 10, 2025

This comment has been minimized.

Sign in to view

chore: accuracy test fixes

819ed01

himanshusinghs added accuracy-tests and removed accuracy-tests labels Sep 10, 2025

This comment has been minimized.

Sign in to view

himanshusinghs requested review from blva and gagik September 10, 2025 20:59

gagik reviewed Sep 11, 2025

View reviewed changes

src/tools/mongodb/read/find.ts Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into fix/MCP-21-avoid-memo…

21f1d3e

…ry-overflow

kmruiz approved these changes Sep 18, 2025

View reviewed changes

src/helpers/iterateCursor.ts Outdated Show resolved Hide resolved

src/tools/mongodb/read/find.ts Outdated Show resolved Hide resolved

himanshusinghs added 7 commits September 19, 2025 17:20

chore: PR feedback about generous config defaults

25e0367

This commit implements the PR feedback about being generous on the config defaults and applying recommended restrictions on the tool parameters for capping the memory usage.

Merge remote-tracking branch 'origin/main' into fix/MCP-21-avoid-memo…

67d3ea8

…ry-overflow

chore: fix tests after merge

8601c05

chore: account for cursor close errors

955b7d8

chore: remove unnecessary call

bca4bbe

chore: revert export changes

811474e

chore: remove eager prediction of overflow

e3a87b3

himanshusinghs added accuracy-tests and removed accuracy-tests labels Sep 19, 2025

himanshusinghs requested review from gagik, kmruiz and Copilot September 19, 2025 16:11

Copilot AI reviewed Sep 19, 2025

View reviewed changes

src/tools/mongodb/read/find.ts Show resolved Hide resolved

src/tools/mongodb/read/aggregate.ts Show resolved Hide resolved

src/helpers/collectCursorUntilMaxBytes.ts Show resolved Hide resolved

chore: initialise cursor variables

e1c95bd

kmruiz approved these changes Sep 19, 2025

View reviewed changes

himanshusinghs merged commit 01cbfe7 into main Sep 22, 2025
18 checks passed

himanshusinghs deleted the fix/MCP-21-avoid-memory-overflow branch September 22, 2025 09:08

himanshusinghs mentioned this pull request Sep 22, 2025

chore: update the readme with new config variables #579

Merged

1 task

fix: add guards against possible memory overflow in find and aggregate tools MCP-21 #536

fix: add guards against possible memory overflow in find and aggregate tools MCP-21 #536

Uh oh!

Conversation

himanshusinghs commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Note for reviewers

Checklist

Uh oh!

coveralls commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 17863910702

Details

💛 - Coveralls

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

Uh oh!

nirinchev commented Sep 18, 2025

Uh oh!

himanshusinghs commented Sep 18, 2025

Uh oh!

Uh oh!

Uh oh!

kmruiz commented Sep 18, 2025

Uh oh!

github-actions bot commented Sep 19, 2025

📊 Accuracy Test Results

📈 Summary

📊 Baseline Comparison

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

himanshusinghs commented Sep 9, 2025 •

edited

Loading

coveralls commented Sep 9, 2025 •

edited

Loading