Skip to content

feat: adds an export tool and exported-data resource MCP-16 #424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 50 commits into from
Aug 11, 2025

Conversation

himanshusinghs
Copy link
Collaborator

@himanshusinghs himanshusinghs commented Aug 5, 2025

Proposed changes

This PR introduces:

  1. a new export tool that allows users to export MongoDB collection data or query results in JSON format
  2. a templated resource called exported-data:// to access the data exports created by export tool

JSON Export tool provides two formats for data export:

  1. relaxed: This exports plain json format at the possible loss of some BSON types. Useful when the data is expected to be used in tools other than MongoDB such as BI tools or when simply having conversation with LLMs
  2. canonical: This exports json with BSON types and is expected to be used when data might get imported back again in MongoDB.

JSON export tool allows exporting:

  1. an entire namespace
  2. query results for a filter (possibly with limit, sort, projection)

Export tool result contains:

  1. a text representation of how to access the exported data via resource URI
  2. a resource_link content part for the clients who are capable of accessing resource_link
  3. a local path, only if the MCP server is running behind stdio transport to make it easier for editor clients to access the exported file.

Exported data temporarily resides on the host machine running the mongodb-mcp-server at the configured path (configuration exposed by flag --exportPath or environment variable MDB_MCP_EXPORT_PATH), until it gets cleaned up automatically either by:

  • the autocleaner running every configured milliseconds (configuration exposed by flag --exportCleanupIntervalMs or environment variable MDB_MCP_EXPORT_CLEANUP_INTERVAL_MS)
  • disconnect of the client

The expiry of exported data is controlled by flag --exportTimeoutMs or the environment variable MDB_MCP_EXPORT_TIMEOUT_MS

The exported data can be accessed via the resource URI template - exported-data://{exportName}. Because the exported data are written to non-guessable filenames, the autocomplete API is provided to easily autocomplete the matching file names.

Checklist

  • JSON export tool should allow exporting BI friendly data
  • JSON export tool should allow exporting data with BSON types
  • JSON export tool should allow exporting entire namespace
  • JSON export tool should allow exporting query results
  • Exported data should be available for limited time and automatically cleaned up during a session and on session disconnect
  • Exported data should be made available via resources for stable and re-usable reference
  • Exported data should optionally be provided via local path when MCP server is connected to editor clients
  • Export tool covered by test cases
  • Exported resource covered by test cases
  • Export cleanup and generation should be covered by test cases

@himanshusinghs himanshusinghs force-pushed the feat/MCP-16-json-export-tool branch from 61e7064 to e359184 Compare August 5, 2025 13:46
@himanshusinghs himanshusinghs marked this pull request as ready for review August 5, 2025 17:47
@Copilot Copilot AI review requested due to automatic review settings August 5, 2025 17:47
@himanshusinghs himanshusinghs requested a review from a team as a code owner August 5, 2025 17:47
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive JSON export functionality for MongoDB collections, allowing users to export data in both relaxed and canonical JSON formats, with automatic cleanup and resource access via MCP resource templates.

Key changes:

  • Added a new export tool that supports filtering, sorting, limiting, and projecting MongoDB collection data
  • Implemented an exported-data:// resource template for stable access to exported files
  • Created automatic cleanup mechanisms with configurable timeouts and intervals

Reviewed Changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/common/sessionExportsManager.ts Core export management with file operations, cleanup, and EJSON formatting
src/tools/mongodb/read/export.ts Export tool implementation with MongoDB query support
src/resources/common/exportedData.ts Resource template for accessing exported data with autocomplete
tests/unit/common/sessionExportsManager.test.ts Comprehensive unit tests for export manager functionality
tests/integration/tools/mongodb/read/export.test.ts Integration tests for export tool with various query scenarios
tests/integration/resources/exportedData.test.ts Resource template integration tests
Comments suppressed due to low confidence (1)

src/common/sessionExportsManager.ts:131

  • The closing bracket is written directly to the output stream here, but the Transform stream's final callback on line 184 also pushes a closing bracket. This will create invalid JSON with duplicate closing brackets.
                    outputStream.write("]\n");

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Aug 6, 2025

📊 Accuracy Test Results

📈 Summary

Metric Value
Commit SHA d683919a23b1f507815d4908e14ee123f6dd90ea
Run ID 1519ed72-a0df-4e44-bff0-68b22a013d81
Status done
Total Prompts Evaluated 55
Models Tested 1
Average Accuracy 98.6%
Responses with 0% Accuracy 0
Responses with 75% Accuracy 3
Responses with 100% Accuracy 52

📊 Baseline Comparison

Metric Value
Baseline Commit a269053ba0d973ac79ecbeaf0177220cb7d06b06
Baseline Run ID f05769f8-177e-4150-b1a4-390757a692ea
Baseline Run Status done
Responses Improved 0
Responses Regressed 0

📎 Download Full HTML Report - Look for the accuracy-test-summary artifact for detailed results.

Report generated on: 8/6/2025, 11:28:49 AM

@himanshusinghs himanshusinghs force-pushed the feat/MCP-16-json-export-tool branch from 6109cea to e89cf23 Compare August 6, 2025 11:43
@coveralls
Copy link
Collaborator

coveralls commented Aug 6, 2025

Pull Request Test Coverage Report for Build 16876636630

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 592 of 659 (89.83%) changed or added relevant lines in 13 files are covered.
  • 18 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.6%) to 81.999%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/common/config.ts 12 13 92.31%
src/tools/mongodb/read/export.ts 77 78 98.72%
src/server.ts 24 31 77.42%
src/common/exportsManager.ts 351 378 92.86%
src/resources/common/exportedData.ts 62 93 66.67%
Files with Coverage Reduction New Missed Lines %
src/common/session.ts 6 85.71%
src/resources/resource.ts 6 77.27%
src/tools/mongodb/mongodbTool.ts 6 85.8%
Totals Coverage Status
Change from base Build 16826971263: 0.6%
Covered Lines: 4072
Relevant Lines: 4919

💛 - Coveralls

Copy link
Collaborator

@nirinchev nirinchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed it thoroughly, but leaving some comments for things that should be reworked. Happy to grab some time later today or tomorrow to go through the high-level architecture and offer some suggestions there.

@himanshusinghs himanshusinghs force-pushed the feat/MCP-16-json-export-tool branch from e89cf23 to e8845bc Compare August 6, 2025 14:40
export class ExportedData {
private readonly name = "exported-data";
private readonly description = "Data files exported in the current session.";
private readonly uri = "exported-data://{exportName}";
Copy link
Collaborator

@gagik gagik Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe mongodb-exported-data? The namespace is too generic otherwise (similar argument can be made about our tool names in general)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this makes sense. I would personally like that we do it at-least for all the resources, if not for everything. Will ask if someone has an objection if I make this blanket change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is the resource uris are already scoped to the MCP server, so it's not like they could conflict between different servers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scoping I think is controlled by clients and each might decide on implementing it differently. It should be fine if make our resources/tools more specific.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is controlled by clients, but they need to be able to associate resources with servers - e.g. even if there's a conflict between schemes, there's no way that we could serve exported-data://name if it comes from a different MCP server.

@himanshusinghs himanshusinghs force-pushed the feat/MCP-16-json-export-tool branch from b76b8cb to a3f9e8e Compare August 7, 2025 08:45
Copy link
Collaborator

@nirinchev nirinchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments from me - haven't gone through the entire thing yet, but made it through the exports manager and will continue after a few meetings.

export class ExportedData {
private readonly name = "exported-data";
private readonly description = "Data files exported in the current session.";
private readonly uri = "exported-data://{exportName}";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is the resource uris are already scoped to the MCP server, so it's not like they could conflict between different servers.

@himanshusinghs himanshusinghs force-pushed the feat/MCP-16-json-export-tool branch from 5e5698b to dc457ae Compare August 7, 2025 17:03
@himanshusinghs himanshusinghs force-pushed the feat/MCP-16-json-export-tool branch 2 times, most recently from 781ee47 to 6752099 Compare August 8, 2025 10:11
@himanshusinghs himanshusinghs force-pushed the feat/MCP-16-json-export-tool branch from 6752099 to d3d81d6 Compare August 8, 2025 10:11
1. outputStream.write moved to within the Transform.flush
2. won't send resource updated notification on export-expired event or
   it might trigger client to fetch expired exports.
3. added ObjectId to the file names to make them unique
Copy link
Collaborator

@nirinchev nirinchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments from me - it's almost ready

Comment on lines 312 to 317
static init(sessionId: string, config: ExportsManagerConfig, logger: LoggerBase): ExportsManager {
const exportsDirectoryPath = path.join(config.exportsPath, sessionId);
const exportsManager = new ExportsManager(exportsDirectoryPath, config, logger);
exportsManager.init();
return exportsManager;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have super strong feelings here, but I feel this pattern is not super idiomatic. I guess you went that route to avoid starting the cleanup in the ctor, but an alternative approach would be to move setting up the interval to the first time an export is requested. That would give us 2 benefits - first, we'd be able to use the ctor like normal and second, we wouldn't be running the cleanup logic if the user never requested a an export. So it would look something like:

constructor(...) {
  // set the class-level variables as usual
}

createJSONExport(...) {
  if (!this.exportsCleanupInterval) {
    this.exportsCleanupInterval = setInterval(...);
  }
}

We can also get rid of the wasInitialized field since it's pretty much just returning whether exportsCleanupInterval is undefined or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the suggestion looks really good, the only reason I wouldn't wanna do that is because we'll be triggering an unrelated side-effect in createJSONExport.

As per my understanding, offloading complex initializations to dedicated methods is pretty common. And I offloaded it to a static method only because otherwise it would have been a pain to do this in every tests:

const exportsManager = new ExportsManager(...)
exportsManager.init();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say the side-effect is unrelated - createJSONExport is the entrypoint for creating files on disk. So, initializing the cleanup loop only once we have files to clean up does make sense to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even then I don't think createJSONExport should concern itself with starting the cleanup loop as well, that's not what its meant for.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a counterargument, what's the point of having a cleanup loop if you know there's nothing to clean up? Not a hill I'm willing to die on, so happy to keep it as-is, but an alternative design would be to use setTimeout instead of a cleanup loop - that is, everytime we create a json export, we setup a clean up callback to delete it after x minutes. In that case, it would very much be createJSONExport that sets that up, right? While the cleanup mechanism is different, I would say the notion of triggering/scheduling/setting up a garbage collection procedure as a side effect of creating "garbage" would not be surprpising to me.

export class ExportedData {
private readonly name = "exported-data";
private readonly description = "Data files exported in the current session.";
private readonly uri = "exported-data://{exportName}";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is controlled by clients, but they need to be able to associate resources with servers - e.g. even if there's a conflict between schemes, there's no way that we could serve exported-data://name if it comes from a different MCP server.

{
type: "resource_link",
name: exportName,
uri: exportURI,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this behaves, but as things stand, we'll be providing a uri that's unavailable in the response. Does this confuse models at all or are they correctly handling it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my tests, I did not see any "confusing" responses from the models. They correctly interpreted the response and the "in-progress" part of it. Yes the response_uri is available which means users (through client) can decide to access it but even that should be fine because we always respond with "still in progress" notification.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I'm misunderstanding the logic here, but looking at ExportsManager.availableExports, it seems like we're filtering out the in-progress reports, which means that the exportURI we supply here would not resolve to a valid resource until the export is completed.

Copy link
Collaborator Author

@himanshusinghs himanshusinghs Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filtering is only for the autocompletion of the export names. The autocomplete is triggered when someone chose the templated URI and then started typing name of the export. But if someone requests the resource directly by just providing the exportName / exportURI then the readResourceCallback will be triggered and that would return "Resource is still being generated" error content.

Copy link
Collaborator

@nirinchev nirinchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the most part, but we may need to rework the last commit slightly to either replace the dependency or reimplement it in another way.

@kmruiz kmruiz force-pushed the feat/MCP-16-json-export-tool branch from 802bd8e to 78a9044 Compare August 11, 2025 10:19
RWLocks are not necessary here because sessions are single user and
we don't want the agent to wait until a resource is available, as it
can take forever depending on the data set.
@kmruiz kmruiz force-pushed the feat/MCP-16-json-export-tool branch from 78a9044 to 1039f4e Compare August 11, 2025 10:21
@kmruiz kmruiz enabled auto-merge (squash) August 11, 2025 10:39
@kmruiz kmruiz merged commit 974fa36 into main Aug 11, 2025
17 checks passed
@kmruiz kmruiz deleted the feat/MCP-16-json-export-tool branch August 11, 2025 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants