1️⃣ Dynamic MCP Tool Filtering using Embedding & 2️⃣ Confident tool calls with BAML #2893

qdrddr · 2025-04-23T21:00:55Z

qdrddr
Apr 23, 2025

Hi team 👋

First, thank you for the amazing work you're doing, Roo-Code! I've been diving into this ecosystem and wanted to propose a feature that I think would benefit a lot of developers working with large toolchains across multiple MCP servers.

🤖 The Problem

In the LLM-based Cline environment, we've seen that the reliability of tool usage drops significantly once the number of available tools exceeds ~20. This is a known issue: LLMs struggle to reason effectively when presented with too many tool options.

Now consider a real-world setup:

Each MCP server can expose 10+ tools
A user might have dozens of MCP servers
That quickly leads to 100+ tools available at once

Currently, users are forced to manually disable MCP servers just to stay under the tool limit. It’s not scalable, and it limits the utility of what MCP can offer.

🎯 The Opportunity with Embeddings

Embedding filtering could be an ideal 1/2 solution to this scaling problem. Imagine dynamically filtering tools using semantic similarity between:

The user’s query
The available tools’ metadata or descriptions

By narrowing the tool list down to the most relevant 20 (or a configurable limit), you:

✅ Keep the toolset within the LLM comfort zone
✅ Increase the accuracy of tool selection and calls
✅ Avoid manual management of tool/server lists

For example, if a user enters:
"Commit all my modified files and log their last modified timestamps before pushing to GitHub."
—Then only 2-3 tools from 2-3 MCP servers should be selected dynamically. There's no need to expose the full universe of 30+ tools.

This would allow MCP to scale elegantly even as more servers and tools come online. Ideally, this would support dynamic registration/deregistration of MCP servers and on-the-fly tool filtering

2️⃣ Execute tool calls more reliably with BAML

The second important part of the solution is to confidently call tools. We can enjoy the reliability of tool calls thanks to BAML capabilities (similar to Pydantic) to confidently populate all the required fields and parameters for the tool being called.

💡 Proposed Enhancements:

Dynamically embed and filter MCP tools based on the prompt to select a reduced tool list (e.g., max 20 tools) for the LLM with similarity search
Apply BAML to execute tools, reliably populate all the required parameters for the tool to run

📚 References

BAML with MCP tools – example notebook
Large-scale classification with BAML
hello.py – example with Embeddings filtering
pick_best_category.baml

Would love to hear your thoughts on this! It could make the toolchain smarter, lighter, and way more user-friendly.

Thanks so much 🙏
Damein

qdrddr · 2025-07-16T19:11:25Z

qdrddr
Jul 16, 2025
Author

Here is what I propose for the MCP specification update to facilitate dynamic tool discovery. Please support the idea here

modelcontextprotocol/modelcontextprotocol#845

0 replies

qdrddr · 2025-07-17T02:11:34Z

qdrddr
Jul 17, 2025
Author

Another elegant solution would be to allow MCP Clients to utilize the /v1/responses endpoints that can "offload" this task instead of using /v1/chat/completions.

0 replies

qdrddr · 2025-07-18T02:12:02Z

qdrddr
Jul 18, 2025
Author

Here is a paper that may be relevant

3 replies

normalnormie Jul 18, 2025

As of problem #1, the author of this MCP(@smart-mcp-proxy/mcpproxy-go) says it achieves that paper results, as of problem #2 I totally agree on integrating BAML

qdrddr Aug 11, 2025
Author

With mcpproxy-go, there's a problem remaining not addressed:
The mcpproxy-go is called with LLM invoked by the MCP Client. Such a design of a tool search_tool has flaws:

LLM needs to make a decision to search for a tool (and in most cases it always searchas a tool, so we have an extra step that can be eliminated).
LLM Need to guess of the names of tools and their capabilities prior to the knowing of the full tool list, (a clasic chicken and the edd problem) leading to wasted LLM tokens and re-tries.

While my proposal is to always do MCP Server search (With seamntic search) prior to LLM call on the MCP Client end that intended to solve the issue of re-tries and chicken and the egg problem while saving LLM tokens.

BradKML Aug 12, 2025

Noted that MCP-Proxy is the only way, but then what options are there to smooth the tool filter out? Should the related MCP repos modify their codebase to include the needed capabilities? https://github.com/TBXark/mcp-proxy #6289 (reply in thread)

Also, thanks for adding this issue TBXark/mcp-proxy#38

qdrddr · 2025-07-21T13:12:55Z

qdrddr
Jul 21, 2025
Author

I think this is relevant.

Addressing LLMs limitations in generating sophisticated long-form outputs. Survey analysis of over 1400 research papers:

Context Retrieval and Generation
Context Processing
Context Management
Memory Systems
RAG
Tool-Integrated Reasoning

https://github.com/Meirtz/Awesome-Context-Engineering
https://arxiv.org/abs/2505.06416
https://arxiv.org/abs/2505.03275
https://github.com/smart-mcp-proxy/mcpproxy-go
https://github.com/Dumbris/mcpproxy
https://github.com/nullplatform/meta-mcp-proxy
https://github.com/metatool-ai/metamcp
https://github.com/pratikjadhav2726/Unified-MCP-Tool-Graph
BoundaryML/baml-examples#53

1 reply

BradKML Aug 12, 2025

@dosu please examine if any of the following tools fit the current pain points, and provide output as a single table dissecting the characteristics of the software.

qdrddr · 2025-07-21T15:17:25Z

qdrddr
Jul 21, 2025
Author

I believe that MCP Client should be able to do RAG with MCP to improve quality and scalability.

0 replies

R-omk · 2025-07-21T17:38:16Z

R-omk
Jul 21, 2025

related #5963

0 replies

qdrddr · 2025-07-23T21:26:25Z

qdrddr
Jul 23, 2025
Author

Context Rot Affects LLM Performance

Longer input does not guarantee consistent results

🔍 Chroma researchers tested 18 LLMs on simple tasks
📉 Found performance declines with longer inputs

📏 Input length caused unexpected reliability issues
🧪 Highlights the need for long-context evaluations
🧠 Suggests better context engineering strategies

Research results https://github.com/chroma-core/context-rot

0 replies

qdrddr · 2025-07-28T17:04:07Z

qdrddr
Jul 28, 2025
Author

TypeScript Implementation of MCP Tool Semantic Search
https://github.com/samanhappy/mcphub

0 replies

qdrddr · 2025-07-29T01:33:23Z

qdrddr
Jul 29, 2025
Author

Reranker with graphs
https://github.com/Bavalpreet/MediumBlogs/blob/main/Knowledgegraph%20Reranking/Knowledge_graph_re_ranking.ipynb

0 replies

qdrddr · 2025-07-29T20:13:10Z

qdrddr
Jul 29, 2025
Author

Suggested Embedding model: Codestral-Embed from mistral and Mxbai-rerank-v2 to improve performance for, code, MCP, and tool retrieval.

0 replies

qdrddr · 2025-07-30T13:40:26Z

qdrddr
Jul 30, 2025
Author

Suggested Embedded (In-Progress) VectorDBs:

https://github.com/tursodatabase/turso
https://github.com/chroma-core/chroma
https://github.com/lancedb/lancedb
DuckDB with VSS extention

0 replies

qdrddr · 2025-07-31T13:08:16Z

qdrddr
Jul 31, 2025
Author

A few recent papers supporting the idea of RAG and semantic search for MCP Tools:
ScaleMCP: Dynamic and Auto-Synchronizing MCP Tools https://arxiv.org/abs/2505.06416
RAG MCP https://arxiv.org/abs/2505.03275

Both papersconducted experiments demonstrating that both RAG techniques decrease the amount of consumed tokens while at the same time increasing task compleatenes score with Vector Search + Reranker. We basically improve quality and making it cheaper.

Experiments, including an MCP stress test, demonstrate RAG-MCP significantly cuts prompt to�kens (e.g., by over 50%) and more than triples tool selection accuracy (43.13% vs 13.62% baseline)

0 replies

qdrddr · 2025-07-31T21:05:06Z

qdrddr
Jul 31, 2025
Author

Would appreciate if you vote for this improvement idea to standardize MCP Specification with Delegated Advanced Tool Search in my comment feature here.

0 replies

BradKML · 2025-08-08T04:00:46Z

BradKML
Aug 8, 2025

@qdrddr how is this any different? Could you get Dosu to get the most popular routing options, and see what the feature differences are? #6289 (comment)

3 replies

qdrddr Aug 11, 2025
Author

While your Discussion idea is focused on aggregating MCP in a marketplaces.
My idea is laser-focused on specific tool search & filtering capabilities, specifically on auto-filtering tools BEFORE LLM call.

BradKML Aug 11, 2025

Wait, I don't think the tools I pointed out are marketplaces, but local tool filtering, similar to what you are focusing on, did I miss anything there? Or are some of the tools not including a filter mechanism?
Also what about this list? codelion/openevolve#199 (comment)

qdrddr Aug 21, 2025
Author

The problem with semantic search and filtering by the MCP Server is with the fact that LLM needs to do a lot of guesswork, guessing if it needs to invoke the search_tool on this MCP Server, which tools available ahead of time. To address this search must be done prior to the LLM call.

With either implementing Semantic Tool Search/Filtering before LLM on the MCP Client side. Or needs MCP protocol to support search capability (which it does not currently).

qdrddr · 2025-08-19T14:51:46Z

qdrddr
Aug 19, 2025
Author

MCP-use implemented a tool semantic search mechanism
https://github.com/mcp-use/mcp-use

1 reply

BradKML Aug 19, 2025

Is there an updated table of what the established toolkits are?

qdrddr · 2025-08-22T00:47:10Z

qdrddr
Aug 22, 2025
Author

MCP-Agent with EmbeddingRouter for semantic tool search and filtering.

https://github.com/lastmile-ai/mcp-agent

0 replies

qdrddr · 2025-09-02T19:57:14Z

qdrddr
Sep 2, 2025
Author

MCP-Universe: Benchmarking Large Language Models with
Real-World Model Context Protocol Servers.

Key findings:
“Long-Context Challenge

Token count increases rapidly with interaction steps, often leading to context overflow and degraded performance in multi-step tasks requiring extensive reasoning.”

This proves how MCP tool pre-filtering (prior LLM) is important, and as I was saying, especially manifests in multi-steps.

https://mcp-universe.github.io

0 replies

qdrddr · 2025-10-10T19:24:04Z

qdrddr
Oct 10, 2025
Author

Cloudflare Turns MCP Tools into TypeScript APIs

Cloudflare proposes converting MCP tools into TypeScript APIs so that LLMs can generate code using them.
They aim to address two key issues: managing many tools and chaining multiple calls efficiently while reducing token usage.

🔧 Improves handling of many complex tools
💡 Enables multi-step automation with fewer tokens

Please help me: 👍 React, Forward ⤵️ and 💬 Comment
What challenges do you see in using LLMs to generate and manage API-based workflows?
Let me know in the comments👇

https://blog.cloudflare.com/code-mode/

Though I still believe Semantic Search can be complimentary for this design.

1 reply

gavmor Oct 19, 2025

This is a promising approach, but one that may exacerbate the Long-Context Challenge w/ tool metadata. Take a look at lootbox, an implementation of "code mode" that composes tools into small programs as you go along, saving these compositions as higher order tools.

1️⃣ Dynamic MCP Tool Filtering using Embedding & 2️⃣ Confident tool calls with BAML #2893

Uh oh!

🤖 The Problem

🎯 The Opportunity with Embeddings

2️⃣ Execute tool calls more reliably with BAML

💡 Proposed Enhancements:

📚 References

Replies: 18 comments · 9 replies

Uh oh!

qdrddr Jul 16, 2025 Author

Uh oh!

qdrddr Jul 17, 2025 Author

Uh oh!

qdrddr Jul 18, 2025 Author

Uh oh!

Uh oh!

Uh oh!

qdrddr Aug 11, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qdrddr Jul 21, 2025 Author

Uh oh!

Uh oh!

Uh oh!

qdrddr Jul 21, 2025 Author

Uh oh!

Uh oh!

qdrddr Jul 23, 2025 Author

Uh oh!

qdrddr Jul 28, 2025 Author

Uh oh!

qdrddr Jul 29, 2025 Author

Uh oh!

qdrddr Jul 29, 2025 Author

Uh oh!

Uh oh!

qdrddr Jul 30, 2025 Author

Uh oh!

qdrddr Jul 31, 2025 Author

Uh oh!

qdrddr Jul 31, 2025 Author

Uh oh!

Uh oh!

qdrddr Aug 11, 2025 Author

Uh oh!

Uh oh!

Uh oh!

qdrddr Aug 21, 2025 Author

Uh oh!

qdrddr Aug 19, 2025 Author

Uh oh!

Uh oh!

qdrddr Aug 22, 2025 Author

Uh oh!

qdrddr Sep 2, 2025 Author

Uh oh!

Uh oh!

qdrddr Oct 10, 2025 Author

Uh oh!

Replies: 18 comments 9 replies

qdrddr
Jul 16, 2025
Author

qdrddr
Jul 17, 2025
Author

qdrddr
Jul 18, 2025
Author

qdrddr Aug 11, 2025
Author

qdrddr
Jul 21, 2025
Author

qdrddr
Jul 21, 2025
Author

qdrddr
Jul 23, 2025
Author

qdrddr
Jul 28, 2025
Author

qdrddr
Jul 29, 2025
Author

qdrddr
Jul 29, 2025
Author

qdrddr
Jul 30, 2025
Author

qdrddr
Jul 31, 2025
Author

qdrddr
Jul 31, 2025
Author

qdrddr Aug 11, 2025
Author

qdrddr Aug 21, 2025
Author

qdrddr
Aug 19, 2025
Author

qdrddr
Aug 22, 2025
Author

qdrddr
Sep 2, 2025
Author

qdrddr
Oct 10, 2025
Author