feat: add mcp classification server doc and example embedding based mcp classification server #417

rootfs · 2025-10-13T22:06:28Z

What type of PR is this?

Follow up #368 with example embedding based mcp classification server and documentation of how to extend the in-tree classification with mcp based approach.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Release Notes: Yes/No

…cp classification server Signed-off-by: Huamin Chen <[email protected]>

netlify · 2025-10-13T22:06:33Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`c4cdf30`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68ed98efa0f92000080b1184
😎 Deploy Preview	https://deploy-preview-417--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot

Pull Request Overview

This PR implements MCP classification server support for semantic router by adding comprehensive documentation and a production-ready embedding-based classifier example. This enables externalized, dynamic classification logic that can be updated without router restarts.

Key changes:

Complete MCP classification protocol specification with HTTP/stdio transport modes
Overview documentation explaining architecture, benefits, and implementation options
Production-ready embedding classifier using Qwen3-Embedding-0.6B model with FAISS vector search
Training dataset with 95 labeled examples across 5 categories

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`website/docs/tutorials/mcp-classification/protocol.md`	Comprehensive protocol specification for MCP classification servers
`website/docs/tutorials/mcp-classification/overview.md`	Architecture overview and implementation guidance for MCP classification
`examples/mcp-classifier-server/training_data.csv`	Training dataset with categorized examples for embedding classifier
`examples/mcp-classifier-server/server_embedding.py`	Production embedding-based MCP classifier implementation
`examples/mcp-classifier-server/requirements_embedding.txt`	Dependencies for embedding-based classifier
`examples/mcp-classifier-server/README.md`	Updated documentation comparing regex and embedding implementations

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-13T22:07:33Z

website/docs/tutorials/mcp-classification/protocol.md

+}
+```
+
+#### Response


[nitpick] The JSON structure is embedded as a string within JSON, making it difficult to parse and validate. Consider showing the parsed structure directly or using a cleaner format for better readability.

Copilot · 2025-10-13T22:07:33Z

examples/mcp-classifier-server/server_embedding.py

+    return classifier
+
+
+# Initialize MCP server
+app = Server("embedding-classifier")


[nitpick] The global classifier variable and initialization pattern could lead to issues in multi-threaded environments. Consider using a thread-safe singleton pattern or dependency injection instead of global state.

Copilot · 2025-10-13T22:07:34Z

examples/mcp-classifier-server/server_embedding.py

+            model_name, trust_remote_code=True
+        )
+        self.model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
+        self.device = torch.device("cpu")


The device is hardcoded to CPU, which will significantly impact performance if GPU is available. Consider auto-detecting GPU availability with fallback to CPU.

Suggested change

self.device = torch.device("cpu")

self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Copilot · 2025-10-13T22:07:34Z

examples/mcp-classifier-server/server_embedding.py

+        import asyncio
+
+        while True:
+            await asyncio.sleep(3600)


The asyncio import inside the function creates unnecessary overhead. Move the import to the top of the file with other imports.

Copilot · 2025-10-13T22:07:34Z

examples/mcp-classifier-server/server_embedding.py

+            logger.error(f"Error handling request: {e}", exc_info=True)
+            error = {
+                "jsonrpc": "2.0",
+                "id": request.get("id") if isinstance(request, dict) else None,


The variable request shadows the aiohttp request object. This should reference the JSON data instead of the HTTP request object.

github-actions · 2025-10-13T22:07:40Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

examples/mcp-classifier-server/requirements_embedding.txt
examples/mcp-classifier-server/server_embedding.py
examples/mcp-classifier-server/training_data.csv
examples/mcp-classifier-server/README.md

📁 `website`

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

website/docs/tutorials/mcp-classification/overview.md
website/docs/tutorials/mcp-classification/protocol.md

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Signed-off-by: Huamin Chen <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-14T00:34:18Z

website/docs/tutorials/mcp-classification/protocol.md

+## Initialization Sequence
+
+When the semantic router starts:
+


[nitpick] The mermaid diagram may not render correctly if the website doesn't support mermaid syntax. Consider adding a note about mermaid support or providing an alternative representation.

Suggested change

> **Note:** The following diagram uses [Mermaid](https://mermaid-js.github.io/mermaid/) syntax. It will only render if your documentation viewer supports Mermaid diagrams.

>

> **Text alternative:**

> - Router sends `POST /mcp/initialize` to MCP Server

> - MCP Server responds with server capabilities

> - Router sends `POST /mcp/tools/list` to MCP Server

> - MCP Server responds with available tools

> - Router calls `classify_text → list_categories` on MCP Server

> - MCP Server responds with categories and prompts

> - Router is ready

> - Router calls `classify_text → classify_text` on MCP Server

> - MCP Server responds with classification result

Copilot · 2025-10-14T00:34:19Z

examples/mcp-classifier-server/server_embedding.py

+# Initialize classifier globally
+# Note: This is safe for aiohttp as it uses a single-threaded event loop.
+# For multi-process deployments, each process gets its own instance.
+classifier = None
+classifier_device = "auto"  # Default device setting


Global state with lazy initialization can cause race conditions if multiple threads access get_classifier() simultaneously. Consider using threading.Lock or asyncio.Lock for thread-safe initialization.

Copilot · 2025-10-14T00:34:19Z

examples/mcp-classifier-server/server_embedding.py

+            error = {
+                "jsonrpc": "2.0",
+                "id": (
+                    data.get("id")
+                    if "data" in locals() and isinstance(data, dict)
+                    else None
+                ),


The error handling logic checks for 'data' in locals() which may not work as expected in exception handlers where 'data' might not be defined. This could cause a NameError. Consider using a try-except block around data access instead.

Suggested change

error = {

"jsonrpc": "2.0",

"id": (

data.get("id")

if "data" in locals() and isinstance(data, dict)

else None

),

try:

request_id = data.get("id") if isinstance(data, dict) else None

except Exception:

request_id = None

error = {

"jsonrpc": "2.0",

"id": request_id,

Copilot · 2025-10-14T00:34:19Z

examples/mcp-classifier-server/server_embedding.py

+    try:
+        while True:
+            await asyncio.sleep(3600)
+    except KeyboardInterrupt:
+        logger.info("Shutting down server...")


[nitpick] The infinite loop with 1-hour sleep intervals is unnecessarily resource-intensive. Consider using asyncio.Event or a more efficient waiting mechanism for graceful shutdown.

Suggested change

try:

while True:

await asyncio.sleep(3600)

except KeyboardInterrupt:

logger.info("Shutting down server...")

shutdown_event = asyncio.Event()

try:

await shutdown_event.wait()

except KeyboardInterrupt:

logger.info("Shutting down server...")

shutdown_event.set()

Copilot · 2025-10-14T00:34:19Z

examples/mcp-classifier-server/training_data.csv

+general,How do I improve my sleep quality?,Health and wellness
+general,What are some creative hobbies to try?,Arts and crafts
+general,How can I be more environmentally friendly?,Sustainability and lifestyle
+


The CSV file has an empty trailing line which could cause parsing issues in some CSV readers. Consider removing the empty line at the end of the file.

Suggested change

feat: add mcp classification server doc and example embedding based m…

0338893

…cp classification server Signed-off-by: Huamin Chen <[email protected]>

rootfs requested a review from Xunzhuo as a code owner October 13, 2025 22:06

rootfs requested review from Copilot and wangchen615 October 13, 2025 22:06

Copilot AI reviewed Oct 13, 2025

View reviewed changes

github-actions bot assigned rootfs and Xunzhuo Oct 13, 2025

rootfs added 6 commits October 13, 2025 22:08

fix broken link

5a6a4b0

Signed-off-by: Huamin Chen <[email protected]>

fix doc escape error

1c0f22d

Signed-off-by: Huamin Chen <[email protected]>

review feedback

3623f2c

Signed-off-by: Huamin Chen <[email protected]>

fix doc

44afaf0

Signed-off-by: Huamin Chen <[email protected]>

fix doc

af671f1

Signed-off-by: Huamin Chen <[email protected]>

fix doc

c4cdf30

Signed-off-by: Huamin Chen <[email protected]>

rootfs requested a review from Copilot October 14, 2025 00:33

Copilot AI reviewed Oct 14, 2025

View reviewed changes

Xunzhuo approved these changes Oct 14, 2025

View reviewed changes

rootfs merged commit 7d3a9f8 into vllm-project:main Oct 14, 2025
9 checks passed

This was referenced Oct 14, 2025

[Prompt Classification] Implement Out-of-Tree Regex Scanning Provider #370

Closed

[Prompt Classification] Implement Out-of-Tree Embedding Similarity Matching #369

Closed

	self.device = torch.device("cpu")
	self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

+> **Note:** The following diagram uses [Mermaid](https://mermaid-js.github.io/mermaid/) syntax. It will only render if your documentation viewer supports Mermaid diagrams.
+>
+> **Text alternative:**
+> - Router sends `POST /mcp/initialize` to MCP Server
+> - MCP Server responds with server capabilities
+> - Router sends `POST /mcp/tools/list` to MCP Server
+> - MCP Server responds with available tools
+> - Router calls `classify_text → list_categories` on MCP Server
+> - MCP Server responds with categories and prompts
+> - Router is ready
+> - Router calls `classify_text → classify_text` on MCP Server
+> - MCP Server responds with classification result

feat: add mcp classification server doc and example embedding based mcp classification server #417

feat: add mcp classification server doc and example embedding based mcp classification server #417

Uh oh!

Conversation

rootfs commented Oct 13, 2025

Uh oh!

netlify bot commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 Root Directory

📁 website

🎉 Thanks for your contributions!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netlify bot commented Oct 13, 2025 •

edited

Loading

github-actions bot commented Oct 13, 2025 •

edited

Loading

📁 `Root Directory`

📁 `website`