Skip to content

Conversation

Xunzhuo
Copy link
Member

@Xunzhuo Xunzhuo commented Oct 8, 2025

What type of PR is this?

docs: add NVIDIA Dynamo integration proposal

Copy link

netlify bot commented Oct 8, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 5fabc12
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68e68ca6794cde0008672b95
😎 Deploy Preview https://deploy-preview-373--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

github-actions bot commented Oct 8, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 website

Owners: @Xunzhuo
Files changed:

  • website/docs/proposals/nvidia-dynamo-integration.md
  • website/sidebars.ts

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@Xunzhuo Xunzhuo force-pushed the docs/nvidia-dynamo-integration-proposal branch 2 times, most recently from 0364191 to 3761959 Compare October 8, 2025 15:58
@Xunzhuo Xunzhuo force-pushed the docs/nvidia-dynamo-integration-proposal branch from be24cd5 to 5fabc12 Compare October 8, 2025 16:09
dynamics, competitive landscape, and stakeholder interests in your recommendations.
```

#### 2.2.2 Fusion Routing Strategy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this integration depending on or can be continued by the prompt classification improvement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope i think it is not a blocker here.


Semantic Router implements a **multi-signal fusion routing** approach that combines three complementary routing methods (as detailed in the [Prompt Classification Routing proposal](./prompt-classification-routing.md)):

**1. Keyword-Based Routing (Fast Path)**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a potential for sub tasks in the integration.

| Dimension | Semantic Router Alone | Dynamo Router Alone | **Integrated System** |
|-----------|----------------------|---------------------|----------------------|
| **Model Selection** | ✅ Semantic accuracy (14 categories) | ❌ No model awareness | ✅ Best model for task |
| **Worker Selection** | ❌ No worker awareness | ✅ KV cache optimization | ✅ Optimal worker for model |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think #227 can help both model and worker selection

|-----------|----------------------|---------------------|----------------------|
| **Model Selection** | ✅ Semantic accuracy (14 categories) | ❌ No model awareness | ✅ Best model for task |
| **Worker Selection** | ❌ No worker awareness | ✅ KV cache optimization | ✅ Optimal worker for model |
| **Prompt Engineering** | ✅ Domain-aware system prompts | ❌ No prompt optimization | ✅ Optimized CoT & MoE matching |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially the system prompt injection could impact the prefix cache, we should also monitor that

Copy link
Collaborator

@rootfs rootfs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, left some ideas for github issues

@rootfs rootfs merged commit ee7ca36 into main Oct 8, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants