Skip to content

Conversation

zerofishnoodles
Copy link
Contributor

What type of PR is this?
docs: Add integration proposal for PS and SR

What this PR does / why we need it:
Add integration proposal for PS and SR

Which issue(s) this PR fixes:

Related to #295

Release Notes: No

Copy link

github-actions bot commented Oct 13, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 website

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

  • website/docs/proposals/production-stack-integration.md
  • website/sidebars.ts

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Copy link

netlify bot commented Oct 13, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 812758d
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68edec026e13930008630810
😎 Deploy Preview https://deploy-preview-418--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.


**Tasks:**

1. **Distributed Tracing (OpenTelemetry):**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the otel tracing is enabled in vsr. An e2e tracing demo is in #329. If we can have an app -> envoy -> vsr -> PS -> vllm tracing, that'll be great

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is cool

rootfs
rootfs previously approved these changes Oct 14, 2025
Signed-off-by: Rui Zhang <[email protected]>
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This proposal document outlines the integration strategy between vLLM Semantic Router and vLLM Production Stack to create a unified inference system with semantic intelligence and infrastructure optimization capabilities.

  • Introduces a comprehensive integration framework combining semantic routing capabilities with production-scale infrastructure
  • Details a four-layer system architecture with gateway, semantic intelligence, infrastructure optimization, and execution layers
  • Proposes a phased implementation plan with foundation, observability, and production hardening stages

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


#### Semantic Router – Request Intelligence Layer

* Understands the user’s intent via multi‑signal classification, combining keyword matching, embedding similarity and classification.
Copy link

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrase 'embedding similarity and classification' is redundant since classification is mentioned twice. Consider revising to 'combining keyword matching, embedding similarity, and LLM-based classification' or similar to clarify the different classification methods.

Suggested change
* Understands the user’s intent via multi‑signal classification, combining keyword matching, embedding similarity and classification.
* Understands the user’s intent via multi‑signal classification, combining keyword matching, embedding similarity, and LLM-based classification.

Copilot uses AI. Check for mistakes.

@Xunzhuo
Copy link
Member

Xunzhuo commented Oct 14, 2025

@zerofishnoodles can you add this doc to sidebar?

Signed-off-by: Rui Zhang <[email protected]>
Signed-off-by: Rui Zhang <[email protected]>
Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🚀🚀

@Xunzhuo Xunzhuo merged commit 8165a9c into vllm-project:main Oct 14, 2025
8 checks passed
samzong pushed a commit to samzong/semantic-router that referenced this pull request Oct 14, 2025
* doc: add integration proposal for PS and SR

Signed-off-by: Rui Zhang <[email protected]>

* change codespell

Signed-off-by: Rui Zhang <[email protected]>

* Add sidebar

Signed-off-by: Rui Zhang <[email protected]>

* modify

Signed-off-by: Rui Zhang <[email protected]>

---------

Signed-off-by: Rui Zhang <[email protected]>
Signed-off-by: samzong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants