-
Notifications
You must be signed in to change notification settings - Fork 273
feat: support /v1/models in direct response #283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
|
@Xunzhuo can you review this solution too? |
|
This need additional xDS config generation which is not standard for InferencePool implementation, so that using direct response. Otherwise we need every Gateway implementation to inject one more cluster to support our models api. |
|
@Xunzhuo sounds good. Can you run precommit? It is ready to go. |
|
Sounds good to me |
ed77087 to
3bf92d4
Compare
Signed-off-by: bitliu <[email protected]>
3bf92d4 to
ce9c4f9
Compare
|
tested and move this forward |
Signed-off-by: bitliu <[email protected]> Signed-off-by: liuhy <[email protected]>
Signed-off-by: bitliu <[email protected]> Signed-off-by: liuhy <[email protected]>

What type of PR is this?
feat: support /v1/models in direct response