-
Notifications
You must be signed in to change notification settings - Fork 255
feat: Add comprehensive monitoring metrics for batch classification API #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: OneZero-Y <[email protected]>
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Signed-off-by: OneZero-Y <[email protected]>
|
@OneZero-Y nicely done! For UX consideration, would batch size range be ok to have a hardcode default instead of exposing on the config? batch_size_ranges:
- {min: 1, max: 1, label: "1"}
- {min: 2, max: 5, label: "2-5"}
- {min: 6, max: 10, label: "6-10"}
- {min: 11, max: 20, label: "11-20"}
- {min: 21, max: 50, label: "21-50"}
- {min: 51, max: -1, label: "50+"} |
@rootfs Thank you for your recognition. |
…coded defaults Signed-off-by: OneZero-Y <[email protected]>
|
let's hardcode it first. You can add a comment there and explain this. |
There are instructions in configuration.md,The batch_size_ranges configuration in config. yaml has been removed. |
What type of PR is this?
feature
What this PR does / why we need it:
This PR adds comprehensive Prometheus monitoring metrics for the batch classification API to provide detailed performance monitoring and error tracking capabilities.
Key Features:
Configuration Example:
Metrics Query Examples: