-
Notifications
You must be signed in to change notification settings - Fork 273
feat: add Grafana+Prometheus in k8s #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: JaredforReal <[email protected]>
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
|
More network troubleshooting will be updated in |
| @@ -0,0 +1,203 @@ | |||
| # Semantic Router Observability on Kubernetes | |||
|
|
|||
| This guide adds a production-ready Prometheus + Grafana stack to the existing Semantic Router Kubernetes deployment. It includes manifests for collectors, dashboards, data sources, RBAC, and ingress so you can monitor routing performance in any cluster. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can move this to tutorials in website, https://vllm-semantic-router.com/docs/tutorials/observability/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure,scheduled
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
|
| @@ -0,0 +1,652 @@ | |||
| apiVersion: v1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the same dashboard as in https://github.com/vllm-project/semantic-router/tree/main/config/grafana? Is there a way we can consolidate them in a follow up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consolidate them — it would add some complexity (e.g. conditional processing with Kustomize overlays), but overall make maintenance easier.
| @@ -0,0 +1,30 @@ | |||
| apiVersion: v1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same issue as above
| spec: | ||
| securityContext: | ||
| runAsNonRoot: true | ||
| runAsUser: 472 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is 472 a magic number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As seen in lines 20 and 21 in /prometheus/deployment.yaml, Prometheus runs as UID/GID 65534 (nobody), while Grafana runs as UID/GID 472 (a dedicated Grafana user) to avoid running processes as root and to enhance container security, particularly when mounting volumes. Maybe I can add a short comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks! Adding a comment is helpful.
|
@fcanogab can you take a look? Thanks |
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
LGTM. Thanks. |
* feat: add Grafava+Prometheus in k8s Signed-off-by: JaredforReal <[email protected]> * Update docs of observability k8s part Signed-off-by: JaredforReal <[email protected]> * get rig of redudent part in doc Signed-off-by: JaredforReal <[email protected]> * add comments of 472 and 65534 Signed-off-by: JaredforReal <[email protected]> * add network tips of k8s Signed-off-by: JaredforReal <[email protected]> * update uid in dashboard Signed-off-by: JaredforReal <[email protected]> --------- Signed-off-by: JaredforReal <[email protected]> Co-authored-by: Huamin Chen <[email protected]> Signed-off-by: liuhy <[email protected]>
* feat: add Grafava+Prometheus in k8s Signed-off-by: JaredforReal <[email protected]> * Update docs of observability k8s part Signed-off-by: JaredforReal <[email protected]> * get rig of redudent part in doc Signed-off-by: JaredforReal <[email protected]> * add comments of 472 and 65534 Signed-off-by: JaredforReal <[email protected]> * add network tips of k8s Signed-off-by: JaredforReal <[email protected]> * update uid in dashboard Signed-off-by: JaredforReal <[email protected]> --------- Signed-off-by: JaredforReal <[email protected]> Co-authored-by: Huamin Chen <[email protected]> Signed-off-by: liuhy <[email protected]>

What type of PR is this?
feat: add Grafana+Prometheus in k8s
What this PR does / why we need it:
Add deployment guide in
deploy/kubernetes/observability/README.md.Add config and manifest of Grafana in
deploy/kubernetes/observability/grafana/Add config and manifest of Prometheus in
deploy/kubernetes/observability/prometheus/Add deployment config in
deploy/kubernetes/observability/kustomization.yamlAdd optional HTTPS ingress examples in
deploy/kubernetes/observability/ingress.yamlUpdate docs in
website/docs/tutorial/observability/observability.mdUpdate the random UID to "prometheus" in
deploy/kubernetes/llm-router-dashboard.jsonWhich issue(s) this PR fixes:
Fixes #279