Skip to content

Conversation

@JaredforReal
Copy link
Collaborator

@JaredforReal JaredforReal commented Sep 30, 2025

What type of PR is this?
feat: add Grafana+Prometheus in k8s

What this PR does / why we need it:
Add deployment guide in deploy/kubernetes/observability/README.md.
Add config and manifest of Grafana in deploy/kubernetes/observability/grafana/
Add config and manifest of Prometheus in deploy/kubernetes/observability/prometheus/
Add deployment config in deploy/kubernetes/observability/kustomization.yaml
Add optional HTTPS ingress examples in deploy/kubernetes/observability/ingress.yaml
Update docs in website/docs/tutorial/observability/observability.md
Update the random UID to "prometheus" in deploy/kubernetes/llm-router-dashboard.json

Which issue(s) this PR fixes:
Fixes #279

@netlify
Copy link

netlify bot commented Sep 30, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 790c919
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68dcac79d26e0300081ded17
😎 Deploy Preview https://deploy-preview-294--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Sep 30, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/kubernetes/observability/README.md
  • deploy/kubernetes/observability/grafana/configmap-dashboard.yaml
  • deploy/kubernetes/observability/grafana/configmap-provisioning.yaml
  • deploy/kubernetes/observability/grafana/deployment.yaml
  • deploy/kubernetes/observability/grafana/pvc.yaml
  • deploy/kubernetes/observability/grafana/secret.yaml
  • deploy/kubernetes/observability/grafana/service.yaml
  • deploy/kubernetes/observability/ingress.yaml
  • deploy/kubernetes/observability/kustomization.yaml
  • deploy/kubernetes/observability/prometheus/configmap.yaml
  • deploy/kubernetes/observability/prometheus/deployment.yaml
  • deploy/kubernetes/observability/prometheus/pvc.yaml
  • deploy/kubernetes/observability/prometheus/rbac.yaml
  • deploy/kubernetes/observability/prometheus/service.yaml
  • deploy/llm-router-dashboard.json

📁 website

Owners: @Xunzhuo
Files changed:

  • website/docs/troubleshooting/network-tips.md
  • website/docs/tutorials/observability/observability.md

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@JaredforReal
Copy link
Collaborator Author

JaredforReal commented Sep 30, 2025

More network troubleshooting will be updated in docs/troubleshooting/network-tips.md

@@ -0,0 +1,203 @@
# Semantic Router Observability on Kubernetes

This guide adds a production-ready Prometheus + Grafana stack to the existing Semantic Router Kubernetes deployment. It includes manifests for collectors, dashboards, data sources, RBAC, and ingress so you can monitor routing performance in any cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can move this to tutorials in website, https://vllm-semantic-router.com/docs/tutorials/observability/ ?

Copy link
Collaborator Author

@JaredforReal JaredforReal Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure,scheduled

@JaredforReal JaredforReal marked this pull request as ready for review September 30, 2025 11:31
@JaredforReal
Copy link
Collaborator Author

  1. Should I keep the README.md in deploy/kubernetes/observability/? Since I have merged it into docs/tutorialobservability/observability.md.
  2. Should I split the docs/tutorialobservability/observability.md to /docker-compose.md and kubernetes.md?
    @Xunzhuo WDYT?

@@ -0,0 +1,652 @@
apiVersion: v1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the same dashboard as in https://github.com/vllm-project/semantic-router/tree/main/config/grafana? Is there a way we can consolidate them in a follow up PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consolidate them — it would add some complexity (e.g. conditional processing with Kustomize overlays), but overall make maintenance easier.

@@ -0,0 +1,30 @@
apiVersion: v1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue as above

spec:
securityContext:
runAsNonRoot: true
runAsUser: 472
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is 472 a magic number?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As seen in lines 20 and 21 in /prometheus/deployment.yaml, Prometheus runs as UID/GID 65534 (nobody), while Grafana runs as UID/GID 472 (a dedicated Grafana user) to avoid running processes as root and to enhance container security, particularly when mounting volumes. Maybe I can add a short comment here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks! Adding a comment is helpful.

rootfs
rootfs previously approved these changes Sep 30, 2025
@rootfs
Copy link
Collaborator

rootfs commented Sep 30, 2025

@fcanogab can you take a look? Thanks

Signed-off-by: JaredforReal <[email protected]>
Signed-off-by: JaredforReal <[email protected]>
@JaredforReal
Copy link
Collaborator Author

More network troubleshooting will be updated in docs/troubleshooting/network-tips.md

done! @Xunzhuo @rootfs it's ready for review now. Thanks!

@fcanogab
Copy link
Contributor

fcanogab commented Oct 1, 2025

@fcanogab can you take a look? Thanks

LGTM. Thanks.

@rootfs rootfs merged commit dff45b3 into vllm-project:main Oct 1, 2025
9 checks passed
@JaredforReal JaredforReal deleted the obser branch October 2, 2025 01:12
Aias00 pushed a commit to Aias00/semantic-router that referenced this pull request Oct 4, 2025
* feat: add Grafava+Prometheus in k8s

Signed-off-by: JaredforReal <[email protected]>

* Update docs of observability k8s part

Signed-off-by: JaredforReal <[email protected]>

* get rig of redudent part in doc

Signed-off-by: JaredforReal <[email protected]>

* add comments of 472 and 65534

Signed-off-by: JaredforReal <[email protected]>

* add network tips of k8s

Signed-off-by: JaredforReal <[email protected]>

* update uid in dashboard

Signed-off-by: JaredforReal <[email protected]>

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Signed-off-by: liuhy <[email protected]>
Aias00 pushed a commit to Aias00/semantic-router that referenced this pull request Oct 4, 2025
* feat: add Grafava+Prometheus in k8s

Signed-off-by: JaredforReal <[email protected]>

* Update docs of observability k8s part

Signed-off-by: JaredforReal <[email protected]>

* get rig of redudent part in doc

Signed-off-by: JaredforReal <[email protected]>

* add comments of 472 and 65534

Signed-off-by: JaredforReal <[email protected]>

* add network tips of k8s

Signed-off-by: JaredforReal <[email protected]>

* update uid in dashboard

Signed-off-by: JaredforReal <[email protected]>

---------

Signed-off-by: JaredforReal <[email protected]>
Co-authored-by: Huamin Chen <[email protected]>
Signed-off-by: liuhy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Kubernetes Grafana Dashboard Deployment Guide

4 participants