You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: deploy/kubernetes/observability/README.md
+8-2Lines changed: 8 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,6 +10,9 @@ This guide adds a production-ready Prometheus + Grafana stack to the existing Se
10
10
|--------------|---------|-----------|
11
11
| Prometheus | Scrapes Semantic Router metrics and stores them with persistent retention |`prometheus/` (`rbac.yaml`, `configmap.yaml`, `deployment.yaml`, `pvc.yaml`, `service.yaml`)|
12
12
| Grafana | Visualizes metrics using the bundled LLM Router dashboard and a pre-configured Prometheus datasource |`grafana/` (`secret.yaml`, `configmap-*.yaml`, `deployment.yaml`, `pvc.yaml`, `service.yaml`)|
13
+
| Dashboard | Unified UI that links Router, Prometheus, and embeds Grafana; reads Router config |`dashboard/` (`configmap.yaml`, `deployment.yaml`, `service.yaml`)|
14
+
| Open WebUI | Playground UI for interacting with the router via a Manifold Pipeline |`openwebui/` (`deployment.yaml`, `service.yaml`)|
15
+
| Pipelines | Executes the `vllm_semantic_router_pipe.py` manifold for Open WebUI |`pipelines/deployment.yaml` (includes a ConfigMap with the pipeline code) |
13
16
| Ingress (optional) | Exposes the UIs outside the cluster |`ingress.yaml`|
14
17
| Dashboard provisioning | Automatically loads `deploy/llm-router-dashboard.json` into Grafana |`grafana/configmap-dashboard.yaml`|
15
18
@@ -110,7 +113,7 @@ Verify pods:
110
113
kubectl get pods -n vllm-semantic-router-system
111
114
```
112
115
113
-
You should see `prometheus-...`and `grafana-...` pods in `Running` state.
116
+
You should see `prometheus-...`, `grafana-...`, and `semantic-router-dashboard-...` pods in `Running` state.
114
117
115
118
### 5.3. Integration with the core deployment
116
119
@@ -133,9 +136,11 @@ You should see `prometheus-...` and `grafana-...` pods in `Running` state.
- **Ingress (production)** – Customize `ingress.yaml` with real domains, TLS secrets, and your ingress class before applying. Replace `*.example.com` and configure HTTPS certificates via cert-manager or your provider.
141
146
@@ -145,6 +150,7 @@ You should see `prometheus-...` and `grafana-...` pods in `Running` state.
145
150
2. Query `rate(llm_model_completion_tokens_total[5m])` – should return data after traffic.
146
151
3. Open Grafana, log in with the admin credentials, and confirm the **LLM Router Metrics** dashboard exists under the *Semantic Router* folder.
147
152
4. Generate traffic to Semantic Router (classification or routing requests). Key panels should start populating:
153
+
5.Playground: open Open WebUI (port-forward or ingress), select the `vllm-semantic-router/auto` model (from the Manifold pipeline), and send prompts. The Dashboard Monitoring page should reflect traffic, and the pipeline will display VSR decision headers inline.
0 commit comments