Skip to content

Commit d50a5b9

Browse files
committed
align deploy/k8s README
Signed-off-by: JaredforReal <[email protected]>
1 parent 8675c7b commit d50a5b9

File tree

1 file changed

+53
-15
lines changed

1 file changed

+53
-15
lines changed

deploy/kubernetes/README.md

Lines changed: 53 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Semantic Router Kubernetes Deployment
22

3-
This directory contains Kubernetes manifests for deploying the Semantic Router using Kustomize.
3+
Kustomize manifests for deploying the Semantic Router and its observability stack (Prometheus, Grafana, Dashboard, optional Open WebUI + Pipelines) on Kubernetes.
44

55
## Architecture
66

@@ -12,8 +12,9 @@ The deployment consists of:
1212
- **Init Container**: Downloads/copies model files to persistent volume
1313
- **Main Container**: Runs the semantic router service
1414
- **Services**:
15-
- Main service exposing gRPC port (50051), Classification API (8080), and metrics port (9190)
16-
- Separate metrics service for monitoring
15+
- Main service exposing gRPC (50051), Classification API (8080), and metrics (9190)
16+
- Separate metrics service for monitoring (`semantic-router-metrics`)
17+
- Observability services (Grafana, Prometheus, Dashboard, optional Open WebUI)
1718

1819
## Ports
1920

@@ -23,17 +24,40 @@ The deployment consists of:
2324

2425
## Quick Start
2526

26-
### Standard Kubernetes Deployment
27+
### Deploy Core (Router)
2728

2829
```bash
2930
kubectl apply -k deploy/kubernetes/
3031

3132
# Check deployment status
32-
kubectl get pods -l app=semantic-router -n semantic-router
33-
kubectl get services -l app=semantic-router -n semantic-router
33+
kubectl get pods -l app=semantic-router -n vllm-semantic-router-system
34+
kubectl get services -l app=semantic-router -n vllm-semantic-router-system
3435

3536
# View logs
36-
kubectl logs -l app=semantic-router -n semantic-router -f
37+
kubectl logs -l app=semantic-router -n vllm-semantic-router-system -f
38+
39+
### Add Observability (Prometheus + Grafana + Dashboard + Playground)
40+
41+
```bash
42+
kubectl apply -k deploy/kubernetes/observability/
43+
```
44+
45+
Port-forward to UIs (local dev):
46+
47+
```bash
48+
kubectl port-forward -n vllm-semantic-router-system svc/prometheus 9090:9090
49+
kubectl port-forward -n vllm-semantic-router-system svc/grafana 3000:3000
50+
kubectl port-forward -n vllm-semantic-router-system svc/semantic-router-dashboard 8700:80
51+
kubectl port-forward -n vllm-semantic-router-system svc/openwebui 3001:8080
52+
```
53+
54+
Then open:
55+
56+
- Prometheus → http://localhost:9090
57+
- Grafana → http://localhost:3000
58+
- Dashboard → http://localhost:8700
59+
- Open WebUI (Playground) → http://localhost:3001
60+
3761
```
3862

3963
### Kind (Kubernetes in Docker) Deployment
@@ -86,20 +110,20 @@ kubectl wait --for=condition=Ready nodes --all --timeout=300s
86110
kubectl apply -k deploy/kubernetes/
87111

88112
# Wait for deployment to be ready
89-
kubectl wait --for=condition=Available deployment/semantic-router -n semantic-router --timeout=600s
113+
kubectl wait --for=condition=Available deployment/semantic-router -n vllm-semantic-router-system --timeout=600s
90114
```
91115

92116
**Step 3: Check deployment status**
93117

94118
```bash
95119
# Check pods
96-
kubectl get pods -n semantic-router -o wide
120+
kubectl get pods -n vllm-semantic-router-system -o wide
97121

98122
# Check services
99-
kubectl get services -n semantic-router
123+
kubectl get services -n vllm-semantic-router-system
100124

101125
# View logs
102-
kubectl logs -l app=semantic-router -n semantic-router -f
126+
kubectl logs -l app=semantic-router -n vllm-semantic-router-system -f
103127
```
104128

105129
#### Resource Requirements for Kind
@@ -131,19 +155,30 @@ make port-forward-grpc
131155

132156
# Access metrics
133157
make port-forward-metrics
158+
159+
# Access Dashboard / Grafana / Open WebUI
160+
kubectl port-forward -n vllm-semantic-router-system svc/semantic-router-dashboard 8700:80
161+
kubectl port-forward -n vllm-semantic-router-system svc/grafana 3000:3000
162+
kubectl port-forward -n vllm-semantic-router-system svc/openwebui 3001:8080
134163
```
135164

136165
Or using kubectl directly:
137166

138167
```bash
139168
# Access Classification API (HTTP REST)
140-
kubectl port-forward -n semantic-router svc/semantic-router 8080:8080
169+
kubectl port-forward -n vllm-semantic-router-system svc/semantic-router 8080:8080
141170

142171
# Access gRPC API
143-
kubectl port-forward -n semantic-router svc/semantic-router 50051:50051
172+
kubectl port-forward -n vllm-semantic-router-system svc/semantic-router 50051:50051
144173

145174
# Access metrics
146-
kubectl port-forward -n semantic-router svc/semantic-router-metrics 9190:9190
175+
kubectl port-forward -n vllm-semantic-router-system svc/semantic-router-metrics 9190:9190
176+
177+
# Access Prometheus/Grafana/Dashboard/Open WebUI
178+
kubectl port-forward -n vllm-semantic-router-system svc/prometheus 9090:9090
179+
kubectl port-forward -n vllm-semantic-router-system svc/grafana 3000:3000
180+
kubectl port-forward -n vllm-semantic-router-system svc/semantic-router-dashboard 8700:80
181+
kubectl port-forward -n vllm-semantic-router-system svc/openwebui 3001:8080
147182
```
148183

149184
#### Testing the Deployment
@@ -313,7 +348,10 @@ Edit the `resources` section in `deployment.yaml` accordingly.
313348
- `namespace.yaml` - Dedicated namespace for the application
314349
- `config.yaml` - Application configuration
315350
- `tools_db.json` - Tools database for semantic routing
316-
- `kustomization.yaml` - Kustomize configuration for easy deployment
351+
- `kustomization.yaml` - Kustomize configuration for core deployment
352+
- `observability/` - Prometheus, Grafana, Dashboard, optional Open WebUI + Pipelines (with its own `kustomization.yaml`)
353+
354+
For detailed observability setup and screenshots, see `deploy/kubernetes/observability/README.md`.
317355

318356
### Development Tools
319357

0 commit comments

Comments
 (0)