@@ -7,20 +7,23 @@ This directory contains Kubernetes manifests for deploying the Semantic Router u
77The deployment consists of:
88
99- ** ConfigMap** : Contains ` config.yaml ` and ` tools_db.json ` configuration files
10- - ** PersistentVolumeClaim** : 10Gi storage for model files
11- - ** Deployment** :
10+ - ** PersistentVolumeClaim** : 10Gi storage for model files
11+ - ** Deployment** :
1212 - ** Init Container** : Downloads/copies model files to persistent volume
1313 - ** Main Container** : Runs the semantic router service
14- - ** Services** :
15- - Main service exposing gRPC port (50051) and metrics port (9190)
14+ - ** Services** :
15+ - Main service exposing gRPC port (50051), Classification API (8080), and metrics port (9190)
1616 - Separate metrics service for monitoring
1717
1818## Ports
1919
2020- ** 50051** : gRPC API (vLLM Semantic Router ExtProc)
21+ - ** 8080** : Classification API (HTTP REST API)
2122- ** 9190** : Prometheus metrics
2223
23- ## Deployment
24+ ## Quick Start
25+
26+ ### Standard Kubernetes Deployment
2427
2528``` bash
2629kubectl apply -k deploy/kubernetes/
@@ -32,3 +35,288 @@ kubectl get services -l app=semantic-router -n semantic-router
3235# View logs
3336kubectl logs -l app=semantic-router -n semantic-router -f
3437```
38+
39+ ### Kind (Kubernetes in Docker) Deployment
40+
41+ For local development and testing, you can deploy to a kind cluster with optimized resource settings.
42+
43+ #### Prerequisites
44+
45+ - [ Docker] ( https://docs.docker.com/get-docker/ ) installed and running
46+ - [ kind] ( https://kind.sigs.k8s.io/docs/user/quick-start/#installation ) installed
47+ - [ kubectl] ( https://kubernetes.io/docs/tasks/tools/ ) installed
48+
49+ #### Automated Deployment
50+
51+ Use the provided make targets for a complete automated setup:
52+
53+ ``` bash
54+ # Complete setup: create cluster and deploy
55+ make setup
56+
57+ # Or step by step:
58+ make create-cluster
59+ make deploy
60+ ```
61+
62+ The setup process will:
63+
64+ 1 . Create a kind cluster with optimized configuration
65+ 2 . Deploy the semantic router with appropriate resource limits
66+ 3 . Wait for the deployment to be ready
67+ 4 . Show deployment status and access instructions
68+
69+ #### Manual Kind Deployment
70+
71+ If you prefer manual deployment:
72+
73+ ** Step 1: Create kind cluster with custom configuration**
74+
75+ ``` bash
76+ # Create cluster with optimized resource settings
77+ kind create cluster --name semantic-router-cluster --config tools/kind/kind-config.yaml
78+
79+ # Verify cluster is ready
80+ kubectl wait --for=condition=Ready nodes --all --timeout=300s
81+ ```
82+
83+ ** Step 2: Deploy the application**
84+
85+ ``` bash
86+ kubectl apply -k deploy/kubernetes/
87+
88+ # Wait for deployment to be ready
89+ kubectl wait --for=condition=Available deployment/semantic-router -n semantic-router --timeout=600s
90+ ```
91+
92+ ** Step 3: Check deployment status**
93+
94+ ``` bash
95+ # Check pods
96+ kubectl get pods -n semantic-router -o wide
97+
98+ # Check services
99+ kubectl get services -n semantic-router
100+
101+ # View logs
102+ kubectl logs -l app=semantic-router -n semantic-router -f
103+ ```
104+
105+ #### Resource Requirements for Kind
106+
107+ The deployment is optimized for kind clusters with the following resource allocation:
108+
109+ - ** Init Container** : 512Mi memory, 250m CPU (limits: 1Gi memory, 500m CPU)
110+ - ** Main Container** : 3Gi memory, 1 CPU (limits: 6Gi memory, 2 CPU)
111+ - ** Total Cluster** : Recommended minimum 8GB RAM, 4 CPU cores
112+
113+ #### Kind Cluster Configuration
114+
115+ The ` tools/kind/kind-config.yaml ` provides:
116+
117+ - Control plane node with system resource reservations
118+ - Worker node for application workloads
119+ - Optimized kubelet settings for resource management
120+
121+ #### Accessing Services in Kind
122+
123+ Using make commands (recommended):
124+
125+ ``` bash
126+ # Access Classification API (HTTP REST)
127+ make port-forward-api
128+
129+ # Access gRPC API
130+ make port-forward-grpc
131+
132+ # Access metrics
133+ make port-forward-metrics
134+ ```
135+
136+ Or using kubectl directly:
137+
138+ ``` bash
139+ # Access Classification API (HTTP REST)
140+ kubectl port-forward -n semantic-router svc/semantic-router 8080:8080
141+
142+ # Access gRPC API
143+ kubectl port-forward -n semantic-router svc/semantic-router 50051:50051
144+
145+ # Access metrics
146+ kubectl port-forward -n semantic-router svc/semantic-router-metrics 9190:9190
147+ ```
148+
149+ #### Testing the Deployment
150+
151+ Use the provided make targets:
152+
153+ ``` bash
154+ # Test overall deployment
155+ make test-deployment
156+
157+ # Test Classification API specifically
158+ make test-api
159+
160+ # Check deployment status
161+ make status
162+
163+ # View logs
164+ make logs
165+ ```
166+
167+ The make targets provide comprehensive testing including:
168+
169+ - Pod readiness checks
170+ - Service availability verification
171+ - PVC status validation
172+ - API health checks
173+ - Basic functionality testing
174+
175+ #### Cleanup
176+
177+ Using make commands (recommended):
178+
179+ ``` bash
180+ # Complete cleanup: undeploy and delete cluster
181+ make cleanup
182+
183+ # Or step by step:
184+ make undeploy
185+ make delete-cluster
186+ ```
187+
188+ Or using kubectl/kind directly:
189+
190+ ``` bash
191+ # Remove deployment
192+ kubectl delete -k deploy/kubernetes/
193+
194+ # Delete the kind cluster
195+ kind delete cluster --name semantic-router-cluster
196+ ```
197+
198+ ## Make Commands Reference
199+
200+ The project provides comprehensive make targets for managing kind clusters and deployments:
201+
202+ ### Cluster Management
203+
204+ ``` bash
205+ make create-cluster # Create kind cluster with optimized configuration
206+ make delete-cluster # Delete kind cluster
207+ make cluster-info # Show cluster information and resource usage
208+ ```
209+
210+ ### Deployment Management
211+
212+ ``` bash
213+ make deploy # Deploy semantic-router to the cluster
214+ make undeploy # Remove semantic-router from the cluster
215+ make load-image # Load Docker image into kind cluster
216+ make status # Show deployment status
217+ ```
218+
219+ ### Testing and Monitoring
220+
221+ ``` bash
222+ make test-deployment # Test the deployment
223+ make test-api # Test the Classification API
224+ make logs # Show application logs
225+ ```
226+
227+ ### Port Forwarding
228+
229+ ``` bash
230+ make port-forward-api # Port forward Classification API (8080)
231+ make port-forward-grpc # Port forward gRPC API (50051)
232+ make port-forward-metrics # Port forward metrics (9190)
233+ ```
234+
235+ ### Combined Operations
236+
237+ ``` bash
238+ make setup # Complete setup (create-cluster + deploy)
239+ make cleanup # Complete cleanup (undeploy + delete-cluster)
240+ ```
241+
242+ ### Configuration Variables
243+
244+ You can customize the deployment using environment variables:
245+
246+ ``` bash
247+ # Custom cluster name
248+ KIND_CLUSTER_NAME=my-cluster make create-cluster
249+
250+ # Custom kind config file
251+ KIND_CONFIG_FILE=my-config.yaml make create-cluster
252+
253+ # Custom namespace
254+ KUBE_NAMESPACE=my-namespace make deploy
255+
256+ # Custom Docker image
257+ DOCKER_IMAGE=my-registry/semantic-router:latest make load-image
258+ ```
259+
260+ ### Help
261+
262+ ``` bash
263+ make help-kube # Show all available Kubernetes targets
264+ ```
265+
266+ ## Troubleshooting
267+
268+ ### Common Issues
269+
270+ ** Pod stuck in Pending state:**
271+
272+ ``` bash
273+ # Check node resources
274+ kubectl describe nodes
275+
276+ # Check pod events
277+ kubectl describe pod -n semantic-router -l app=semantic-router
278+ ```
279+
280+ ** Init container fails:**
281+
282+ ``` bash
283+ # Check init container logs
284+ kubectl logs -n semantic-router -l app=semantic-router -c model-downloader
285+ ```
286+
287+ ** Out of memory errors:**
288+
289+ ``` bash
290+ # Check resource usage
291+ kubectl top pods -n semantic-router
292+
293+ # Adjust resource limits in deployment.yaml if needed
294+ ```
295+
296+ ### Resource Optimization
297+
298+ For different environments, you can adjust resource requirements:
299+
300+ - ** Development** : 2Gi memory, 0.5 CPU
301+ - ** Testing** : 4Gi memory, 1 CPU
302+ - ** Production** : 8Gi+ memory, 2+ CPU
303+
304+ Edit the ` resources ` section in ` deployment.yaml ` accordingly.
305+
306+ ## Files Overview
307+
308+ ### Kubernetes Manifests (` deploy/kubernetes/ ` )
309+
310+ - ` deployment.yaml ` - Main application deployment with optimized resource settings
311+ - ` service.yaml ` - Services for gRPC, HTTP API, and metrics
312+ - ` pvc.yaml ` - Persistent volume claim for model storage
313+ - ` namespace.yaml ` - Dedicated namespace for the application
314+ - ` config.yaml ` - Application configuration
315+ - ` tools_db.json ` - Tools database for semantic routing
316+ - ` kustomization.yaml ` - Kustomize configuration for easy deployment
317+
318+ ### Development Tools
319+
320+ - ` tools/kind/kind-config.yaml ` - Kind cluster configuration for local development
321+ - ` tools/make/kube.mk ` - Make targets for Kubernetes operations
322+ - ` Makefile ` - Root makefile including all make targets
0 commit comments