Skip to content

Commit f5d40a2

Browse files
authored
add Service to expose vLLM deployment and update doc (#201)
Signed-off-by: googs1025 <[email protected]>
1 parent 699452c commit f5d40a2

File tree

2 files changed

+36
-0
lines changed

2 files changed

+36
-0
lines changed

README.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -225,4 +225,24 @@ Update the `deployment.yaml` file to use the dev tag.
225225
To verify the deployment is available, run:
226226
```bash
227227
kubectl get deployment vllm-llama3-8b-instruct
228+
kubectl get service vllm-llama3-8b-instruct-svc
229+
```
230+
231+
Use `kubectl port-forward` to expose the service on your local machine:
232+
233+
```bash
234+
kubectl port-forward svc/vllm-llama3-8b-instruct-svc 8000:8000
235+
```
236+
237+
Test the API with curl
238+
239+
```bash
240+
curl -X POST http://localhost:8000/v1/chat/completions \
241+
-H "Content-Type: application/json" \
242+
-d '{
243+
"model": "meta-llama/Llama-3.1-8B-Instruct",
244+
"messages": [
245+
{"role": "user", "content": "Hello!"}
246+
]
247+
}'
228248
```

manifests/deployment.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,19 @@ spec:
4040
- containerPort: 8000
4141
name: http
4242
protocol: TCP
43+
---
44+
apiVersion: v1
45+
kind: Service
46+
metadata:
47+
name: vllm-llama3-8b-instruct-svc
48+
labels:
49+
app: vllm-llama3-8b-instruct
50+
spec:
51+
selector:
52+
app: vllm-llama3-8b-instruct
53+
ports:
54+
- protocol: TCP
55+
port: 8000
56+
targetPort: 8000
57+
name: http
58+
type: ClusterIP

0 commit comments

Comments
 (0)