Skip to content

Commit 52c92e2

Browse files
committed
update guides docs
1 parent c90106c commit 52c92e2

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

site-src/guides/index.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,7 +278,7 @@ A cluster with:
278278
helm install vllm-llama3-8b-instruct \
279279
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
280280
--set provider.name=$GATEWAY_PROVIDER \
281-
--version v0.3.0 \
281+
--version v0.5.1 \
282282
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
283283
```
284284

@@ -297,12 +297,17 @@ A cluster with:
297297

298298
Wait until the gateway is ready.
299299

300+
Depending on the type of model server you have deployed, you must update the model field in the request body accordingly:
301+
- vLLM Simulator Model Server: `food-review-1`
302+
- CPU-Based Model Server: `food-review-0` or `food-review-1`
303+
- GPU-Based Model Server: TODO
304+
300305
```bash
301306
IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')
302307
PORT=80
303308

304309
curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{
305-
"model": "food-review",
310+
"model": "food-review-1",
306311
"prompt": "Write as if you were a critic: San Francisco",
307312
"max_tokens": 100,
308313
"temperature": 0

0 commit comments

Comments
 (0)