File tree Expand file tree Collapse file tree 1 file changed +7
-2
lines changed
Expand file tree Collapse file tree 1 file changed +7
-2
lines changed Original file line number Diff line number Diff line change @@ -278,7 +278,7 @@ A cluster with:
278278 helm install vllm-llama3-8b-instruct \
279279 --set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
280280 --set provider.name=$GATEWAY_PROVIDER \
281- --version v0.3.0 \
281+ --version v0.5.1 \
282282 oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
283283 ```
284284
@@ -297,12 +297,17 @@ A cluster with:
297297
298298 Wait until the gateway is ready.
299299
300+ Depending on the type of model server you have deployed, you must update the model field in the request body accordingly:
301+ - vLLM Simulator Model Server: ` food-review-1 `
302+ - CPU-Based Model Server: ` food-review-0 ` or ` food-review-1 `
303+ - GPU-Based Model Server: TODO
304+
300305 ``` bash
301306 IP=$( kubectl get gateway/inference-gateway -o jsonpath=' {.status.addresses[0].value}' )
302307 PORT=80
303308
304309 curl -i ${IP} :${PORT} /v1/completions -H ' Content-Type: application/json' -d ' {
305- "model": "food-review",
310+ "model": "food-review-1 ",
306311 "prompt": "Write as if you were a critic: San Francisco",
307312 "max_tokens": 100,
308313 "temperature": 0
You can’t perform that action at this time.
0 commit comments