Skip to content
18 changes: 12 additions & 6 deletions docs/ovms-model-deploy-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,16 +194,22 @@ echo "Access Token: $TOKEN"

```bash
# Test chat completions endpoint
For Inferencing with Qwen3-4B-int4-ov:
curl -k ${BASE_URL}/qwen3-4b-ovms/v3/chat/completions -X POST -d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is photosynthesis"}],"model": "qwen3-4b","max_tokens": 32,"temperature": 0.4}' -H 'Content-Type: application/json' -sS -H "Authorization: Bearer $TOKEN"

For Inferencing with Mistral-7B-Instruct-v0.3-int4-cw-ov:
curl -k ${BASE_URL}/mistral-7b-ovms/v3/chat/completions -X POST -d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is photosynthesis"}],"model": "mistral-7b","max_tokens": 32,"temperature": 0.4}' -H 'Content-Type: application/json' -sS -H "Authorization: Bearer $TOKEN"
# For Inferencing with any deployed models, use below command to get model route
kubectl get apisixroute -A
```
![alt text](pictures/apisix-route.png)
```
export MODEL_APISIX_ROUTE="qwen3-4b-ovms"
export MODEL_ID=OpenVINO/Qwen3-4B-int4-ov

For Inferencing with meta-llama/Llama-3.2-3B-Instruct:
curl -k ${BASE_URL}/llama-3.2-3b-instruct/v3/chat/completions -X POST -d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is api"}],"model": "llama-3.2-3b-instruct","max_tokens": 32,"temperature": 0.4}' -H 'Content-Type: application/json' -sS -H "Authorization: Bearer $TOKEN"
curl -k ${BASE_URL}/${MODEL_APISIX_ROUTE}/v3/chat/completions -X POST \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $TOKEN" \
-d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is api"}],"model": "'"$MODEL_ID"'","max_tokens": 32,"temperature": 0.4}'

```
**NOTE:** export respective MODEL_APISIX_ROUTE and MODEL_ID to test the model endpoints
---
## Undeployment

Expand Down
Binary file added docs/pictures/apisix-route.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.