diff --git a/docs/ovms-model-deploy-guide.md b/docs/ovms-model-deploy-guide.md index a298430..a8fc63c 100644 --- a/docs/ovms-model-deploy-guide.md +++ b/docs/ovms-model-deploy-guide.md @@ -194,16 +194,22 @@ echo "Access Token: $TOKEN" ```bash # Test chat completions endpoint -For Inferencing with Qwen3-4B-int4-ov: -curl -k ${BASE_URL}/qwen3-4b-ovms/v3/chat/completions -X POST -d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is photosynthesis"}],"model": "qwen3-4b","max_tokens": 32,"temperature": 0.4}' -H 'Content-Type: application/json' -sS -H "Authorization: Bearer $TOKEN" -For Inferencing with Mistral-7B-Instruct-v0.3-int4-cw-ov: -curl -k ${BASE_URL}/mistral-7b-ovms/v3/chat/completions -X POST -d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is photosynthesis"}],"model": "mistral-7b","max_tokens": 32,"temperature": 0.4}' -H 'Content-Type: application/json' -sS -H "Authorization: Bearer $TOKEN" +# For Inferencing with any deployed models, use below command to get model route +kubectl get apisixroute -A +``` +![alt text](pictures/apisix-route.png) +``` +export MODEL_APISIX_ROUTE="qwen3-4b-ovms" +export MODEL_ID=OpenVINO/Qwen3-4B-int4-ov -For Inferencing with meta-llama/Llama-3.2-3B-Instruct: -curl -k ${BASE_URL}/llama-3.2-3b-instruct/v3/chat/completions -X POST -d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is api"}],"model": "llama-3.2-3b-instruct","max_tokens": 32,"temperature": 0.4}' -H 'Content-Type: application/json' -sS -H "Authorization: Bearer $TOKEN" +curl -k ${BASE_URL}/${MODEL_APISIX_ROUTE}/v3/chat/completions -X POST \ + -H 'Content-Type: application/json' \ + -H "Authorization: Bearer $TOKEN" \ + -d '{"messages": [{"role": "system","content": "You are helpful assistant"},{"role": "user","content": "what is api"}],"model": "'"$MODEL_ID"'","max_tokens": 32,"temperature": 0.4}' ``` +**NOTE:** export respective MODEL_APISIX_ROUTE and MODEL_ID to test the model endpoints --- ## Undeployment diff --git a/docs/pictures/apisix-route.png b/docs/pictures/apisix-route.png new file mode 100644 index 0000000..dc70748 Binary files /dev/null and b/docs/pictures/apisix-route.png differ