Skip to content

Commit 8960b96

Browse files
authored
doc: fix wrong model path in vllm_backend readme (#84)
1 parent cc88f97 commit 8960b96

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -144,8 +144,9 @@ Once you have the model repository set up, it is time to launch the Triton serve
144144
We will use the [pre-built Triton container with vLLM backend](#option-1-use-the-pre-built-docker-container) from
145145
[NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver) in this example.
146146

147+
Run the following command inside the `vllm_backend` directory:
147148
```
148-
docker run --gpus all -it --net=host --rm -p 8001:8001 --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 -v ${PWD}:/work -w /work nvcr.io/nvidia/tritonserver:<xx.yy>-vllm-python-py3 tritonserver --model-repository ./model_repository
149+
docker run --gpus all -it --net=host --rm -p 8001:8001 --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 -v ${PWD}:/work -w /work nvcr.io/nvidia/tritonserver:<xx.yy>-vllm-python-py3 tritonserver --model-repository ./samples/model_repository
149150
```
150151

151152
Replace \<xx.yy\> with the version of Triton that you want to use.
@@ -171,10 +172,10 @@ with the
171172
you can quickly run your first inference request with the
172173
[generate endpoint](https://github.com/triton-inference-server/server/blob/main/docs/protocol/extension_generate.md).
173174

174-
Try out the command below.
175+
Try out the command below from another terminal:
175176

176177
```
177-
$ curl -X POST localhost:8000/v2/models/vllm_model/generate -d '{"text_input": "What is Triton Inference Server?", "parameters": {"stream": false, "temperature": 0}}'
178+
curl -X POST localhost:8000/v2/models/vllm_model/generate -d '{"text_input": "What is Triton Inference Server?", "parameters": {"stream": false, "temperature": 0}}'
178179
```
179180

180181
Upon success, you should see a response from the server like this one:

0 commit comments

Comments
 (0)