Skip to content

Commit 17bc7ab

Browse files
authored
doc: Update model repository documentation (#807)
Signed-off-by: Aurelien Chartier <[email protected]>
1 parent af354e4 commit 17bc7ab

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

README.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -162,8 +162,9 @@ more details on the parameters.
162162
Next, create the
163163
[model repository](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_repository.md)
164164
that will be used by the Triton server. The models can be found in the
165-
[all_models](./tensorrt_llm/triton_backend/all_models) folder. The folder contains two groups of models:
166-
- [`gpt`](./tensorrt_llm/triton_backend/all_models/gpt): Using TensorRT-LLM pure Python runtime.
165+
[all_models](./tensorrt_llm/triton_backend/all_models) folder. The folder contains six groups of models:
166+
- [`disaggregated_serving`](./tensorrt_llm/triton_backend/all_models/disaggregated_serving): Using the C++ TensorRT-LLM backend to run disaggregated serving.
167+
- [`gpt`](./tensorrt_llm/triton_backend/all_models/gpt): Using TensorRT-LLM pure Python runtime. This model is deprecated and will be removed in a future release.
167168
- [`inflight_batcher_llm`](./tensorrt_llm/triton_backend/all_models/inflight_batcher_llm/)`: Using the C++
168169
TensorRT-LLM backend with the executor API, which includes the latest features
169170
including inflight batching.
@@ -193,6 +194,9 @@ please see the [model config](./docs/model_config.md#tensorrt_llm_bls-model) sec
193194
mkdir /triton_model_repo
194195
cp -r /app/all_models/inflight_batcher_llm/* /triton_model_repo/
195196
```
197+
- [`llmapi`](./tensorrt_llm/triton_backend/all_models/llmapi/): Using TensorRT-LLM LLM API with pytorch backend.
198+
- [`multimodal`](./tensorrt_llm/triton_backend/all_models/multimodal/): Using TensorRT-LLM python runtime for multimodal models. See [`multimodal.md`](./docs/multimodal.md) for more details.
199+
- [`whisper`](./tensorrt_llm/triton_backend/all_models/whisper/): Using TensorRT-LLM python runtime for Whisper. See [`whisper.md`](./docs/whisper.md) for more details.
196200

197201
#### Modify the Model Configuration
198202
Use the script to fill in the parameters in the model configuration files. For

0 commit comments

Comments
 (0)