Skip to content

Commit 7bf7360

Browse files
authored
Update README.md
Add tensorrt llm hint
1 parent 395a780 commit 7bf7360

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1522,6 +1522,8 @@ to load the model after the server has been started. The model loading API is
15221522
currently not supported during the `auto_complete_config` and `finalize`
15231523
functions.
15241524

1525+
The model loading API applies to repository-managed backends.
1526+
TensorRT-LLM models must be launched via the TensorRT-LLM launcher and cannot be instantiated via pb_utils.load_model(files=...).
15251527
## Using BLS with Stateful Models
15261528

15271529
[Stateful models](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/architecture.md#stateful-models)

0 commit comments

Comments
 (0)