-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
System Info
TensorRT-LLM Version: 1.2.0rc2
How would you like to use TensorRT-LLM
Hello TensorRT-LLM team,
I have a question regarding the recommended workflow for building TensorRT engines that are intended to be used with the trtllm-serve tool and its --backend tensorrt option.
I've noticed that there are (at least) two different commands that can be used to build a TensorRT engine from a model:
The dedicated build command: trtllm-build
The build phase of the benchmarking tool: trtllm-bench --build ...
My Core Questions:
What are the functional and performance differences between an engine generated by trtllm-build and one generated by trtllm-bench --build for the same base model and architecture?
For the specific use case of serving a model using trtllm-serve --backend tensorrt, which build tool is the recommended or officially supported method to generate the engine?
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.