[Usage]: [Question] Differences between trtllm-build and trtllm-bench --build

### System Info

TensorRT-LLM Version: 1.2.0rc2

### How would you like to use TensorRT-LLM

Hello TensorRT-LLM team,

I have a question regarding the recommended workflow for building TensorRT engines that are intended to be used with the trtllm-serve tool and its --backend tensorrt option.

I've noticed that there are (at least) two different commands that can be used to build a TensorRT engine from a model:

The dedicated build command: trtllm-build

The build phase of the benchmarking tool: trtllm-bench --build ...

My Core Questions:

What are the functional and performance differences between an engine generated by trtllm-build and one generated by trtllm-bench --build for the same base model and architecture?

For the specific use case of serving a model using trtllm-serve --backend tensorrt, which build tool is the recommended or officially supported method to generate the engine?


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Usage]: [Question] Differences between trtllm-build and trtllm-bench --build #9334

System Info

How would you like to use TensorRT-LLM

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Usage]: [Question] Differences between trtllm-build and trtllm-bench --build #9334

Description

System Info

How would you like to use TensorRT-LLM

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions