Skip to content

Commit d4f6819

Browse files
QiJunemikeiovine
authored andcommitted
[TRTLLM-9092][doc] link to modelopt checkpoints in quick start guide (#9571)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>
1 parent 0406949 commit d4f6819

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

docs/source/quick-start-guide.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ Ensure your GPU supports FP8 quantization before running the following:
3131
trtllm-serve "nvidia/Qwen3-8B-FP8"
3232
```
3333

34+
For more options, browse the full [collection of generative models](https://huggingface.co/collections/nvidia/inference-optimized-checkpoints-with-model-optimizer) that have been quantized and optimized for inference with the TensorRT Model Optimizer.
35+
3436
```{note}
3537
If you are running `trtllm-serve` inside a Docker container, you have two options for sending API requests:
3638
1. Expose a port (e.g., 8000) to allow external access to the server from outside the container.

0 commit comments

Comments
 (0)