[TRTLLM-9092][doc] link to modelopt checkpoints in quick start guide (#9571)

QiJune · mikeiovine · commit d4f68195c341 · 2025-12-05T17:50:12.000-05:00
Signed-off-by: junq &lt;22017000+QiJune@users.noreply.github.com&gt;
Signed-off-by: Mike Iovine &lt;6158008+mikeiovine@users.noreply.github.com&gt;
Signed-off-by: Mike Iovine &lt;miovine@nvidia.com&gt;
diff --git a/docs/source/quick-start-guide.md b/docs/source/quick-start-guide.md
@@ -31,6 +31,8 @@ Ensure your GPU supports FP8 quantization before running the following:
 trtllm-serve "nvidia/Qwen3-8B-FP8"
 ```
 
+For more options, browse the full [collection of generative models](https://huggingface.co/collections/nvidia/inference-optimized-checkpoints-with-model-optimizer) that have been quantized and optimized for inference with the TensorRT Model Optimizer.
+
 ```{note}
 If you are running `trtllm-serve` inside a Docker container, you have two options for sending API requests:
 1. Expose a port (e.g., 8000) to allow external access to the server from outside the container.