Skip to content

Commit 31b7dcb

Browse files
sugunav14realAsma
andcommitted
Update examples/llm_qat/README.md
Co-authored-by: realAsma <[email protected]> Signed-off-by: sugunav14 <[email protected]>
1 parent 02280d8 commit 31b7dcb

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

examples/llm_qat/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -303,7 +303,7 @@ See more details on running LLM evaluation benchmarks [here](../llm_eval/README.
303303

304304
The final model after QAT/QAD is similar in architecture to that of PTQ model. QAT model simply have updated weights as compared to the PTQ model. It can be deployed to TensorRT-LLM (TRTLLM)/TensorRT/vLLM/SGLang just like a regular **ModelOpt** PTQ model if the quantization format is supported for deployment.
305305

306-
To run QAT model with vLLM/TRTLLM, run:
306+
To export TRTLLM/vLLM/SGLang compatible checkpoint for the model after QAT (or QAD) model with run:
307307

308308
```sh
309309
python export.py --pyt_ckpt_path llama3-qat --export_path llama3-qat-deploy

0 commit comments

Comments
 (0)