Skip to content

Commit fcddf20

Browse files
yiliu30hmellor
andauthored
Update _posts/2025-12-03-intel-autoround-llmc.md
Co-authored-by: Harry Mellor <[email protected]> Signed-off-by: Yi Liu <[email protected]>
1 parent c64f548 commit fcddf20

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

_posts/2025-12-03-intel-autoround-llmc.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ For more details, please refer to the paper [AutoRound (EMNLP 2024)](https://acl
4646

4747
## Integration Overview
4848

49-
We completed the first stage of integration by introducing the new `AutoRoundModifier` into LLM Compressor, enabling production of `wNa16` (e.g., W4A16) compressed models that seamlessly load in vLLM, as implemented in [PR #1994](https://github.com/vllm-project/llm-compressor/pull/1994). With a straightforward configuration—just specify your model and calibration data—you can quickly generate high‑quality low‑bit checkpoints. This initial stage supports quantizing a range of dense LLMs, including the **Llama** and **Qwen** model families, and demonstrates robust compatibility for practical deployment.
49+
We completed the first stage of integration by introducing the new `AutoRoundModifier` into LLM Compressor, enabling production of `W{n}A16` (e.g., W4A16) compressed models that seamlessly load in vLLM, as implemented in [PR #1994](https://github.com/vllm-project/llm-compressor/pull/1994). With a straightforward configuration—just specify your model and calibration data—you can quickly generate high‑quality low‑bit checkpoints. This initial stage supports quantizing a range of dense LLMs, including the **Llama** and **Qwen** model families, and demonstrates robust compatibility for practical deployment.
5050

5151
## Try It Now (Quickstart)
5252

0 commit comments

Comments
 (0)