Skip to content

Commit dff6185

Browse files
docs: fix typo in 'quantization-aware training' (#39904)
1 parent c7844c7 commit dff6185

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/source/en/quantization/fp_quant.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
1616

1717
# FP-Quant
1818

19-
[FP-Quant](https://github.com/IST-DASLab/FP-Quant) is a family of quantization algorithms tailored for the Blackwell generation of Nvidia GPUs. The goal is to allow for efficient post-training quantization (PTQ) and quantization-aware trainin (QAT) of LLMs in the [MXFP4 and NVFP4 data-types](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf).
19+
[FP-Quant](https://github.com/IST-DASLab/FP-Quant) is a family of quantization algorithms tailored for the Blackwell generation of Nvidia GPUs. The goal is to allow for efficient post-training quantization (PTQ) and quantization-aware training (QAT) of LLMs in the [MXFP4 and NVFP4 data-types](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf).
2020

2121
Currently, only PTQ with MXFP4 is supported. Models can either be quantized on the fly with `quantization_config=FPQuantConfig()`:
2222

@@ -63,4 +63,4 @@ model.forward = torch.compile(model.forward, mode="max-autotune", fullgraph=True
6363

6464
FP-Quant currently performs best for very large batch size processing.
6565

66-
See [QuTLASS README](https://github.com/IST-DASLab/qutlass/blob/main/README.md) for speedups.
66+
See [QuTLASS README](https://github.com/IST-DASLab/qutlass/blob/main/README.md) for speedups.

0 commit comments

Comments
 (0)