Skip to content

Commit 4f34f14

Browse files
DN6stevhliu
andauthored
Update docs/source/en/quantization/gguf.md
Co-authored-by: Steven Liu <[email protected]>
1 parent 82606cb commit 4f34f14

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/source/en/quantization/gguf.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Before starting please install gguf in your environment
2323
pip install -U gguf
2424
```
2525

26-
Since GGUF is a single file format, we will be using `from_single_file` to load the model and pass in the `GGUFQuantizationConfig` when loading the model.
26+
Since GGUF is a single file format, use [`~FromSingleFileMixin.from_single_file`] to load the model and pass in the [`GGUFQuantizationConfig`] when loading the model.
2727

2828
When using GGUF checkpoints, the quantized weights remain in a low memory `dtype`, typically `torch.unint8` and are dynamically dequantized and cast to the configured `compute_dtype` when running a forward pass through each module in the model. The `GGUFQuantizationConfig` allows you to set the `compute_dtype` for the forward pass of each module. The functions used for dynamic dequantizatation are based on the great work done by [city96](https://github.com/city96/ComfyUI-GGUF)
2929

0 commit comments

Comments
 (0)