Skip to content

Commit d6e3de0

Browse files
committed
finish
1 parent 3ed52be commit d6e3de0

File tree

1 file changed

+3
-8
lines changed

1 file changed

+3
-8
lines changed

docs/source/en/quantization/bitsandbytes.md

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -418,15 +418,10 @@ transformer_4bit.dequantize()
418418

419419
## torch.compile
420420

421+
Speed up inference with `torch.compile`. Make sure you have the latest `bitsandbytes` installed and we also recommend installing [PyTorch nightly](https://pytorch.org/get-started/locally/).
422+
421423
<hfoptions id="bnb">
422424
<hfoption id="8-bit">
423-
424-
Speed up inference with `torch.compile`. Make sure you have
425-
the latest `bitsandbytes` installed and we also recommend installing [PyTorch nightly](https://pytorch.org/get-started/locally/).
426-
427-
```py
428-
pip install -U bitsandbytes
429-
430425
```py
431426
torch._dynamo.config.capture_dynamic_output_shape_ops = True
432427

@@ -456,7 +451,7 @@ transformer_4bit.compile(fullgraph=True)
456451
</hfoption>
457452
</hfoptions>
458453

459-
On a RTX 4090 with compilation, 4-bit Flux generation completed in 25.809 seconds versus 32.570 seconds without.
454+
On an RTX 4090 with compilation, 4-bit Flux generation completed in 25.809 seconds versus 32.570 seconds without.
460455

461456
Check out the [benchmarking script](https://gist.github.com/sayakpaul/0db9d8eeeb3d2a0e5ed7cf0d9ca19b7d) for more details.
462457

0 commit comments

Comments
 (0)