Skip to content

Commit c00402f

Browse files
committed
Fixed a bug in absmax float conversion.
1 parent 6747525 commit c00402f

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,7 @@ Features:
268268
Bug fixes:
269269
- Fixed a bug where the default type of absmax was undefined which leads to errors if the default type is different than torch.float32. # 553
270270
- Fixed a missing scipy dependency in requirements.txt. #544
271+
- Fixed a bug, where a view operation could cause an error in 8-bit layers.
271272

272273
Documentation:
273274
- Improved documentation for GPUs that do not support 8-bit matmul. #529

bitsandbytes/functional.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -685,10 +685,10 @@ def dequantize_blockwise(
685685

686686
absmax, code, blocksize, nested, dtype, offset, state2 = quant_state
687687

688-
if absmax.dtype != torch.float32: absmax = absmax.float()
689688
if nested:
690689
absmax = dequantize_blockwise(absmax, state2)
691690
absmax += offset
691+
if absmax.dtype != torch.float32: absmax = absmax.float()
692692

693693
if out is None:
694694
out = torch.empty(A.shape, dtype=dtype, device=A.device)

0 commit comments

Comments
 (0)