Skip to content

Commit c0079d4

Browse files
Isotr0pyfhl2000
authored andcommitted
[Bugfix] Fix bnb 8bit model weights loading (vllm-project#19917)
Signed-off-by: Isotr0py <[email protected]> Signed-off-by: fhl <[email protected]>
1 parent cfbaaff commit c0079d4

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

vllm/model_executor/model_loader/bitsandbytes_loader.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -577,10 +577,10 @@ def dequantize_dq(quant_states: dict) -> None:
577577
thereby avoiding this computational overhead during inference. This comes
578578
at the cost of increased memory usage.
579579
"""
580-
from bitsandbytes.functional import dequantize_blockwise
580+
from bitsandbytes.functional import QuantState, dequantize_blockwise
581581
for _, quant_state in quant_states.items():
582582
# Copied from: https://github.com/bitsandbytes-foundation/bitsandbytes/blob/0.45.3/bitsandbytes/functional.py#L1352-#L1356
583-
if quant_state.nested:
583+
if isinstance(quant_state, QuantState) and quant_state.nested:
584584
absmax = dequantize_blockwise(quant_state.absmax,
585585
quant_state.state2)
586586
absmax += quant_state.offset

0 commit comments

Comments
 (0)