Skip to content

Commit acc1a6e

Browse files
Jun-HowieJunHowieIsotr0py
authored
Fix the bug related to loading GPTP INT3 weights. (#23328)
Signed-off-by: JunHowie <[email protected]> Co-authored-by: JunHowie <[email protected]> Co-authored-by: Isotr0py <[email protected]>
1 parent 8c742a6 commit acc1a6e

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm/model_executor/layers/quantization/utils/gptq_utils.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# SPDX-License-Identifier: Apache-2.0
22
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
33
from copy import deepcopy
4+
from fractions import Fraction
45
from typing import Optional, Union
56

67
import regex as re
@@ -29,7 +30,7 @@ def override_config(config: QuantizationConfig, prefix: str):
2930
if isinstance(desc_act, bool):
3031
config.desc_act = desc_act
3132

32-
config.pack_factor = 32 // config.weight_bits # packed into int32
33+
config.pack_factor = Fraction(32, config.weight_bits) # packed into int32
3334
if config.get_name() == "gptq_marlin":
3435
is_sym = get_dynamic_override(config, prefix, "sym", config.is_sym)
3536
if isinstance(is_sym, bool):

0 commit comments

Comments
 (0)