Skip to content

Commit e9fab4f

Browse files
authored
fix bug of deepseek gropu_size setting (NVIDIA#3860)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
1 parent e6c14ca commit e9fab4f

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

tensorrt_llm/_torch/model_config.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,9 @@ def from_pretrained(cls,
122122
'group_size', None)
123123
mixed_quant_configs[layer] = config
124124
layer_quant_config = mixed_quant_configs
125+
elif quant_config.quant_algo == QuantAlgo.FP8_BLOCK_SCALES:
126+
if quant_config.group_size is None:
127+
quant_config.group_size = 128
125128

126129
if kwargs.get(
127130
'moe_backend'

0 commit comments

Comments
 (0)