-
Notifications
You must be signed in to change notification settings - Fork 161
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When I tried to quantize GLM-MoE model GLM-4.5 using example/llm_ptq/hf_ptq.py, I met the following error:
Traceback (most recent call last):
File "/TensorRT-Model-Optimizer/examples/llm_ptq/hf_ptq.py", line 772, in <module>
main(args)
File "/TensorRT-Model-Optimizer/examples/llm_ptq/hf_ptq.py", line 625, in main
export_hf_checkpoint(
File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 545, in export_hf_checkpoint
raise e
File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 511, in export_hf_checkpoint
post_state_dict, hf_quant_config = _export_hf_checkpoint(model, dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 461, in _export_hf_checkpoint
_export_quantized_weight(sub_module, dtype)
File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 266, in _export_quantized_weight
quantizer_attrs.weight_scale, get_weight_scaling_factor(sub_module, weight_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/TensorRT-Model-Optimizer/modelopt/torch/export/quant_utils.py", line 275, in get_weight_scaling_factor
NVFP4QTensor.get_weights_scaling_factor_2_from_quantizer(weight_quantizer).to(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/TensorRT-Model-Optimizer/modelopt/torch/quantization/qtensor/nvfp4_tensor.py", line 59, in get_weights_scaling_factor_2_from_quantizer
assert hasattr(weight_quantizer, "_amax"), "Weight quantizer does not have attribute amax"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Weight quantizer does not have attribute amax
Steps/Code to reproduce bug
MODEL_PATH=xxx
SAVE_PATH=xxx
QFORMAT=nvfp4
CALIB_SIZE=64
CALIB_BATCH_SIZE=1
EXPORT_FORMAT=hf
python hf_ptq.py \
--pyt_ckpt_path=$MODEL_PATH \
--export_path=$SAVE_PATH \
--qformat=$QFORMAT \
--calib_size=$CALIB_SIZE \
--batch_size=$CALIB_BATCH_SIZE \
--export_fmt=$EXPORT_FORMAT \
--kv_cache_qformat=none \
--trust_remote_code
System information
- ModelOpt 0.37.0.dev30
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working