Skip to content

Bug when quantize GLM-MoE model #328

@mxjmtxrm

Description

@mxjmtxrm

Describe the bug

When I tried to quantize GLM-MoE model GLM-4.5 using example/llm_ptq/hf_ptq.py, I met the following error:

Traceback (most recent call last):
  File "/TensorRT-Model-Optimizer/examples/llm_ptq/hf_ptq.py", line 772, in <module>
    main(args)
  File "/TensorRT-Model-Optimizer/examples/llm_ptq/hf_ptq.py", line 625, in main
    export_hf_checkpoint(
  File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 545, in export_hf_checkpoint
    raise e
  File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 511, in export_hf_checkpoint
    post_state_dict, hf_quant_config = _export_hf_checkpoint(model, dtype)
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 461, in _export_hf_checkpoint
    _export_quantized_weight(sub_module, dtype)
  File "/TensorRT-Model-Optimizer/modelopt/torch/export/unified_export_hf.py", line 266, in _export_quantized_weight
    quantizer_attrs.weight_scale, get_weight_scaling_factor(sub_module, weight_name)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/TensorRT-Model-Optimizer/modelopt/torch/export/quant_utils.py", line 275, in get_weight_scaling_factor
    NVFP4QTensor.get_weights_scaling_factor_2_from_quantizer(weight_quantizer).to(
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/TensorRT-Model-Optimizer/modelopt/torch/quantization/qtensor/nvfp4_tensor.py", line 59, in get_weights_scaling_factor_2_from_quantizer
    assert hasattr(weight_quantizer, "_amax"), "Weight quantizer does not have attribute amax"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Weight quantizer does not have attribute amax

Steps/Code to reproduce bug

MODEL_PATH=xxx
 SAVE_PATH=xxx
 QFORMAT=nvfp4
 CALIB_SIZE=64
 CALIB_BATCH_SIZE=1
 EXPORT_FORMAT=hf
 python hf_ptq.py \
    --pyt_ckpt_path=$MODEL_PATH \
    --export_path=$SAVE_PATH \
    --qformat=$QFORMAT \
    --calib_size=$CALIB_SIZE \
    --batch_size=$CALIB_BATCH_SIZE \
    --export_fmt=$EXPORT_FORMAT \
    --kv_cache_qformat=none \
    --trust_remote_code

System information

  • ModelOpt 0.37.0.dev30

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions