-
Notifications
You must be signed in to change notification settings - Fork 160
Description
Does TensorRT-Model-Optimizer support GPT-OSS-120b W4A4 Quantization?
I use B200, model Opt==0.35 and it tips me bug:
usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py:548: UserWarning: Cannot export model to the model_config. The modelopt-optimized model state_dict can be saved with torch.save for further inspection.
warn(
Traceback (most recent call last):
File "/workspace/TensorRT-LLM/examples/quantization/quantize.py", line 160, in
quantize_and_export(
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/quantization/quantize_by_modelopt.py", line 817, in quantize_and_export
export_tensorrt_llm_checkpoint(
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py", line 552, in export_tensorrt_llm_checkpoint
raise e
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py", line 486, in export_tensorrt_llm_checkpoint
for (
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py", line 288, in torch_to_tensorrt_llm_checkpoint
layer_config = build_decoder_config(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/layer_utils.py", line 1628, in build_decoder_config
config.mlp = build_mlp_config(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/layer_utils.py", line 868, in build_mlp_config
assert config.proj is not None and config.fc is not None, "proj or fc can not be found"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: proj or fc can not be found