Skip to content

GPT-OSS-120b W4A4 Quantization supported #335

@ryan-yide

Description

@ryan-yide

Does TensorRT-Model-Optimizer support GPT-OSS-120b W4A4 Quantization?
I use B200, model Opt==0.35 and it tips me bug:
usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py:548: UserWarning: Cannot export model to the model_config. The modelopt-optimized model state_dict can be saved with torch.save for further inspection.
warn(
Traceback (most recent call last):
File "/workspace/TensorRT-LLM/examples/quantization/quantize.py", line 160, in
quantize_and_export(
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/quantization/quantize_by_modelopt.py", line 817, in quantize_and_export
export_tensorrt_llm_checkpoint(
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py", line 552, in export_tensorrt_llm_checkpoint
raise e
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py", line 486, in export_tensorrt_llm_checkpoint
for (
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/model_config_export.py", line 288, in torch_to_tensorrt_llm_checkpoint
layer_config = build_decoder_config(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/layer_utils.py", line 1628, in build_decoder_config
config.mlp = build_mlp_config(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/modelopt/torch/export/layer_utils.py", line 868, in build_mlp_config
assert config.proj is not None and config.fc is not None, "proj or fc can not be found"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: proj or fc can not be found

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions