Try importing gptqmodel first when exporting a GPTQ model to openvino #1233

notsyncing · 2025-04-09T13:04:50Z

What does this PR do?

Hello, I'm trying exporting a Qwen2.5 GPTQ model loaded with gptqmodel to openvino, and it errors:


INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.                                                       
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.                                                                               
`low_cpu_mem_usage` was None, now default to True since model is quantized.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
INFO   Kernel: Auto-selection: adding candidate `IPEXQuantLinear`                                                                                   
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
Traceback (most recent call last):
  File "/var/home/sfc/Projects/azarrot-py312/.venv/bin/azarrot", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/src/azarrot/server.py", line 347, in main
    server = create_server()
             ^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/src/azarrot/server.py", line 279, in create_server
    model_manager = ModelManager(config, backends)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/src/azarrot/models/model_manager.py", line 57, in __init__
    self.refresh_models()
  File "/var/home/sfc/Projects/azarrot-py312/src/azarrot/models/model_manager.py", line 204, in refresh_models
    model_info = backend.load_model(model)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/src/azarrot/backends/transformers_based_backend.py", line 157, in load_model
    transformers_model: Any = model_class.from_pretrained(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/optimum/intel/openvino/modeling_base.py", line 485, in from_pretrained
    return super().from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/optimum/modeling_base.py", line 438, in from_pretrained
    return from_pretrained_method(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/optimum/intel/openvino/modeling_decoder.py", line 319, in _from_transformers
    main_export(
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/optimum/exporters/openvino/__main__.py", line 386, in main_export
    model = TasksManager.get_model_from_task(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/optimum/exporters/tasks.py", line 2283, in get_model_from_task
    model = model_class.from_pretrained(model_name_or_path, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 262, in _wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4400, in from_pretrained
    hf_quantizer.postprocess_model(model, config=config)
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/transformers/quantizers/base.py", line 207, in postprocess_model
    return self._process_model_after_weight_loading(model, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/transformers/quantizers/quantizer_gptq.py", line 111, in _process_model_after_weight_loading
    model = self.optimum_quantizer.post_init_model(model)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/optimum/exporters/openvino/__main__.py", line 345, in post_init_model
    from auto_gptq import exllama_set_max_input_length
ModuleNotFoundError: No module named 'auto_gptq'

I found that gptqmodel has the same method exllama_set_max_input_length as auto_gptq, so I just added an import trying gptqmodel first, and it works.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

IlyasMoutawwakil

thanks for the catch ! autogptq is being deprecated so we probably should also warn the user about that and advise to use gptqmodel instead.

optimum/exporters/openvino/__main__.py

HuggingFaceDocBuilderDev · 2025-05-09T14:33:43Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

optimum/exporters/openvino/__main__.py

IlyasMoutawwakil requested changes May 6, 2025

View reviewed changes

optimum/exporters/openvino/__main__.py Show resolved Hide resolved

notsyncing force-pushed the fix-gptqmodel-exporting branch from 8e1aab1 to 5d29d0f Compare May 6, 2025 13:39

Check for gptqmodel first when exporting a GPTQ model to openvino

d642dde

notsyncing force-pushed the fix-gptqmodel-exporting branch from 5d29d0f to d642dde Compare May 9, 2025 14:26

notsyncing requested a review from IlyasMoutawwakil May 9, 2025 14:27

IlyasMoutawwakil reviewed May 14, 2025

View reviewed changes

optimum/exporters/openvino/__main__.py Outdated Show resolved Hide resolved

Update optimum/exporters/openvino/__main__.py

4f19f48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Try importing gptqmodel first when exporting a GPTQ model to openvino #1233

Try importing gptqmodel first when exporting a GPTQ model to openvino #1233

Uh oh!

notsyncing commented Apr 9, 2025

Uh oh!

IlyasMoutawwakil left a comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Try importing gptqmodel first when exporting a GPTQ model to openvino #1233

Are you sure you want to change the base?

Try importing gptqmodel first when exporting a GPTQ model to openvino #1233

Uh oh!

Conversation

notsyncing commented Apr 9, 2025

What does this PR do?

Before submitting

Uh oh!

IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants