Skip to content

Cannot convert Gemma-3-4B QAT to ONNX #67

@kinchahoy

Description

@kinchahoy

System Info

onnx                     1.17.0
onnxruntime              1.21.1
onnxslim                 0.1.50
optimum                  1.24.0
transformers             4.52.0.dev0

Who can help?

optimum-cli export onnx does not support Gemma 3

❯ optimum-cli export onnx --model gemma-3-4b-it-qat-q4_0-unquantized gemma-3-4b/ Traceback (most recent call last): File "/home/X/ml-projects/convert-models/to-onnx/.venv/bin/optimum-cli", line 4, in <module> from optimum.commands.optimum_cli import main File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/commands/__init__.py", line 17, in <module> from .export import ExportCommand, ONNXExportCommand, TFLiteExportCommand File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/commands/export/__init__.py", line 16, in <module> from .base import ExportCommand File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/commands/export/base.py", line 18, in <module> from .onnx import ONNXExportCommand File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/commands/export/onnx.py", line 23, in <module> from ...exporters import TasksManager File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/exporters/__init__.py", line 16, in <module> from .tasks import TasksManager # noqa ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/exporters/tasks.py", line 173, in <module> class TasksManager: File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/exporters/tasks.py", line 348, in TasksManager "t5-encoder": supported_tasks_mapping( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/exporters/tasks.py", line 117, in supported_tasks_mapping importlib.import_module(f"optimum.exporters.{backend}.model_configs"), config_cls_name ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/X/.local/share/uv/python/cpython-3.12.5-linux-x86_64-gnu/lib/python3.12/importlib/__init__.py", line 90, in import_module return _bootstrap._gcd_import(name[level:], package, level) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/exporters/onnx/model_configs.py", line 72, in <module> from .base import ConfigBehavior, OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/exporters/onnx/base.py", line 55, in <module> from .model_patcher import ModelPatcher, Seq2SeqModelPatcher File "/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/optimum/exporters/onnx/model_patcher.py", line 36, in <module> from transformers.models.clip.modeling_clip import CLIPAttention, CLIPSdpaAttention ImportError: cannot import name 'CLIPSdpaAttention' from 'transformers.models.clip.modeling_clip' (/home/X/ml-projects/convert-models/to-onnx/.venv/lib/python3.12/site-packages/transformers/models/clip/modeling_clip.py). Did you mean: 'CLIPAttention'?

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

optimum-cli export onnx --model gemma-3-4b-it-qat-q4_0-unquantized gemma-3-4b/

Expected behavior

Gemma-3-4b-it QAT is probably the strongest current <10B param quantized model and should be support in onnx

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions