NVIDIA Model Optimizer Changelog (Windows)

0.33 (2025-07-21)

New Features

Model Optimizer for Windows now supports NvTensorRtRtx execution-provider.

0.27 (2025-04-30)

New Features

New LLM models like DeepSeek etc. are supported with ONNX INT4 AWQ quantization on Windows. Refer Windows Support Matrix for details about supported features and models.
Model Optimizer for Windows now supports ONNX INT8 and FP8 quantization (W8A8) of SAM2 and Whisper models. Check example scripts for getting started with quantizing these models.

0.19 (2024-11-18)

New Features

This is the first official release of Model Optimizer for Windows
ONNX INT4 Quantization: :meth:`modelopt.onnx.quantization.quantize_int4 <modelopt.onnx.quantization.int4.quantize>` now supports ONNX INT4 quantization for DirectML and TensorRT* deployment. See :ref:`Support_Matrix` for details about supported features and models.
LLM Quantization with Olive: Enabled LLM quantization through Olive, streamlining model optimization workflows. Refer example
DirectML Deployment Guide: Added DML deployment guide. Refer :ref:`DirectML_Deployment`.
MMLU Benchmark for Accuracy Evaluations: Introduced MMLU benchmarking for accuracy evaluation of ONNX models on DirectML (DML).
Published quantized ONNX models collection: Published quantized ONNX models at HuggingFace NVIDIA collections.

* This version includes experimental features such as TensorRT deployment of ONNX INT4 models, PyTorch quantization and sparsity. These are currently unverified on Windows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Model Optimizer Changelog (Windows)

0.33 (2025-07-21)

0.27 (2025-04-30)

0.19 (2024-11-18)

FilesExpand file tree

CHANGELOG-Windows.rst

Latest commit

History

CHANGELOG-Windows.rst

File metadata and controls

NVIDIA Model Optimizer Changelog (Windows)

0.33 (2025-07-21)

0.27 (2025-04-30)

0.19 (2024-11-18)