Skip to content

Conversation

@ajrasane
Copy link
Contributor

@ajrasane ajrasane commented Dec 5, 2025

What does this PR do?

Type of change:
New Feature

Overview:

  • Enable ONNX export for auto quantized models
  • Update documentation and changelog

Usage

python torch_quant_to_onnx.py --quantize_mode=auto \
	--onnx_save_path=./vit_base_patch16_224.nvfp4_fp8.onnx \
	--calibration_data_size 64 \
	--auto_quantization_formats NVFP4_AWQ_LITE_CFG FP8_DEFAULT_CFG \
	--batch_size 128

Testing

python evaluate.py --onnx_path=vit_base_patch16_224.nvfp4_fp8.onnx \
	--model_name=vit_base_patch16_224 \
	--results_path=./results.txt \
	--batch_size 128

Accuracy results

The top1 accuracy of the model is 84.15%
The top5 accuracy of the model is 97.396%

Reference accuracy for fp16

The top1 accuracy of the model is 85.102%
The top5 accuracy of the model is 97.526%

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: No
  • Did you add or update any necessary documentation?: Yes
  • Did you update Changelog?: Yes

@ajrasane ajrasane requested review from a team as code owners December 5, 2025 22:38
@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 74.50%. Comparing base (3ef9e39) to head (23e00a2).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/torch/_deploy/utils/torch_onnx.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #656      +/-   ##
==========================================
- Coverage   74.57%   74.50%   -0.07%     
==========================================
  Files         183      183              
  Lines       18451    18400      -51     
==========================================
- Hits        13759    13709      -50     
+ Misses       4692     4691       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants