Update ReadMe for torch_quant_to_onnx.py example #395

ajrasane · 2025-09-30T22:22:57Z

What does this PR do?

Type of change: documentation

Overview:

Updated ReadMe for torch_quant_to_onnx.py example
Resolves ONNX PTQ with NVFP4 fails on Vision CNN models #362

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Did you add or update any necessary documentation?: Yes

Summary by CodeRabbit

Documentation
- Clarified TensorRT limitations in the ONNX PTQ example README.
- Noted FP8 convolution restrictions (kernel sizes, channel multiples) and lack of NVFP4 convolution kernels.
- Stated NVFP4 export is effectively limited to GEMM-heavy Transformer-style models (e.g., ViT); convolution-heavy CNNs may fail with nvfp4/int4_awq.
- Documentation-only guidance; no behavioral changes.

coderabbitai · 2025-09-30T22:23:05Z

Walkthrough

Added documentation notes to examples/onnx_ptq/README.md clarifying TensorRT FP8 convolution limitations (kernel sizes and channel multiples) and that ONNX PTQ examples are intended for Transformer-style models (e.g., ViT), not convolution-heavy CNNs. No code or control-flow changes.

Changes

Cohort / File(s)	Summary
Docs: ONNX PTQ README notes `examples/onnx_ptq/README.md`	Added two informational note blocks describing TensorRT FP8 convolution restrictions (allowed kernel sizes and channel multiples, absence of NVFP4 conv kernels) and clarifying that NVFP4 export is effectively limited to GEMM-heavy Transformer-style models (e.g., ViT); placed after the Torch quantize-to-ONNX example and after the MXFP8/INT4/NVFP4 section.

Sequence Diagram(s)

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

A rabbit nudges lines of text,
“ViTs hop well, CNNs perplexed.”
FP8 kernels mind their size,
Channels counted, nothing lies.
Notes now snug in README's light—🥕

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title accurately describes the nature of the change by indicating an update to the README for the torch_quant_to_onnx.py example and is concise and directly related to the modified documentation without extraneous detail.
Linked Issues Check	✅ Passed	The pull request fulfills the requirement of issue #362 by adding a note in the example’s README that details the current limitations of NVFP4 quantization on convolutional CNN models, explicitly explaining that only GEMM-heavy Transformer models are supported and that common vision CNNs will fail under quantize_mode=nvfp4.
Out of Scope Changes Check	✅ Passed	All modifications are confined to the README for the torch_quant_to_onnx.py example and directly pertain to documenting NVFP4 limitations, with no unrelated code or files altered.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ajrasane/readme_update

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13c4939 and 608e636.

📒 Files selected for processing (1)

examples/onnx_ptq/README.md (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

examples/onnx_ptq/README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: linux
GitHub Check: code-quality
GitHub Check: build-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17439e6 and 13c4939.

📒 Files selected for processing (1)

examples/onnx_ptq/README.md (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: linux
GitHub Check: build-docs
GitHub Check: code-quality

examples/onnx_ptq/README.md

codecov · 2025-09-30T22:36:18Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.79%. Comparing base (17439e6) to head (608e636).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #395   +/-   ##
=======================================
  Coverage   73.79%   73.79%           
=======================================
  Files         171      171           
  Lines       17583    17583           
=======================================
  Hits        12975    12975           
  Misses       4608     4608

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: ajrasane <[email protected]>

gcunhase · 2025-10-02T14:14:36Z

examples/onnx_ptq/README.md

    --onnx_save_path=<path to save the exported ONNX model>
 ```

+> *Note: TensorRT has limited support for Convolution layers with certain precision formats. FP8 Convolution layers remain restricted to specific kernel sizes and channel multiples, and there are no NVFP4 convolution kernels today—NVFP4 export is effectively limited to GEMM-heavy Transformer-style models (e.g., ViT). Convolution-centric CNNs such as ResNet, ConvNeXt, or MobileNet will fail when exported with `quantize_mode=nvfp4|int4_awq`.*


Will they fail to export or just export in FP16?

ajrasane requested a review from cjluo-nv September 30, 2025 22:22

ajrasane self-assigned this Sep 30, 2025

ajrasane requested a review from a team as a code owner September 30, 2025 22:22

ajrasane requested a review from gcunhase September 30, 2025 22:22

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

examples/onnx_ptq/README.md Outdated Show resolved Hide resolved

Update Readme for torch_quant_to_onnx.py example

608e636

Signed-off-by: ajrasane <[email protected]>

ajrasane force-pushed the ajrasane/readme_update branch from 13c4939 to 608e636 Compare October 1, 2025 07:20

gcunhase reviewed Oct 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update ReadMe for torch_quant_to_onnx.py example #395

Update ReadMe for torch_quant_to_onnx.py example #395

Uh oh!

ajrasane commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

codecov bot commented Sep 30, 2025 •

edited

Loading

Uh oh!

gcunhase Oct 2, 2025

Uh oh!

Uh oh!

Update ReadMe for torch_quant_to_onnx.py example #395

Are you sure you want to change the base?

Update ReadMe for torch_quant_to_onnx.py example #395

Uh oh!

Conversation

ajrasane commented Sep 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gcunhase Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ajrasane commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 30, 2025 •

edited

Loading

codecov bot commented Sep 30, 2025 •

edited

Loading