Training mode removal from ONNX nodes #277

i-riyad · 2025-08-29T19:05:00Z

What does this PR do?

Type of change: Bug fix

Overview: Training mode removal from a ONNX node was incomplete.

Usage

from modelopt.onnx.utils import remove_node_training_mode

result_model = remove_node_training_mode(model, "BatchNormalization")

Testing

python -m pytest tests/unit/torch/deploy/utils/test_torch_onnx_utils.py::test_remove_node_training_mode_attribute -v
python -m pytest tests/unit/torch/deploy/utils/test_torch_onnx_utils.py::test_remove_node_extra_training_outputs -v

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: No
Did you update Changelog?: No

Additional Information

copy-pr-bot · 2025-08-29T19:05:04Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

ajrasane · 2025-08-29T22:25:33Z

modelopt/onnx/utils.py

+                break
+
+        # If node has extra training outputs, keep only the first
+        if len(node.output) > 1:


Is there a way we can check that that this output that we remove will be the training output?

Actually, the number of training outputs is operator specific and removing them specifically is a bit more convoluted. I have implemented the same functionality by removing all the unused outputs from a given node if training_mode is on. Please review.

ajrasane · 2025-08-29T22:28:19Z

tests/unit/torch/deploy/utils/test_torch_onnx_utils.py

+    return make_tensor(name, onnx.TensorProto.FLOAT, shape, data.flatten())
+
+
+def _make_batchnorm_model(bn_node, extra_value_infos=None):


Could you add a comment to show how this model will look after creation?

ajrasane · 2025-08-29T22:29:48Z

tests/unit/torch/deploy/utils/test_torch_onnx_utils.py

+    assert "training_mode" not in attr_names
+
+
+def test_remove_node_extra_training_outputs():


Could you also add another test to check if there are multiple outputs of a node that are not training outputs, they are not removed?

Modified the existing test to cover this.

codecov · 2025-08-30T05:39:43Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.97%. Comparing base (dba0b37) to head (ff5f6df).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #277      +/-   ##
==========================================
+ Coverage   73.83%   73.97%   +0.14%     
==========================================
  Files         173      172       -1     
  Lines       17402    17351      -51     
==========================================
- Hits        12849    12836      -13     
+ Misses       4553     4515      -38

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Riyad Islam <[email protected]>

i-riyad requested review from a team as code owners August 29, 2025 19:05

i-riyad requested review from ajrasane and cjluo-nv August 29, 2025 19:05

i-riyad force-pushed the rislam/training-mode-remove branch from 0a3fd29 to d0f071c Compare August 29, 2025 20:54

ajrasane reviewed Aug 29, 2025

View reviewed changes

i-riyad force-pushed the rislam/training-mode-remove branch 4 times, most recently from 1b85154 to 626d99d Compare August 30, 2025 21:30

i-riyad requested a review from ajrasane August 30, 2025 21:32

i-riyad added 3 commits August 30, 2025 14:37

Traning mode removal from ONNX nodes

04b8496

Signed-off-by: Riyad Islam <[email protected]>

Doc and test update

8d614b5

Signed-off-by: Riyad Islam <[email protected]>

Fix

062224a

Signed-off-by: Riyad Islam <[email protected]>

i-riyad force-pushed the rislam/training-mode-remove branch from 626d99d to 062224a Compare August 30, 2025 21:37

i-riyad self-assigned this Aug 30, 2025

i-riyad and others added 2 commits August 30, 2025 20:52

Tests relocated

e70aeef

Signed-off-by: Riyad Islam <[email protected]>

Merge branch 'main' into rislam/training-mode-remove

0d7d92d

ajrasane approved these changes Sep 2, 2025

View reviewed changes

kevalmorabia97 removed the request for review from cjluo-nv September 3, 2025 05:39

Merge branch 'main' into rislam/training-mode-remove

ff5f6df

i-riyad merged commit d0372f4 into main Sep 4, 2025
19 checks passed

i-riyad deleted the rislam/training-mode-remove branch September 4, 2025 18:27

i-riyad mentioned this pull request Sep 4, 2025

Repeated line #239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training mode removal from ONNX nodes #277

Training mode removal from ONNX nodes #277

Uh oh!

i-riyad commented Aug 29, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Aug 29, 2025

Uh oh!

ajrasane Aug 29, 2025

Uh oh!

i-riyad Sep 2, 2025

Uh oh!

ajrasane Aug 29, 2025

Uh oh!

ajrasane Aug 29, 2025

Uh oh!

i-riyad Aug 30, 2025 •

edited

Loading

Uh oh!

codecov bot commented Aug 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		return make_tensor(name, onnx.TensorProto.FLOAT, shape, data.flatten())


		def _make_batchnorm_model(bn_node, extra_value_infos=None):

		assert "training_mode" not in attr_names


		def test_remove_node_extra_training_outputs():

Training mode removal from ONNX nodes #277

Training mode removal from ONNX nodes #277

Uh oh!

Conversation

i-riyad commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Aug 29, 2025

Uh oh!

ajrasane Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

i-riyad Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

ajrasane Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

ajrasane Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

i-riyad Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

i-riyad commented Aug 29, 2025 •

edited

Loading

i-riyad Aug 30, 2025 •

edited

Loading

codecov bot commented Aug 30, 2025 •

edited

Loading