NXP backend: Add model input and output quantization #12586

StrycekSimon · 2025-07-17T13:26:02Z

Summary

With this change the NeutronConverter can quantize the input and output tensors (i.e. Input and Output placeholder nodes). There is also a pass added to consequently remove the Q/DQ nodes for the placeholders, making the model fully quantized.

Test plan

Unit tests were updated with respect to newly introduced changes.

cc @digantdesai @JakeStevens @robert-kalmar @skywall

pytorch-bot · 2025-07-17T13:26:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12586

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit dc8e7ea with merge base 4197fc1 ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
At least one of the pre-conditions you specified did not hold
pull / test-llama-lora-linux / linux-job (gh)
RuntimeError: Command docker exec -t 33a0705dc1757e596364b8250658ff65295c0797dc670d50832142547a750bd0 /exec failed with exit code 127

This comment was automatically generated by Dr. CI and updates every 15 minutes.

StrycekSimon · 2025-07-17T13:36:27Z

@pytorchbot label "module: nxp" "release notes: nxp"

digantdesai

Thanks!

digantdesai · 2025-07-24T10:41:10Z

backends/nxp/backend/ir/edge_passes/remove_io_quant_ops_pass.py

+        super().__init__()
+        self._edge_program_manager = edge_program_manager
+
+    def _get_quantizable_input_indices(self):


Feel free to improve quantize_io_pass and move this utils there if you think they can be useful elsewhere.

digantdesai · 2025-07-24T10:41:49Z

backends/nxp/tests/ir/converter/node_converter/test_constant_pad_nd_converter.py

        pytest.param((2, 4), tuple(range(4)), id="2D, padding N, H"),
        pytest.param((2, 4, 6), tuple(range(2)), id="3D, padding H"),
        pytest.param((2, 4, 6), tuple(range(4)), id="3D, padding C, H"),
-        pytest.param((2, 4, 6), list(range(6)), id="3D, padding N, C, H"),


Curious why remove these tests?

These tests are no longer relevant, because ConstantPad nodes with following params will not be delegated. It is related to this restricstion: https://github.com/pytorch/executorch/pull/12586/files#diff-e01d426046aa644b4e18ffa510b42e50e1b18b8f5407bcfb0d210f701d95b16aR53

We are still able to convert them into intermediate model representation, but Neutron conversion will fail.

digantdesai · 2025-07-24T10:42:19Z

backends/nxp/tests/ir/edge_passes/test_remove_io_quant_ops_pass.py

+        config=ExecutorchBackendConfig(extract_delegate_segments=False)
+    )
+
+    exported_program_to_dot(exec_prog.exported_program(), "conv_relu.dot")


is this needed for the test?

Removed thanks.

digantdesai · 2025-07-24T10:43:44Z

backends/nxp/tests/ir/edge_passes/test_remove_io_quant_ops_pass.py

+
+
+def test_remove_io_quant_ops_pass__conv_relu():
+    model = Conv2dReLUModule()


Since you are calculating indices do you want to test a model which has >1 inputs and outputs?

Test for multi input/output model added.

digantdesai · 2025-07-24T10:45:28Z

backends/nxp/tests/test_integration.py

@@ -0,0 +1,50 @@
+# Copyright 2024 NXP


Is this just full model tests? Not sure what do you mean here by integration?

Yes, integration test in our terms refers to "full model / real-world model" test.

digantdesai · 2025-07-24T10:46:37Z

Make sure the nxp CI is green before you merge.

robert-kalmar · 2025-07-24T15:18:43Z

pyproject.toml

  # Some kernel libraries need their .yaml files.
  "*.yaml",
+  # Add trained models from backends/nxp/experimental
+  "backends/nxp/experimental/*.pth",


Wrong path: Pretrained model is on the model is on examples/nxp/experimental/cifar_net/cifar_net.pth

robert-kalmar · 2025-07-24T15:32:34Z

src/executorch/examples/nxp/experimental

@@ -0,0 +1 @@
+../../../../examples/nxp/experimental/


@mergennachin, @digantdesai please note this change: Do you agree with installing the examples/nxp/experimental/cifar_net model as part of the ExecuTorch wheel? Tests in this PR uses it.
Based on this https://github.com/pytorch/executorch/tree/main/src and:

TODO(mnachin T180504136): Do not put examples/models into core pip packages. Refactor out the necessary utils or core models files into a separate package.

from https://github.com/pytorch/executorch/blob/main/src/README.md , there will be undergoing changes.

perhaps not, its ~350KiB. Alternative would be to download it from somewhere?

Size concern: the trained weights (pth) is not needed ==> Will revert the change in pyproject.toml https://github.com/pytorch/executorch/pull/12586/files/08e134b5315aea6611df5246cf76bb8cdc46a67d#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711

The factual presence of the examples/nxp/experimental/cifar_net in the executorch installaller containing only the model definition cifar_net.py https://github.com/pytorch/executorch/blob/main/examples/nxp/experimental/cifar_net/cifar_net.py is OK?

Yeah I do see examples/xnnpack in there so should be OK

JakeStevens · 2025-07-30T13:52:53Z

Still waiting on the revert to the toml, still contains the pth afaict

robert-kalmar · 2025-08-01T08:13:21Z

pyproject.toml

  # Some kernel libraries need their .yaml files.
  "*.yaml",
+  # Add trained models from backends/nxp/experimental
+  "examples/nxp/experimental/*.pth",


This to be reverted completely, in this thread https://github.com/pytorch/executorch/pull/12586/files#r2228867897, this thread we agreed to not install the trained weight due to size, just the model cifar_net.py with model definition. For the purpose of the test, the trained weights are not needed.

robert-kalmar · 2025-08-01T08:16:11Z

examples/nxp/aot_neutron_compile.py

 from executorch.examples.models import MODEL_NAME_TO_MODEL
 from executorch.examples.models.model_factory import EagerModelFactory

+from executorch.examples.nxp.experimental.cifar_net.cifar_net import (


As the cifar_net weights won't be present in the executorch package installation (https://github.com/pytorch/executorch/pull/12586/files#r2228867897) revert back to use relative import, in order the trained weights to be loaded correctly when using the aot_neutron_compile.py

CifarNet requires input quantization for full INT8 model quantization. This test verifies that input node is quantized.

### Summary With this change the NeutronConverter can quantize the input and output tensors (i.e. Input and Output placeholder nodes). There is also a pass added to consequently remove the Q/DQ nodes for the placeholders, making the model fully quantized. ### Test plan Unit tests were updated with respect to newly introduced changes. --------- Co-authored-by: Lukas Sztefek <[email protected]>

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 17, 2025

StrycekSimon force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 190f46a to d8c7ee9 Compare July 17, 2025 13:31

pytorch-bot bot added module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate labels Jul 17, 2025

StrycekSimon marked this pull request as ready for review July 17, 2025 13:37

StrycekSimon changed the title ~~Input placeholder quantization and pad op improvement~~ NXP backend: Input placeholder quantization and pad op improvement Jul 21, 2025

robert-kalmar marked this pull request as draft July 22, 2025 12:01

robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from d8c7ee9 to ab6fb9f Compare July 23, 2025 13:18

robert-kalmar marked this pull request as ready for review July 23, 2025 13:40

robert-kalmar changed the title ~~NXP backend: Input placeholder quantization and pad op improvement~~ NXP backend: Add model input and output quantization Jul 23, 2025

robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from ab6fb9f to 392b3b2 Compare July 23, 2025 21:45

digantdesai approved these changes Jul 24, 2025

View reviewed changes

robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 392b3b2 to 08e134b Compare July 24, 2025 14:19

robert-kalmar reviewed Jul 24, 2025

View reviewed changes

skywall force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 08e134b to 706e5bc Compare July 31, 2025 14:44

robert-kalmar reviewed Aug 1, 2025

View reviewed changes

skywall force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 706e5bc to d0ff3b4 Compare August 1, 2025 11:35

skywall added 2 commits August 1, 2025 13:51

NXP backend: Quantize input placeholders in NeutronQuantizer

320d280

NXP backend: Improve partitioning and conversion of Pad op

6d369cc

robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from d0ff3b4 to 0f8ece1 Compare August 1, 2025 11:51

skywall added 2 commits August 4, 2025 14:08

NXP Backend: Add integration test with CifarNet model

5499f4e

CifarNet requires input quantization for full INT8 model quantization. This test verifies that input node is quantized.

NXP Backend: Add pass to remove IO de/quantize nodes

dc8e7ea

robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 0f8ece1 to dc8e7ea Compare August 4, 2025 12:09

robert-kalmar approved these changes Aug 4, 2025

View reviewed changes

robert-kalmar merged commit cf2f170 into pytorch:main Aug 4, 2025
101 of 103 checks passed

robert-kalmar deleted the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch August 4, 2025 14:20



		def test_remove_io_quant_ops_pass__conv_relu():
		model = Conv2dReLUModule()

		@@ -0,0 +1 @@
		../../../../examples/nxp/experimental/ No newline at end of file

NXP backend: Add model input and output quantization #12586

NXP backend: Add model input and output quantization #12586

Uh oh!

Conversation

StrycekSimon commented Jul 17, 2025 • edited by robert-kalmar Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12586

❌ 2 New Failures

Uh oh!

StrycekSimon commented Jul 17, 2025

Uh oh!

digantdesai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skywall Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

digantdesai commented Jul 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JakeStevens commented Jul 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

StrycekSimon commented Jul 17, 2025 •

edited by robert-kalmar

Loading

pytorch-bot bot commented Jul 17, 2025 •

edited

Loading

skywall Jul 25, 2025 •

edited

Loading