Skip to content

Conversation

@StrycekSimon
Copy link
Collaborator

@StrycekSimon StrycekSimon commented Jul 17, 2025

Summary

With this change the NeutronConverter can quantize the input and output tensors (i.e. Input and Output placeholder nodes). There is also a pass added to consequently remove the Q/DQ nodes for the placeholders, making the model fully quantized.

Test plan

Unit tests were updated with respect to newly introduced changes.

cc @digantdesai @JakeStevens @robert-kalmar @skywall

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 17, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12586

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit dc8e7ea with merge base 4197fc1 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 17, 2025
@StrycekSimon StrycekSimon force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 190f46a to d8c7ee9 Compare July 17, 2025 13:31
@StrycekSimon
Copy link
Collaborator Author

@pytorchbot label "module: nxp" "release notes: nxp"

@pytorch-bot pytorch-bot bot added module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate labels Jul 17, 2025
@StrycekSimon StrycekSimon marked this pull request as ready for review July 17, 2025 13:37
@StrycekSimon StrycekSimon changed the title Input placeholder quantization and pad op improvement NXP backend: Input placeholder quantization and pad op improvement Jul 21, 2025
@robert-kalmar robert-kalmar marked this pull request as draft July 22, 2025 12:01
@robert-kalmar robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from d8c7ee9 to ab6fb9f Compare July 23, 2025 13:18
@robert-kalmar robert-kalmar marked this pull request as ready for review July 23, 2025 13:40
@robert-kalmar robert-kalmar changed the title NXP backend: Input placeholder quantization and pad op improvement NXP backend: Add model input and output quantization Jul 23, 2025
@robert-kalmar robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from ab6fb9f to 392b3b2 Compare July 23, 2025 21:45
Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

super().__init__()
self._edge_program_manager = edge_program_manager

def _get_quantizable_input_indices(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to improve quantize_io_pass and move this utils there if you think they can be useful elsewhere.

pytest.param((2, 4), tuple(range(4)), id="2D, padding N, H"),
pytest.param((2, 4, 6), tuple(range(2)), id="3D, padding H"),
pytest.param((2, 4, 6), tuple(range(4)), id="3D, padding C, H"),
pytest.param((2, 4, 6), list(range(6)), id="3D, padding N, C, H"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why remove these tests?

Copy link
Collaborator

@skywall skywall Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are no longer relevant, because ConstantPad nodes with following params will not be delegated. It is related to this restricstion: https://github.com/pytorch/executorch/pull/12586/files#diff-e01d426046aa644b4e18ffa510b42e50e1b18b8f5407bcfb0d210f701d95b16aR53

We are still able to convert them into intermediate model representation, but Neutron conversion will fail.

config=ExecutorchBackendConfig(extract_delegate_segments=False)
)

exported_program_to_dot(exec_prog.exported_program(), "conv_relu.dot")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed for the test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed thanks.



def test_remove_io_quant_ops_pass__conv_relu():
model = Conv2dReLUModule()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are calculating indices do you want to test a model which has >1 inputs and outputs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test for multi input/output model added.

@@ -0,0 +1,50 @@
# Copyright 2024 NXP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just full model tests? Not sure what do you mean here by integration?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, integration test in our terms refers to "full model / real-world model" test.

@digantdesai
Copy link
Contributor

Make sure the nxp CI is green before you merge.

@robert-kalmar robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 392b3b2 to 08e134b Compare July 24, 2025 14:19
pyproject.toml Outdated
# Some kernel libraries need their .yaml files.
"*.yaml",
# Add trained models from backends/nxp/experimental
"backends/nxp/experimental/*.pth",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong path: Pretrained model is on the model is on examples/nxp/experimental/cifar_net/cifar_net.pth

@@ -0,0 +1 @@
../../../../examples/nxp/experimental/ No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mergennachin, @digantdesai please note this change: Do you agree with installing the examples/nxp/experimental/cifar_net model as part of the ExecuTorch wheel? Tests in this PR uses it.
Based on this https://github.com/pytorch/executorch/tree/main/src and:

TODO(mnachin T180504136): Do not put examples/models into core pip packages. Refactor out the necessary utils or core models files into a separate package.

from https://github.com/pytorch/executorch/blob/main/src/README.md , there will be undergoing changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps not, its ~350KiB. Alternative would be to download it from somewhere?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size concern: the trained weights (pth) is not needed ==> Will revert the change in pyproject.toml https://github.com/pytorch/executorch/pull/12586/files/08e134b5315aea6611df5246cf76bb8cdc46a67d#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711

The factual presence of the examples/nxp/experimental/cifar_net in the executorch installaller containing only the model definition cifar_net.py https://github.com/pytorch/executorch/blob/main/examples/nxp/experimental/cifar_net/cifar_net.py is OK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I do see examples/xnnpack in there so should be OK

@JakeStevens
Copy link
Contributor

Still waiting on the revert to the toml, still contains the pth afaict

@skywall skywall force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 08e134b to 706e5bc Compare July 31, 2025 14:44
pyproject.toml Outdated
# Some kernel libraries need their .yaml files.
"*.yaml",
# Add trained models from backends/nxp/experimental
"examples/nxp/experimental/*.pth",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This to be reverted completely, in this thread https://github.com/pytorch/executorch/pull/12586/files#r2228867897, this thread we agreed to not install the trained weight due to size, just the model cifar_net.py with model definition. For the purpose of the test, the trained weights are not needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

from executorch.examples.models import MODEL_NAME_TO_MODEL
from executorch.examples.models.model_factory import EagerModelFactory

from executorch.examples.nxp.experimental.cifar_net.cifar_net import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the cifar_net weights won't be present in the executorch package installation (https://github.com/pytorch/executorch/pull/12586/files#r2228867897) revert back to use relative import, in order the trained weights to be loaded correctly when using the aot_neutron_compile.py

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

@skywall skywall force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 706e5bc to d0ff3b4 Compare August 1, 2025 11:35
@robert-kalmar robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from d0ff3b4 to 0f8ece1 Compare August 1, 2025 11:51
skywall added 2 commits August 4, 2025 14:08
CifarNet requires input quantization for full
INT8 model quantization. This test verifies that
input node is quantized.
@robert-kalmar robert-kalmar force-pushed the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch from 0f8ece1 to dc8e7ea Compare August 4, 2025 12:09
@robert-kalmar robert-kalmar merged commit cf2f170 into pytorch:main Aug 4, 2025
101 of 103 checks passed
@robert-kalmar robert-kalmar deleted the upstream/main-nxp/EIEX-329-upstream-input-placeholder-quantization-and-pad-op-improvement branch August 4, 2025 14:20
agrima1304 pushed a commit to agrima1304/executorch that referenced this pull request Aug 26, 2025
### Summary

With this change the NeutronConverter can quantize the input and output
tensors (i.e. Input and Output placeholder nodes). There is also a pass
added to consequently remove the Q/DQ nodes for the placeholders, making
the model fully quantized.

### Test plan
Unit tests were updated with respect to newly introduced changes.

---------

Co-authored-by: Lukas Sztefek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants