-
Notifications
You must be signed in to change notification settings - Fork 699
NXP Backend: Add padd to remove unnecessary Quantize/Dequantize nodes. #15148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
NXP Backend: Add padd to remove unnecessary Quantize/Dequantize nodes. #15148
Conversation
… node format. The pass `RemoveGetItemPass` replaces a `max_pool2d_with_indices` node with a `max_pool2d` node, that doesn't require a GetItem afterward. The new operator must, however, preserve the original node format. Therefore, a copy of the pass was created in `backends/nxp/_passes`, where it was modified. The new directory was created, because the pass doesn't follow the `NeutronEdgePass` interface.
Before, the format inference was done during conversion to NeutronIR (after partitioning), so the partitioner didn't yet know the formats. Now, the partitioner has the format data, which can be used to accurately select nodes for delegation.
…der`, to delegate format related transpositions of input/output tensors.
…utputs to Neutron when possible. Due to the different tensor formats used by Executorch and Neutron, the inputs/outputs often have to be transposed. This used to be done exclusively by the runtime. Now, the transpositions are done by Neutron when possible.
Before, this change was "hidden", and it was only done to some inputs/outputs.
…ducing unsupported permutations.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15148
Note: Links to docs will display an error until the docs builds have been completed. ❌ 9 New Failures, 3 Unrelated FailuresAs of commit aa651f1 with merge base 57a7903 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Summary
This PR adds an edge dialect pre-processing pass to remove some Q/Dq nodes. This enables some non-delegated nodes (which run on the CPU) to run in float32. This improves accuracy and inference speed (by eliminating the need to artificially quantize and de-quantize activations).
Test plan
Unit tests provided.