-
Notifications
You must be signed in to change notification settings - Fork 743
NXP backend: Add model input and output quantization #12586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NXP backend: Add model input and output quantization #12586
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12586
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit dc8e7ea with merge base 4197fc1 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
190f46a to
d8c7ee9
Compare
|
@pytorchbot label "module: nxp" "release notes: nxp" |
d8c7ee9 to
ab6fb9f
Compare
ab6fb9f to
392b3b2
Compare
digantdesai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
| super().__init__() | ||
| self._edge_program_manager = edge_program_manager | ||
|
|
||
| def _get_quantizable_input_indices(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to improve quantize_io_pass and move this utils there if you think they can be useful elsewhere.
| pytest.param((2, 4), tuple(range(4)), id="2D, padding N, H"), | ||
| pytest.param((2, 4, 6), tuple(range(2)), id="3D, padding H"), | ||
| pytest.param((2, 4, 6), tuple(range(4)), id="3D, padding C, H"), | ||
| pytest.param((2, 4, 6), list(range(6)), id="3D, padding N, C, H"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why remove these tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are no longer relevant, because ConstantPad nodes with following params will not be delegated. It is related to this restricstion: https://github.com/pytorch/executorch/pull/12586/files#diff-e01d426046aa644b4e18ffa510b42e50e1b18b8f5407bcfb0d210f701d95b16aR53
We are still able to convert them into intermediate model representation, but Neutron conversion will fail.
| config=ExecutorchBackendConfig(extract_delegate_segments=False) | ||
| ) | ||
|
|
||
| exported_program_to_dot(exec_prog.exported_program(), "conv_relu.dot") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed for the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed thanks.
|
|
||
|
|
||
| def test_remove_io_quant_ops_pass__conv_relu(): | ||
| model = Conv2dReLUModule() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are calculating indices do you want to test a model which has >1 inputs and outputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test for multi input/output model added.
| @@ -0,0 +1,50 @@ | |||
| # Copyright 2024 NXP | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just full model tests? Not sure what do you mean here by integration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, integration test in our terms refers to "full model / real-world model" test.
|
Make sure the nxp CI is green before you merge. |
392b3b2 to
08e134b
Compare
pyproject.toml
Outdated
| # Some kernel libraries need their .yaml files. | ||
| "*.yaml", | ||
| # Add trained models from backends/nxp/experimental | ||
| "backends/nxp/experimental/*.pth", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong path: Pretrained model is on the model is on examples/nxp/experimental/cifar_net/cifar_net.pth
| @@ -0,0 +1 @@ | |||
| ../../../../examples/nxp/experimental/ No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mergennachin, @digantdesai please note this change: Do you agree with installing the examples/nxp/experimental/cifar_net model as part of the ExecuTorch wheel? Tests in this PR uses it.
Based on this https://github.com/pytorch/executorch/tree/main/src and:
TODO(mnachin T180504136): Do not put examples/models into core pip packages. Refactor out the necessary utils or core models files into a separate package.
from https://github.com/pytorch/executorch/blob/main/src/README.md , there will be undergoing changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps not, its ~350KiB. Alternative would be to download it from somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size concern: the trained weights (pth) is not needed ==> Will revert the change in pyproject.toml https://github.com/pytorch/executorch/pull/12586/files/08e134b5315aea6611df5246cf76bb8cdc46a67d#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711
The factual presence of the examples/nxp/experimental/cifar_net in the executorch installaller containing only the model definition cifar_net.py https://github.com/pytorch/executorch/blob/main/examples/nxp/experimental/cifar_net/cifar_net.py is OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I do see examples/xnnpack in there so should be OK
|
Still waiting on the revert to the toml, still contains the pth afaict |
08e134b to
706e5bc
Compare
pyproject.toml
Outdated
| # Some kernel libraries need their .yaml files. | ||
| "*.yaml", | ||
| # Add trained models from backends/nxp/experimental | ||
| "examples/nxp/experimental/*.pth", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This to be reverted completely, in this thread https://github.com/pytorch/executorch/pull/12586/files#r2228867897, this thread we agreed to not install the trained weight due to size, just the model cifar_net.py with model definition. For the purpose of the test, the trained weights are not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
examples/nxp/aot_neutron_compile.py
Outdated
| from executorch.examples.models import MODEL_NAME_TO_MODEL | ||
| from executorch.examples.models.model_factory import EagerModelFactory | ||
|
|
||
| from executorch.examples.nxp.experimental.cifar_net.cifar_net import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the cifar_net weights won't be present in the executorch package installation (https://github.com/pytorch/executorch/pull/12586/files#r2228867897) revert back to use relative import, in order the trained weights to be loaded correctly when using the aot_neutron_compile.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted.
706e5bc to
d0ff3b4
Compare
d0ff3b4 to
0f8ece1
Compare
CifarNet requires input quantization for full INT8 model quantization. This test verifies that input node is quantized.
0f8ece1 to
dc8e7ea
Compare
### Summary With this change the NeutronConverter can quantize the input and output tensors (i.e. Input and Output placeholder nodes). There is also a pass added to consequently remove the Q/DQ nodes for the placeholders, making the model fully quantized. ### Test plan Unit tests were updated with respect to newly introduced changes. --------- Co-authored-by: Lukas Sztefek <[email protected]>
Summary
With this change the NeutronConverter can quantize the input and output tensors (i.e. Input and Output placeholder nodes). There is also a pass added to consequently remove the Q/DQ nodes for the placeholders, making the model fully quantized.
Test plan
Unit tests were updated with respect to newly introduced changes.
cc @digantdesai @JakeStevens @robert-kalmar @skywall