-
Notifications
You must be signed in to change notification settings - Fork 689
NXP Backend: Update documentation to the new scheme #15219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
robert-kalmar
wants to merge
1
commit into
pytorch:main
Choose a base branch
from
nxp-upstream:update-nxp-doc-new-template
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+255
−82
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# NXP eIQ Neutron Backend | ||
|
||
This manual page is dedicated to introduction NXP eIQ Neutron backend. | ||
NXP offers accelerated machine learning models inference on edge devices. | ||
To learn more about NXP's machine learning acceleration platform, please refer to [the official NXP website](https://www.nxp.com/applications/technologies/ai-and-machine-learning:MACHINE-LEARNING). | ||
|
||
<div class="admonition tip"> | ||
For up-to-date status about running ExecuTorch on Neutron backend please visit the <a href="https://github.com/pytorch/executorch/blob/main/backends/nxp/README.md">manual page</a>. | ||
</div> | ||
|
||
## Features | ||
|
||
|
||
ExecuTorch v1.0 supports running machine learning models on selected NXP chips (for now only i.MXRT700). | ||
Among currently supported machine learning models are: | ||
- Convolution-based neutral networks | ||
- Full support for MobileNetV2 and CifarNet | ||
|
||
## Target Requirements | ||
|
||
- Hardware with NXP's [i.MXRT700](https://www.nxp.com/products/i.MX-RT700) chip or a evaluation board like MIMXRT700-EVK. | ||
|
||
## Development Requirements | ||
|
||
- [MCUXpresso IDE](https://www.nxp.com/design/design-center/software/development-software/mcuxpresso-software-and-tools-/mcuxpresso-integrated-development-environment-ide:MCUXpresso-IDE) or [MCUXpresso Visual Studio Code extension](https://www.nxp.com/design/design-center/software/development-software/mcuxpresso-software-and-tools-/mcuxpresso-for-visual-studio-code:MCUXPRESSO-VSC) | ||
- [MCUXpresso SDK 25.06](https://mcuxpresso.nxp.com/mcuxsdk/25.06.00/html/index.html) | ||
- eIQ Neutron Converter for MCUXPresso SDK 25.06, what you can download from eIQ PyPI: | ||
|
||
```commandline | ||
$ pip install --index-url https://eiq.nxp.com/repository neutron_converter_SDK_25_06 | ||
``` | ||
|
||
Instead of manually installing requirements, except MCUXpresso IDE and SDK, you can use the setup script: | ||
```commandline | ||
$ ./examples/nxp/setup.sh | ||
``` | ||
|
||
## Using NXP eIQ Backend | ||
|
||
To test converting a neural network model for inference on NXP eIQ Neutron backend, you can use our example script: | ||
|
||
```shell | ||
# cd to the root of executorch repository | ||
./examples/nxp/aot_neutron_compile.sh [model (cifar10 or mobilenetv2)] | ||
``` | ||
|
||
For a quick overview how to convert a custom PyTorch model, take a look at our [example python script](https://github.com/pytorch/executorch/tree/release/1.0/examples/nxp/aot_neutron_compile.py). | ||
|
||
|
||
## Runtime Integration | ||
|
||
To learn how to run the converted model on the NXP hardware, use one of our example projects on using ExecuTorch runtime from MCUXpresso IDE example projects list. | ||
For more finegrained tutorial, visit [this manual page](https://mcuxpresso.nxp.com/mcuxsdk/latest/html/middleware/eiq/executorch/docs/nxp/topics/example_applications.html). | ||
|
||
## Reference | ||
|
||
**→{doc}`nxp-partitioner` — Partitioner options.** | ||
|
||
**→{doc}`nxp-quantization` — Supported quantization schemes.** | ||
|
||
**→{doc}`tutorials/nxp-tutorials` — Tutorials.** | ||
|
||
```{toctree} | ||
:maxdepth: 2 | ||
:hidden: | ||
:caption: NXP Backend | ||
nxp-partitioner | ||
nxp-quantization | ||
tutorials/nxp-tutorials | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
=============== | ||
Partitioner API | ||
=============== | ||
|
||
The Neutron partitioner API allows for configuration of the model delegation to Neutron. Passing an ``NeutronPartitioner`` instance with no additional parameters will run as much of the model as possible on the Neutron backend. This is the most common use-case. | ||
|
||
It has the following arguments: | ||
|
||
* `compile_spec` - list of key-value pairs defining compilation: | ||
* `custom_delegation_options` - custom options for specifying node delegation. | ||
|
||
-------------------- | ||
Compile Spec Options | ||
-------------------- | ||
To generate the Compile Spec for Neutron backend, you can use the `generate_neutron_compile_spec` function or directly the `NeutronCompileSpecBuilder().neutron_compile_spec()` | ||
Following fields can be set: | ||
|
||
* `config` - NXP platform defining the Neutron NPU configuration, e.g. "imxrt700". | ||
* `neutron_converter_flavor` - Flavor of the neutron-converter module to use. Neutron-converter module named neutron_converter_SDK_25_06' has flavor 'SDK_25_06'. You shall set the flavour to the MCUXpresso SDK version you will use. | ||
* `extra_flags` - Extra flags for the Neutron compiler. | ||
* `operators_not_to_delegate` - List of operators that will not be delegated. | ||
|
||
------------------------- | ||
Custom Delegation Options | ||
------------------------- | ||
By default the Neutron backend is defensive, what means it does not delegate operators which cannot be decided statically during partitioning. But as the model author you typically have insight into the model and so you can allow opportunistic delegation for some cases. For list of options, see | ||
roman-janik-nxp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`CustomDelegationOptions <https://github.com/pytorch/executorch/blob/release/1.0/backends/nxp/backend/custom_delegation_options.py#L11>`_ | ||
|
||
================ | ||
Operator Support | ||
================ | ||
|
||
Operators are the building blocks of the ML model. See `IRs <https://docs.pytorch.org/docs/stable/torch.compiler_ir.html>`_ for more information on the PyTorch operator set. | ||
|
||
This section lists the Edge operators supported by the Neutron backend. | ||
roman-janik-nxp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
For detailed constraints of the operators see the conditions in the ``is_supported_*`` functions in the `Node converters <https://github.com/pytorch/executorch/blob/release/1.0/backends/nxp/neutron_partitioner.py#L192>`_ | ||
|
||
|
||
.. csv-table:: Operator Support | ||
:file: op-support.csv | ||
:header-rows: 1 | ||
:widths: 20 15 30 30 | ||
:align: center |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# NXP eIQ Neutron Quantization | ||
|
||
The eIQ Neutron NPU requires the operators delegated to be quantized. To quantize the PyTorch model for the Neutron backend, use the `NeutronQuantizer` from `backends/nxp/quantizer/neutron_quantizer.py`. | ||
The `NeutronQuantizer` is configured to quantize the model with quantization scheme supported by the eIQ Neutron NPU. | ||
|
||
### Supported Quantization Schemes | ||
|
||
The Neutron delegate supports the following quantization schemes: | ||
|
||
- Static quantization with 8-bit symmetric weights and 8-bit asymmetric activations (via the PT2E quantization flow), per-tensor granularity. | ||
- Following operators are supported at this moment: | ||
- `aten.abs.default` | ||
- `aten.adaptive_avg_pool2d.default` | ||
- `aten.addmm.default` | ||
- `aten.add.Tensor` | ||
- `aten.avg_pool2d.default` | ||
- `aten.cat.default` | ||
- `aten.conv1d.default` | ||
- `aten.conv2d.default` | ||
- `aten.dropout.default` | ||
- `aten.flatten.using_ints` | ||
- `aten.hardtanh.default` | ||
- `aten.hardtanh_.default` | ||
- `aten.linear.default` | ||
- `aten.max_pool2d.default` | ||
- `aten.mean.dim` | ||
- `aten.pad.default` | ||
- `aten.permute.default` | ||
- `aten.relu.default` and `aten.relu_.default` | ||
- `aten.reshape.default` | ||
- `aten.view.default` | ||
- `aten.softmax.int` | ||
- `aten.tanh.default`, `aten.tanh_.default` | ||
- `aten.sigmoid.default` | ||
|
||
### Static 8-bit Quantization Using the PT2E Flow | ||
|
||
To perform 8-bit quantization with the PT2E flow, perform the following steps prior to exporting the model to edge: | ||
|
||
1) Create an instance of the `NeutronQuantizer` class. | ||
2) Use `torch.export.export` to export the model to ATen Dialect. | ||
3) Call `prepare_pt2e` with the instance of the `NeutronQuantizer` to annotate the model with observers for quantization. | ||
4) As static quantization is required, run the prepared model with representative samples to calibrate the quantized tensor activation ranges. | ||
5) Call `convert_pt2e` to quantize the model. | ||
6) Export and lower the model using the standard flow. | ||
|
||
The output of `convert_pt2e` is a PyTorch model which can be exported and lowered using the normal flow. As it is a regular PyTorch model, it can also be used to evaluate the accuracy of the quantized model using standard PyTorch techniques. | ||
|
||
```python | ||
import torch | ||
import torchvision.models as models | ||
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights | ||
from executorch.backends.nxp.quantizer.neutron_quantizer import NeutronQuantizer | ||
from executorch.backends.nxp.neutron_partitioner import NeutronPartitioner | ||
from executorch.backends.nxp.nxp_backend import generate_neutron_compile_spec | ||
from executorch.exir import to_edge_transform_and_lower | ||
from torchao.quantization.pt2e.quantize_pt2e import convert_pt2e, prepare_pt2e | ||
|
||
model = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval() | ||
sample_inputs = (torch.randn(1, 3, 224, 224), ) | ||
|
||
quantizer = NeutronQuantizer() # (1) | ||
|
||
training_ep = torch.export.export(model, sample_inputs).module() # (2) | ||
prepared_model = prepare_pt2e(training_ep, quantizer) # (3) | ||
|
||
for cal_sample in [torch.randn(1, 3, 224, 224)]: # Replace with representative model inputs | ||
prepared_model(cal_sample) # (4) Calibrate | ||
|
||
quantized_model = convert_pt2e(prepared_model) # (5) | ||
|
||
compile_spec = generate_neutron_compile_spec( | ||
"imxrt700", | ||
operators_not_to_delegate=None, | ||
neutron_converter_flavor="SDK_25_06", | ||
) | ||
|
||
et_program = to_edge_transform_and_lower( # (6) | ||
torch.export.export(quantized_model, sample_inputs), | ||
partitioner=[NeutronPartitioner(compile_spec=compile_spec)], | ||
).to_executorch() | ||
``` | ||
|
||
See [PyTorch 2 Export Post Training Quantization](https://docs.pytorch.org/ao/main/tutorials_source/pt2e_quant_ptq.html) for more information. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Operator,Compute DType,Quantization,Constraints | ||
aten.abs.default,int8,static int8, | ||
aten._adaptive_avg_pool2d.default,int8,static int8,"ceil_mode=False, count_include_pad=False, divisor_override=False" | ||
aten.addmm.default,int8,static int8,2D tensor only | ||
aten.add.Tensor,int8,static int8,"alpha = 1, input tensor of rame rank" | ||
aten.avg_pool2d.default,int8,static int8,"ceil_mode=False, count_include_pad=False, divisor_override=False" | ||
aten.cat.default,int8,static int8,"input_channels % 8 = 0, output_channels %8 = 0" | ||
aten.clone.default,int8,static int8, | ||
aten.constant_pad_nd.default,int8,static int8,"H or W padding only" | ||
aten.convolution.default,int8,static int8,"1D or 2D convolution, constant weights, groups=1 or groups=channels_count (depthwise)" | ||
aten.hardtanh.default,int8,static int8,"supported ranges: <0,6>, <-1, 1>, <0,1>, <0,inf>" | ||
aten.max_pool2d.default,int8,static int8,"dilation=1, ceil_mode=False" | ||
aten.max_pool2d_with_indices.default,int8,static int8,"dilation=1, ceil_mode=False" | ||
aten.mean.dim,int8,static int8,"4D tensor only, dims = [-1,-2] or [-2,-1]" | ||
aten.mm.default,int8,static int8,2D tensor only | ||
aten.relu.default,int8,static int8, | ||
aten.tanh.default,int8,static int8, | ||
aten.view_copy.default,int8,static int8, | ||
aten.sigmoid.default,int8,static int8, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# Preparing a Model for NXP eIQ Neutron Backend | ||
|
||
This guide demonstrating the use of ExecuTorch AoT flow to convert a PyTorch model to ExecuTorch | ||
format and delegate the model computation to eIQ Neutron NPU using the eIQ Neutron Backend. | ||
|
||
## Step 1: Environment Setup | ||
|
||
This tutorial is intended to be run from a Linux and uses Conda or Virtual Env for Python environment management. For full setup details and system requirements, see [Getting Started with ExecuTorch](/getting-started). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This link gives 404. |
||
|
||
Create a Conda environment and install the ExecuTorch Python package. | ||
```bash | ||
conda create -y --name executorch python=3.12 | ||
conda activate executorch | ||
conda install executorch | ||
``` | ||
|
||
Run the setup.sh script to install the neutron-converter: | ||
```commandline | ||
$ ./examples/nxp/setup.sh | ||
``` | ||
|
||
## Step 2: Model Preparation and Running the Model on Target | ||
|
||
See the example `aot_neutron_compile.py` and its [README](https://github.com/pytorch/executorch/blob/release/1.0/examples/nxp/README.md) file. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# NXP Tutorials | ||
|
||
**→{doc}`nxp-basic-tutorial` — Lower and run a model on the NXP eIQ Neutron backend.** | ||
roman-janik-nxp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```{toctree} | ||
:hidden: | ||
:maxdepth: 1 | ||
nxp-basic-tutorial | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
```{include} backends-nxp.md | ||
```{include} backends/nxp/nxp-overview.md |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.