Skip to content

Commit f27ffc1

Browse files
committed
Update
[ghstack-poisoned]
2 parents 600cf8a + 365d4c1 commit f27ffc1

File tree

75 files changed

+1751
-507
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

75 files changed

+1751
-507
lines changed

.ci/scripts/setup-linux.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ set -exu
1111
source "$(dirname "${BASH_SOURCE[0]}")/utils.sh"
1212

1313
read -r BUILD_TOOL BUILD_MODE EDITABLE < <(parse_args "$@")
14+
echo "Build tool: $BUILD_TOOL, Mode: $BUILD_MODE"
1415

1516
# As Linux job is running inside a Docker container, all of its dependencies
1617
# have already been installed, so we use PyTorch build from source here instead

.github/workflows/android-release-artifacts.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ jobs:
9090
fi
9191
9292
FLAVOR="${{ inputs.flavor }}"
93-
if [[ "$FLAVOR" == "vulkan+xnnpack" ]]; then
93+
if [[ "$FLAVOR" == "vulkan+xnnpack" || -z "$FLAVOR" ]]; then
9494
export EXECUTORCH_BUILD_VULKAN=ON
9595
fi
9696

backends/arm/README.md

Lines changed: 71 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,74 @@
1-
# ExecuTorch Arm/TOSA Delegate
1+
# ExecuTorch Arm&reg; Delegate for TOSA devices
22

33
This subtree contains the Arm(R) Delegate implementation for ExecuTorch.
44

55
This delegate is structured to, over time, support a number of different Arm devices
66
through an AoT flow which targets multiple Arm IP using the TOSA standard.
77

8-
The expected flow is:
9-
* torch.nn.module -> TOSA -> command_stream for fully AoT flows e.g. embedded.
10-
* torch.nn.module -> TOSA for flows supporting a JiT compilation step.
11-
12-
Current backend support is being developed for TOSA to Ethos(TM)-U55/65/85 via the
13-
ethos-u-vela compilation stack. which follows the fully AoT flow.
14-
15-
## Layout
8+
For more information on TOSA see https://www.mlplatform.org/tosa/tosa_spec.html
9+
10+
**The expected flows are:**
11+
* torch.nn.module -> TOSA for development and validation of model export
12+
* torch.nn.module -> TOSA/VGF for flows supporting a JiT compilation step.
13+
* torch.nn.module -> TOSA -> command_stream for fully AoT flows e.g. embedded.
14+
15+
**Currently device support is for:**
16+
* TOSA to Ethos&trade;-U55/65/85 via the ethos-u-vela compilation stack.
17+
* This is cross-compiled to the appropriate target CPU
18+
* There is a separate arm_executor_runner for bare-metal platforms
19+
* TOSA to VGF via the model-converter for devices supporting the ML SDK for Vulkan&reg;
20+
* The VGF graph represents TOSA directly in a SPIR-V&trade; standardized form.
21+
* As the VGF delegate runs on Vulkan, it's required to be built with the Vulkan delegate also present.
22+
23+
**Currently supported development platforms are:**
24+
* For ahead of time tooling
25+
* Linux aarch64
26+
* Linux x86_64
27+
* macOS with Apple silicon
28+
* Bare metal builds For the Ethos-U target and Cortex-M targets
29+
* Full testing is available in tree for the Corstone&trade; FVPs
30+
* This is a reference implementation for porting to silicon targets
31+
* Linux target support For VGF capable targets
32+
* This flow re-uses the common executor_runner
33+
34+
## Layout of key components
1635

1736
Export:
18-
- `ethosu_backend.py` - Main entrypoint for the EthosUBackend. For more information see the section on
19-
[Arm Backend Architecture](#arm-backend-architecture). For examples of use see `executorch/examples/arm`.
20-
- `tosa_mapping.py` - utilities for mapping edge dialect to TOSA
21-
- `tosa_quant_utils.py` - utilities for mapping quantization information to TOSA encoding
37+
* `tosa_backend.py` - The TOSA conversion flow all other backends rely on.
38+
* `ethosu/backend.py` - Main entrypoint for the EthosUBackend.
39+
* `vgf_backend.py` - Main entrypoint for VgfBackend.
40+
* For more information see the section on [Arm Backend Architecture](#arm-backend-architecture).
41+
* `scripts` - For the core scripts which prepare AoT dependencies such as backend compilers.
2242

23-
Operators:
24-
- `node_visitor.py` - Base class for edge operator lowering
25-
- `op_*.py` - Edge operator lowering/serialization to TOSA
43+
Passes (which prepare the partitioned graphs for TOSA conversion):
44+
* `_passes\arm_pass_manager.py` - Pass manager. Will decide which passes need to be applied depending on the compile_spec.
45+
* `_passes\*_pass.py` - Compiler passes derived from ExportPass
2646

27-
Passes:
28-
- `arm_pass_manager.py` - Pass manager. Will decide which passes need to be applied depending on the compile_spec.
29-
- `*_pass.py` - Compiler passes derived from ExportPass
47+
Operators (which handle mapping of operators to TOSA):
48+
* `operators/node_visitor.py` - Base class for edge operator lowering
49+
* `operators/op_*.py` - Edge operator lowering/serialization to TOSA
3050

3151
Quantization:
32-
- `arm_quantizer.py` - Quantizers for Arm backend. Contains the EthosUQuantizer which inherits from the TOSAQuantizer
33-
- `arm_quantizer_utils.py` - Utilities for quantization
52+
* `quantizer/arm_quantizer.py` - Quantizers for Arm backend.
53+
* Contains the EthosUQuantizer which inherits from the TOSAQuantizer
54+
* Contains the VgfQuantizer which inherits from the TOSAQuantizer
55+
* `arm_quantizer_utils.py` - Utilities for quantization
3456

3557
Runtime:
36-
- `runtime/ArmEthosUBackend.cpp` - The Arm backend implementation of the ExecuTorch runtime backend (BackendInterface) for Ethos-U
58+
- `runtime/ArmEthosUBackend.cpp` - The Arm delegate for Ethos-U targets
59+
- `runtime/VGFBackend.cpp` - The Arm delegate for VGF capable targets
60+
- `CMakeLists.txt` - the build configuration for both targets
3761

3862
Other:
39-
- `third-party/` - Dependencies on other code - in particular the TOSA serialization_lib for compiling to TOSA and the ethos-u-core-driver for the bare-metal backend supporting Ethos-U
63+
- `third-party/` - Dependencies for runtime builds
4064
- `test/` - Unit test and test support functions
4165

66+
4267
## Testing
4368

44-
After a setup you can run unit tests with the test_arm_baremetal.sh script.
69+
The tests and related support scripts will test TOSA, Ethos-U and VGF behaviour based on the installed tools. It is expected that the relevant environment preparation has been performed as outlined in ./examples/arm/README.md.
70+
71+
After setup you can run unit tests with the test_arm_baremetal.sh script.
4572

4673
To run the pytests suite run
4774

@@ -62,6 +89,7 @@ backends/arm/test/test_arm_baremetal.sh test_full_ethosu_fvp
6289
```
6390

6491
## Unit tests
92+
6593
This is the structure of the test directory
6694

6795
```
@@ -112,89 +140,51 @@ Please note that installing model test dependencies is a standalone process. Whe
112140
List of models with specific dependencies:
113141
- Stable Diffusion: [diffusers](https://github.com/huggingface/diffusers/tree/main)
114142

115-
## Passes
116-
117-
With the default passes in the Arm Ethos-U backend, assuming the model lowers fully to the
118-
Ethos-U, the exported program is composed of a Quantize node, Ethos-U custom delegate
119-
and a Dequantize node. In some circumstances, you may want to feed quantized input to the Neural
120-
Network straight away, e.g. if you have a camera sensor outputting (u)int8 data and keep all the
121-
arithmetic of the application in the int8 domain. For these cases, you can apply the
122-
`exir/passes/quantize_io_pass.py`. See the unit test in `executorch/backends/arm/
123-
test/passes/test_ioquantization_pass.py`for an example how to feed quantized inputs and
124-
obtain quantized outputs.
125-
126-
127-
### Code coverage
128-
129-
To get code coverage:
130-
131-
```
132-
coverage run --source=<SRC> --rcfile=backends/arm/test/.coveragerc -m pytest \
133-
--config-file=/dev/null backends/arm/test/
134-
```
135-
136-
All files in `SRC` and its child directories will be analysed for code coverage,
137-
unless explicitly exluded in the .coveragerc file. If using venv this might be
138-
under `env/lib/python<VERSION_NUMBER>/site-packages/executorch/`. To get the
139-
absolute path, run:
140-
141-
```
142-
python -c "import executorch; print(executorch.__path__)"
143-
```
144-
145-
This contains a list of paths where the source directory is located. Pick the
146-
one that is located in `env/lib`. If that does not work try the others. Add
147-
`backends/arm` to the path in `--source` to only get code coverage for the Arm
148-
backend.
149-
150-
### A note on unit tests
151143

152-
There are currently 3 ways we unit test our code.
153-
1. TOSA main inference. These tests are using non-quantized data and ops. Edge IR representation of the module is lowered to a TOSA flatbuffer, which is tested for numerical correcteness using the ```tosa_reference_model``` tool.
154-
2. TOSA base inference. Same as above, but data and ops are quantized.
155-
3. Ethos-U55. These tests use quantized data and ops (aka TOSA base inference). Edge IR is lowered to a TOSA flatbuffer, which is fed into the Vela compiler. Theses tests are functional tests and do not test numerical correctness, since that should be guaranteed by TOSA.
144+
There are currently a number of ways we unit test our code:
145+
1. TOSA FP. These tests are using non-quantized data and ops. Edge IR representation of the module is lowered to a TOSA flatbuffer, which is tested for numerical correcteness using the ```tosa_reference_model``` tool.
146+
2. TOSA INT. Same as above, but data and ops integer, and represent a quantized domain.
147+
3. Ethos-U. These tests use quantized data and ops (aka TOSA base inference). Edge IR is lowered to a TOSA flatbuffer, which is fed into the Vela compiler. Theses tests are functional tests and do not test numerical correctness, since that should be guaranteed by TOSA.
148+
4. VGF. These tests enable both FP and INT testing for the VGF/SPIR-V representation of TOSA.
156149

157-
In order to distinguise between the different tests, the following suffixes have been added to the respective test case.
158-
* ```_MI``` for main inference
159-
* ```_BI``` for base inference
160-
* ```_U55_BI``` for base inference on U55
150+
In order to distinguise between general, and more targeted tests, you will find suffixes with FP, INT, U55, VGF, etc.
161151

162152
## Help & Improvements
163153
If you have problems or questions, or have suggestions for ways to make
164154
implementation and testing better, please reach out to the Arm team developing this delegate, or
165-
create an issue on [github](https://www.github.com/pytorch/executorch/issues).
155+
create an issue on [github](https://www.github.com/pytorch/executorch/issues) and add the "Partner: Arm" label.
166156

167157
# Arm Backend Architecture
168158

169159
The broad principle with the Arm backend implemention for ExecuTorch is to support multiple Arm devices and device configurations through a largely Homogeneous flow with maximal sharing of class logic.
170-
The EthosUBackend is currently the one user facing API that target the Ethos-U55 and Ethos-U85 hardware IP. It is using the TOSABackend under the hood to share code and functionality, but also to separate testing possibilities to the TOSA flow itself.
160+
The EthosUBackend and VgfBackend are the user facing targets available for the the Ethos-U55 and Ethos-U85 hardware IP, and VGF targets. It is using the TOSABackend under the hood to share compiler passes and legalisation, along with other code and functionality, but also to enable separate testing for the TOSA flow itself.
171161

172162
In practice for compilation, this means that the flow goes via [Arm TOSA](https://www.mlplatform.org/tosa/tosa_spec.html) to produce a common IR and quantization behaviour compatible with our various IP, and typically, device-specific backends to further lower to a device specific binary which can happen ahead of time (within the Python development flow) or at runtime (during a JIT compilation stage).
173163

174-
In practice for the runtime, this means we will share common runtime backend functionality, with the aim for features like debugging to be available through common tooling.
175-
176164

177165
## Arm Backend Status and Maturity
178166

179-
The Arm EthosU Backend should be considered a prototype quality at this point, likely subject to significant change and improvement, and with a limited coverage of functionality. We are actively developing this codebase.
167+
The Arm EthosU Backend should be considered reasonable quality at this point, supporting a large number of operators and major networks.
168+
The Arm VGF Backend should be considered of Alpha quality, likely subject to significant change and improvement, and with a limited coverage of functionality.
169+
We are actively developing the codebase for both targets.
180170

181171
## Current flows
182172

183-
The EthosUBackend has a two stage process,
184-
- Compile to TOSA to rationalise the graph into known hardware support profiles. Currently this is to v1.0 TOSA INT with specific concern to a subset which gives support on Ethos-U55 and Ethos-U85, the target of the initial prototype efforts. This calls into the TOSABackend.
185-
- Lower via the ethos-u-vela compilation flow which takes TOSA v1.0 as an input and produces a low level commandstream for the hardware which is then passed via the delegate to the ethos-u-core-driver for direct execution.
173+
The Arm backends have a two stage process,
174+
1. Compile to TOSA to by applying FX passes and legalizing the graph into supported TOSA profiles. Currently this is to v1.0 TOSA INT/FP, this is via calls into the TOSABackend.
175+
1. Lower via the target compilation flow which takes TOSA v1.0 as an input and produces a lower level format for the hardware
176+
* For Ethos-U this is a hardware command stream that is possible to directly execute on hardware
177+
* For VGF this is a SPIR-V representation of TOSA to enable JiT compilation on the target platform
186178

187-
The EthosUPartitioner is currenly used to ensure the operations converted are Ethos-U compatible, but will be extended to offer spec-correct TOSA Base inference and TOSA Main Inference generation in future.
179+
All targets provide a partitioner to enable the standard partially delegated flow offered by ExecuTorch.
188180

189-
There is also a generic TOSABackend with accompanying TOSAPartitioner and TOSAQuantizer, which are used by the EthosUBackend and friends. The Arm TOSA Backend can be used by it's own to verify the lowering to the TOSA representation of the model (refer to the unit tests in backends/arm/test which uses the TOSA backend in the test suites).
181+
There is also a generic TOSABackend with accompanying TOSAPartitioner and TOSAQuantizer, these can be used directly to verify the lowering to the TOSA representation of the model (refer to the unit tests in backends/arm/test which uses the TOSA backend in the test suites).
190182

191183
### Controlling compilation
192184

193185
It is possible to control the compilation flow to aid in development and debug of both networks and the code itself.
194186

195-
Configuration of the EthosUBackend export flow is controlled by CompileSpec information (essentially used as compilation flags) to determine which of these outputs is produced. In particular this allows for use of the tosa_reference_model to run intermediate output to check for correctness and quantization accuracy without a full loop via hardware implemntation.
196-
197-
As this is in active development see the EthosUBackend for accurate information on [compilation flags](https://github.com/pytorch/executorch/blob/29f6dc9353e90951ed3fae3c57ae416de0520067/backends/arm/arm_backend.py#L319-L324)
187+
Configuration of the export flow is controlled by CompileSpec information (essentially used as compilation flags) to determine which of these outputs is produced. In particular this allows for compilation flags, capturing intermediate forms during lowering, and use of the tosa_reference_model to run intermediate output to check for correctness and quantization accuracy without a full loop via hardware implemntation.
198188

199189
## Model specific and optional passes
200190
The current TOSA version does not support int64. However, int64 is commonly used in many models. In order to lower the operators with int64 inputs and/or outputs to TOSA, a few passes have been developed to handle the int64-related issues. The main idea behind these passes is to replace the uses of int64 with int32 where feasible.

backends/arm/_passes/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
from .decompose_acosh_pass import DecomposeAcoshPass # noqa
2626
from .decompose_adaptive_avg_pool2d_pass import DecomposeAdaptiveAvgPool2dPass # noqa
2727
from .decompose_addmm_pass import DecomposeAddmmPass # noqa
28-
from .decompose_asin_pass import DecomposeAsinPass # noqa
28+
from .decompose_asin_and_acos_pass import DecomposeAsinAndAcosPass # noqa
2929
from .decompose_asinh_pass import DecomposeAsinhPass # noqa
3030
from .decompose_atan_pass import DecomposeAtanPass # noqa
3131
from .decompose_atanh_pass import DecomposeAtanhPass # noqa

backends/arm/_passes/arm_pass_manager.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@
3030
DecomposeAcoshPass,
3131
DecomposeAdaptiveAvgPool2dPass,
3232
DecomposeAddmmPass,
33+
DecomposeAsinAndAcosPass,
3334
DecomposeAsinhPass,
34-
DecomposeAsinPass,
3535
DecomposeAtanhPass,
3636
DecomposeAtanPass,
3737
DecomposeAvgPool2d,
@@ -171,9 +171,9 @@ def _tosa_FP_pipeline(self, exported_program: ExportedProgram) -> GraphModule:
171171
self.add_pass(DecomposeMaskedFill())
172172
self.add_pass(DecomposeRoundPass())
173173
self.add_pass(DecomposeAcoshPass())
174-
self.add_pass(DecomposeAsinPass())
175174
self.add_pass(DecomposeAsinhPass())
176175
self.add_pass(DecomposeCoshPass())
176+
self.add_pass(DecomposeAsinAndAcosPass())
177177
self.add_pass(DecomposeSqrtPass())
178178
self.add_pass(DecomposeAtanPass())
179179
self.add_pass(DecomposeAtanhPass())

0 commit comments

Comments
 (0)