testing and review fixes

robell · robell · commit 1defb11e71ed · 2025-08-12T09:46:49.000Z
Signed-off-by: Rob Elliott &lt;robert.elliott@arm.com&gt;
diff --git a/backends/arm/README.md b/backends/arm/README.md
@@ -1,4 +1,4 @@
-# ExecuTorch Arm/TOSA Delegate
+# ExecuTorch Arm&reg; Delegate for TOSA devices
 
 This subtree contains the Arm(R) Delegate implementation for ExecuTorch.
 
@@ -7,26 +7,26 @@ through an AoT flow which targets multiple Arm IP using the TOSA standard.
 
 For more information on TOSA see https://www.mlplatform.org/tosa/tosa_spec.html
 
-The expected flows are:
+**The expected flows are:**
 * torch.nn.module -> TOSA for development and validation of model export
 * torch.nn.module -> TOSA/VGF for flows supporting a JiT compilation step.
 * torch.nn.module -> TOSA -> command_stream for fully AoT flows e.g. embedded.
 
-Currently device support is for:
-* TOSA to Ethos(TM)-U55/65/85 via the ethos-u-vela compilation stack.
+**Currently device support is for:**
+* TOSA to Ethos&trade;-U55/65/85 via the ethos-u-vela compilation stack.
   * This is cross-compiled to the appropriate target CPU
   * There is a seperate arm_executor_runner for bare-metal platforms
-* TOSA to VGF via the model-converter for devices supporting the ML SDK for Vulkan(R)
-  * The VGF graph represents TOSA directly in a SPIR-V(TM) standardized form.
+* TOSA to VGF via the model-converter for devices supporting the ML SDK for Vulkan&reg;
+  * The VGF graph represents TOSA directly in a SPIR-V&trade; standardized form.
   * As the VGF delegate runs on Vulkan, it's required to be built with the Vulkan delegate also present.
 
-Currently supported development platforms are:
+**Currently supported development platforms are:**
 * For ahead of time tooling
   * Linux aarch64
   * Linux x86_64
   * macOS with Apple silicon
 * Bare metal builds For the Ethos-U target and Cortex-M targets
-  * Full testing is available in tree for the Corstone(TM) FVPs
+  * Full testing is available in tree for the Corstone&trade; FVPs
   * This is a reference implementation for porting to silicon targets
 * Linux target support For VGF capable targets
   * This flow re-uses the common executor_runner
@@ -66,7 +66,7 @@ Other:
 
 ## Testing
 
-The unit tests and related support scripts will test TOSA, Ethos-U and VGF behaviour based on the installed tools. It is expected that the relevant environment preperation has been performed as outlined in the guide available here https://docs.pytorch.org/executorch/main/tutorial-arm.html
+The tests and related support scripts will test TOSA, Ethos-U and VGF behaviour based on the installed tools. It is expected that the relevant environment preperation has been performed as outlined in the guide available here https://docs.pytorch.org/executorch/main/tutorial-arm.html
 
 After setup you can run unit tests with the test_arm_baremetal.sh script.
 
diff --git a/docs/source/tutorial-arm.md b/docs/source/tutorial-arm.md
@@ -1,4 +1,4 @@
-# Arm(R) Backend Tutorial
+# Arm&reg; Backend Tutorial
 
 <!----This will show a grid card on the page----->
 ::::{grid} 2
@@ -26,10 +26,10 @@ You may encounter some rough edges and features which may be documented or plann
 
 ```{tip}
 If you are already familiar with this delegate, you may want to jump directly to the examples:
-* [https://github.com/pytorch/executorch/tree/main/examples/arm](https://github.com/pytorch/executorch/tree/main/examples/arm)
-* [https://github.com/pytorch/executorch/blob/main/examples/arm/ethos_u_minimal_example.ipynb](Compilation for Ethos-U)
-* [https://github.com/pytorch/executorch/blob/main/examples/arm/vgf_minimal_example.ipynb](Compilation for VGF/ML-SDK)
-* [https://github.com/pytorch/executorch/blob/main/examples/arm/aot_arm_compiler.py](A commandline compiler for example models)
+* [Examples in the ExecuTorch repository](https://github.com/pytorch/executorch/tree/main/examples/arm)
+* [Compilation for Ethos-U](https://github.com/pytorch/executorch/blob/main/examples/arm/ethos_u_minimal_example.ipynb)
+* [Compilation for VGF/ML-SDK](https://github.com/pytorch/executorch/blob/main/examples/arm/vgf_minimal_example.ipynb)
+* [A commandline compiler for example models](https://github.com/pytorch/executorch/blob/main/examples/arm/aot_arm_compiler.py)
 ```
 
 ## Prerequisites
@@ -69,7 +69,6 @@ For VGF run:
 ```
 It is possible to install both sets of dependencies if you omit the disable options.
 
-Upon successful execution, you can directly go to [the next step](#convert-the-pytorch-model-to-the-pte-file).
 
 ### Notes:
 
@@ -203,27 +202,50 @@ graph_module_edge.exported_program = to_backend(
 
 Similar to the non-delegate flow, the same script will server as a helper utility to help generate the `.pte` file. Notice the `--delegate` option to enable the `to_backend` call.
 
+For Ethos targets:
 ```bash
 python3 -m examples.arm.aot_arm_compiler --model_name="add" --delegate
+# This targets the default of ethos-u55-128, see --help for further targets
 # should produce ./add_arm_delegate_ethos-u55-128.pte
 ```
 
-### Delegated Quantized Workflow
-Generating the `.pte` file can be done using the aot_arm_compiler:
+For basic post-training quantization:
 ```bash
 python3 -m examples.arm.aot_arm_compiler --model_name="mv2" --delegate --quantize
+# This targets the default of ethos-u55-128, see --help for further targets
 # should produce ./mv2_arm_delegate_ethos-u55-128.pte
 ```
 
+
+For VGF targets:
+```bash
+python3 -m examples.arm.aot_arm_compiler --model_name="add" --target=vgf --delegate
+# should produce ./add_arm_delegate_vgf.pte
+```
+
+For basic post-training quantization:
+```bash
+python3 -m examples.arm.aot_arm_compiler --model_name="mv2" --target=vgf --delegate --quantize
+# should produce ./mv2_arm_delegate_vgf.pte
+```
+
+To capture intermediates such as VGF for lower level integration, invoke with the "-i" option:
+```bash
+python3 -m examples.arm.aot_arm_compiler --model_name="mv2" --target=vgf --delegate --quantize -i ./mv2_output
+# should produce ./mv2_arm_delegate_vgf.pte and intermediates in ./mv2_out/
+```
+
 <br />
 
-At the end of this, you should have three different `.pte` files.
+At the end of this, you should have a number of different `.pte` files.
 
-- The first one contains the [SoftmaxModule](#softmaxmodule), without any backend delegates.
-- The second one contains the [AddModule](#addmodule), with Arm Ethos-U backend delegate enabled.
-- The third one contains the [quantized MV2Model](#mv2module), with the Arm Ethos-U backend delegate enabled as well.
+- the SoftmaxModule, without any backend delegates.
+- the AddModule, targeting the Arm Ethos-U backend.
+- the Quantized MV2Model, targeting the Arm Ethos-U backend.
+- the AddModule, targeting the VGF backend.
+- the Quantized MV2Model, targeting the VGF backend.
 
-Now let's try to run these `.pte` files on a Corstone-300 and Corstone-320 platforms in a bare-metal environment.
+Now let's try to run these `.pte` files on a target.
 
 ## Getting a Bare-Metal Executable
 
@@ -391,6 +413,40 @@ I [executorch:arm_executor_runner.cpp:179]
 The `run.sh` script provides various options to select a particular FVP target, use desired models, select portable kernels and can be explored using the `--help` argument
 ```
 
+## Running on the VGF backend with the standard executor_runner for Linux
+
+Follow typical [Building ExecuTorch with CMake](using-executorch-building-from-source.md) flow to build the linux target, ensuring that the VGF delegate is enabled.
+
+```bash
+-DEXECUTORCH_BUILD_VGF=ON
+```
+
+A full example buld line is:
+```
+cmake bash \
+    -DCMAKE_INSTALL_PREFIX=cmake-out \
+    -DCMAKE_BUILD_TYPE=Release \
+    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
+    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
+    -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
+    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
+    -DEXECUTORCH_BUILD_XNNPACK=OFF \
+    -DEXECUTORCH_BUILD_VULKAN=ON \
+    -DEXECUTORCH_BUILD_VGF=ON \
+    -DEXECUTORCH_ENABLE_LOGGING=ON \
+    -DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
+    -DPYTHON_EXECUTABLE=python \
+    -Bcmake-out .
+cmake --build cmake-out -j25 --target install --config Release
+```
+
+You can then invoke the executor runner on the host machine, which will use the VGF delegate, and requires the vulkan layer drivers we installed with setup.sh.
+
+```bash
+./cmake-out/executor_runner -model_path add_arm_delegate_vgf.pte
+```
+
+
 ## Takeaways
 In this tutorial you have learnt how to use the ExecuTorch software to both export a standard model from PyTorch and to run it on the compact and fully functioned ExecuTorch runtime, enabling a smooth path for offloading models from PyTorch to Arm based platforms.