Merge pull request #2367 from madeline-underwood/training_inference

jasonrandrews · web-flow · commit 1cdc34f26322 · 2025-09-30T12:55:43.000-05:00
Training inference_JA to sign off
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/_index.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/_index.md
@@ -1,32 +1,30 @@
 ---
-title: Edge AI with PyTorch & ExecuTorch - Tiny Rock-Paper-Scissors on Arm
+title: "Edge AI on Arm: PyTorch and ExecuTorch rock-paper-scissors"
 
 minutes_to_complete: 60
 
-who_is_this_for: This learning path is for machine learning developers interested in deploying TinyML models on Arm-based edge devices. You will learn how to train and deploy a machine learning model for the classic game "Rock-Paper-Scissors" on edge devices. You'll use PyTorch and ExecuTorch, frameworks designed for efficient on-device inference, to build and run a small-scale computer vision model.
-
+who_is_this_for: This is an introductory topic for machine learning developers who want to deploy TinyML models on Arm-based edge devices using PyTorch and ExecuTorch.
 
 learning_objectives:
-    - Train a small Convolutional Neural Network (CNN) for image classification using PyTorch.
-    - Understand how to use synthetic data generation for training a model when real-world data is limited.
-    - Optimize and convert a PyTorch model into an ExecuTorch program (.pte) for Arm-based devices.
-    - Run the trained model on a local machine to play an interactive mini-game, demonstrating model inference.
-
+  - Train a small Convolutional Neural Network (CNN) for image classification using PyTorch
+  - Use synthetic data generation for training a model when real data is limited
+  - Convert and optimize a PyTorch model to an ExecuTorch program (`.pte`) for Arm-based devices
+  - Run the trained model locally as an interactive mini-game to demonstrate inference
 
 prerequisites:
-   - A basic understanding of machine learning concepts.
-   - Familiarity with Python and the PyTorch library.
-   - Having completed [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm).
-   - An x86 Linux host machine or VM running Ubuntu 22.04 or higher.
+  - Basic understanding of machine learning concepts
+  - Familiarity with Python and the PyTorch library
+  - Completion of the Learning Path [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm/)
+  - An x86 Linux host machine or VM running Ubuntu 22.04 or later
 
 author: Dominica Abena O. Amanfo
 
 ### Tags
 skilllevels: Introductory
 subjects: ML
 armips:
-    - Cortex-M
-    - Ethos-U
+  - Cortex-M
+  - Ethos-U
 tools_software_languages:
     - tinyML
     - Computer Vision
@@ -36,23 +34,21 @@ tools_software_languages:
     - ExecuTorch
 
 operatingsystems:
-    - Linux
+  - Linux
 
 further_reading:
-    - resource:
-        title: Run Llama 3 on a Raspberry Pi 5 using ExecuTorch
-        link: /learning-paths/embedded-and-microcontrollers/rpi-llama3
-        type: website
-    - resource:
-        title: ExecuTorch Examples
-        link: https://github.com/pytorch/executorch/blob/main/examples/README.md
-        type: website
-
-
+  - resource:
+      title: Run Llama 3 on a Raspberry Pi 5 using ExecuTorch
+      link: /learning-paths/embedded-and-microcontrollers/rpi-llama3
+      type: website
+  - resource:
+      title: ExecuTorch examples
+      link: https://github.com/pytorch/executorch/blob/main/examples/README.md
+      type: website
 
 ### FIXED, DO NOT MODIFY
 # ================================================================================
 weight: 1                       # _index.md always has weight of 1 to order correctly
 layout: "learningpathall"       # All files under learning paths have this same wrapper
 learning_path_main_page: "yes"  # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
----
+---
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/env-setup-1.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/env-setup-1.md
@@ -1,46 +1,45 @@
 ---
-title: Environment Setup
+title: Set up your environment 
 weight: 2
 
 ### FIXED, DO NOT MODIFY
 layout: learningpathall
 ---
 
-## Overview
-This learning path (LP) is a direct follow-up to the [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) learning path. While the previous one introduced you to the core concepts and the toolchain, this one puts that knowledge into practice with a fun, real-world example. You will move from the simple [Feedforward Neural Network](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm/4-build-model) in the previous LP, to a more practical computer vision task: A tiny Rock-Paper-Scissors game, to demonstrate how these tools can be used to solve a tangible problem and run efficiently on Arm-based edge devices.
+## Set up your environment for Tiny rock-paper-scissors on Arm 
+
+This Learning Path is a direct follow-up to [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm). While the previous Learning Path introduced the core concepts and toolchain, this one puts that knowledge into practice with a small, real-world example. You move from a simple [Feedforward Neural Network](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm/4-build-model) to a practical computer vision task: a tiny rock-paper-scissors game that runs efficiently on Arm-based edge devices.
 
 You will train a lightweight CNN to classify images of the letters R, P, and S as "rock," "paper," or "scissors." The script uses a synthetic data renderer to create a large dataset of these images with various transformations and noise, eliminating the need for a massive real-world dataset.
 
 ### What is a Convolutional Neural Network (CNN)?
-A Convolutional Neural Network (CNN) is a type of deep neural network primarily used for analyzing visual imagery. Unlike traditional neural networks, CNNs are designed to process pixel data by using a mathematical operation called **convolution**. This allows them to automatically and adaptively learn spatial hierarchies of features from input images, from low-level features like edges and textures to high-level features like shapes and objects.
-
-![Image of a convolutional neural network architecture](image.png)
-[Image credits](https://medium.com/@atul_86537/learning-ml-from-first-principles-c-linux-the-rick-and-morty-way-convolutional-neural-c76c3df511f4).
+A Convolutional Neural Network (CNN) is a type of deep neural network primarily used for analyzing visual imagery. Unlike traditional neural networks, CNNs are designed to process pixel data by using a mathematical operation called convolution. This allows them to automatically and adaptively learn spatial hierarchies of features from input images, from low-level features like edges and textures to high-level features like shapes and objects.
 
-CNNs are the backbone of many modern computer vision applications, including:
+A convolutional neural network (CNN) is a deep neural network designed to analyze visual data using the *convolution* operation. CNNs learn spatial hierarchies of features - from edges and textures to shapes and objects - directly from pixels.
 
-- **Image Classification:** Identifying the main object in an image, like classifying a photo as a "cat" or "dog".
-- **Object Detection:** Locating specific objects within an image and drawing a box around them.
-- **Facial Recognition:** Identifying and verifying individuals based on their faces.
+Common CNN applications include:
 
-For the Rock-Paper-Scissors game, you'll use a tiny CNN to classify images of the letters R, P, and S as the corresponding hand gestures.
+- Image classification: identify the main object in an image, such as classifying a photo as a cat or dog
+- Object detection: locate specific objects in an image and draw bounding boxes
+- Facial recognition: identify or verify individuals based on facial features
 
+For the rock-paper-scissors game, you use a tiny CNN to classify the letters R, P, and S as the corresponding hand gestures.
 
+## Environment setup
 
-## Environment Setup
-To get started, follow the first three chapters of the [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) Learning Path. This will set up your development environment and install the necessary tools. Return to this LP once you've run the `./examples/arm/run.sh` script in the ExecuTorch repository.
+To get started, complete the first three sections of [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm). This setup prepares your development environment and installs the required tools. Return here after running the `./examples/arm/run.sh` script in the ExecuTorch repository.
 
-If you just followed the LP above, you should already have your virtual environment activated. If not, activate it using:
+If you just completed the earlier Learning Path, your virtual environment should still be active. If not, activate it:
 
 ```console
 source $HOME/executorch-venv/bin/activate
 ```
 The prompt of your terminal now has `(executorch-venv)` as a prefix to indicate the virtual environment is active.
 
-Run the commands below to install the dependencies.
+Install Python dependencies:
 
-```bash
-pip install argparse numpy pillow torch
+```console
+pip install numpy pillow torch
 ```
-You are now ready to create the model.
 
+You’re now ready to create the model.
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fine-tune-2.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fine-tune-2.md
@@ -1,20 +1,20 @@
 ---
-title: Train and Test the Rock-Paper-Scissors Model
+title: Train and Test the rock-paper-scissors Model
 weight: 3
 
 ### FIXED, DO NOT MODIFY
 layout: learningpathall
 ---
 
-## Build the Model
+## Build the model
 
-Navigate to the Arm examples directory in the ExecuTorch repository.
+Navigate to the Arm examples directory in the ExecuTorch repository:
 
 ```bash
 cd $HOME/executorch/examples/arm
 ```
 
-Using a file editor of your choice, create a file named `rps_tiny.py`, copy and paste the code shown below:
+Create a file named `rps_tiny.py` and paste the following code:
 
 ```python
 #!/usr/bin/env python3
@@ -369,24 +369,24 @@ if __name__ == "__main__":
 ```
 
 
-### About the Script
+### About the script
 The script handles the entire workflow: data generation, model training, and a simple command-line game.
 
-- **Synthetic Data Generation:** The script includes a function `render_rps()` that generates 28x28 grayscale images of the letters 'R', 'P', and 'S' with random rotations, blurs, and noise. This creates a diverse dataset that's used to train the model.
-- **Model Architecture:** The model, a TinyRPS class, is a simple Convolutional Neural Network (CNN). It uses a series of 2D convolutional layers, followed by pooling layers to reduce spatial dimensions, and finally, fully connected linear layers to produce a final prediction. This architecture is efficient and well-suited for edge devices.
-- **Training:** The script generates synthetic training and validation datasets. It then trains the CNN model using the **Adam optimizer** and **Cross-Entropy Loss**. It tracks validation accuracy and saves the best-performing model to `rps_best.pt`.
-- **ExecuTorch Export:** A key part of the script is the `export_to_pte()` function. This function uses the `torch.export module` (or a fallback) to trace the trained PyTorch model and convert it into an ExecuTorch program (`.pte`). This compiled program is highly optimized for deployment on any target hardware, for example Cortex-M or Cortex-A CPUs for embedded devices.
-- **CLI Mini-Game**: After training, you can play an interactive game. The script generates an image of your move and a random opponent's move. It then uses the trained model to classify both images and determines the winner based on the model's predictions.
+- Synthetic Data Generation: the script includes a function `render_rps()` that generates 28x28 grayscale images of the letters 'R', 'P', and 'S' with random rotations, blurs, and noise. This creates a diverse dataset that's used to train the model.
+- Model Architecture: the model, a TinyRPS class, is a simple Convolutional Neural Network (CNN). It uses a series of 2D convolutional layers, followed by pooling layers to reduce spatial dimensions, and finally, fully connected linear layers to produce a final prediction. This architecture is efficient and well-suited for edge devices.
+- Training: the script generates synthetic training and validation datasets. It then trains the CNN model using the **Adam optimizer** and **Cross-Entropy Loss**. It tracks validation accuracy and saves the best-performing model to `rps_best.pt`.
+- ExecuTorch Export: a key part of the script is the `export_to_pte()` function. This function uses the `torch.export module` (or a fallback) to trace the trained PyTorch model and convert it into an ExecuTorch program (`.pte`). This compiled program is highly optimized for deployment on any target hardware, for example Cortex-M or Cortex-A CPUs for embedded devices.
+- CLI Mini-Game: after training, you can play an interactive game. The script generates an image of your move and a random opponent's move. It then uses the trained model to classify both images and determines the winner based on the model's predictions.
 
-### Running the Script:
+## Running the Script:
 
-To train the model, export it, and play the game, run the following command:
+Train the model, export it, and play the game:
 
 ```bash
 python rps_tiny.py --epochs 8 --export --play
 ```
 
-You'll see the training progress, where the model's accuracy rapidly improves on the synthetic data.
+You’ll see training progress similar to:
 
 ```output
 == Building synthetic datasets ==
@@ -402,7 +402,8 @@ Training done.
 Loaded weights from rps_best.pt
 [export] wrote rps_tiny.pte
 ```
-After training and export, the game will start. Type rock, paper, or scissors and see the model's predictions and what your opponent played.
+
+After training and export, the game starts. Type rock, paper, or scissors, and review the model’s predictions for you and a random opponent:
 
 ```output
 === Rock–Paper–Scissors: Play vs Tiny CNN ===
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fvp-3.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fvp-3.md
@@ -6,13 +6,15 @@ weight: 4
 layout: learningpathall
 ---
 
-This section guides you through the process of compiling your trained Rock-Paper-Scissors model and running it on a simulated Arm-based edge device, the Corstone-320 Fixed Virtual Platform (FVP). This final step demonstrates the end-to-end workflow of deploying a TinyML model for on-device inference.
+## Compile and run the rock-paper-scissors model on Corstone-320 FVP
+
+This section shows how to compile your trained rock-paper-scissors model and run it on the Corstone-320 Fixed Virtual Platform (FVP), a simulated Arm-based edge device. This completes the end-to-end workflow for deploying a TinyML model for on-device inference.
 
 ## Compile and build the executable
 
-First, you'll use the Ahead-of-Time (AOT) Arm compiler to convert your PyTorch model into a format optimized for the Arm architecture and the Ethos-U NPU. This process, known as delegation, offloads parts of the neural network graph that are compatible with the NPU, allowing for highly efficient inference.
+Use the Ahead-of-Time (AoT) Arm compiler to convert your PyTorch model to an ExecuTorch program optimized for Arm and the Ethos-U NPU. This process (delegation) offloads supported parts of the neural network to the NPU for efficient inference.
 
-Set up your environment variables by running the following commands in your terminal:
+Set up environment variables:
 
 ```bash
 export ET_HOME=$HOME/executorch
@@ -34,7 +36,7 @@ You should see:
 PTE file saved as rps_tiny_arm_delegate_ethos-u85-128.pte
 ```
 
-Next, you'll build the **Ethos-U runner**, which is a bare-metal executable that includes the ExecuTorch runtime and your compiled model. This runner is what the FVP will execute. Navigate to the runner's directory and use CMake to configure the build.
+Next, build the Ethos-U runner - a bare-metal executable that includes the ExecuTorch runtime and your compiled model. Configure the build with CMake:
 
 ```bash
 cd $HOME/executorch/examples/arm/executor_runner
@@ -52,7 +54,7 @@ cmake -DCMAKE_BUILD_TYPE=Release \
       -DSYSTEM_CONFIG=Ethos_U85_SYS_DRAM_Mid
 ```
 
-You should see output similar to this, indicating a successful configuration:
+You should see configuration output similar to:
 
 ```bash
 -- *******************************************************
@@ -67,13 +69,13 @@ You should see output similar to this, indicating a successful configuration:
 -- Build files have been written to: ~/executorch/examples/arm/executor_runner/cmake-out
 ```
 
-Now, build the executable with CMake:
+Build the executable:
 
 ```bash
 cmake --build "$ET_HOME/examples/arm/executor_runner/cmake-out" -j --target arm_executor_runner
 ```
 
-### Run the Model on the FVP
+## Run the model on the FVP
 With the `arm_executor_runner` executable ready, you can now run it on the Corstone-320 FVP to see the model on a simulated Arm device.
 
 ```bash
@@ -88,11 +90,10 @@ FVP_Corstone_SSE-320 \
 ```
 
 {{% notice Note %}}
-The argument `mps4_board.visualisation.disable-visualisation=1` disables the FVP GUI. This can speed up launch time for the FVP.
+`mps4_board.visualisation.disable-visualisation=1` disables the FVP GUI and can reduce launch time
 {{% /notice %}}
 
-
-Observe the output from the FVP. You'll see messages indicating that the model file has been loaded and the inference is running. This confirms that your ExecuTorch program is successfully executing on the simulated Arm hardware.
+You should see logs indicating that the model file loads and inference begins:
 
 ```output
 telnetterminal0: Listening for serial connection on port 5000
@@ -109,9 +110,7 @@ I [executorch:EthosUBackend.cpp:116 init()] data:0x70000070
 ```
 
 {{% notice Note %}}
-The inference itself may take a longer to run with a model this size - note that this is not a reflection of actual execution time.
+Inference might take longer with a model of this size on the FVP; this does not reflect real device performance.
 {{% /notice %}}
 
-You've now successfully built, optimized, and deployed a computer vision model on a simulated Arm-based system. This hands-on exercise demonstrates the power and practicality of TinyML and ExecuTorch for resource-constrained devices.
-
-In a future learning path, you can explore comparing different model performances and inference times before and after optimization. You could also analyze CPU and memory usage during inference, providing a deeper understanding of how the ExecuTorch framework optimizes your model for edge deployment.
+You have now built, optimized, and deployed a computer vision model on a simulated Arm-based system. In a future Learning Path, you can compare performance and latency before and after optimization and analyze CPU and memory usage during inference for deeper insight into ExecuTorch on edge devices.