Skip to content

Commit 1cdc34f

Browse files
Merge pull request #2367 from madeline-underwood/training_inference
Training inference_JA to sign off
2 parents 6b84c89 + 10b5697 commit 1cdc34f

File tree

4 files changed

+68
-73
lines changed

4 files changed

+68
-73
lines changed
Lines changed: 22 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,30 @@
11
---
2-
title: Edge AI with PyTorch & ExecuTorch - Tiny Rock-Paper-Scissors on Arm
2+
title: "Edge AI on Arm: PyTorch and ExecuTorch rock-paper-scissors"
33

44
minutes_to_complete: 60
55

6-
who_is_this_for: This learning path is for machine learning developers interested in deploying TinyML models on Arm-based edge devices. You will learn how to train and deploy a machine learning model for the classic game "Rock-Paper-Scissors" on edge devices. You'll use PyTorch and ExecuTorch, frameworks designed for efficient on-device inference, to build and run a small-scale computer vision model.
7-
6+
who_is_this_for: This is an introductory topic for machine learning developers who want to deploy TinyML models on Arm-based edge devices using PyTorch and ExecuTorch.
87

98
learning_objectives:
10-
- Train a small Convolutional Neural Network (CNN) for image classification using PyTorch.
11-
- Understand how to use synthetic data generation for training a model when real-world data is limited.
12-
- Optimize and convert a PyTorch model into an ExecuTorch program (.pte) for Arm-based devices.
13-
- Run the trained model on a local machine to play an interactive mini-game, demonstrating model inference.
14-
9+
- Train a small Convolutional Neural Network (CNN) for image classification using PyTorch
10+
- Use synthetic data generation for training a model when real data is limited
11+
- Convert and optimize a PyTorch model to an ExecuTorch program (`.pte`) for Arm-based devices
12+
- Run the trained model locally as an interactive mini-game to demonstrate inference
1513

1614
prerequisites:
17-
- A basic understanding of machine learning concepts.
18-
- Familiarity with Python and the PyTorch library.
19-
- Having completed [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm).
20-
- An x86 Linux host machine or VM running Ubuntu 22.04 or higher.
15+
- Basic understanding of machine learning concepts
16+
- Familiarity with Python and the PyTorch library
17+
- Completion of the Learning Path [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm/)
18+
- An x86 Linux host machine or VM running Ubuntu 22.04 or later
2119

2220
author: Dominica Abena O. Amanfo
2321

2422
### Tags
2523
skilllevels: Introductory
2624
subjects: ML
2725
armips:
28-
- Cortex-M
29-
- Ethos-U
26+
- Cortex-M
27+
- Ethos-U
3028
tools_software_languages:
3129
- tinyML
3230
- Computer Vision
@@ -36,23 +34,21 @@ tools_software_languages:
3634
- ExecuTorch
3735

3836
operatingsystems:
39-
- Linux
37+
- Linux
4038

4139
further_reading:
42-
- resource:
43-
title: Run Llama 3 on a Raspberry Pi 5 using ExecuTorch
44-
link: /learning-paths/embedded-and-microcontrollers/rpi-llama3
45-
type: website
46-
- resource:
47-
title: ExecuTorch Examples
48-
link: https://github.com/pytorch/executorch/blob/main/examples/README.md
49-
type: website
50-
51-
40+
- resource:
41+
title: Run Llama 3 on a Raspberry Pi 5 using ExecuTorch
42+
link: /learning-paths/embedded-and-microcontrollers/rpi-llama3
43+
type: website
44+
- resource:
45+
title: ExecuTorch examples
46+
link: https://github.com/pytorch/executorch/blob/main/examples/README.md
47+
type: website
5248

5349
### FIXED, DO NOT MODIFY
5450
# ================================================================================
5551
weight: 1 # _index.md always has weight of 1 to order correctly
5652
layout: "learningpathall" # All files under learning paths have this same wrapper
5753
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
58-
---
54+
---
Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,45 @@
11
---
2-
title: Environment Setup
2+
title: Set up your environment
33
weight: 2
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
## Overview
10-
This learning path (LP) is a direct follow-up to the [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) learning path. While the previous one introduced you to the core concepts and the toolchain, this one puts that knowledge into practice with a fun, real-world example. You will move from the simple [Feedforward Neural Network](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm/4-build-model) in the previous LP, to a more practical computer vision task: A tiny Rock-Paper-Scissors game, to demonstrate how these tools can be used to solve a tangible problem and run efficiently on Arm-based edge devices.
9+
## Set up your environment for Tiny rock-paper-scissors on Arm
10+
11+
This Learning Path is a direct follow-up to [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm). While the previous Learning Path introduced the core concepts and toolchain, this one puts that knowledge into practice with a small, real-world example. You move from a simple [Feedforward Neural Network](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm/4-build-model) to a practical computer vision task: a tiny rock-paper-scissors game that runs efficiently on Arm-based edge devices.
1112

1213
You will train a lightweight CNN to classify images of the letters R, P, and S as "rock," "paper," or "scissors." The script uses a synthetic data renderer to create a large dataset of these images with various transformations and noise, eliminating the need for a massive real-world dataset.
1314

1415
### What is a Convolutional Neural Network (CNN)?
15-
A Convolutional Neural Network (CNN) is a type of deep neural network primarily used for analyzing visual imagery. Unlike traditional neural networks, CNNs are designed to process pixel data by using a mathematical operation called **convolution**. This allows them to automatically and adaptively learn spatial hierarchies of features from input images, from low-level features like edges and textures to high-level features like shapes and objects.
16-
17-
![Image of a convolutional neural network architecture](image.png)
18-
[Image credits](https://medium.com/@atul_86537/learning-ml-from-first-principles-c-linux-the-rick-and-morty-way-convolutional-neural-c76c3df511f4).
16+
A Convolutional Neural Network (CNN) is a type of deep neural network primarily used for analyzing visual imagery. Unlike traditional neural networks, CNNs are designed to process pixel data by using a mathematical operation called convolution. This allows them to automatically and adaptively learn spatial hierarchies of features from input images, from low-level features like edges and textures to high-level features like shapes and objects.
1917

20-
CNNs are the backbone of many modern computer vision applications, including:
18+
A convolutional neural network (CNN) is a deep neural network designed to analyze visual data using the *convolution* operation. CNNs learn spatial hierarchies of features - from edges and textures to shapes and objects - directly from pixels.
2119

22-
- **Image Classification:** Identifying the main object in an image, like classifying a photo as a "cat" or "dog".
23-
- **Object Detection:** Locating specific objects within an image and drawing a box around them.
24-
- **Facial Recognition:** Identifying and verifying individuals based on their faces.
20+
Common CNN applications include:
2521

26-
For the Rock-Paper-Scissors game, you'll use a tiny CNN to classify images of the letters R, P, and S as the corresponding hand gestures.
22+
- Image classification: identify the main object in an image, such as classifying a photo as a cat or dog
23+
- Object detection: locate specific objects in an image and draw bounding boxes
24+
- Facial recognition: identify or verify individuals based on facial features
2725

26+
For the rock-paper-scissors game, you use a tiny CNN to classify the letters R, P, and S as the corresponding hand gestures.
2827

28+
## Environment setup
2929

30-
## Environment Setup
31-
To get started, follow the first three chapters of the [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) Learning Path. This will set up your development environment and install the necessary tools. Return to this LP once you've run the `./examples/arm/run.sh` script in the ExecuTorch repository.
30+
To get started, complete the first three sections of [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm). This setup prepares your development environment and installs the required tools. Return here after running the `./examples/arm/run.sh` script in the ExecuTorch repository.
3231

33-
If you just followed the LP above, you should already have your virtual environment activated. If not, activate it using:
32+
If you just completed the earlier Learning Path, your virtual environment should still be active. If not, activate it:
3433

3534
```console
3635
source $HOME/executorch-venv/bin/activate
3736
```
3837
The prompt of your terminal now has `(executorch-venv)` as a prefix to indicate the virtual environment is active.
3938

40-
Run the commands below to install the dependencies.
39+
Install Python dependencies:
4140

42-
```bash
43-
pip install argparse numpy pillow torch
41+
```console
42+
pip install numpy pillow torch
4443
```
45-
You are now ready to create the model.
4644

45+
You’re now ready to create the model.

content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fine-tune-2.md

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
---
2-
title: Train and Test the Rock-Paper-Scissors Model
2+
title: Train and Test the rock-paper-scissors Model
33
weight: 3
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
## Build the Model
9+
## Build the model
1010

11-
Navigate to the Arm examples directory in the ExecuTorch repository.
11+
Navigate to the Arm examples directory in the ExecuTorch repository:
1212

1313
```bash
1414
cd $HOME/executorch/examples/arm
1515
```
1616

17-
Using a file editor of your choice, create a file named `rps_tiny.py`, copy and paste the code shown below:
17+
Create a file named `rps_tiny.py` and paste the following code:
1818

1919
```python
2020
#!/usr/bin/env python3
@@ -369,24 +369,24 @@ if __name__ == "__main__":
369369
```
370370

371371

372-
### About the Script
372+
### About the script
373373
The script handles the entire workflow: data generation, model training, and a simple command-line game.
374374

375-
- **Synthetic Data Generation:** The script includes a function `render_rps()` that generates 28x28 grayscale images of the letters 'R', 'P', and 'S' with random rotations, blurs, and noise. This creates a diverse dataset that's used to train the model.
376-
- **Model Architecture:** The model, a TinyRPS class, is a simple Convolutional Neural Network (CNN). It uses a series of 2D convolutional layers, followed by pooling layers to reduce spatial dimensions, and finally, fully connected linear layers to produce a final prediction. This architecture is efficient and well-suited for edge devices.
377-
- **Training:** The script generates synthetic training and validation datasets. It then trains the CNN model using the **Adam optimizer** and **Cross-Entropy Loss**. It tracks validation accuracy and saves the best-performing model to `rps_best.pt`.
378-
- **ExecuTorch Export:** A key part of the script is the `export_to_pte()` function. This function uses the `torch.export module` (or a fallback) to trace the trained PyTorch model and convert it into an ExecuTorch program (`.pte`). This compiled program is highly optimized for deployment on any target hardware, for example Cortex-M or Cortex-A CPUs for embedded devices.
379-
- **CLI Mini-Game**: After training, you can play an interactive game. The script generates an image of your move and a random opponent's move. It then uses the trained model to classify both images and determines the winner based on the model's predictions.
375+
- Synthetic Data Generation: the script includes a function `render_rps()` that generates 28x28 grayscale images of the letters 'R', 'P', and 'S' with random rotations, blurs, and noise. This creates a diverse dataset that's used to train the model.
376+
- Model Architecture: the model, a TinyRPS class, is a simple Convolutional Neural Network (CNN). It uses a series of 2D convolutional layers, followed by pooling layers to reduce spatial dimensions, and finally, fully connected linear layers to produce a final prediction. This architecture is efficient and well-suited for edge devices.
377+
- Training: the script generates synthetic training and validation datasets. It then trains the CNN model using the **Adam optimizer** and **Cross-Entropy Loss**. It tracks validation accuracy and saves the best-performing model to `rps_best.pt`.
378+
- ExecuTorch Export: a key part of the script is the `export_to_pte()` function. This function uses the `torch.export module` (or a fallback) to trace the trained PyTorch model and convert it into an ExecuTorch program (`.pte`). This compiled program is highly optimized for deployment on any target hardware, for example Cortex-M or Cortex-A CPUs for embedded devices.
379+
- CLI Mini-Game: after training, you can play an interactive game. The script generates an image of your move and a random opponent's move. It then uses the trained model to classify both images and determines the winner based on the model's predictions.
380380

381-
### Running the Script:
381+
## Running the Script:
382382

383-
To train the model, export it, and play the game, run the following command:
383+
Train the model, export it, and play the game:
384384

385385
```bash
386386
python rps_tiny.py --epochs 8 --export --play
387387
```
388388

389-
You'll see the training progress, where the model's accuracy rapidly improves on the synthetic data.
389+
Youll see training progress similar to:
390390

391391
```output
392392
== Building synthetic datasets ==
@@ -402,7 +402,8 @@ Training done.
402402
Loaded weights from rps_best.pt
403403
[export] wrote rps_tiny.pte
404404
```
405-
After training and export, the game will start. Type rock, paper, or scissors and see the model's predictions and what your opponent played.
405+
406+
After training and export, the game starts. Type rock, paper, or scissors, and review the model’s predictions for you and a random opponent:
406407

407408
```output
408409
=== Rock–Paper–Scissors: Play vs Tiny CNN ===

content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fvp-3.md

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,15 @@ weight: 4
66
layout: learningpathall
77
---
88

9-
This section guides you through the process of compiling your trained Rock-Paper-Scissors model and running it on a simulated Arm-based edge device, the Corstone-320 Fixed Virtual Platform (FVP). This final step demonstrates the end-to-end workflow of deploying a TinyML model for on-device inference.
9+
## Compile and run the rock-paper-scissors model on Corstone-320 FVP
10+
11+
This section shows how to compile your trained rock-paper-scissors model and run it on the Corstone-320 Fixed Virtual Platform (FVP), a simulated Arm-based edge device. This completes the end-to-end workflow for deploying a TinyML model for on-device inference.
1012

1113
## Compile and build the executable
1214

13-
First, you'll use the Ahead-of-Time (AOT) Arm compiler to convert your PyTorch model into a format optimized for the Arm architecture and the Ethos-U NPU. This process, known as delegation, offloads parts of the neural network graph that are compatible with the NPU, allowing for highly efficient inference.
15+
Use the Ahead-of-Time (AoT) Arm compiler to convert your PyTorch model to an ExecuTorch program optimized for Arm and the Ethos-U NPU. This process (delegation) offloads supported parts of the neural network to the NPU for efficient inference.
1416

15-
Set up your environment variables by running the following commands in your terminal:
17+
Set up environment variables:
1618

1719
```bash
1820
export ET_HOME=$HOME/executorch
@@ -34,7 +36,7 @@ You should see:
3436
PTE file saved as rps_tiny_arm_delegate_ethos-u85-128.pte
3537
```
3638

37-
Next, you'll build the **Ethos-U runner**, which is a bare-metal executable that includes the ExecuTorch runtime and your compiled model. This runner is what the FVP will execute. Navigate to the runner's directory and use CMake to configure the build.
39+
Next, build the Ethos-U runner - a bare-metal executable that includes the ExecuTorch runtime and your compiled model. Configure the build with CMake:
3840

3941
```bash
4042
cd $HOME/executorch/examples/arm/executor_runner
@@ -52,7 +54,7 @@ cmake -DCMAKE_BUILD_TYPE=Release \
5254
-DSYSTEM_CONFIG=Ethos_U85_SYS_DRAM_Mid
5355
```
5456

55-
You should see output similar to this, indicating a successful configuration:
57+
You should see configuration output similar to:
5658

5759
```bash
5860
-- *******************************************************
@@ -67,13 +69,13 @@ You should see output similar to this, indicating a successful configuration:
6769
-- Build files have been written to: ~/executorch/examples/arm/executor_runner/cmake-out
6870
```
6971

70-
Now, build the executable with CMake:
72+
Build the executable:
7173

7274
```bash
7375
cmake --build "$ET_HOME/examples/arm/executor_runner/cmake-out" -j --target arm_executor_runner
7476
```
7577

76-
### Run the Model on the FVP
78+
## Run the model on the FVP
7779
With the `arm_executor_runner` executable ready, you can now run it on the Corstone-320 FVP to see the model on a simulated Arm device.
7880

7981
```bash
@@ -88,11 +90,10 @@ FVP_Corstone_SSE-320 \
8890
```
8991

9092
{{% notice Note %}}
91-
The argument `mps4_board.visualisation.disable-visualisation=1` disables the FVP GUI. This can speed up launch time for the FVP.
93+
`mps4_board.visualisation.disable-visualisation=1` disables the FVP GUI and can reduce launch time
9294
{{% /notice %}}
9395

94-
95-
Observe the output from the FVP. You'll see messages indicating that the model file has been loaded and the inference is running. This confirms that your ExecuTorch program is successfully executing on the simulated Arm hardware.
96+
You should see logs indicating that the model file loads and inference begins:
9697

9798
```output
9899
telnetterminal0: Listening for serial connection on port 5000
@@ -109,9 +110,7 @@ I [executorch:EthosUBackend.cpp:116 init()] data:0x70000070
109110
```
110111

111112
{{% notice Note %}}
112-
The inference itself may take a longer to run with a model this size - note that this is not a reflection of actual execution time.
113+
Inference might take longer with a model of this size on the FVP; this does not reflect real device performance.
113114
{{% /notice %}}
114115

115-
You've now successfully built, optimized, and deployed a computer vision model on a simulated Arm-based system. This hands-on exercise demonstrates the power and practicality of TinyML and ExecuTorch for resource-constrained devices.
116-
117-
In a future learning path, you can explore comparing different model performances and inference times before and after optimization. You could also analyze CPU and memory usage during inference, providing a deeper understanding of how the ExecuTorch framework optimizes your model for edge deployment.
116+
You have now built, optimized, and deployed a computer vision model on a simulated Arm-based system. In a future Learning Path, you can compare performance and latency before and after optimization and analyze CPU and memory usage during inference for deeper insight into ExecuTorch on edge devices.

0 commit comments

Comments
 (0)