second Learning Path: First commit

dominica-of · dominica-of · commit 91d69bea86d1 · 2025-02-12T16:30:57.000Z
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/_index.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/_index.md
@@ -0,0 +1,58 @@
+---
+title: Deploying DistilBERT on Arm - Training and Inference with PyTorch and ExecuTorch
+
+minutes_to_complete: 120
+
+who_is_this_for: This topic is for machine learning engineers, embedded AI developers, and researchers interested in deploying TinyML models for NLP on Arm-based edge devices using PyTorch and ExecuTorch. 
+
+learning_objectives: 
+    - Fine-tune a DistilBERT model for sentiment analysis using PyTorch.
+    - Optimize and convert the model using ExecuTorch for Arm-based edge devices.
+    - Deploy and run inference on the Corstone-320 FVP and Raspberry Pi 5.
+
+
+prerequisites:
+   - Basic knowledge of machine learning concepts. 
+   - It is advised to complete The Learning Path [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) before starting this learning path. 
+   - Familiarity with Python and PyTorch.
+   - A Linux host machine or VM running Ubuntu 22.04 or higher.
+   - (Optional) Raspberry Pi 5 or an Arm license to run the Corstone-320 Fixed Virtual Platform (FVP), for hands-on deployment.  
+
+
+author: Dominica Abena O. Amanfo
+
+### Tags
+skilllevels: Intermediate 
+subjects: ML
+armips:
+    - Cortex-A
+tools_software_languages:
+    - tinyML 
+    - Transformers 
+    - PyTorch
+    - ExecuTorch
+    - Raspberry Pi
+    
+operatingsystems:
+    - Linux
+    - Raspberry Pi OS
+
+
+further_reading:
+    - resource:
+        title: Run Llama 3 on a Raspberry Pi 5 using ExecuTorch 
+        link: /learning-paths/embedded-and-microcontrollers/rpi-llama3
+        type: website
+    - resource:
+        title: ExecuTorch Examples
+        link: https://github.com/pytorch/executorch/blob/main/examples/README.md
+        type: website
+
+
+
+### FIXED, DO NOT MODIFY
+# ================================================================================
+weight: 1                       # _index.md always has weight of 1 to order correctly
+layout: "learningpathall"       # All files under learning paths have this same wrapper
+learning_path_main_page: "yes"  # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
+---
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/_next-steps.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/_next-steps.md
@@ -0,0 +1,8 @@
+---
+# ================================================================================
+#       FIXED, DO NOT MODIFY THIS FILE
+# ================================================================================
+weight: 21                  # Set to always be larger than the content in this path to be at the end of the navigation.
+title: "Next Steps"         # Always the same, html page title.
+layout: "learningpathall"   # All files under learning paths have this same wrapper for Hugo processing.
+---
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/env-setup-1.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/env-setup-1.md
@@ -0,0 +1,32 @@
+---
+title: Environment Setup
+weight: 2
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+## Overview 
+#TODO: Add intro on Distil
+
+In this course, you will learn how to train and run inference using DistilBERT. You'll deploy the model on the Arm Corstone-320 FVP and optionally on a Raspberry Pi 5 for sentiment analysis.
+
+## Environment Setup
+Setup your development environment for TinyML by following the first 3 chapters of the [Introduction to TinyML on Arm using PyTorch and ExecuTorch](/learning-paths/embedded-and-microcontrollers/introduction-to-tinyml-on-arm) Learning Path (LP).
+
+
+If you just followed the LP above, you should already have your virtual environment activated. If not, activate it using: 
+
+```console
+source $HOME/executorch-venv/bin/activate
+```
+The prompt of your terminal now has `(executorch)` as a prefix to indicate the virtual environment is active.
+
+Run the commands below to install the dependencies.
+
+```bash
+pip install transformers datasets torch
+```
+You are now ready to fine-tune the model
+
+
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fine-tune-2.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fine-tune-2.md
@@ -0,0 +1,117 @@
+---
+title: Fine-Tune DistilBERT
+weight: 3
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+## Fine-Tune the Model
+
+Using a file editor of your choice, create a file named distilbert-sentiment-analysis.py with the code shown below:
+
+```python
+from transformers import DistilBERTTokenizerFast, DistilBertForSequenceClassification, Trainer, TrainingArguments
+from datasets import load_dataset
+
+# Load dataset and tokenizer
+dataset = load_dataset("imdb")
+tokenizer = DistilBERTTokenizerFast.from_pretrained("distilbert-base-uncased")
+
+def tokenize(batch):
+    return tokenizer(batch["text"], padding=True, truncation=True)
+
+# Tokenize data
+dataset = dataset.map(tokenize, batched=True)
+
+# Load pretrained model
+model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")
+
+# Training arguments
+training_args = TrainingArguments(
+    output_dir="./results",
+    evaluation_strategy="epoch",
+    per_device_train_batch_size=16,
+    per_device_eval_batch_size=16,
+    num_train_epochs=2,
+)
+
+# Trainer
+trainer = Trainer(
+    model=model,
+    args=training_args,
+    train_dataset=dataset["train"].shuffle().select(range(2000)),
+    eval_dataset=dataset["test"].shuffle().select(range(500))
+)
+
+trainer.train()
+
+# Save model
+model.save_pretrained("distilbert_sentiment")
+```
+
+#TODO: Talk about what the example does.
+
+
+Run the model using:
+```bash
+python distilbert-sentiment-analysis.py
+```
+
+The output should look like:
+```bash
+#TODO add output
+```
+You are now ready to optimize and convert the model using ExecuTorch.
+
+
+## Compile and build the executable
+
+Start by setting some environment variables that are used by ExecuTorch.
+
+```bash
+export ET_HOME=$HOME/executorch
+export executorch_DIR=$ET_HOME/build
+```
+
+Then, generate a `.pte` file using the Arm examples. The Ahead-of-Time (AoT) Arm compiler will enable optimizations for devices like the Raspberry Pi and the Corstone-320 FVP. Run it from the ExecuTorch root directory.
+
+Navigate to the root directory using:
+
+```bash
+cd ../../../
+```
+You are now in $HOME/executorch and ready to create the model file for ExecuTorch.
+
+
+```bash
+cd $ET_HOME
+python -m examples.arm.aot_arm_compiler --model_name=examples/arm/distilbert-sentiment-analysis.py \
+--delegate --quantize --target=ethos-u85-256 \
+--so_library=cmake-out-aot-lib/kernels/quantized/libquantized_ops_aot_lib.so \
+--system_config=Ethos_U85_SYS_DRAM_Mid --memory_mode=Sram_Only
+```
+
+From the Arm Examples directory, you build an embedded Arm runner with the `.pte` included. This allows you to get the most performance out of your model, and ensures compatibility with the CPU kernels on the FVP. Finally, generate the executable `arm_executor_runner`.
+
+```bash
+cd $HOME/executorch/examples/arm/executor_runner
+
+
+cmake -DCMAKE_BUILD_TYPE=Release \
+-DCMAKE_TOOLCHAIN_FILE=$ET_HOME/examples/arm/ethos-u-setup/arm-none-eabi-gcc.cmake \
+-DTARGET_CPU=cortex-m85 \
+-DET_DIR_PATH:PATH=$ET_HOME/ \
+-DET_BUILD_DIR_PATH:PATH=$ET_HOME/cmake-out \
+-DET_PTE_FILE_PATH:PATH=$ET_HOME/simple_nn_arm_delegate_ethos-u85-256.pte \
+-DETHOS_SDK_PATH:PATH=$ET_HOME/examples/arm/ethos-u-scratch/ethos-u \
+-DETHOSU_TARGET_NPU_CONFIG=ethos-u85-256 \
+-DPYTHON_EXECUTABLE=$HOME/executorch-venv/bin/python3 \
+-DSYSTEM_CONFIG=Ethos_U85_SYS_DRAM_Mid  \
+-B $ET_HOME/examples/arm/executor_runner/cmake-out
+
+cmake --build $ET_HOME/examples/arm/executor_runner/cmake-out --parallel -- arm_executor_runner
+
+```
+
+Now, you can run the model on the Corstone-320 FVP.
diff --git a/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fvp-3.md b/content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fvp-3.md
@@ -0,0 +1,48 @@
+---
+title: Deploy the model on Corstone-320 FVP  
+weight: 4
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+Now run the model on the Corstone-320 with the following command:
+
+```bash
+FVP_Corstone_SSE-320 \
+-C mps4_board.subsystem.ethosu.num_macs=256 \
+-C mps4_board.visualisation.disable-visualisation=1 \
+-C vis_hdlcd.disable_visualisation=1                \
+-C mps4_board.telnetterminal0.start_telnet=0        \
+-C mps4_board.uart0.out_file='-'                    \
+-C mps4_board.uart0.shutdown_on_eot=1               \
+-a "$ET_HOME/examples/arm/executor_runner/cmake-out/arm_executor_runner"
+```
+
+{{% notice Note %}}
+
+The argument `mps4_board.visualisation.disable-visualisation=1` disables the FVP GUI. This can speed up launch time for the FVP.
+
+{{% /notice %}}
+
+
+#todo: VERIFY
+
+Observe that the FVP loads the model file.
+```output
+telnetterminal0: Listening for serial connection on port 5000
+telnetterminal1: Listening for serial connection on port 5001
+telnetterminal2: Listening for serial connection on port 5002
+telnetterminal5: Listening for serial connection on port 5003
+I [executorch:arm_executor_runner.cpp:412] Model in 0x70000000 $
+I [executorch:arm_executor_runner.cpp:414] Model PTE file loaded. Size: 3360 bytes.
+```
+
+You've now ......
+
+
+
+
+
+IGNORE anything BELOW:
+