Skip to content

Commit c6d4dea

Browse files
Merge branch 'ArmDeveloperEcosystem:main' into win_perf
2 parents 9ace018 + 0991301 commit c6d4dea

25 files changed

+1131
-69
lines changed
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
---
2+
title: MLOps with Arm-hosted GitHub Runners
3+
draft: true
4+
cascade:
5+
draft: true
6+
7+
minutes_to_complete: 30
8+
9+
who_is_this_for: This is an introductory topic for software developers interested in automation for machine learning (ML) tasks.
10+
11+
learning_objectives:
12+
- Set up an Arm-hosted GitHub runner.
13+
- Train and test a PyTorch ML model with the German Traffic Sign Recognition Benchmark (GTSRB) dataset.
14+
- Use PyTorch compiled with OpenBLAS and oneDNN with Arm Compute Library to compare the performance of a trained model.
15+
- Containerize the model and push the container to DockerHub.
16+
- Automate all the steps in the ML workflow using GitHub Actions.
17+
18+
prerequisites:
19+
- A GitHub account with access to Arm-hosted GitHub runners.
20+
- A Docker Hub account for storing container images.
21+
- Some familiarity with ML and continuous integration and deployment (CI/CD) concepts.
22+
23+
author_primary: Pareena Verma, Annie Tallund
24+
25+
### Tags
26+
skilllevels: Introductory
27+
subjects: CI-CD
28+
armips:
29+
- Neoverse
30+
tools_software_languages:
31+
- Python
32+
- PyTorch
33+
- ACL
34+
- GitHub
35+
operatingsystems:
36+
- Linux
37+
38+
39+
### FIXED, DO NOT MODIFY
40+
# ================================================================================
41+
weight: 1 # _index.md always has weight of 1 to order correctly
42+
layout: "learningpathall" # All files under learning paths have this same wrapper
43+
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
44+
---
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
---
2+
next_step_guidance: Thank you for completing the learning path on running MLOps with Arm-hosted GitHub runners. You might be interested in learning how to build Arm images and multi-architecture images with these Arm-hosted runners.
3+
4+
recommended_path: /learning-paths/cross-platform/github-arm-runners
5+
6+
further_reading:
7+
- resource:
8+
title: Arm64 on GitHub Actions - Powering faster, more efficient build systems
9+
link: https://github.blog/news-insights/product-news/arm64-on-github-actions-powering-faster-more-efficient-build-systems/
10+
type: blog
11+
- resource:
12+
title: Arm Compute Library
13+
link: https://github.com/ARM-software/ComputeLibrary
14+
type: website
15+
- resource:
16+
title: Streamlining your MLOps pipeline with GitHub Actions and Arm64 runners
17+
link: https://github.blog/enterprise-software/ci-cd/streamlining-your-mlops-pipeline-with-github-actions-and-arm64-runners/
18+
type: blog
19+
20+
21+
# ================================================================================
22+
# FIXED, DO NOT MODIFY
23+
# ================================================================================
24+
weight: 21 # set to always be larger than the content in this path, and one more than 'review'
25+
title: "Next Steps" # Always the same
26+
layout: "learningpathall" # All files under learning paths have this same wrapper
27+
---
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
review:
3+
- questions:
4+
question: >
5+
Can Arm-hosted runners be used with GitHub Actions?
6+
answers:
7+
- "Yes"
8+
- "No"
9+
correct_answer: 1
10+
explanation: >
11+
Arm-hosted runners for use with GitHub Actions are available for Linux and Windows.
12+
13+
- questions:
14+
question: >
15+
What is the GTSRB dataset made up of?
16+
answers:
17+
- Sound files of spoken German words
18+
- Sound files of animal sounds
19+
- Images of flower petals
20+
- Images of German traffic signs
21+
correct_answer: 4
22+
explanation: >
23+
GTSRB stands for German Traffic Signs Recognition Benchmark
24+
25+
- questions:
26+
question: >
27+
ACL is included in PyTorch.
28+
answers:
29+
- "True"
30+
- "False"
31+
correct_answer: 1
32+
explanation: >
33+
While it is possible to use ACL stand-alone, the optimized kernels are built into PyTorch through the oneDNN backend.
34+
35+
36+
37+
# ================================================================================
38+
# FIXED, DO NOT MODIFY
39+
# ================================================================================
40+
title: "Review" # Always the same title
41+
weight: 20 # Set to always be larger than the content in this path
42+
layout: "learningpathall" # All files under learning paths have this same wrapper
43+
---
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
title: MLOps background
3+
weight: 2
4+
5+
### FIXED, DO NOT MODIFY
6+
layout: learningpathall
7+
---
8+
9+
## Overview
10+
11+
In this Learning Path, you will learn how to automate an MLOps workflow using Arm-hosted GitHub runners and GitHub Actions.
12+
13+
You will learn how to do the following tasks:
14+
- Train and test a neural network model with PyTorch.
15+
- Compare the model inference time using two different PyTorch backends.
16+
- Containerize the model and save it to DockerHub.
17+
- Deploy the container image and use API calls to access the model.
18+
19+
## GitHub Actions
20+
21+
GitHub Actions is a platform that automates software development workflows, including continuous integration and continuous delivery. Every repository on GitHub has an `Actions` tab as shown below:
22+
23+
![#actions-gui](images/actions-gui.png)
24+
25+
GitHub Actions runs workflow files to automate processes. Workflows run when specific events occur in a GitHub repository.
26+
27+
[YAML](https://yaml.org/) defines a workflow.
28+
29+
Workflows specify how a job is triggered, the running environment, and the commands to run.
30+
31+
The machine running workflows is called a _runner_.
32+
33+
## Arm-hosted GitHub runners
34+
35+
Hosted GitHub runners are provided by GitHub so you don't need to setup and manage cloud infrastructure. Arm-hosted GitHub runners use the Arm architecture so you can build and test software without cross-compiling or instruction emulation.
36+
37+
Arm-hosted GitHub runners enable you to optimize your workflows, reduce cost, and improve energy consumption.
38+
39+
Additionally, the Arm-hosted runners are preloaded with essential tools, making it easier for you to develop and test your applications.
40+
41+
Arm-hosted runners are available for Linux and Windows. This Learning Path uses Linux.
42+
43+
{{% notice Note %}}
44+
You must have a Team or Enterprise Cloud plan to use Arm-hosted runners.
45+
{{% /notice %}}
46+
47+
Getting started with Arm-hosted GitHub runners is straightforward. Follow the steps in [Create a new Arm-hosted runner](/learning-paths/cross-platform/github-arm-runners/runner/#how-can-i-create-an-arm-hosted-runner) to create a runner in your organization.
48+
49+
Once you have created the runner, use the `runs-on` syntax in your GitHub Actions workflow file to execute the workflow on Arm.
50+
51+
Below is an example workflow that executes on an Arm-hosted runner named `ubuntu-22.04-arm-os`:
52+
53+
```yaml
54+
name: Example workflow
55+
on:
56+
workflow_dispatch:
57+
jobs:
58+
example-job:
59+
name: Example Job
60+
runs-on: ubuntu-22.04-arm-os # Custom ARM64 runner
61+
steps:
62+
- name: Example step
63+
run: echo "This line runs on Arm!"
64+
```
65+
66+
67+
## Machine Learning Operations (MLOps)
68+
69+
Machine learning use-cases have a need for reliable workflows to maintain performance and quality.
70+
71+
There are many tasks that can be automated in the ML lifecycle.
72+
- Model training and re-training
73+
- Model performance analysis
74+
- Data storage and processing
75+
- Model deployment
76+
77+
Developer Operations (DevOps) refers to good practices for collaboration and automation, including CI/CD. The domain-specific needs for ML, combined with DevOps knowledge, creates the new term MLOps.
78+
79+
## German Traffic Sign Recognition Benchmark (GTSRB)
80+
81+
This Learning Path explains how to train and test a PyTorch model to perform traffic sign recognition.
82+
83+
You will learn how to use the GTSRB dataset to train the model. The dataset is free to use under the [Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/) license. It contains thousands of images of traffic signs found in Germany. It has become a well-known resource to showcase ML applications.
84+
85+
The GTSRB dataset is also good for comparing performance and accuracy of different models and to compare and contrast different PyTorch backends.
86+
87+
Continue to the next section to learn how to setup an end-to-end MLOps workflow using Arm-hosted GitHub runners.
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
---
2+
title: Compare the performance of PyTorch backends
3+
weight: 5
4+
5+
### FIXED, DO NOT MODIFY
6+
layout: learningpathall
7+
---
8+
9+
Continuously monitoring the performance of your machine learning models in production is crucial to maintaining effectiveness over time. The performance of your ML model can change due to various factors ranging from data-related issues to environmental factors.
10+
11+
In this section, you will change the PyTorch backend being used to test the trained model. You will learn how to measure and continuously monitor the inference performance using your workflow.
12+
13+
## OneDNN with Arm Compute Library (ACL)
14+
15+
In the previous section, you used the PyTorch 2.3.0 Docker Image compiled with OpenBLAS from DockerHub to run your testing workflow. PyTorch can be run with other backends. You will now modify the testing workflow to use PyTorch 2.3.0 Docker Image compiled with OneDNN and the Arm Compute Library.
16+
17+
The [Arm Compute Library](https://github.com/ARM-software/ComputeLibrary) is a collection of low-level machine learning functions optimized for Arm's Cortex-A and Neoverse processors and Mali GPUs. Arm-hosted GitHub runners use Arm Neoverse CPUs, which make it possible to optimize your neural networks to take advantage of processor features. ACL implements kernels (also known as operators or layers), using specific instructions that run faster on AArch64.
18+
19+
ACL is integrated into PyTorch through [oneDNN](https://github.com/oneapi-src/oneDNN), an open-source deep neural network library.
20+
21+
## Modify the test workflow and compare results
22+
23+
Two different PyTorch docker images for Arm Neoverse CPUs are available on [DockerHub](https://hub.docker.com/r/armswdev/pytorch-arm-neoverse).
24+
25+
Up until this point, you used the `r24.07-torch-2.3.0-openblas` container image to run workflows. The oneDNN container image is also available to use in workflows. These images represent two different PyTorch backends which handle the PyTorch model execution.
26+
27+
### Change the Docker image to use oneDNN
28+
29+
In your browser, open and edit the file `.github/workflows/test_model.yml`.
30+
31+
Update the `container.image` parameter to `armswdev/pytorch-arm-neoverse:r24.07-torch-2.3.0-onednn-acl` and save the file by committing the change to the main branch:
32+
33+
```yaml
34+
jobs:
35+
test-model:
36+
name: Test the Model
37+
runs-on: ubuntu-22.04-arm-os # Custom ARM64 runner
38+
container:
39+
image: armswdev/pytorch-arm-neoverse:r24.07-torch-2.3.0-onednn-acl
40+
options: --user root
41+
# Steps omitted
42+
```
43+
44+
### Run the test workflow
45+
46+
Trigger the **Test Model** job again by clicking the `Run workflow` button on the `Actions` tab.
47+
48+
The test workflow starts running.
49+
50+
Navigate to the workflow run on the `Actions` tab, click into the job, and expand the **Run testing script** step.
51+
52+
You see a change in the performance results with OneDNN and ACL kernels being used.
53+
54+
The output is similar to:
55+
56+
```output
57+
Accuracy of the model on the test images: 90.48%
58+
--------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
59+
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
60+
--------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
61+
model_inference 4.63% 304.000us 100.00% 6.565ms 6.565ms 1
62+
aten::conv2d 0.18% 12.000us 56.92% 3.737ms 1.869ms 2
63+
aten::convolution 0.30% 20.000us 56.74% 3.725ms 1.863ms 2
64+
aten::_convolution 0.43% 28.000us 56.44% 3.705ms 1.853ms 2
65+
aten::mkldnn_convolution 47.02% 3.087ms 55.48% 3.642ms 1.821ms 2
66+
aten::max_pool2d 0.15% 10.000us 25.51% 1.675ms 837.500us 2
67+
aten::max_pool2d_with_indices 25.36% 1.665ms 25.36% 1.665ms 832.500us 2
68+
aten::linear 0.18% 12.000us 9.26% 608.000us 304.000us 2
69+
aten::clone 0.26% 17.000us 9.08% 596.000us 149.000us 4
70+
aten::addmm 8.50% 558.000us 8.71% 572.000us 286.000us 2
71+
--------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
72+
Self CPU time total: 6.565ms
73+
```
74+
75+
For the ACL results, notice that the **Self CPU time total** is lower compared to the OpenBLAS run in the previous section.
76+
77+
The names of the layers have also changed, where the `aten::mkldnn_convolution` is the kernel optimized to run on the Arm architecture. That operator is the main reason the inference time is improved, made possible by using ACL kernels.
78+
79+
In the next section, you will learn how to automate the deployment of your model.

0 commit comments

Comments
 (0)