Skip to content

Commit e476ff3

Browse files
authored
Merge pull request #1344 from madeline-underwood/MLOps_GH_runners
MLOps_GitHub_Runners_KB_to_review
2 parents c437337 + 0b49dbd commit e476ff3

File tree

6 files changed

+66
-58
lines changed

6 files changed

+66
-58
lines changed

content/learning-paths/servers-and-cloud-computing/gh-runners/_index.md

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,23 @@
11
---
2-
title: MLOps with Arm-hosted GitHub Runners
3-
draft: true
2+
title: Optimize MLOps with Arm-hosted GitHub Runners
3+
44
cascade:
5-
draft: true
65

7-
minutes_to_complete: 30
6+
minutes_to_complete: 60
87

9-
who_is_this_for: This is an introductory topic for software developers interested in automation for machine learning (ML) tasks.
8+
who_is_this_for: This is an introductory topic for software developers interested in automation for Machine Learning (ML) tasks.
109

1110
learning_objectives:
1211
- Set up an Arm-hosted GitHub runner.
1312
- Train and test a PyTorch ML model with the German Traffic Sign Recognition Benchmark (GTSRB) dataset.
14-
- Use PyTorch compiled with OpenBLAS and oneDNN with Arm Compute Library to compare the performance of a trained model.
15-
- Containerize the model and push the container to DockerHub.
16-
- Automate all the steps in the ML workflow using GitHub Actions.
13+
- Compare the performance of two trained PyTorch ML models; one model compiled with OpenBLAS (Open Basic Linear Algebra Subprograms Library) and oneDNN (Deep Neural Network Library), and the other model compiled with Arm Compute Library (ACL).
14+
- Containerize a ML model and push the container to DockerHub.
15+
- Automate steps in an ML workflow using GitHub Actions.
1716

1817
prerequisites:
1918
- A GitHub account with access to Arm-hosted GitHub runners.
2019
- A Docker Hub account for storing container images.
21-
- Some familiarity with ML and continuous integration and deployment (CI/CD) concepts.
20+
- Familiarity with the concepts of ML and continuous integration and deployment (CI/CD).
2221

2322
author_primary: Pareena Verma, Annie Tallund
2423

content/learning-paths/servers-and-cloud-computing/gh-runners/_review.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,29 +8,29 @@ review:
88
- "No"
99
correct_answer: 1
1010
explanation: >
11-
Arm-hosted runners for use with GitHub Actions are available for Linux and Windows.
11+
You can use Arm-hosted runners with GitHub Actions, and they are available for both Linux and Windows.
1212
1313
- questions:
1414
question: >
15-
What is the GTSRB dataset made up of?
15+
What does the GTSRB dataset consist of?
1616
answers:
17-
- Sound files of spoken German words
18-
- Sound files of animal sounds
19-
- Images of flower petals
20-
- Images of German traffic signs
17+
- Sound files of spoken German words.
18+
- Sound files of animal sounds.
19+
- Images of flower petals.
20+
- Images of German traffic signs.
2121
correct_answer: 4
2222
explanation: >
23-
GTSRB stands for German Traffic Signs Recognition Benchmark
23+
GTSRB stands for German Traffic Signs Recognition Benchmark, and the dataset consists of images of German traffic signs.
2424
2525
- questions:
2626
question: >
27-
ACL is included in PyTorch.
27+
Is ACL included in PyTorch?
2828
answers:
2929
- "True"
3030
- "False"
3131
correct_answer: 1
3232
explanation: >
33-
While it is possible to use ACL stand-alone, the optimized kernels are built into PyTorch through the oneDNN backend.
33+
While it is possible to use Arm Compute Library independently, the optimized kernels are built into PyTorch through the oneDNN backend.
3434
3535
3636

content/learning-paths/servers-and-cloud-computing/gh-runners/background.md

Lines changed: 26 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,33 +10,43 @@ layout: learningpathall
1010

1111
In this Learning Path, you will learn how to automate an MLOps workflow using Arm-hosted GitHub runners and GitHub Actions.
1212

13-
You will learn how to do the following tasks:
13+
You will perform the following tasks:
1414
- Train and test a neural network model with PyTorch.
1515
- Compare the model inference time using two different PyTorch backends.
1616
- Containerize the model and save it to DockerHub.
1717
- Deploy the container image and use API calls to access the model.
1818

1919
## GitHub Actions
2020

21-
GitHub Actions is a platform that automates software development workflows, including continuous integration and continuous delivery. Every repository on GitHub has an `Actions` tab as shown below:
21+
GitHub Actions is a platform that automates software development workflows, which includes Continuous Integration and Continuous Delivery (CI/CD).
22+
23+
Every repository on GitHub has an **Actions** tab as shown below:
2224

2325
![#actions-gui](images/actions-gui.png)
2426

2527
GitHub Actions runs workflow files to automate processes. Workflows run when specific events occur in a GitHub repository.
2628

2729
[YAML](https://yaml.org/) defines a workflow.
2830

29-
Workflows specify how a job is triggered, the running environment, and the commands to run.
31+
Workflows specify:
32+
33+
* How a job is triggered.
34+
* The running environment.
35+
* The commands to run.
3036

31-
The machine running workflows is called a _runner_.
37+
The machine running the workflows is called a _runner_.
3238

3339
## Arm-hosted GitHub runners
3440

35-
Hosted GitHub runners are provided by GitHub so you don't need to setup and manage cloud infrastructure. Arm-hosted GitHub runners use the Arm architecture so you can build and test software without cross-compiling or instruction emulation.
41+
Hosted GitHub runners are provided by GitHub, so you do not need to set up and manage cloud infrastructure. Arm-hosted GitHub runners use the Arm architecture so you can build and test software without the necessity for cross-compiling or instruction emulation.
42+
43+
Arm-hosted GitHub runners enable you to:
3644

37-
Arm-hosted GitHub runners enable you to optimize your workflows, reduce cost, and improve energy consumption.
45+
* Optimize your workflows.
46+
* Reduce cost.
47+
* Improve energy consumption.
3848

39-
Additionally, the Arm-hosted runners are preloaded with essential tools, making it easier for you to develop and test your applications.
49+
Additionally, the Arm-hosted runners are preloaded with essential tools, which makes it easier for to develop and test your applications.
4050

4151
Arm-hosted runners are available for Linux and Windows. This Learning Path uses Linux.
4252

@@ -66,22 +76,22 @@ jobs:
6676
6777
## Machine Learning Operations (MLOps)
6878
69-
Machine learning use-cases have a need for reliable workflows to maintain performance and quality.
79+
Machine learning use cases require reliable workflows to maintain both performance and quality of output.
7080
71-
There are many tasks that can be automated in the ML lifecycle.
72-
- Model training and re-training
73-
- Model performance analysis
74-
- Data storage and processing
75-
- Model deployment
81+
There are tasks that can be automated in the ML lifecycle, such as:
82+
- Model training and retraining.
83+
- Model performance analysis.
84+
- Data storage and processing.
85+
- Model deployment.
7686
77-
Developer Operations (DevOps) refers to good practices for collaboration and automation, including CI/CD. The domain-specific needs for ML, combined with DevOps knowledge, creates the new term MLOps.
87+
Developer Operations (DevOps) refers to good practices for collaboration and automation, including CI/CD. MLOps describes the area of practice where the ML application development intersects with ML system deployment and operations.
7888
7989
## German Traffic Sign Recognition Benchmark (GTSRB)
8090
8191
This Learning Path explains how to train and test a PyTorch model to perform traffic sign recognition.
8292
8393
You will learn how to use the GTSRB dataset to train the model. The dataset is free to use under the [Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/) license. It contains thousands of images of traffic signs found in Germany. It has become a well-known resource to showcase ML applications.
8494
85-
The GTSRB dataset is also good for comparing performance and accuracy of different models and to compare and contrast different PyTorch backends.
95+
The GTSRB dataset is also effective for comparing the performance and accuracy of both different models, and different PyTorch backends.
8696
87-
Continue to the next section to learn how to setup an end-to-end MLOps workflow using Arm-hosted GitHub runners.
97+
Continue to the next section to learn how to set up an end-to-end MLOps workflow using Arm-hosted GitHub runners.

content/learning-paths/servers-and-cloud-computing/gh-runners/compare-performance.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ In this section, you will change the PyTorch backend being used to test the trai
1414

1515
In the previous section, you used the PyTorch 2.3.0 Docker Image compiled with OpenBLAS from DockerHub to run your testing workflow. PyTorch can be run with other backends. You will now modify the testing workflow to use PyTorch 2.3.0 Docker Image compiled with OneDNN and the Arm Compute Library.
1616

17-
The [Arm Compute Library](https://github.com/ARM-software/ComputeLibrary) is a collection of low-level machine learning functions optimized for Arm's Cortex-A and Neoverse processors and Mali GPUs. Arm-hosted GitHub runners use Arm Neoverse CPUs, which make it possible to optimize your neural networks to take advantage of processor features. ACL implements kernels (also known as operators or layers), using specific instructions that run faster on AArch64.
17+
The [Arm Compute Library](https://github.com/ARM-software/ComputeLibrary) is a collection of low-level machine learning functions optimized for Arm's Cortex-A and Neoverse processors and Mali GPUs. Arm-hosted GitHub runners use Arm Neoverse CPUs, which make it possible to optimize your neural networks to take advantage of processor features. ACL implements kernels, which are also known as operators or layers, using specific instructions that run faster on AArch64.
1818

1919
ACL is integrated into PyTorch through [oneDNN](https://github.com/oneapi-src/oneDNN), an open-source deep neural network library.
2020

@@ -43,11 +43,11 @@ jobs:
4343

4444
### Run the test workflow
4545

46-
Trigger the **Test Model** job again by clicking the `Run workflow` button on the `Actions` tab.
46+
Trigger the **Test Model** job again by clicking the **Run workflow** button on the **Actions** tab.
4747

4848
The test workflow starts running.
4949

50-
Navigate to the workflow run on the `Actions` tab, click into the job, and expand the **Run testing script** step.
50+
Navigate to the workflow run on the **Actions** tab, click into the job, and expand the **Run testing script** step.
5151

5252
You see a change in the performance results with OneDNN and ACL kernels being used.
5353

content/learning-paths/servers-and-cloud-computing/gh-runners/train-test.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -6,30 +6,29 @@ weight: 3
66
layout: learningpathall
77
---
88

9-
In this section, you will fork the example GitHub repository containing the project code and inspect the Python code for training and testing a neural network model.
10-
119
## Fork the example repository
1210

13-
Get started by forking the example repository.
11+
In this section, you will fork the example GitHub repository containing the project code.
1412

15-
In a web browser, navigate to the repository at:
13+
Get started by forking the example repository. In a web browser, navigate to the repository at:
1614

1715
```bash
1816
https://github.com/Arm-Labs/gh_armrunner_mlops_gtsrb
1917
```
20-
21-
Fork the repository, using the `Fork` button:
18+
Fork the repository, using the **Fork** button:
2219

2320
![#fork](/images/fork.png)
2421

2522
Create a fork within a GitHub Organization or Team where you have access to Arm-hosted GitHub runners.
2623

2724
{{% notice Note %}}
28-
If a repository with the same name `gh_armrunner_mlops_gtsrb` already exists in your Organization or Team you modify the repository name to make it unique.
25+
If a repository with the same name `gh_armrunner_mlops_gtsrb` already exists in your Organization or Team, you can modify the repository name to make it unique.
2926
{{% /notice %}}
3027

3128
## Learn about model training and testing
3229

30+
In this section, you will inspect the Python code for training and testing a neural network model.
31+
3332
Explore the repository using a browser to get familiar with code and the workflow files.
3433

3534
{{% notice Note %}}
@@ -42,13 +41,13 @@ The purpose is to provide an overview of the code used for training and testing
4241

4342
In the `scripts` directory, there is a Python script called `train_model.py`. This script loads the GTSRB dataset, defines a neural network, and trains the model on the dataset.
4443

45-
#### Data pre-processing
44+
#### Data preprocessing
4645

4746
The first section loads the GTSRB dataset to prepare it for training. The GTSRB dataset is built into `torchvision`, which makes loading easier.
4847

49-
The transformations used when loading data are part of the pre-processing step, which makes the data uniform and ready to run through the extensive math operations of the ML model.
48+
The transformations used when loading data are part of the preprocessing step, which makes the data uniform and ready to run through the extensive math operations of the ML model.
5049

51-
In accordance with common machine learning practices, data is separated into training and testing data to avoid over-fitting the neural network.
50+
In accordance with common machine learning practices, data is separated into training and testing data to avoid overfitting the neural network.
5251

5352
Here is the code to load the dataset:
5453

@@ -67,9 +66,9 @@ train_loader = DataLoader(train_set, batch_size=64, shuffle=True)
6766

6867
The next step is to define a class for the model, listing the layers used.
6968

70-
The model defines the forward-pass function used at training time to update the weights. Additionally, the loss function and optimizer for the model are defined.
69+
The model defines the forward pass function used at training time to update the weights. Additionally, the loss function and optimizer for the model are defined.
7170

72-
Here is the code defining the model:
71+
Here is the code that defines the model:
7372

7473
```python
7574
class TrafficSignNet(nn.Module):
@@ -167,7 +166,7 @@ test_loader = DataLoader(test_set, batch_size=64, shuffle=False)
167166

168167
The testing loop passes each batch of test data through the model and compares predictions to the actual labels to calculate accuracy.
169168

170-
The accuracy is calculated as a percentage of correctly classified images. Both the accuracy and PyTorch profiler report is printed at the end of the script.
169+
The accuracy is calculated as a percentage of correctly classified images. Both the accuracy and PyTorch profiler reports are printed at the end of the script.
171170

172171
Here is the testing loop with profiling:
173172

content/learning-paths/servers-and-cloud-computing/gh-runners/workflows.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -135,23 +135,23 @@ The `test-model.yml` file needs to be edited to be able to use the saved model f
135135

136136
Complete the steps below to modify the testing workflow file:
137137

138-
1. Navigate to the `Actions` tab on your GitHub repository.
138+
1. Navigate to the **Actions** tab on your GitHub repository.
139139

140-
2. Click on `Train Model` on the left side of the page.
140+
2. Click on **Train Model** on the left side of the page.
141141

142-
3. Click on the completed `Train Model` workflow.
142+
3. Click on the completed **Train Model** workflow.
143143

144-
4. Copy the The 11 digit ID number from the end of the URL in your browser address bar.
144+
4. Copy the 11-digit ID number from the end of the URL in your browser address bar.
145145

146146
![#run-id](/images/run-id.png)
147147

148-
5. Navigate back to the `Code` tab and open the file `.github/workflows/test-model.yml`.
148+
5. Navigate back to the **Code** tab and open the file `.github/workflows/test-model.yml`.
149149

150150
6. Click the Edit button, represented by a pencil on the top right of the file contents.
151151

152152
7. Update the `run-id` parameter with the 11 digit ID number you copied.
153153

154-
8. Save the file by clicking the `Commit changes` button.
154+
8. Save the file by clicking the **Commit changes** button.
155155

156156

157157
#### Run the workflow file
@@ -160,7 +160,7 @@ You are now ready to run the **Test Model** workflow.
160160

161161
1. Navigate to the `Actions` tab and select the **Test Workflow** on the left side.
162162

163-
2. Click the `Run workflow` button to run the workflow on the main branch.
163+
2. Click the **Run workflow** button to run the workflow on the main branch.
164164

165165
![#run-workflow](images/run-workflow.png)
166166

@@ -170,7 +170,7 @@ Click on the workflow to view the output from each step.
170170

171171
![Actions_test](/images/actions_test.png)
172172

173-
Click on the "Run testing script" step to see the accuracy of the model and a table of the results from the PyTorch profiler.
173+
Click on the **Run testing script** step to see the accuracy of the model and a table of the results from the PyTorch profiler.
174174

175175
The output from is similar to:
176176

0 commit comments

Comments
 (0)