ArmDeveloperEcosystem
diff --git a/‎content/learning-paths/servers-and-cloud-computing/gh-runners/_index.md‎
Lines changed: 41 additions & 0 deletions b/‎content/learning-paths/servers-and-cloud-computing/gh-runners/_index.md‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎content/learning-paths/servers-and-cloud-computing/gh-runners/_next-steps.md‎
Lines changed: 27 additions & 0 deletions b/‎content/learning-paths/servers-and-cloud-computing/gh-runners/_next-steps.md‎
Lines changed: 27 additions & 0 deletions
diff --git a/‎content/learning-paths/servers-and-cloud-computing/gh-runners/_review.md‎
Lines changed: 43 additions & 0 deletions b/‎content/learning-paths/servers-and-cloud-computing/gh-runners/_review.md‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎content/learning-paths/servers-and-cloud-computing/gh-runners/background.md‎
Lines changed: 58 additions & 0 deletions b/‎content/learning-paths/servers-and-cloud-computing/gh-runners/background.md‎
Lines changed: 58 additions & 0 deletions
diff --git a/‎content/learning-paths/servers-and-cloud-computing/gh-runners/compare-performance.md‎
Lines changed: 62 additions & 0 deletions b/‎content/learning-paths/servers-and-cloud-computing/gh-runners/compare-performance.md‎
Lines changed: 62 additions & 0 deletions
@@ -0,0 +1,41 @@
+---
+title: MLOps with Arm-hosted GitHub Runners
+
+minutes_to_complete: 30
+
+who_is_this_for: This is an introductory topic for software developers interested in automation for machine learning (ML) tasks.
+
+learning_objectives:
+    - Set up an Arm-hosted GitHub runner
+    - Train and test a PyTorch ML model with the German Traffic Sign Recognition Benchmark (GTSRB) dataset on Arm
+    - Use PyTorch compiled with OpenBLAS and oneDNN with Arm Compute Library to compare the performance of your trained model
+    - Containerize the model and push your container to DockerHub
+    - Automate all the steps in the ML workflow using GitHub Actions
+
+
+prerequisites:
+    - A GitHub account with access to Arm-hosted GitHub runners
+    - Some familiarity with ML and continuous integration and deployment (CI/CD) concepts is assumed
+
+author_primary: Pareena Verma, Annie Tallund
+
+### Tags
+skilllevels: Introductory
+subjects: CI/CD
+armips:
+    - Neoverse
+tools_software_languages:
+    - Python
+    - PyTorch
+    - ACL
+    - GitHub
+operatingsystems:
+    - Linux
+
+
+### FIXED, DO NOT MODIFY
+# ================================================================================
+weight: 1                       # _index.md always has weight of 1 to order correctly
+layout: "learningpathall"       # All files under learning paths have this same wrapper
+learning_path_main_page: "yes"  # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
+---
@@ -0,0 +1,27 @@
+---
+next_step_guidance: Thank you for completing the learning path on running MLOps with Arm-hosted GitHub runners. You might be interested in learning how to build Arm images and multi-architecture images with these Arm-hosted runners.
+
+recommended_path: /learning-paths/cross-platform/github-arm-runners
+
+further_reading:
+    - resource:
+        title: Arm64 on GitHub Actions - Powering faster, more efficient build systems
+        link: https://github.blog/news-insights/product-news/arm64-on-github-actions-powering-faster-more-efficient-build-systems/
+        type: blog
+    - resource:
+        title: Arm Compute Library
+        link: https://github.com/ARM-software/ComputeLibrary
+        type: website
+    - resource:
+        title: Streamlining your MLOps pipeline with GitHub Actions and Arm64 runners
+        link: https://github.blog/enterprise-software/ci-cd/streamlining-your-mlops-pipeline-with-github-actions-and-arm64-runners/
+        type: blog
+
+
+# ================================================================================
+#       FIXED, DO NOT MODIFY
+# ================================================================================
+weight: 21                  # set to always be larger than the content in this path, and one more than 'review'
+title: "Next Steps"         # Always the same
+layout: "learningpathall"   # All files under learning paths have this same wrapper
+---
@@ -0,0 +1,43 @@
+---
+review:
+    - questions:
+        question: >
+            Can Arm-hosted runners be used with GitHub Actions?
+        answers:
+            - "Yes"
+            - "No"
+        correct_answer: 1
+        explanation: >
+            Arm-hosted runners for use with GitHub Actions are available for Linux and Windows.
+
+    - questions:
+        question: >
+            What is the GTSRB dataset made up of?
+        answers:
+            - Sound files of spoken German words
+            - Sound files of animal sounds
+            - Images of flower petals
+            - Images of German traffic signs
+        correct_answer: 4
+        explanation: >
+            GTSRB stands for German Traffic Signs Recognition Benchmark
+
+    - questions:
+        question: >
+            ACL is integrated into PyTorch by default.
+        answers:
+            - "True"
+            - "False"
+        correct_answer: 1
+        explanation: >
+            While it is possible to use ACL stand-alone, the optimized kernels are built into PyTorch through the oneDNN backend.
+
+
+
+# ================================================================================
+#       FIXED, DO NOT MODIFY
+# ================================================================================
+title: "Review"                 # Always the same title
+weight: 20                      # Set to always be larger than the content in this path
+layout: "learningpathall"       # All files under learning paths have this same wrapper
+---
@@ -0,0 +1,58 @@
+---
+title: Background
+weight: 2
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+## Overview 
+
+In this Learning Path, you will learn how to automate your MLOps workflow using an Arm-hosted GitHub runner and GitHub Actions. You will learn how to train and test a neural network model with PyTorch. You will compare the model inference time for your trained model using two different PyTorch backends. You will then containerize your trained model and deploy the container image to DockerHub for easy deployment of your application.
+
+## GitHub Actions
+
+GitHub Actions is a platform that automates software development workflows, including continuous integration and continuous delivery. Every repository on GitHub has a tab named _Actions_.
+
+![#actions-gui](images/actions-gui.png)
+
+From here, you can run different _workflow files_ which automate processes that run when specific events occur in your GitHub code repository. You use [YAML](https://yaml.org/) to define a workflow. You specify how a job is triggered, the running environment, and the workflow commands. The machine on which the workflow runs is called a _runner_.
+
+## Arm-hosted GitHub runners
+
+Arm-hosted GitHub runners are a powerful addition to your CI/CD toolkit. They leverage the efficiency and performance of Arm64 architecture, making your build systems faster and easier to scale. By using the Arm-hosted GitHub runners, you can optimize your workflows, reduce costs, and improve energy consumption. Additionally, the Arm-hosted runners are preloaded with essential tools, making it easier for you to develop and test your applications.
+
+Arm-hosted runners are available for Linux and Windows. This Learning Path uses Linux.
+
+{{% notice Note %}}
+You must have a Team or Enterprise Cloud plan to use Arm-hosted runners.
+{{% /notice %}}
+
+Getting started with Arm-hosted GitHub runners is straightforward. Follow [these steps to create a Linux Arm-hosted runner within your organization](/learning-paths/cross-platform/github-arm-runners/runner/#how-can-i-create-an-arm-hosted-runner).
+
+Once you have created the runner within your organization, you can use the `runs-on` syntax in your GitHub Actions workflow file to execute the workflow on Arm. Shown here is an example workflow that executes on your Arm-hosted runner named `ubuntu-22.04-arm`:
+
+```yaml
+name: Example workflow
+on:
+  workflow_dispatch:
+jobs:
+  example-job:
+    name: Example Job
+    runs-on: ubuntu-22.04-arm-os # Custom ARM64 runner
+    steps:
+      - name: Example step
+        run: echo "This line runs on Arm!"
+```
+
+This setup allows you to take full advantage of the Arm64 architecture's capabilities. Whether you are working on cloud, edge, or automotive projects, these runners provide a versatile and robust solution.
+
+## Machine Learning Operations (MLOps)
+
+With machine learning use-cases evolving and scaling, comes an increased need for reliable workflows to maintain them. There are many regular tasks that can be automated in the ML lifecycle. Models need to be re-trained, while ensuring they still perform at their best capacity. New training data needs to be properly stored and pre-processed, and the models need to be deployed in a good production environment. Developer Operations (DevOps) refers to good practices for CI/CD. The domain-specific needs for ML, combined with state of the art DevOps knowledge, created the term MLOps.
+
+## German Traffic Sign Recognition Benchmark (GTSRB)
+
+In this Learning path, you will train and test a PyTorch model for use in Traffic Sign recognition. You will use the GTSRB dataset to train the model. The dataset is free to use under the [Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/) license. It contains thousands of images of traffic signs found in Germany. Thanks to the availability and real-world connection, it has become a well-known resource to showcase ML applications. Additionally, given that it is a benchmark, you can apply it in a MLOps context to compare model improvements. This makes it a great candidate for this Learning Path, where you compare the performance of your trained model using two different PyTorch backends.
+
+Now that you have an overview, in the following sections you will learn how to setup an end-to-end MLOps workflow using the Arm-hosted GitHub runners.
@@ -0,0 +1,62 @@
+---
+title: Modify test workflow and compare performance
+weight: 5
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+Continuously monitoring the performance of your machine learning models in production is crucial to maintaining their effectiveness over time. The performance of your ML model can change due to various factors ranging from data-related issues to model-specific and environmental factors.
+
+In this section, you will change the PyTorch backend being used to test the trained model. You will learn how to measure and continuously monitor the inference performance with your workflow.
+
+## OneDNN with Arm Compute Library (ACL)
+
+In the previous section, you used the PyTorch 2.3.0 Docker Image compiled with OpenBLAS from DockerHub to run your testing workflow. PyTorch can be run with other backends as well. You will now modify the testing workflow to use PyTorch 2.3.0 Docker Image compiled with OneDNN and the Arm Compute Library. 
+
+The [Arm Compute Library](https://github.com/ARM-software/ComputeLibrary) is a collection of low-level machine learning functions optimized for Arm's Cortex-A and Neoverse processors, and the Mali GPUs. The Arm-hosted GitHub runners use Arm Neoverse CPUs, which makes it possible to optimize your neural networks to take advantange of the features available on the runners. ACL implements kernels (which you may know as operators or layers), which uses specific instructions that run faster on AArch64.
+ACL is integrated into PyTorch through the [oneDNN engine](https://github.com/oneapi-src/oneDNN). 
+
+## Modify the test workflow and compare results
+
+Two different PyTorch docker images for Arm Neoverse CPUs are available on [DockerHub](https://hub.docker.com/r/armswdev/pytorch-arm-neoverse). Up until this point, you used the `r24.07-torch-2.3.0-openblas` container image in your workflows. You will now update `test_model.yml` to use the `r24.07-torch-2.3.0-onednn-acl` container image instead. 
+
+Open and edit `.github/workflows/test_model.yml` in your browser. Update the `container.image` parameter to `armswdev/pytorch-arm-neoverse:r24.07-torch-2.3.0-onednn-acl` and save the file:
+
+```yaml
+jobs:
+  test-model:
+    name: Test the Model
+    runs-on: ubuntu-22.04-arm-os # Custom ARM64 runner
+    container:
+      image: armswdev/pytorch-arm-neoverse:r24.07-torch-2.3.0-onednn-acl
+      options: --user root
+    # Steps omitted
+```
+
+Trigger the Test Model job again by clicking the Run workflow button on the Actions tab.
+
+Expand the Run testing script step from your Actions tab. You should see a change in the performance results with OneDNN and ACL kernels being used. 
+
+```output
+Accuracy of the model on the test images: 90.48%
+---------------------------------  ------------  ------------  ------------  ------------  ------------  ------------
+                             Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls
+---------------------------------  ------------  ------------  ------------  ------------  ------------  ------------
+                  model_inference         4.63%     304.000us       100.00%       6.565ms       6.565ms             1
+                     aten::conv2d         0.18%      12.000us        56.92%       3.737ms       1.869ms             2
+                aten::convolution         0.30%      20.000us        56.74%       3.725ms       1.863ms             2
+               aten::_convolution         0.43%      28.000us        56.44%       3.705ms       1.853ms             2
+         aten::mkldnn_convolution        47.02%       3.087ms        55.48%       3.642ms       1.821ms             2
+                 aten::max_pool2d         0.15%      10.000us        25.51%       1.675ms     837.500us             2
+    aten::max_pool2d_with_indices        25.36%       1.665ms        25.36%       1.665ms     832.500us             2
+                     aten::linear         0.18%      12.000us         9.26%     608.000us     304.000us             2
+                      aten::clone         0.26%      17.000us         9.08%     596.000us     149.000us             4
+                      aten::addmm         8.50%     558.000us         8.71%     572.000us     286.000us             2
+---------------------------------  ------------  ------------  ------------  ------------  ------------  ------------
+Self CPU time total: 6.565ms
+
+```
+For the ACL results, observe that the **Self CPU time total** is lower compared to the OpenBLAS run in the previous section. The names of the layers have changed as well, where the `aten::mkldnn_convolution` is the kernel optimized to run on Aarch64. That operator is the main reason our inference time is improved, made possible by using ACL kernels.
+
+In the next section, you will learn how to automate the deployment of your trained and tested model.