diff --git a/assets/contributors.csv b/assets/contributors.csv index 33d3bfb507..d599b57494 100644 --- a/assets/contributors.csv +++ b/assets/contributors.csv @@ -42,3 +42,6 @@ Cyril Rohr,RunsOn,crohr,cyrilrohr,, Rin Dobrescu,Arm,,,, Przemyslaw Wirkus,Arm,PrzemekWirkus,przemyslaw-wirkus-78b73352,, Nader Zouaoui,Day Devs,nader-zouaoui,nader-zouaoui,@zouaoui_nader,https://daydevs.com/ +Alaaeddine Chakroun,Day Devs,Alaaeddine-Chakroun,alaaeddine-chakroun,,https://daydevs.com/ +Koki Mitsunami,Arm,,,, +Chen Zhang,Zilliz,,,, diff --git a/content/install-guides/_images/about-wpa.png b/content/install-guides/_images/about-wpa.png new file mode 100644 index 0000000000..7cd5131059 Binary files /dev/null and b/content/install-guides/_images/about-wpa.png differ diff --git a/content/install-guides/_images/wpa-installation.png b/content/install-guides/_images/wpa-installation.png new file mode 100644 index 0000000000..207ce3bf8a Binary files /dev/null and b/content/install-guides/_images/wpa-installation.png differ diff --git a/content/install-guides/_images/wpa-telemetry-table.png b/content/install-guides/_images/wpa-telemetry-table.png new file mode 100644 index 0000000000..eb85d41270 Binary files /dev/null and b/content/install-guides/_images/wpa-telemetry-table.png differ diff --git a/content/install-guides/_images/wpa-timeline-by-core.png b/content/install-guides/_images/wpa-timeline-by-core.png new file mode 100644 index 0000000000..b35f6df648 Binary files /dev/null and b/content/install-guides/_images/wpa-timeline-by-core.png differ diff --git a/content/install-guides/windows-perf-vs-extension.md b/content/install-guides/windows-perf-vs-extension.md index c55e3298f0..79d0dd668a 100644 --- a/content/install-guides/windows-perf-vs-extension.md +++ b/content/install-guides/windows-perf-vs-extension.md @@ -28,38 +28,33 @@ multi_install: FALSE # Set to true if first page of multi-page article, else fal multitool_install_part: false # Set to true if a sub-page of a multi-page article, else false layout: installtoolsall # DO NOT MODIFY. Always true for tool install articles --- +[WindowsPerf](/install-guides/wperf/) is a lightweight performance profiling tool inspired by Linux Perf, and specifically tailored for Windows on Arm. It leverages the AArch64 Performance Monitoring Unit (PMU) and its hardware counters to offer precise profiling capabilities. -## Introduction +Recognizing the complexities of command-line interaction, the WindowsPerf GUI is a Visual Studio 2022 extension created to provide a more intuitive, integrated experience within the Integrated Development Environment (IDE). This tool enables developers to interact with WindowsPerf, adjust settings, and visualize performance data seamlessly in Visual Studio. -WindowsPerf is a lightweight performance profiling tool inspired by Linux Perf, and specifically tailored for Windows on Arm. It leverages the ARM64 PMU (Performance Monitor Unit) and its hardware counters to offer precise profiling capabilities. - -Recognizing the complexities of command-line interaction, the WindowsPerf GUI is a Visual Studio 2022 extension created to provide a more intuitive, integrated experience within the integrated development environment (IDE). This tool enables developers to interact with WindowsPerf, adjust settings, and visualize performance data seamlessly in Visual Studio. - -## A Glimpse of the available features +## Overview of key features The WindowsPerf GUI extension is composed of several key features, each designed to streamline the user experience: -- **WindowsPerf Configuration**: Connect directly to `wperf.exe` for a seamless integration. Configuration is accessible via `Tools -> Options -> Windows Perf -> WindowsPerf Path`. -- **Host Data**: Understand your environment with `Tools -> WindowsPerf Host Data`, offering insights into tests run by WindowsPerf. -- **Output Logging**: All commands executed through the GUI are logged, ensuring transparency and aiding in performance analysis. -- **Sampling UI**: Customize your sampling experience by selecting events, setting frequency and duration, choosing programs for sampling, and comprehensively analyzing results. +- **WindowsPerf Configuration**: Connect directly to `wperf.exe` for a seamless integration. Configuration is accessible by selecting **Tools > Options > Windows Perf > WindowsPerf Path**. +- **Host Data**: Understand your environment by selecting **Tools** then **WindowsPerf Host Data**. This offers insights into tests run by WindowsPerf. +- **Output Logging**: All commands executed through the GUI are logged, ensuring transparency and supporting performance analysis. +- **Sampling UI**: Customize your sampling experience by selecting events, setting frequency and duration, choosing programs for sampling, and comprehensively analyzing results. See screenshot below. ![Sampling preview #center](../_images/wperf-vs-extension-sampling-preview.png "Sampling settings UI Overview") -- **Counting Settings UI**: Build a `wperf stat` command from scratch using the configuration interface, then view the output in the IDE or open it with Windows Performance Analyzer (WPA) - - -![Counting preview #center](../_images/wperf-vs-extension-counting-preview.png "_Counting settings UI Overview_") +- **Counting Settings UI**: Build a `wperf stat` command from scratch using the configuration interface, then view the output in the IDE or open it with Windows Performance Analyzer (WPA). See screenshot below. -## Getting Started +![Counting preview #center](../_images/wperf-vs-extension-counting-preview.png "Counting settings UI Overview") -### Prerequisites +## Before you begin -- **Visual Studio 2022**: Ensure you have Visual Studio 2022 installed on your Windows on Arm device. -- **WindowsPerf**: Download and install WindowsPerf by following the [WindowsPerf install guide](/install-guides/wperf/). -- **LLVM** (Recommended): You can install the LLVM toolchain by following the [LLVM toolchain for Windows on Arm install guide](/install-guides/llvm-woa). +Before installing WindowsPerf Visual Studio Extension, check the following: +1. Ensure Visual Studio 2022 is installed on your Windows on Arm device. +2. Download and install WindowsPerf by following the [WindowsPerf install guide](/install-guides/wperf/). +3. (Recommended) You can install the LLVM toolchain by following the [LLVM toolchain for Windows on Arm install guide](/install-guides/llvm-woa). {{% notice llvm-objdump %}} The disassembly feature needs to have `llvm-objdump` available at `%PATH%` to work properly. @@ -69,11 +64,10 @@ The disassembly feature needs to have `llvm-objdump` available at `%PATH%` to wo To install the WindowsPerf Visual Studio Extension from Visual Studio: -1. Open Visual Studio 2022 -2. Go to the `Extensions` menu -3. Select **Manage Extensions** -4. Click on the search bar ( or tap `Ctrl` + `L` ) and type `WindowsPerf` -5. Click on the install button and restart Visual Studio +1. Open Visual Studio 2022. +2. Go to the **Extensions** menu and select **Manage Extensions**. +4. Click on the search bar (Ctrl+L) and type `WindowsPerf`. +5. Click on the **Install** button and restart Visual Studio. ![WindowsPerf install page #center](../_images/wperf-vs-extension-install-page.png) @@ -83,7 +77,7 @@ You can also install the WindowsPerf Visual Studio Extension from GitHub. Download the installation file directly from the [GitHub release page](https://github.com/arm-developer-tools/windowsperf-vs-extension/releases). -Unzip the downloaded file and double click on the `WindowsPerfGUI.vsix` file +Unzip the downloaded file and double click on the `WindowsPerfGUI.vsix` file. {{% notice Note %}} Make sure that any previous version of the extension is uninstalled and that Visual Studio is closed before installing the extension. @@ -97,10 +91,10 @@ Building the source is not required, but offered as an alternative installation ### WindowsPerf Setup -To get started, you must link the GUI with the executable file `wperf.exe` by navigating to `Tools -> Options -> WindowsPerf -> WindowsPerf Path`. This step is crucial for utilizing the GUI, and the extension will not work if you don't do it. +To get started, you must link the GUI with the executable file `wperf.exe` by navigating to **Tools > Options > WindowsPerf > WindowsPerf Path**. This step is crucial for utilizing the GUI, and the extension will not work if you don't do it. ## Uninstall the WindowsPerfGUI extension -In Visual Studio go to `Extensions` -> `Manage Extensions` -> `Installed` -> `All` -> `WindowsPerfGUI` and select "Uninstall". +In Visual Studio go to **Extensions > Manage Extensions > Installed > All > WindowsPerfGUI** and select **Uninstall**. -Please note that this will be scheduled by Visual Studio. You may need to close VS instance and follow uninstall wizard to remove the extension. +As this will be scheduled by Visual Studio, you might need to close the VS instance and follow the uninstall wizard to remove the extension. diff --git a/content/install-guides/windows-perf-wpa-plugin.md b/content/install-guides/windows-perf-wpa-plugin.md new file mode 100644 index 0000000000..f23f5cc042 --- /dev/null +++ b/content/install-guides/windows-perf-wpa-plugin.md @@ -0,0 +1,139 @@ +--- +### Title the install tools article with the name of the tool to be installed +### Include vendor name where appropriate +title: Windows Performance Analyzer (WPA) Plugin +minutes_to_complete: 15 + +draft: true + +official_docs: https://github.com/arm-developer-tools/windowsperf-wpa-plugin + +author_primary: Alaaeddine Chakroun + +### Optional additional search terms (one per line) to assist in finding the article +additional_search_terms: + - perf + - profiling + - profiler + - windows + - woa + - windows on arm + - windows performance analyzer + - wpa + +### FIXED, DO NOT MODIFY +weight: 1 # Defines page ordering. Must be 1 for first (or only) page. +tool_install: true # Set to true to be listed in main selection page, else false +multi_install: FALSE # Set to true if first page of multi-page article, else false +multitool_install_part: false # Set to true if a sub-page of a multi-page article, else false +layout: installtoolsall # DO NOT MODIFY. Always true for tool install articles +--- + +## What is the Windows Performance Analyzer plugin? + +The Windows Performance Analyzer plugin connects Windows Perf to the Windows Performance Analyzer (WPA). + +[WindowsPerf](https://github.com/arm-developer-tools/windowsperf) is a lightweight performance profiling tool inspired by Linux Perf and designed for Windows on Arm. + +Windows Performance Analyzer (WPA) is a tool that creates graphs and data tables of Event Tracing for Windows (ETW) events that are recorded by Windows Performance Recorder (WPR), Xperf, or an assessment that is run in the Assessment Platform. WPA opens event trace log (ETL) files for analysis. + +The WPA plugin is built using the [Microsoft Performance Toolkit SDK](https://github.com/microsoft/microsoft-performance-toolkit-sdk), a collection of tools to create and extend performance analysis applications. The plugin parses json output from WidowsPerf so that it can be visualized in WPA. + +## What are some of the features of the WPA plugin? + +The WindowsPerf GUI extension is composed of several key features, each designed to streamline the user experience: + +### What is the timeline view? + +The timeline view visualizes the `wperf stat` timeline data plotted by event group. + +![Timeline By Core Table](/install-guides/_images/wpa-timeline-by-core.png) + +### What is the telemetry view? + +The telemetry view displays telemetry events grouped by unit. + +![Telemetry Table](/install-guides/_images/wpa-telemetry-table.png) + +## How do I install the WPA plugin? + +Before using the WPA plugin, make sure you have installed WPA. + +### Windows Performance Analyzer + +WPA is included in the Windows Assessment and Deployment Kit (Windows ADK) that can be downloaded from [Microsoft](https://go.microsoft.com/fwlink/?linkid=2243390). + +{{% notice Note %}} +The WPA plugin requires WPA version `11.0.7.2` or higher. +{{% /notice %}} + +Run the downloaded `adksetup.exe` program. + +Specify the default installation location and accept the license agreement. + +Make sure that "Windows Performance Toolkit" is checked under "Select the features you want to install". + +![WPA Installation](/install-guides/_images/wpa-installation.png) + +Finally, click Install. + +### Windows Performance Analyzer plugin + +The plugin is a single `.dll` file. + +Download a `.zip` file from the [GitHub releases page](https://github.com/arm-developer-tools/windowsperf-wpa-plugin/releases). + +To download the latest version from the command prompt: + +```console +mkdir wpa-plugin +cd wpa-plugin +curl -L -O https://github.com/arm-developer-tools/windowsperf-wpa-plugin/releases/download/1.0.2/wpa-plugin-1.0.2.zip +``` + +Extract the `.dll` file from the downloaded `.zip` file. + +```console +tar -xmf wpa-plugin-1.0.2.zip +``` + +You now have the file `WPAPlugin.dll` in your `wpa-plugin` directory. + +There are three ways you can install the `WPAPlugin.dll` file: + +###### 1. Copy the plugin dll to the CustomDataSources directory next to the WPA executable. + +The default location is: + `C:\\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\CustomDataSources` + +###### 2. Set an environment variable + +Set the `WPA_ADDITIONAL_SEARCH_DIRECTORIES` environment variable to the location of the DLL file. + +###### 3. Start WPA from the command line and pass the plugin directory location using a flag. + +Use the `-addsearchdir` flag for `wpa`: + +```bash +wpa -addsearchdir "%USERPROFILE%\plugins" +``` + +## How can I verify the WPA plugin is installed? + +To verify the plugin is loaded, launch WPA and the plugin should appear under `Help > About Windows Performance Analyzer` + +![WPA installation confirmation](/install-guides/_images/about-wpa.png) + +## How can I run the WPA plugin from the command line? + +To open a json file directly from the command line, you can use the `-i` flag to specify the file path to open. + +For example: to open `timeline_long.json` in your downloads directory, run the command: + +```console +wpa -i "%USERPROFILE%\\Downloads\\timeline_long.json" +``` +## How do I uninstall the WPA plugin? + +To uninstall the plugin simply delete the `WPAPlugin.dll` file. + diff --git a/content/learning-paths/cross-platform/gitlab/1-gitlab-runner.md b/content/learning-paths/cross-platform/gitlab/1-gitlab-runner.md index 97d56ca788..d11f820821 100644 --- a/content/learning-paths/cross-platform/gitlab/1-gitlab-runner.md +++ b/content/learning-paths/cross-platform/gitlab/1-gitlab-runner.md @@ -16,9 +16,7 @@ A GitLab Runner works with GitLab CI/CD to run jobs in a pipeline. It acts as an 3. Multi-architecture support: GitLab runners support multiple architectures including - `x86/amd64` and `arm64` ## What is Google Axion? -Axion is Google’s first Arm-based server processor, built using the Armv9 Neoverse V2 CPU. The VM instances are part of the `C4A` family of compute instances. To learn more about Google Axion refer to this [blog](https://cloud.google.com/blog/products/compute/introducing-googles-new-arm-based-cpu). - -Note: These `C4A` VM instances are in public preview and needs a signup to be enabled in your Google Cloud account/project. +Axion is Google’s first Arm-based server processor, built using the Armv9 Neoverse V2 CPU. The VM instances are part of the `C4A` family of compute instances. To learn more about Google Axion refer to this [page](cloud.google.com/products/axion). ## Install GitLab runner on a Google Axion VM diff --git a/content/learning-paths/cross-platform/gitlab/_index.md b/content/learning-paths/cross-platform/gitlab/_index.md index 2ce96256e0..0aeb2d98e7 100644 --- a/content/learning-paths/cross-platform/gitlab/_index.md +++ b/content/learning-paths/cross-platform/gitlab/_index.md @@ -1,8 +1,5 @@ --- title: Build a CI/CD pipeline with GitLab on Google Axion -draft: true -cascade: - draft: true minutes_to_complete: 30 diff --git a/content/learning-paths/servers-and-cloud-computing/csp/google.md b/content/learning-paths/servers-and-cloud-computing/csp/google.md index 282fea34a4..c4430cb9de 100644 --- a/content/learning-paths/servers-and-cloud-computing/csp/google.md +++ b/content/learning-paths/servers-and-cloud-computing/csp/google.md @@ -11,7 +11,7 @@ layout: "learningpathall" As with most cloud service providers, Google Cloud offers a pay-as-you-use [pricing policy](https://cloud.google.com/pricing), including a number of [free](https://cloud.google.com/free/docs/free-cloud-features) services. -This section is to help you get started with [Google Cloud Compute Engine](https://cloud.google.com/compute) compute services, using Arm-based [Tau T2A](https://cloud.google.com/tau-vm) Virtual Machines. This is a general-purpose compute platform, essentially your own personal computer in the cloud. +This section is to help you get started with [Google Cloud Compute Engine](https://cloud.google.com/compute) compute services, using Arm-based Virtual Machines. Google Cloud offers two generations of Arm-based VMs, `C4A` is the latest generation based on [Google Axion](cloud.google.com/products/axion), Google’s first Arm-based server processor, built using the Armv9 Neoverse V2 CPU. The previous generation VMs are based on Ampere Altra processor and part of [Tau T2A](https://cloud.google.com/tau-vm) family of Virtual Machines. Detailed instructions are available in the Google Cloud [documentation](https://cloud.google.com/compute/docs/instances). @@ -23,7 +23,7 @@ If using an organization's account, you will likely need to consult with your in ## Browse for an appropriate instance -Google Cloud offers a wide range of instance types, covering all performance (and pricing) points. For an overview of the Tau T2A instance types, see the [General-purpose machine family](https://cloud.google.com/compute/docs/general-purpose-machines#t2a_machines) overview. +Google Cloud offers a wide range of instance types, covering all performance (and pricing) points. For an overview of the `C4A` instance types, see this [page](cloud.google.com/products/axion). Similarly, to know more about the `T2A` instance types, see the [General-purpose machine family](https://cloud.google.com/compute/docs/general-purpose-machines#t2a_machines) overview. Also note which [regions](https://cloud.google.com/compute/docs/regions-zones#available) these servers are available in. @@ -49,15 +49,15 @@ Select an appropriate `region` and `zone` that support Arm-based servers. ![google3 #center](https://github.com/ArmDeveloperEcosystem/arm-learning-paths/assets/71631645/f2a19cd0-7565-44d3-9e6f-b27bccad3e86 "Select an appropriate region and zone") -To view the latest information on which available regions and zones support Arm-based servers, see the [Compute Engine documentation](https://cloud.google.com/compute/docs/regions-zones#available). To filter for Arm-based machines, click on `Select a machine type`, then select `T2A` from the pull-down menu. +To view the latest information on which available regions and zones support Arm-based servers, see the [Compute Engine documentation](https://cloud.google.com/compute/docs/regions-zones#available). To filter for Arm-based machines, click on `Select a machine type`, then select `T2A` or `C4A` from the pull-down menu. ![google4 #center](https://github.com/ArmDeveloperEcosystem/arm-learning-paths/assets/71631645/5b1683dc-724f-4c60-aea6-dc945c7bf6bc "Check which regions and zones support Arm-based machines") ### Machine configuration -Select `T2A` from the `Series` pull-down menu. Then select an appropriate `Machine type` configuration for your needs. +Select `C4A` from the `Series` pull-down menu. Then select an appropriate `Machine type` configuration for your needs. -![google5 #center](images/gcp_instance.png "Select an appropriate T2A machine type") +![google5 #center](images/gcp_instance_new.png "Select an appropriate C4A machine type") ### Boot disk configuration diff --git a/content/learning-paths/servers-and-cloud-computing/gcp/jump_server.md b/content/learning-paths/servers-and-cloud-computing/gcp/jump_server.md index 27c8c36046..9c67fa700c 100644 --- a/content/learning-paths/servers-and-cloud-computing/gcp/jump_server.md +++ b/content/learning-paths/servers-and-cloud-computing/gcp/jump_server.md @@ -69,7 +69,7 @@ resource "google_project_iam_member" "project" { resource "google_compute_instance" "bastion_host" { project = var.project name = "bastion-vm" - machine_type = "t2a-standard-1" + machine_type = "c4a-standard-1" zone = var.zone tags = ["public"] boot_disk { @@ -91,7 +91,7 @@ resource "google_compute_instance" "bastion_host" { resource "google_compute_instance" "private" { project = var.project name = "bastion-private" - machine_type = "t2a-standard-1" + machine_type = "c4a-standard-1" zone = var.zone allow_stopping_for_update = true tags = ["private"] diff --git a/content/learning-paths/servers-and-cloud-computing/gcp/terraform.md b/content/learning-paths/servers-and-cloud-computing/gcp/terraform.md index 63a4622941..bfedc010ba 100644 --- a/content/learning-paths/servers-and-cloud-computing/gcp/terraform.md +++ b/content/learning-paths/servers-and-cloud-computing/gcp/terraform.md @@ -39,7 +39,7 @@ provider "google" { resource "google_compute_instance" "vm_instance" { name = "instance-arm" - machine_type = "t2a-standard-1" + machine_type = "c4a-standard-1" boot_disk { initialize_params { diff --git a/content/learning-paths/servers-and-cloud-computing/gh-runners/_index.md b/content/learning-paths/servers-and-cloud-computing/gh-runners/_index.md index 3057703032..3e2e93a039 100644 --- a/content/learning-paths/servers-and-cloud-computing/gh-runners/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/gh-runners/_index.md @@ -1,8 +1,5 @@ --- title: Optimize MLOps with Arm-hosted GitHub Runners -draft: true -cascade: - draft: true minutes_to_complete: 60 diff --git a/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/_index.md b/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/_index.md index 4188422e34..d093594ea4 100644 --- a/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/_index.md @@ -1,12 +1,12 @@ --- -title: Learn how to migrate an x86 application to multi-architecture with Arm on Google Kubernetes Engine (GKE) +title: Learn how to migrate an x86 application to multi-architecture with Arm-based on Google Axion Processor on GKE minutes_to_complete: 30 who_is_this_for: This is an advanced topic for software developers who are looking to migrate their existing x86 containerized applications to Arm learning_objectives: - - Add Arm-based nodes to an existing x86-based GKE cluster + - Add Arm-based nodes (Google Axion) to an existing x86-based GKE cluster - Rebuild an x86-based application to make it multi-arch and run on Arm - Learn how to add taints and tolerations to GKE clusters to schedule application pods on architecture specific nodes - Run a multi-arch application across multiple architectures on a single GKE cluster diff --git a/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/how-to-1.md b/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/how-to-1.md index adff9915da..63016134df 100644 --- a/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/how-to-1.md +++ b/content/learning-paths/servers-and-cloud-computing/gke-multi-arch/how-to-1.md @@ -8,7 +8,7 @@ layout: learningpathall ## Migrate an existing x86-based application to run on Arm-based nodes in a single GKE cluster -Google Kubernetes Engine (GKE) supports hybrid clusters with x86 and Arm based nodes. The Arm-based nodes can be deployed on the `Tau T2A` family of virtual machines. The `Tau T2A` virtual machines are powered by Ampere Altra Arm-based processors. +Google Kubernetes Engine (GKE) supports hybrid clusters with x86 and Arm based nodes. The Arm-based nodes can be deployed on the `C4A` family of virtual machines. The `C4A` VMs are based on [Google Axion](cloud.google.com/products/axion), Google’s first Arm-based server processor, built using the Armv9 Neoverse V2 CPU. ## Before you begin @@ -102,13 +102,13 @@ Hello from NODE:gke-multi-arch-cluster-default-pool-45537239-q83v, POD:x86-hello ## Add Arm-based nodes to your GKE cluster -Use the following command to add an Arm-based node pool with VM type `t2a-standard-2` to your GKE cluster: +Use the following command to add an Arm-based node pool with VM type `c4a-standard-2` to your GKE cluster: ```console gcloud container node-pools create arm-pool \ --cluster $CLUSTER_NAME \ --zone $ZONE \ - --machine-type=t2a-standard-2 \ + --machine-type=c4a-standard-2 \ --num-nodes=3 ``` After the Arm-nodes are successfully added to the cluster, run the following command to check if both types of nodes show up in the cluster: diff --git a/content/learning-paths/servers-and-cloud-computing/java-on-axion/1-create-instance.md b/content/learning-paths/servers-and-cloud-computing/java-on-axion/1-create-instance.md index 7d4604d0fc..0e30ae6993 100644 --- a/content/learning-paths/servers-and-cloud-computing/java-on-axion/1-create-instance.md +++ b/content/learning-paths/servers-and-cloud-computing/java-on-axion/1-create-instance.md @@ -8,11 +8,7 @@ layout: learningpathall ## Create an Axion instance -Axion is Google’s first Arm-based server processor, built using the Armv9 Neoverse V2 CPU. Created specifically for the data center, Axion delivers industry-leading performance and energy efficiency. - -{{% notice Note %}} -The Axion instance type (C4A) is currently in public preview. A GA (General Availability) release will happen in the coming months. -{{% /notice %}} +Axion is Google’s first Arm-based server processor, built using the Armv9 Neoverse V2 CPU. Created specifically for the data center, Axion delivers industry-leading performance and energy efficiency. To learn more about Google Axion, refer to this [page](cloud.google.com/products/axion) There are several ways to create an Arm-based Google Axion VM: the Google Cloud console, the gcloud CLI tool, or using your choice of IaC (Infrastructure as Code). diff --git a/content/learning-paths/servers-and-cloud-computing/java-on-axion/2-deploy-java.md b/content/learning-paths/servers-and-cloud-computing/java-on-axion/2-deploy-java.md index 65c89d4af8..98313d2d6a 100644 --- a/content/learning-paths/servers-and-cloud-computing/java-on-axion/2-deploy-java.md +++ b/content/learning-paths/servers-and-cloud-computing/java-on-axion/2-deploy-java.md @@ -90,7 +90,7 @@ java -jar target/*.jar Once the application is running, you can open the web app in a web browser by visiting ```bash -http://[EXTERNAL IP]:8080 +http://:8080 ``` -Where `[EXTERNAL IP]` is the value you obtained in the [last section](/learning-paths/servers-and-cloud-computing/java-on-axion/1-create-instance/#obtain-the-ip-of-your-instance). +Where `` is the value you obtained in the [last section](/learning-paths/servers-and-cloud-computing/java-on-axion/1-create-instance/#obtain-the-ip-of-your-instance). diff --git a/content/learning-paths/servers-and-cloud-computing/java-on-axion/_index.md b/content/learning-paths/servers-and-cloud-computing/java-on-axion/_index.md index 425b8702a5..b469bfbe86 100644 --- a/content/learning-paths/servers-and-cloud-computing/java-on-axion/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/java-on-axion/_index.md @@ -1,8 +1,5 @@ --- title: Run Java applications on Google Axion processors -draft: true -cascade: - draft: true minutes_to_complete: 20 diff --git a/content/learning-paths/servers-and-cloud-computing/java-on-axion/select_axion_instance.png b/content/learning-paths/servers-and-cloud-computing/java-on-axion/select_axion_instance.png deleted file mode 100644 index d537fd83d1..0000000000 Binary files a/content/learning-paths/servers-and-cloud-computing/java-on-axion/select_axion_instance.png and /dev/null differ diff --git a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_index.md b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_index.md index ff83f6bff4..de2a37aecb 100644 --- a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_index.md @@ -1,15 +1,12 @@ --- title: Migrate containers to Arm using KubeArchInspect -draft: true -cascade: - draft: true minutes_to_complete: 15 -who_is_this_for: This is an introductory topic for software developers who want to know if the containers running in a Kubernetes cluster are available for the Arm architecture. +who_is_this_for: This is an introductory topic for software developers who want to ensure containers running in a Kubernetes cluster support the Arm architecture. learning_objectives: - - Run KubeArchInspect to get a quick report of the containers running in a Kubernetes cluster. + - Run KubeArchInspect to generate a report on the containers running in a Kubernetes cluster. - Discover which images support the Arm architecture. - Understand common reasons for an image not supporting Arm. - Make configuration changes to upgrade images with Arm support. diff --git a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_next-steps.md b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_next-steps.md index 3174193d80..8e40a1f93b 100644 --- a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_next-steps.md +++ b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_next-steps.md @@ -1,5 +1,5 @@ --- -next_step_guidance: Now you know how to use the KubeArchInspect tool to understand the Arm support of your Kubernetes cluster images. +next_step_guidance: You now know how to use the KubeArchInspect tool to ensure containers running in a Kubernetes cluster support the Arm architecture. To learn more about related topics, please explore the resources below. recommended_path: /learning-paths/servers-and-cloud-computing/eks-multi-arch/ diff --git a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_review.md b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_review.md index 378a647733..fdc1d2d118 100644 --- a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_review.md +++ b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/_review.md @@ -2,7 +2,7 @@ review: - questions: question: > - Which of the following statements is true about kubearchinspect? + Which of the following statements is true about KubeArchInspect? answers: - KubeArchInspect displays a report of the images running in a Kubernetes cluster, but it does not identify which images support arm64. - KubeArchInspect displays a report of the images running in a Kubernetes cluster and identifies which images support arm64. @@ -25,7 +25,7 @@ review: question: > Which of the following is NOT a way to improve your cluster's Arm compatibility? answers: - - Upgrade images to a newer version -- if they support arm64. + - Upgrade images to a newer version that supports arm64. - Find an alternative image that supports arm64. - Request that the developers of an image build and publish an arm64 version. - Contact the Kubernetes community to upgrade your cluster. diff --git a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/analyse-results.md b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/analyse-results.md index a3bdc0efec..3635d50cb7 100644 --- a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/analyse-results.md +++ b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/analyse-results.md @@ -8,7 +8,12 @@ layout: learningpathall ## Identifying issues and opportunities -After running KubeArchInspect, you can examine the output to determine if the cluster image architectures are suitable for your needs. +After running KubeArchInspect, you can examine the output to determine if the cluster image architectures are suitable for your needs. Each image running in the cluster appears on a separate line, including name, tag (version), and test result: + +* A green tick (✅) indicates the image already supports arm64. +* A red cross (❌) indicates that arm64 support is not available. +* An upward arrow (⬆) shows that arm64 support is included in a newer version. +* A red exclamation mark (❗) is shown when an error occurs checking the image. This may indicate an error connecting to the image registry. If you want to run an all Arm cluster, you need to use images which include arm64 support. @@ -25,17 +30,14 @@ Legends: ... sergrua/kube-tagger:release-0.1.1 ❌ ``` - These images are identified as not supporting arm64 (`❌`). ## Addressing issues The KubeArchInspect report provides valuable information for improving the cluster's performance and compatibility with the Arm architecture. -Several approaches can be taken to address the issues identified: +There are several approaches you can take to address issues identified in the report: -* **Upgrade images:** If an image with an available arm64 version (`⬆`) is detected, consider upgrading to that version. This can be done by modifying the deployment configuration and restarting the containers using the new image tag. -* **Find alternative images:** For images with no available arm64 version, look for alternative images that offer arm64 support. For example, instead of a specific image from the registry, try using a more general image like `busybox`, which supports multiple architectures, including arm64. +* **Upgrade images:** If an image with an available arm64 version (`⬆`) is detected, consider upgrading to that version. You can do this by modifying the deployment configuration and restarting the containers using the new image tag. +* **Find alternative images:** For images with no available arm64 version (`❌`), look for alternative images that offer arm64 support. For example, instead of a specific image from the registry, try using a more general image like `busybox`, which supports multiple architectures, including arm64. * **Request Arm support:** If there is no suitable alternative image available, you can contact the image developers or the Kubernetes community and request them to build and publish an arm64 version of the image. - -KubeArchInspect provides an efficient way to understand and improve the Arm architecture support within your Kubernetes cluster, ensuring your cluster runs efficiently and effectively. \ No newline at end of file diff --git a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/before-you-begin.md b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/before-you-begin.md index d59fe366bb..9d909ef123 100644 --- a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/before-you-begin.md +++ b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/before-you-begin.md @@ -5,6 +5,11 @@ weight: 2 ### FIXED, DO NOT MODIFY layout: learningpathall --- +## How does KubeArchInspect help? + +KubeArchInspect is a tool developed by Arm. It provides an efficient way to understand and improve the Arm architecture support within your Kubernetes cluster, ensuring your cluster runs efficiently and effectively. + +KubeArchInspect identifies images in a Kubernetes cluster which support the Arm architecture. It does this by checking each image against the source registry for the image and identifying which architectures are available. You can use the results to identify potential issues or opportunities for optimizing the cluster to run on Arm. {{% notice Note %}} KubeArchInspect is a command-line tool which requires a running Kubernetes cluster. diff --git a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/run-kubearchinspect.md b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/run-kubearchinspect.md index 61d3f4830d..18daf3fb29 100644 --- a/content/learning-paths/servers-and-cloud-computing/kubearchinspect/run-kubearchinspect.md +++ b/content/learning-paths/servers-and-cloud-computing/kubearchinspect/run-kubearchinspect.md @@ -6,8 +6,6 @@ weight: 3 layout: learningpathall --- -KubeArchInspect identifies images in a Kubernetes cluster which have support for the Arm architecture. It checks each image against the image registry, checking the available architectures for each image tag. The results can be used to identify potential issues or opportunities for optimizing the cluster for Arm. - ## How do I run KubeArchInspect? To run KubeArchInspect, you need to have `kubearchinspect` installed and ensure that the `kubectl` command is configured to connect to your cluster. If not already configured, you should set up `kubectcl` to connect to your cluster. @@ -65,9 +63,3 @@ quay.io/prometheus/alertmanager:v0.25.0 ✅ 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni-init:v1.15.4-eksbuild.1 ❗ 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni:v1.15.4-eksbuild.1 ❗ ``` - -Each image running in the cluster appears on a separate line, including name, tag (version), and test result. - -A green tick indicates the image already supports arm64, a red cross that arm64 support is not available, an upward arrow shows that arm64 support is included in a newer version. - -A red exclamation mark is shown when an error occurs checking the image. This may indicate an error connecting to the image registry. \ No newline at end of file diff --git a/content/learning-paths/servers-and-cloud-computing/milvus-rag/_index.md b/content/learning-paths/servers-and-cloud-computing/milvus-rag/_index.md index 8cd15700f2..82820bf743 100644 --- a/content/learning-paths/servers-and-cloud-computing/milvus-rag/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/milvus-rag/_index.md @@ -12,7 +12,7 @@ learning_objectives: prerequisites: - A basic understanding of a RAG pipeline. - An AWS Graviton3 C7g.2xlarge instance, or any [Arm-based instance](/learning-paths/servers-and-cloud-computing/csp) from a cloud service provider or an on-premise Arm server. - - A [Zilliz account](https://zilliz.com/cloud), which you can sign up for with a free trial. + - A [Zilliz account](https://zilliz.com/cloud?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm), which you can sign up for with a free trial. author_primary: Chen Zhang diff --git a/content/learning-paths/servers-and-cloud-computing/milvus-rag/offline_data_loading.md b/content/learning-paths/servers-and-cloud-computing/milvus-rag/offline_data_loading.md index 0299590493..6b0465696d 100644 --- a/content/learning-paths/servers-and-cloud-computing/milvus-rag/offline_data_loading.md +++ b/content/learning-paths/servers-and-cloud-computing/milvus-rag/offline_data_loading.md @@ -9,9 +9,9 @@ layout: learningpathall In this section, you will set up a cluster on Zilliz Cloud. -Begin by [registering](https://docs.zilliz.com/docs/register-with-zilliz-cloud) for a free account on Zilliz Cloud. +Begin by [registering](https://docs.zilliz.com/docs/register-with-zilliz-cloud?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm) for a free account on Zilliz Cloud. -After you register, [create a cluster](https://docs.zilliz.com/docs/create-cluster). +After you register, [create a cluster](https://docs.zilliz.com/docs/create-cluster?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm). Now create a **Dedicated** cluster deployed in AWS using Arm-based machines to store and retrieve the vector data as shown: @@ -22,7 +22,7 @@ When you select the **Create Cluster** Button, you should see the cluster runnin ![running](running_cluster.png) {{% notice Note %}} -You can use self-hosted Milvus as an alternative to Zilliz Cloud. This option is more complicated to set up. You can also deploy [Milvus Standalone](https://milvus.io/docs/install_standalone-docker-compose.md) and [Kubernetes](https://milvus.io/docs/install_cluster-milvusoperator.md) on Arm-based machines. For more information about installing Milvus, see the [Milvus installation documentation](https://milvus.io/docs/install-overview.md). +You can use self-hosted Milvus as an alternative to Zilliz Cloud. This option is more complicated to set up. You can also deploy [Milvus Standalone](https://milvus.io/docs/install_standalone-docker-compose.md?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm) and [Kubernetes](https://milvus.io/docs/install_cluster-milvusoperator.md?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm) on Arm-based machines. For more information about installing Milvus, see the [Milvus installation documentation](https://milvus.io/docs/install-overview.md?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm). {{% /notice %}} ## Create the Collection @@ -39,7 +39,7 @@ milvus_client = MilvusClient( ) ``` -Replace ** and ** with the `URI` and `Token` for your running cluster. Refer to [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud for further information. +Replace ** and ** with the `URI` and `Token` for your running cluster. Refer to [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm) in Zilliz Cloud for further information. Now, append the following code to `zilliz-llm-rag.py` and save the contents: @@ -60,7 +60,7 @@ milvus_client.create_collection( This code checks if a collection already exists and drops it if it does. If this happens, you can create a new collection with the specified parameters. If you do not specify any field information, Milvus automatically creates a default `id` field for the primary key, and a `vector` field to store the vector data. A reserved JSON field is used to store non-schema defined fields and their values. -You can use inner product distance as the default metric type. For more information about distance types, you can refer to [Similarity Metrics page](https://milvus.io/docs/metric.md?tab=floating). +You can use inner product distance as the default metric type. For more information about distance types, you can refer to [Similarity Metrics page](https://milvus.io/docs/metric.md?tab=floating?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm). You can now prepare the data to use in this collection. diff --git a/content/learning-paths/servers-and-cloud-computing/milvus-rag/prerequisite.md b/content/learning-paths/servers-and-cloud-computing/milvus-rag/prerequisite.md index 1008493283..41ef015331 100644 --- a/content/learning-paths/servers-and-cloud-computing/milvus-rag/prerequisite.md +++ b/content/learning-paths/servers-and-cloud-computing/milvus-rag/prerequisite.md @@ -12,7 +12,7 @@ In this Learning Path, you will learn how to build a Retrieval-Augmented Generat RAG applications often use vector databases to efficiently store and retrieve high-dimensional vector representations of text data. Vector databases are optimized for similarity search and can handle large volumes of vector data, making them ideal for the retrieval component of RAG systems. -In this Learning Path, you will use [Zilliz Cloud](https://zilliz.com/cloud) for your vector storage, which is a fully managed Milvus vector database. Zilliz Cloud is available on major cloud computing service providers; for example, AWS, GCP, and Azure. +In this Learning Path, you will use [Zilliz Cloud](https://zilliz.com/cloud?utm_source=partner&utm_medium=referral&utm_campaign=2024-10-24_web_arm-dev-hub-data-loading_arm) for your vector storage, which is a fully managed Milvus vector database. Zilliz Cloud is available on major cloud computing service providers; for example, AWS, GCP, and Azure. Here, you will use Zilliz Cloud deployed on AWS with an Arm-based server. For the LLM, you will use the Llama-3.1-8B model also running on an AWS Arm-based server, but using `llama.cpp`. diff --git a/content/learning-paths/servers-and-cloud-computing/mysql_tune/kernel_comp_lib.md b/content/learning-paths/servers-and-cloud-computing/mysql_tune/kernel_comp_lib.md index 95d35ff73b..e42c507ac0 100644 --- a/content/learning-paths/servers-and-cloud-computing/mysql_tune/kernel_comp_lib.md +++ b/content/learning-paths/servers-and-cloud-computing/mysql_tune/kernel_comp_lib.md @@ -14,7 +14,7 @@ The underlying storage technology and the file system format can impact performa Aside from the storage technology, the file system format used with `MySQL` can impact performance. The `xfs` file system is a good starting point. The `ext4` file system is another good alternative. Last, it is recommended to use storage drives that are dedicated to the database (i.e. not shared with the OS or other applications). -When running in the cloud, the disk scheduling algorithm is typically set to `noop` or a similar "dumb" algorithm. This is typically optimal for `MySQL` in the cloud, so no adjustment is needed. However, if running `MySQL` on an on-prem server, it's a good idea to double check what the disk scheduling algorithm is, and possibly change it. According to the [Optimizing InnoDB Disk I/O documentation]https://dev.mysql.com/doc/refman/en/optimizing-innodb-diskio.html), `noop` or `deadline` might be better options. It's worth testing this with on-prem systems. +When running in the cloud, the disk scheduling algorithm is typically set to `noop` or a similar "dumb" algorithm. This is typically optimal for `MySQL` in the cloud, so no adjustment is needed. However, if running `MySQL` on an on-prem server, it's a good idea to double check what the disk scheduling algorithm is, and possibly change it. According to the [Optimizing InnoDB Disk I/O documentation](https://dev.mysql.com/doc/refman/en/optimizing-innodb-diskio.html), `noop` or `deadline` might be better options. It's worth testing this with on-prem systems. ## MySQL storage engines diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/1-dev-env-setup.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/1-dev-env-setup.md new file mode 100644 index 0000000000..6c640dd1f8 --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/1-dev-env-setup.md @@ -0,0 +1,61 @@ +--- +title: Create a development environment +weight: 2 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## Set up your development environment + +In this learning path, you will learn how to build and deploy a simple LLM-based chat app to an Android device using ONNX Runtime. You will learn how to build the ONNX Runtime and ONNX Runtime generate() API and how to run the Phi-3 model for the Android application. + +Your first task is to prepare a development environment with the required software: + +- Android Studio (latest version recommended) +- Android NDK (tested with version 27.0.12077973) +- Python 3.11 +- CMake (tested with version 3.28.1) +- Ninja (tested with version 1.11.1) + +The following instructions were tested on an x86 Windows machine with at least 16GB of RAM. + +## Install Android Studio and Android NDK + +Follow these steps to install and configure Android Studio: + +1. Download and install the latest version of [Android Studio](https://developer.android.com/studio/). + +2. Navigate to **Tools > SDK Manager**. + +3. In the **SDK Platforms** tab, check **Android 14.0 ("UpsideDownCake")**. + +4. In the **SDK Tools** tab, check **NDK (Side by side)**. + +5. Click **OK** and **Apply**. + +## Install Python 3.11 + +Download and install [Python version 3.11](https://www.python.org/downloads/release/python-3110/) + +## Install CMake + +CMake is an open-source tool that automates the build process for software projects, helping to generate platform-specific build configurations. + +[Download and install CMake](https://cmake.org/download/) + +{{% notice Note %}} +The instructions were tested with version 3.28.1 +{{% /notice %}} + +## Install Ninja + +Ninja is a minimalistic build system designed to efficiently handle incremental builds, particularly in large-scale software projects, by focusing on speed and simplicity. The Ninja generator is used to build on Windows for Android. + +[Download and install Ninja]( https://github.com/ninja-build/ninja/releases) + +{{% notice Note %}} +The instructions were tested with version 1.11.1 +{{% /notice %}} + +You now have the required development tools installed to follow this learning path. diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/2-build-onnxruntime.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/2-build-onnxruntime.md new file mode 100644 index 0000000000..d6541e2bd6 --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/2-build-onnxruntime.md @@ -0,0 +1,57 @@ +--- +title: Build ONNX Runtime +weight: 3 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## Cross-compile ONNX Runtime for Android CPU + +Now that you have your environment set up correctly, you can build the ONNX Runtime inference engine. + +ONNX Runtime is an open-source inference engine designed to accelerate the deployment of machine learning models, particularly those in the Open Neural Network Exchange (ONNX) format. ONNX Runtime is optimized for high performance and low latency, making it popular for production deployment of AI models. You can learn more by reading the [ONNX Runtime Overview](https://onnxruntime.ai/). + + +### Clone onnxruntime repo + +Open up a Windows Powershell and checkout the source tree: + +```bash +cd C:\Users\$env:USERNAME +git clone --recursive https://github.com/Microsoft/onnxruntime.git +cd onnxruntime +git checkout 9b37b3ea4467b3aab9110e0d259d0cf27478697d +``` + +{{% notice Note %}} +You might be able to use a later commit. These steps have been tested with the commit `9b37b3ea4467b3aab9110e0d259d0cf27478697d`. +{{% /notice %}} + +### Build for Android CPU + +You use the Ninja generator to build on Windows for Android. First, set JAVA_HOME to the path to your JDK install. You can point to the JDK from Android Studio, or a standalone JDK install. + +```bash +$env:JAVA_HOME="C:\Program Files\Android\Android Studio\jbr" +``` + +Now run the following command: + +```bash + +./build.bat --config Release --build_shared_lib --android --android_sdk_path C:\Users\$env:USERNAME\AppData\Local\Android\Sdk --android_ndk_path C:\Users\$env:USERNAME\AppData\Local\Android\Sdk\ndk\27.0.12077973 --android_abi arm64-v8a --android_api 27 --cmake_generator Ninja --build_java + +``` + +An Android Archive (AAR) file, which can be imported directly in Android Studio, will be generated by using the above command with `--build_java` + +When the build is complete, confirm the shared library and the AAR file have been created: + +``` +ls build\Windows\Release\onnxruntime.so +ls build\Windows\Release\java\build\android\outputs\aar\onnxruntime-release.aar +``` + + + diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/3-build-onnxruntime-generate-api.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/3-build-onnxruntime-generate-api.md new file mode 100644 index 0000000000..4ca2983bec --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/3-build-onnxruntime-generate-api.md @@ -0,0 +1,47 @@ +--- +title: Build ONNX Runtime Generate() API +weight: 4 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## Cross-compile the ONNX Runtime Generate() API for Android CPU + +The Generate() API in ONNX Runtime is designed for text generation tasks using models like Phi-3. It implements the generative AI loop for ONNX models, including: +- pre- and post-processing +- inference with ONNX Runtime +- logits processing +- search and sampling +- KV cache management. +You can learn more by reading the [ONNX Runtime generate() API page](https://onnxruntime.ai/docs/genai/). + + +### Clone onnxruntime-genai repo +Within your Windows Powershell prompt, checkout the source repo: + +```bash +C:\Users\$env:USERNAME +git clone https://github.com/microsoft/onnxruntime-genai +cd onnxruntime-genai +git checkout 1e4d289502a61265c3b07efb17d8796225bb0b7f +``` + +{{% notice Note %}} +You might be able to use later commits. These steps have been tested with the commit `1e4d289502a61265c3b07efb17d8796225bb0b7f`. +{{% /notice %}} + +### Build for Android CPU + +Ninja generator is used to build on Windows for Android. Make sure you have set JAVA_HOME before running the following command: + +```bash +python -m pip install requests +python3.11 build.py --build_java --android --android_home C:\Users\$env:USERNAME\AppData\Local\Android\Sdk --android_ndk_path C:\Users\$env:USERNAME\AppData\Local\Android\Sdk\ndk\27.0.12077973 --android_abi arm64-v8a --config Release +``` + +When the build is complete, confirm the shared library has been created: + +```output +ls build\Android\Release\onnxruntime-genai.so +``` diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/4-run-benchmark-on-android.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/4-run-benchmark-on-android.md new file mode 100644 index 0000000000..4d231a3eef --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/4-run-benchmark-on-android.md @@ -0,0 +1,89 @@ +--- +title: Run a benchmark on an Android phone +weight: 5 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## Run a Phi-3 model on your Android phone + +You can now prepare and run a Phi-3-mini model on your Android smartphone, and view performance metrics: + +### Build model runner + +First, cross-compile the model runner to run on Android using the commands below: + +``` bash +cd onnxruntime-genai +copy src\ort_genai.h examples\c\include\ +copy src\ort_genai_c.h examples\c\include\ +cd examples\c +mkdir build +cd build +``` +Run the `cmake` command as shown: + +```bash +cmake -DCMAKE_TOOLCHAIN_FILE=C:\Users\$env:USERNAME\AppData\Local\Android\Sdk\ndk\27.0.12077973\build\cmake\android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=android-27 -DCMAKE_BUILD_TYPE=Release -G "Ninja" .. +ninja +``` + +After successful build, a binary program called `phi3` will be created. + +### Prepare Phi-3-mini model + +Phi-3 ONNX models are hosted on HuggingFace. You can download the Phi-3-mini model by using the `huggingface-cli` command: + +``` bash +pip install huggingface-hub[cli] +huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* --local-dir . +``` +This command downloads the model into a folder called `cpu_and_mobile`. + +The Phi-3-mini (3B) model has a short (4k) context version and a long (128k) context version. The long context version can accept much longer prompts and produce longer output text, but it does consume more memory. In this learning path, you will use the short context version, which is quantized to 4-bits. + + +### Run on Android via adb shell + +#### Connect your Android phone +Connect your phone to your computer using a USB cable. + +You need to enable USB debugging on your Android device. You can follow [Configure on-device developer options](https://developer.android.com/studio/debug/dev-options) to do this. + +Once you have enabled USB debugging and connected via USB, run: + +``` +adb devices +``` + +You should see your device listed to confirm it is connected. + +#### Copy the runner binary and the model files to the phone + +``` bash +adb push cpu-int4-rtn-block-32-acc-level-4 /data/local/tmp +adb push .\phi3 /data/local/tmp +adb push onnxruntime-genai\build\Android\Release\libonnxruntime-genai.so /data/local/tmp +adb push onnxruntime\build\Windows\Release\libonnxruntime.so /data/local/tmp +``` + +#### Run the model + +Use the runner to execute the model on the phone with the `adb` command: + +``` bash +adb shell +cd /data/local/tmp +chmod 777 phi3 +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp +./phi3 cpu-int4-rtn-block-32-acc-level-4 +``` + +This will allow the runner program to load the model. It will then prompt you to input the text prompt to the model. After you enter your input prompt, the text output by the model will be displayed. On completion, performance metrics similar to those shown below should be displayed: + +``` +Prompt length: 64, New tokens: 931, Time to first: 1.79s, Prompt tokens per second: 35.74 tps, New tokens per second: 6.34 tps +``` + +You have successfully run the Phi-3 model on your Android smartphone powered by Arm. diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/5-build-android-chat-app.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/5-build-android-chat-app.md new file mode 100644 index 0000000000..a7a9d85cce --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/5-build-android-chat-app.md @@ -0,0 +1,54 @@ +--- +title: Build and run an Android chat app +weight: 6 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## Build an Android chat app + +Another way to run the model is to use an Android GUI app. +You can use the Android demo application included in the [onnxruntime-inference-examples repository](https://github.com/microsoft/onnxruntime-inference-examples) to demonstrate local inference. + +### Clone the repo + +``` bash +git clone https://github.com/microsoft/onnxruntime-inference-examples +cd onnxruntime-inference-examples +git checkout 009920df0136d7dfa53944d06af01002fb63e2f5 +``` + +{{% notice Note %}} +You could probably use a later commit but these steps have been tested with the commit `009920df0136d7dfa53944d06af01002fb63e2f5`. +{{% /notice %}} + +### Build the app using Android Studio + +Open the `mobile\examples\phi-3\android` directory with Android Studio. + +#### (Optional) In case you want to use the ONNX Runtime AAR you built + +Copy ONNX Runtime AAR you built earlier in this learning path: + +```bash +Copy onnxruntime\build\Windows\Release\java\build\android\outputs\aar\onnxruntime-release.aar mobile\examples\phi-3\android\app\libs +``` + +Update `build.gradle.kts (:app)` as below: + +``` kotlin +// ONNX Runtime with GenAI +//implementation("com.microsoft.onnxruntime:onnxruntime-android:latest.release") +implementation(files("libs/onnxruntime-release.aar")) +``` + +Finally, click **File > Sync Project with Gradle** + +#### Build and run the app + +When you select **Run**, the build will be executed, and then the app will be copied and installed on the Android device. This app will automatically download the Phi-3-mini model during the first run. After the download, you can input the prompt in the text box and execute it to run the model. + +You should now see a running app on your phone, which looks like this: + +![App screenshot](screenshot.png) diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_index.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_index.md new file mode 100644 index 0000000000..73ef3146ae --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_index.md @@ -0,0 +1,40 @@ +--- +title: Build an Android chat application with ONNX Runtime API + +minutes_to_complete: 60 + +who_is_this_for: This is an advanced topic for software developers interested in learning how to build an Android chat app with ONNX Runtime and ONNX Runtime Generate() API. + +learning_objectives: + - Build ONNX Runtime and ONNX Runtime generate() API for Android. + - Run a Phi-3 model using ONNX Runtime on an Arm-based smartphone. + +prerequisites: + - A Windows x86_64 development machine with at least 16GB of RAM. + - An Android phone with at least 8GB of RAM. This learning path was tested on Samsung Galaxy S24. + +author_primary: Koki Mitsunami + +### Tags +skilllevels: Advanced +subjects: ML +armips: + - Cortex-A + - Cortex-X +tools_software_languages: + - Kotlin + - C++ + - ONNX Runtime + - Android + - Mobile +operatingsystems: + - Windows + - Android + + +### FIXED, DO NOT MODIFY +# ================================================================================ +weight: 1 # _index.md always has weight of 1 to order correctly +layout: "learningpathall" # All files under learning paths have this same wrapper +learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content. +--- diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_next-steps.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_next-steps.md new file mode 100644 index 0000000000..3be9b43e25 --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_next-steps.md @@ -0,0 +1,27 @@ +--- +next_step_guidance: Now that you are familiar with building LLM applications with ONNX Runtime, you are ready to incorporate LLMs into your Android applications. You can learn how to further accelerate the performance of your LLMs using KleidiAI. + +recommended_path: /learning-paths/cross-platform/kleidiai-explainer/ + +further_reading: + - resource: + title: ONNX Runtime + link: https://onnxruntime.ai/docs/ + type: documentation + - resource: + title: ONNX Runtime generate() API + link: https://onnxruntime.ai/docs/genai/ + type: documentation + - resource: + title: Accelerating AI Developer Innovation Everywhere with New Arm Kleidi + link: https://newsroom.arm.com/blog/arm-kleidi + type: blog + + +# ================================================================================ +# FIXED, DO NOT MODIFY +# ================================================================================ +weight: 21 # set to always be larger than the content in this path, and one more than 'review' +title: "Next Steps" # Always the same +layout: "learningpathall" # All files under learning paths have this same wrapper +--- diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_review.md b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_review.md new file mode 100644 index 0000000000..10e994b600 --- /dev/null +++ b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/_review.md @@ -0,0 +1,44 @@ +--- +review: + - questions: + question: > + What is ONNX Runtime? + answers: + - A cross-platform inference engine for running machine learning models. + - A platform for training machine learning models from scratch. + - A cloud-based data storage service for deep learning models. + correct_answer: 1 + explanation: > + ONNX Runtime is a cross-platform inference engine designed to to run machine learning models in the ONNX format. It optimizes model performance across various hardware environments, including CPUs, GPUs and specialized accelerators. + + - questions: + question: > + What is Phi? + answers: + - A new optimization algorithm for neural networks. + - A family of pre-trained language models. + - A toolkit for converting machine learning models to ONNX format. + correct_answer: 2 + explanation: > + Phi models are a series of Large Language Models developed to perform natural language processing tasks such as text generation, completion and comprehension. + + - questions: + question: > + Why is ONNX format important in machine learning? + answers: + - It is a proprietary format developed exclusively for cloud-based AI systems. + - It compresses models to reduce memory usage during training. + - It allows models to be exchanged between different frameworks, such as PyTorch and TensorFlow. + correct_answer: 3 + explanation: > + The ONNX (Open Neural Network Exchange) format is an open-source standard designed to enable the sharing and use of machine learning models across different frameworks such as PyTorch and TensorFlow. It allows models to be exported in a unified format, making them interoperable and ensuring they can run on various platforms or hardware. + + + +# ================================================================================ +# FIXED, DO NOT MODIFY +# ================================================================================ +title: "Review" # Always the same title +weight: 20 # Set to always be larger than the content in this path +layout: "learningpathall" # All files under learning paths have this same wrapper +--- diff --git a/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/screenshot.png b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/screenshot.png new file mode 100644 index 0000000000..8c0724683b Binary files /dev/null and b/content/learning-paths/smartphones-and-mobile/build-android-chat-app-using-onnxruntime/screenshot.png differ diff --git a/tools/requirements.txt b/tools/requirements.txt index d28433e30f..43764fce73 100644 --- a/tools/requirements.txt +++ b/tools/requirements.txt @@ -2,3 +2,4 @@ junit-xml pyyaml inclusivewriting pyspellchecker +setuptools