From 7d7b7e72334fb33cdc3710403577c67b83532e88 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Tue, 29 Apr 2025 21:43:08 +0000 Subject: [PATCH 01/12] Starting content review --- .../win_on_arm_build_onnxruntime/1-dev-env-setup.md | 2 +- .../3-build-onnxruntime-generate-api.md | 8 ++++---- .../win_on_arm_build_onnxruntime/_index.md | 12 ++++-------- 3 files changed, 9 insertions(+), 13 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md index 2c689167ed..3487e5f44d 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md @@ -1,5 +1,5 @@ --- -title: Development environment +title: Set up your Environment weight: 2 ### FIXED, DO NOT MODIFY diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md index ab91cb0146..894fd51fa8 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md @@ -9,10 +9,10 @@ layout: learningpathall ## Compile the ONNX Runtime Generate() API for Windows on Arm The Generate() API in ONNX Runtime is designed for text generation tasks using models like Phi-3. It implements the generative AI loop for ONNX models, including: -- pre- and post-processing -- inference with ONNX Runtime- logits processing -- search and sampling -- KV cache management +- Pre- and post-processing. +- Inference with ONNX Runtime- logits processing. +- Search and sampling. +- KV cache management. You can learn more by reading the [ONNX Runtime Generate() API page](https://onnxruntime.ai/docs/genai/). diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md index 81ead9d54b..bb56d5c533 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md @@ -1,20 +1,16 @@ --- title: Run Phi-3 on a Windows on Arm machine with ONNX Runtime -draft: true -cascade: - draft: true - + minutes_to_complete: 60 -who_is_this_for: A deep-dive for advanced developers looking to build ONNX Runtime on Windows on Arm (WoA) and leverage the Generate() API to run Phi-3 inference with KleidiAI acceleration. +who_is_this_for: This is an advanced topic for developers looking to build ONNX Runtime on Windows on Arm (WoA) and leverage the Generate() API to run Phi-3 inference with KleidiAI acceleration. learning_objectives: - - Build ONNX Runtime and ONNX Runtime Generate() API for Windows on Arm. + - Build ONNX Runtime and the Generate() API for Windows on Arm. - Run a Phi-3 model using ONNX Runtime on a Windows on Arm laptop. - prerequisites: - - A Windows on Arm computer such as the Lenovo Thinkpad X13 running Windows 11 or a Windows on Arm [virtual machine](https://learn.arm.com/learning-paths/cross-platform/woa_azure/) + - A Windows on Arm computer, such as the Lenovo Thinkpad X13 running Windows 11, or a Windows on Arm [virtual machine](/learning-paths/cross-platform/woa_azure/). author: Barbara Corriero From d1346f93ebda013755b4fb59b7d0d07c1a723253 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 13:57:30 +0000 Subject: [PATCH 02/12] Further updates --- .../1-dev-env-setup.md | 26 ++++++++++--------- .../2-build-onnxruntime.md | 16 ++++++++---- .../4-run-benchmark-on-WoA.md | 25 +++++++++++------- .../win_on_arm_build_onnxruntime/_index.md | 11 ++++---- 4 files changed, 46 insertions(+), 32 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md index 3487e5f44d..91240e2d97 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md @@ -8,9 +8,9 @@ layout: learningpathall ## Set up your development environment -In this learning path, you will learn how to build and deploy a LLM on a Windows on Arm (WoA) laptop using ONNX Runtime for inference. +In this learning path, you'll learn how to build and deploy an LLM on a Windows on Arm (WoA) laptop using ONNX Runtime for inference. -You will first learn how to build the ONNX Runtime and ONNX Runtime Generate() API library and then how to download the Phi-3 model and run the inference. You will run the short context (4k) mini (3.3B) variant of Phi 3 model. The short context version accepts a shorter (4K) prompts and produces shorter output text compared to the long (128K) context version. The short version will consume less memory. +You'll first learn how to build the ONNX Runtime and ONNX Runtime Generate() API library and then how to download the Phi-3 model and run the inference. You'll run the short context (4k) mini (3.3B) variant of Phi 3 model. The short context version accepts a shorter (4K) prompts and produces shorter output text compared to the long (128K) context version. The short version consumes less memory. Your first task is to prepare a development environment with the required software: @@ -24,30 +24,32 @@ The following instructions were tested on an WoA 64-bit Windows machine with at Follow these steps to install and configure Visual Studio 2022 IDE: -1. Download and install the latest version of [Visual Studio IDE](https://visualstudio.microsoft.com/downloads/). +1. Download the latest [Visual Studio IDE](https://visualstudio.microsoft.com/downloads/). 2. Select the **Community Version**. An installer called *VisualStudioSetup.exe* will be downloaded. -3. From your Downloads folder, double-click the installer to start the installation. +3. Run the downloaded installer (*VisualStudioSetup.exe*) from your Downloads folder. -4. Follow the prompts and acknowledge **License Terms** and **Privacy Statement**. +4. Follow the installation prompts and accept the **License Terms** and **Privacy Statement**. -5. Once "Downloaded" and "Installed" complete select your workloads. As a minimum you should select **Desktop Development with C++**. This will install the **Microsoft Visual Studio Compiler** or **MSVC**. +5. When prompted to select your workloads, select **Desktop Development with C++**. This includes **Microsoft Visual Studio Compiler** (**MSVC**). ## Install Python -Download and install [Python for Windows on Arm](/install-guides/py-woa) +Download and install [Python for Windows on Arm](/install-guides/py-woa). -You will need Python version 3.10 or higher. This learning path was tested with version 3.11.9. +{{% notice Note %}} +You'll need Python version 3.10 or higher. This Learning Path was tested with version 3.11.9. +{{% /notice %}} ## Install CMake -CMake is an open-source tool that automates the build process for software projects, helping to generate platform-specific build configurations. +CMake is an open-source tool that automates the build process and helps generate platform-specific build configurations. -[Download and install CMake](/install-guides/cmake) +Download and install [CMake for Windows on Arm](/install-guides/cmake). {{% notice Note %}} -The instructions were tested with version 3.30.5 +The instructions were tested with version 3.30.5. {{% /notice %}} -You now have the required development tools installed to follow this learning path. +You’re now ready to move on to building the ONNX Runtime and running inference with Phi-3. diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md index dba9e29407..a6ff45bbea 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md @@ -6,14 +6,20 @@ weight: 3 layout: learningpathall --- -## Compile ONNX Runtime for Windows on Arm -Now that you have your environment set up correctly, you can build the ONNX Runtime inference engine. +## Build ONNX Runtime for Windows on Arm +Now that your environment is set up, you're ready to build the ONNX Runtime inference engine. -ONNX Runtime is an open-source inference engine designed to accelerate the deployment of machine learning models, particularly those in the Open Neural Network Exchange (ONNX) format. ONNX Runtime is optimized for high performance and low latency, making it popular for production deployment of AI models. You can learn more by reading the [ONNX Runtime Overview](https://onnxruntime.ai/). +ONNX Runtime is an open-source inference engine for accelerating the deployment of machine learning models, particularly those in the Open Neural Network Exchange (ONNX) format. ONNX Runtime is optimized for high performance and low latency, widely used in the production deployment of AI models. -### Clone ONNX Runtime Repo +{{% notice Learning Tip %}} +You can learn more about ONNX Runtime by reading the [ONNX Runtime Overview](https://onnxruntime.ai/). +{{% /notice %}} + +### Clone the ONNX Runtime repository + +Open a developer command prompt for Visual Studio to set up the environment including path to compiler, linker, utilities and header files. -Open a Developer Command Prompt for Visual Studio to properly setup the environment including path to compiler, linker, utilities and header files. Create your workspace and check out the source tree: +Create your workspace and check out the source tree: ```bash cd C:\Users\%USERNAME% diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md index 5019c0ff53..05629933b6 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md @@ -6,26 +6,29 @@ weight: 5 layout: learningpathall --- -## Run a Phi-3 model on your Windows on Arm machine +## Run the Phi-3 model on your Windows on Arm machine -In this section, you will learn how to download the Phi3-mini model and run it on your Windows on Arm machine (physical or virtual machine). You will be use a simple model runner program which provides performance metrics. +In this section, you'll download the Phi-3 Mini model and run it on your Windows on Arm machine, which can be a physical or virtual machine. You'll be use a simple model runner program which provides performance metrics -The Phi-3-mini (3.3B) model has a short (4k) context version and a long (128k) context version. The long context version can accept much longer prompts and produces longer output text, but it consumes more memory. -In this learning path, you will use the short context version, which is quantized to 4-bits. +The Phi-3 Mini (3.3B) model has a short (4k) context version and a long (128k) context version. The long context version can accept much longer prompts and produces longer output text, but it consumes more memory. -The Phi-3-mini model used here is in an ONNX format. +In this learning path, you'll use the short context version, which is quantized to 4-bits. + +The Phi-3 Mini model used here is in ONNX format. ### Setup [Phi-3 ONNX models](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) are hosted on HuggingFace. -Hugging Face uses Git for version control and to download ONNX model files, which can be quite large. -You will first need to install the Git Large File Storage (LFS) extension. +Hugging Face uses Git for both version control and to download the ONNX model files, which can be quite large. + +You'll first need to install the Git Large File Storage (LFS) extension: ``` bash winget install -e --id GitHub.GitLFS git lfs install ``` If you don’t have winget, download and run the exe from the [official source](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage?platform=windows). + If the extension is already installed for you when you run the above ``git`` command it will say ``Git LFS initialized``. You then need to install the ``HuggingFace CLI``. @@ -34,7 +37,7 @@ You then need to install the ``HuggingFace CLI``. pip install huggingface-hub[cli] ``` -### Download the Phi-3-mini (4k) model +### Download the Phi-3-Mini-4K model ``` bash cd C:\Users\%USERNAME% @@ -44,8 +47,9 @@ huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include cpu_and This command downloads the model into a folder called `cpu_and_mobile`. ### Build model runner (ONNX Runtime GenAI C Example) -In the previous section you built ONNX RUntime Generate() API from source. -The headers and dynamic linked libraries that are built need to be copied over to appropriate folders (``lib`` and ``inclue``). + +In the previous section you built ONNX Runtime Generate() API from source. The headers and dynamic linked libraries that are built need to be copied over to appropriate folders (``lib`` and ``inclue``). + Building from source is a better practice because the examples usually are updated to run with the latest changes. ``` bash @@ -66,6 +70,7 @@ cmake --build . --config Release ``` After a successful build, a binary program called `phi3` will be created in the ''onnxruntime-genai'' folder: + ```output dir Release\phi3.exe ``` diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md index bb56d5c533..ca0e1890d4 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md @@ -1,16 +1,16 @@ --- -title: Run Phi-3 on a Windows on Arm machine with ONNX Runtime +title: Run Phi-3 on Windows on Arm using ONNX Runtime minutes_to_complete: 60 -who_is_this_for: This is an advanced topic for developers looking to build ONNX Runtime on Windows on Arm (WoA) and leverage the Generate() API to run Phi-3 inference with KleidiAI acceleration. +who_is_this_for: This is an advanced topic for developers looking to build ONNX Runtime for Windows on Arm (WoA) and leverage the Generate() API to run Phi-3 inference with KleidiAI acceleration. learning_objectives: - - Build ONNX Runtime and the Generate() API for Windows on Arm. - - Run a Phi-3 model using ONNX Runtime on a Windows on Arm laptop. + - Build ONNX Runtime and enable the Generate() API for Windows on Arm. + - Run inference with a Phi-3 model using ONNX Runtime with KleidiAI acceleration. prerequisites: - - A Windows on Arm computer, such as the Lenovo Thinkpad X13 running Windows 11, or a Windows on Arm [virtual machine](/learning-paths/cross-platform/woa_azure/). + - A Windows on Arm computer such as a Lenovo Thinkpad X13 running Windows 11, or a Windows on Arm [virtual machine](/learning-paths/cross-platform/woa_azure/). author: Barbara Corriero @@ -26,6 +26,7 @@ tools_software_languages: - Python - Git - cmake + - ONNX Runtime operatingsystems: - Windows From e28610c5b629e8642e2d52373c6e07ba1adabe51 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 14:16:50 +0000 Subject: [PATCH 03/12] Further updates --- .../win_on_arm_build_onnxruntime/1-dev-env-setup.md | 6 +++--- .../win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md index 91240e2d97..3fb22e381c 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md @@ -18,7 +18,7 @@ Your first task is to prepare a development environment with the required softwa - Python 3.10 or higher - CMake 3.28 or higher -The following instructions were tested on an WoA 64-bit Windows machine with at least 16GB of RAM. +The following instructions were tested on a WoA 64-bit Windows machine with at least 16GB of RAM. ## Install Visual Studio 2022 IDE @@ -26,9 +26,9 @@ Follow these steps to install and configure Visual Studio 2022 IDE: 1. Download the latest [Visual Studio IDE](https://visualstudio.microsoft.com/downloads/). -2. Select the **Community Version**. An installer called *VisualStudioSetup.exe* will be downloaded. +2. Select the **Community** edition. An installer called *VisualStudioSetup.exe* will be downloaded. -3. Run the downloaded installer (*VisualStudioSetup.exe*) from your Downloads folder. +3. Run the downloaded installer (*VisualStudioSetup.exe*) from your **Downloads** folder. 4. Follow the installation prompts and accept the **License Terms** and **Privacy Statement**. diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md index 05629933b6..d3ff0f5086 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md @@ -1,5 +1,5 @@ --- -title: Run Phi3 model on a Windows on Arm machine +title: Run Phi3 Model weight: 5 ### FIXED, DO NOT MODIFY From 389641d2ea3c48416c932ef14e313b1a95ae1c17 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 16:03:32 +0000 Subject: [PATCH 04/12] Further updates --- .../win_on_arm_build_onnxruntime/1-dev-env-setup.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md index 3fb22e381c..381e8b4b56 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md @@ -6,9 +6,9 @@ weight: 2 layout: learningpathall --- -## Set up your development environment +## Overview -In this learning path, you'll learn how to build and deploy an LLM on a Windows on Arm (WoA) laptop using ONNX Runtime for inference. +In this Learning Path, you'll learn build and deploy a large language model (LLM) on a Windows on Arm (WoA) laptop using ONNX Runtime for inference. You'll first learn how to build the ONNX Runtime and ONNX Runtime Generate() API library and then how to download the Phi-3 model and run the inference. You'll run the short context (4k) mini (3.3B) variant of Phi 3 model. The short context version accepts a shorter (4K) prompts and produces shorter output text compared to the long (128K) context version. The short version consumes less memory. From be2cd83cdbe0e34550803bcd845a2d59572b5f9b Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 16:20:09 +0000 Subject: [PATCH 05/12] dev env done --- .../1-dev-env-setup.md | 38 ++++++++++++------- 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md index 381e8b4b56..656373a62a 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md @@ -10,29 +10,39 @@ layout: learningpathall In this Learning Path, you'll learn build and deploy a large language model (LLM) on a Windows on Arm (WoA) laptop using ONNX Runtime for inference. -You'll first learn how to build the ONNX Runtime and ONNX Runtime Generate() API library and then how to download the Phi-3 model and run the inference. You'll run the short context (4k) mini (3.3B) variant of Phi 3 model. The short context version accepts a shorter (4K) prompts and produces shorter output text compared to the long (128K) context version. The short version consumes less memory. +You'll learn how to: -Your first task is to prepare a development environment with the required software: +* Build ONNX Runtime and the Generate() API library. +* Download the Phi-3 model and run inference. +* Run the short-context (4K) Mini (3.3B) variant of Phi 3 model. -- Visual Studio 2022 IDE (latest version recommended) -- Python 3.10 or higher -- CMake 3.28 or higher +{{% notice Note %}} +The short-context version accepts shorter (4K) prompts and generates shorter outputs than the long-context (128K) version. It also consumes less memory. +{{% /notice %}} + +## Set up your Development Environment + +Your first task is to prepare a development environment with the required software. Start by installing the required tools: + +- Visual Studio 2022 IDE (latest version recommended). +- Python 3.10 or higher. +- CMake 3.28 or higher. -The following instructions were tested on a WoA 64-bit Windows machine with at least 16GB of RAM. +These instructions were tested on a 64-bit WoA machine with at least 16GB of RAM. -## Install Visual Studio 2022 IDE +## Install and Configure Visual Studio 2022 -Follow these steps to install and configure Visual Studio 2022 IDE: +Follow these steps: 1. Download the latest [Visual Studio IDE](https://visualstudio.microsoft.com/downloads/). -2. Select the **Community** edition. An installer called *VisualStudioSetup.exe* will be downloaded. +2. Select the **Community** edition. This downloads an installer called *VisualStudioSetup.exe*. -3. Run the downloaded installer (*VisualStudioSetup.exe*) from your **Downloads** folder. +3. Run the installer (*VisualStudioSetup.exe*) from your **Downloads** folder. -4. Follow the installation prompts and accept the **License Terms** and **Privacy Statement**. +4. Follow the prompts and accept the **License Terms** and **Privacy Statement**. -5. When prompted to select your workloads, select **Desktop Development with C++**. This includes **Microsoft Visual Studio Compiler** (**MSVC**). +5. When prompted to select workloads, select **Desktop Development with C++**. This installs the **Microsoft Visual Studio Compiler** (**MSVC**). ## Install Python @@ -44,7 +54,7 @@ You'll need Python version 3.10 or higher. This Learning Path was tested with ve ## Install CMake -CMake is an open-source tool that automates the build process and helps generate platform-specific build configurations. +CMake is an open-source tool that automates the build process and generates platform-specific build configurations. Download and install [CMake for Windows on Arm](/install-guides/cmake). @@ -52,4 +62,4 @@ Download and install [CMake for Windows on Arm](/install-guides/cmake). The instructions were tested with version 3.30.5. {{% /notice %}} -You’re now ready to move on to building the ONNX Runtime and running inference with Phi-3. +You’re now ready to build ONNX Runtime and run inference using the Phi-3 model. From fb0cb0747e39229a99c30052beb0e68f18901d53 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 16:51:43 +0000 Subject: [PATCH 06/12] Build done --- .../2-build-onnxruntime.md | 23 +++++++++++-------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md index a6ff45bbea..b048634db1 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md @@ -9,7 +9,9 @@ layout: learningpathall ## Build ONNX Runtime for Windows on Arm Now that your environment is set up, you're ready to build the ONNX Runtime inference engine. -ONNX Runtime is an open-source inference engine for accelerating the deployment of machine learning models, particularly those in the Open Neural Network Exchange (ONNX) format. ONNX Runtime is optimized for high performance and low latency, widely used in the production deployment of AI models. +ONNX Runtime is an open-source engine for accelerating machine learning model inference, especially those in the Open Neural Network Exchange (ONNX) format. + +ONNX Runtime is optimized for high performance and low latency, and is widely used in production deployments. {{% notice Learning Tip %}} You can learn more about ONNX Runtime by reading the [ONNX Runtime Overview](https://onnxruntime.ai/). @@ -17,9 +19,9 @@ You can learn more about ONNX Runtime by reading the [ONNX Runtime Overview](htt ### Clone the ONNX Runtime repository -Open a developer command prompt for Visual Studio to set up the environment including path to compiler, linker, utilities and header files. +Open a command prompt for Visual Studio to set up the environment. This includes paths to the compiler, linker, utilities, and header files. -Create your workspace and check out the source tree: +Then, create your workspace and clone the repository: ```bash cd C:\Users\%USERNAME% @@ -34,17 +36,18 @@ git checkout 4eeefd7260b7fa42a71dd1a08b423d5e7c722050 You might be able to use a later commit. These steps have been tested with the commit `4eeefd7260b7fa42a71dd1a08b423d5e7c722050`. {{% /notice %}} -### Build for Windows +### Build ONNX Runtime + +To build the ONNX Runtime shared library, use one of the following configurations: -You can build the "Release" configuration for a build optimized for performance but without debug information. +-**Release** configuration for a build optimized for performance but without debug information: ```bash .\build.bat --config Release --build_shared_lib --parallel --compile_no_warning_as_error --skip_submodule_sync --skip_tests ``` - -As an alternative, you can build with "RelWithDebInfo" configuration for a release-optimized build with debug information. +**RelWithDebInfo** configuration, which includes debug symbols for profiling or inspection: ```bash .\build.bat --config RelWithDebInfo --build_shared_lib --parallel --compile_no_warning_as_error --skip_submodule_sync --skip_tests @@ -52,13 +55,15 @@ As an alternative, you can build with "RelWithDebInfo" configuration for a relea ### Resulting Dynamic Link Library -When the build is complete, the `onnxruntime.dll` dynamic linked library can be found in: +When the build is complete, you'll find the `onnxruntime.dll` dynamic linked library in: + +* For **Release** build: ``` dir .\build\Windows\Release\Release\onnxruntime.dll ``` -or if you build with debug information it can be found in: +* For **RelWithDebInfo** build: ``` dir .\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxruntime.dll From 19df8b6e79aa99c6ed5c5bd9fc7257234e6a7a0a Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 18:18:59 +0000 Subject: [PATCH 07/12] Updates --- .../4-run-benchmark-on-WoA.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md index d3ff0f5086..787d7e7caa 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md @@ -8,18 +8,23 @@ layout: learningpathall ## Run the Phi-3 model on your Windows on Arm machine -In this section, you'll download the Phi-3 Mini model and run it on your Windows on Arm machine, which can be a physical or virtual machine. You'll be use a simple model runner program which provides performance metrics +In this section, you'll download the Phi-3 Mini model and run it on your WoA machine - either physical or virtual. You'll use a simple model runner that also reports performance metrics. -The Phi-3 Mini (3.3B) model has a short (4k) context version and a long (128k) context version. The long context version can accept much longer prompts and produces longer output text, but it consumes more memory. +The Phi-3 Mini (3.3B) model is available in two versions: -In this learning path, you'll use the short context version, which is quantized to 4-bits. +- Short context (4K) - supports shorter prompts and uses less memory. +- Long context (128K) - supports longer prompts and outputs but consumes more memory. + +This Learning Path uses the short context version, which is quantized to 4-bits. The Phi-3 Mini model used here is in ONNX format. ### Setup [Phi-3 ONNX models](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) are hosted on HuggingFace. -Hugging Face uses Git for both version control and to download the ONNX model files, which can be quite large. +Hugging Face uses Git for both version control and to download the ONNX model files, which are large. + +### Install Git LFS You'll first need to install the Git Large File Storage (LFS) extension: @@ -27,12 +32,11 @@ You'll first need to install the Git Large File Storage (LFS) extension: winget install -e --id GitHub.GitLFS git lfs install ``` -If you don’t have winget, download and run the exe from the [official source](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage?platform=windows). +If you don’t have winget, [download the installer manually](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage?platform=windows). If the extension is already installed for you when you run the above ``git`` command it will say ``Git LFS initialized``. -You then need to install the ``HuggingFace CLI``. - +You then need to install the ``HuggingFace CLI`` ``` bash pip install huggingface-hub[cli] ``` From 1233b554aec0227e3309f776b06b27f9c6e21239 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 18:30:51 +0000 Subject: [PATCH 08/12] Updates --- .../4-run-benchmark-on-WoA.md | 20 ++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md index 787d7e7caa..094207e047 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md @@ -34,27 +34,29 @@ git lfs install ``` If you don’t have winget, [download the installer manually](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage?platform=windows). -If the extension is already installed for you when you run the above ``git`` command it will say ``Git LFS initialized``. +If Git LFS is already installed, you'll see ``Git LFS initialized``. -You then need to install the ``HuggingFace CLI`` +### Install Hugging Face CLI + +You then need to install the ``HuggingFace CLI``: ``` bash pip install huggingface-hub[cli] ``` -### Download the Phi-3-Mini-4K model +### Download the Phi-3-Mini (4K) model ``` bash cd C:\Users\%USERNAME% cd repos\lp huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* --local-dir . ``` -This command downloads the model into a folder called `cpu_and_mobile`. +This command downloads the model into a folder named `cpu_and_mobile`. -### Build model runner (ONNX Runtime GenAI C Example) +### Build the Model Runner (ONNX Runtime GenAI C Example) -In the previous section you built ONNX Runtime Generate() API from source. The headers and dynamic linked libraries that are built need to be copied over to appropriate folders (``lib`` and ``inclue``). +In the previous step, you built the ONNX Runtime Generate() API from source. Now, copy over the resulting headers and dynamic linked libraries into the appropriate folders (``lib`` and ``include``). -Building from source is a better practice because the examples usually are updated to run with the latest changes. +Building from source is a better practice because the examples usually are updated to run with the latest changes: ``` bash copy onnxruntime\build\Windows\Release\Release\onnxruntime.* onnxruntime-genai\examples\c\lib @@ -73,7 +75,7 @@ cd build cmake --build . --config Release ``` -After a successful build, a binary program called `phi3` will be created in the ''onnxruntime-genai'' folder: +After a successful build, the binary `phi3` will be created in the ''onnxruntime-genai'' folder: ```output dir Release\phi3.exe @@ -81,7 +83,7 @@ dir Release\phi3.exe #### Run the model -Use the runner you just built to execute the model with the following commands: +Execute the model using the following command: ``` bash cd C:\Users\%USERNAME% From 2bac82c9b52eda35b0537e66558732eb434aa39c Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 18:48:31 +0000 Subject: [PATCH 09/12] Updates --- .../1-dev-env-setup.md | 20 +++++++++++-------- .../2-build-onnxruntime.md | 8 ++++---- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md index 656373a62a..0e7179532e 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md @@ -8,9 +8,9 @@ layout: learningpathall ## Overview -In this Learning Path, you'll learn build and deploy a large language model (LLM) on a Windows on Arm (WoA) laptop using ONNX Runtime for inference. +In this Learning Path, you'll learn how to build and deploy a large language model (LLM) on a Windows on Arm (WoA) machine using ONNX Runtime for inference. -You'll learn how to: +Specifically, you'll learn how to: * Build ONNX Runtime and the Generate() API library. * Download the Phi-3 model and run inference. @@ -22,25 +22,29 @@ The short-context version accepts shorter (4K) prompts and generates shorter out ## Set up your Development Environment -Your first task is to prepare a development environment with the required software. Start by installing the required tools: +Your first task is to prepare a development environment with the required software. -- Visual Studio 2022 IDE (latest version recommended). +Start by installing the required tools: + +- Visual Studio 2022 IDE (the latest version available is recommended). - Python 3.10 or higher. - CMake 3.28 or higher. +{{% notice Note %}} These instructions were tested on a 64-bit WoA machine with at least 16GB of RAM. +{{% /notice %}} ## Install and Configure Visual Studio 2022 -Follow these steps: +Now, to install and configure Visual Studio, follow these steps: 1. Download the latest [Visual Studio IDE](https://visualstudio.microsoft.com/downloads/). -2. Select the **Community** edition. This downloads an installer called *VisualStudioSetup.exe*. +2. Select the **Community** edition. This downloads an installer called `VisualStudioSetup.exe`. -3. Run the installer (*VisualStudioSetup.exe*) from your **Downloads** folder. +3. Run `VisualStudioSetup.exe` from your **Downloads** folder. -4. Follow the prompts and accept the **License Terms** and **Privacy Statement**. +4. Follow the prompts and accept the License Terms and Privacy Statement. 5. When prompted to select workloads, select **Desktop Development with C++**. This installs the **Microsoft Visual Studio Compiler** (**MSVC**). diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md index b048634db1..21e67fc0ae 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md @@ -19,7 +19,7 @@ You can learn more about ONNX Runtime by reading the [ONNX Runtime Overview](htt ### Clone the ONNX Runtime repository -Open a command prompt for Visual Studio to set up the environment. This includes paths to the compiler, linker, utilities, and header files. +Open a command prompt for Visual Studio to set up the environment, which includes paths to the compiler, linker, utilities, and header files. Then, create your workspace and clone the repository: @@ -40,14 +40,14 @@ You might be able to use a later commit. These steps have been tested with the c To build the ONNX Runtime shared library, use one of the following configurations: --**Release** configuration for a build optimized for performance but without debug information: +* **Release** configuration, for a build optimized for performance but without debug information: ```bash .\build.bat --config Release --build_shared_lib --parallel --compile_no_warning_as_error --skip_submodule_sync --skip_tests ``` -**RelWithDebInfo** configuration, which includes debug symbols for profiling or inspection: +* **RelWithDebInfo** configuration, which includes debug symbols for profiling or inspection: ```bash .\build.bat --config RelWithDebInfo --build_shared_lib --parallel --compile_no_warning_as_error --skip_submodule_sync --skip_tests @@ -55,7 +55,7 @@ To build the ONNX Runtime shared library, use one of the following configuration ### Resulting Dynamic Link Library -When the build is complete, you'll find the `onnxruntime.dll` dynamic linked library in: +When the build is complete, you'll find the `onnxruntime.dll` dynamic linked library in the following respective directories: * For **Release** build: From e0319b11325f35ccfc5508ff208304e6e33fc705 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 19:13:11 +0000 Subject: [PATCH 10/12] Updates --- .../1-dev-env-setup.md | 2 +- .../3-build-onnxruntime-generate-api.md | 26 +++++++++++-------- .../4-run-benchmark-on-WoA.md | 2 +- 3 files changed, 17 insertions(+), 13 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md index 0e7179532e..70b741063d 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/1-dev-env-setup.md @@ -20,7 +20,7 @@ Specifically, you'll learn how to: The short-context version accepts shorter (4K) prompts and generates shorter outputs than the long-context (128K) version. It also consumes less memory. {{% /notice %}} -## Set up your Development Environment +## Set up your development environment Your first task is to prepare a development environment with the required software. diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md index 894fd51fa8..fe4dd6dcd7 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md @@ -6,21 +6,23 @@ weight: 4 layout: learningpathall --- -## Compile the ONNX Runtime Generate() API for Windows on Arm +## Build the ONNX Runtime Generate() API for Windows on Arm The Generate() API in ONNX Runtime is designed for text generation tasks using models like Phi-3. It implements the generative AI loop for ONNX models, including: - Pre- and post-processing. -- Inference with ONNX Runtime- logits processing. +- Inference with ONNX Runtime (including logits processing). - Search and sampling. - KV cache management. -You can learn more by reading the [ONNX Runtime Generate() API page](https://onnxruntime.ai/docs/genai/). +{{% notice Learning Tip %}} +You can learn more about this area by reading the [ONNX Runtime Generate() API documentation](https://onnxruntime.ai/docs/genai/). +{{% /notice %}} -In this section you will learn how to build the Generate API() from source. +In this section, you'll build the Generate API() from source. -### Clone onnxruntime-genai Repo -Within your Windows Developer Command Prompt for Visual Studio, checkout the source repo: +### Clone the onnxruntime-genai repository +From your **Windows Developer Command Prompt for Visual Studio**, clone the repository and checkout the following tested commit: ```bash cd C:\Users\%USERNAME% @@ -35,18 +37,20 @@ You might be able to use later commits. These steps have been tested with the co {{% /notice %}} ### Build for Windows on Arm -The build command below has a ---config argument, which takes the following options: -- ```Release``` builds release build -- ```Debug``` builds binaries with debug symbols -- ```RelWithDebInfo``` builds release binaries with debug info +The build script uses a ---config argument, which supports the following options: +- ```Release``` builds release build. +- ```Debug``` builds binaries with debug symbols. +- ```RelWithDebInfo``` builds release binaries with debug info. -You will build the `Release` variant of the ONNX Runtime Generate() API: +To build the `Release` variant of the ONNX Runtime Generate() API: ```bash pip install requests python build.py --config Release --skip_tests ``` +### Verify the output + When the build is complete, confirm the ONNX Runtime Generate() API Dynamic Link Library has been created: ```output diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md index 094207e047..ea0e2981b3 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md @@ -81,7 +81,7 @@ After a successful build, the binary `phi3` will be created in the ''onnxruntime dir Release\phi3.exe ``` -#### Run the model +### Run the model Execute the model using the following command: From d1b28a1afc641aa2c0ee85ac5abadd4bcf4461d4 Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 19:25:17 +0000 Subject: [PATCH 11/12] Final tweaks --- .../win_on_arm_build_onnxruntime/2-build-onnxruntime.md | 4 ++-- .../3-build-onnxruntime-generate-api.md | 6 +++--- .../win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md | 2 +- .../win_on_arm_build_onnxruntime/_index.md | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md index 21e67fc0ae..819192f9d6 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/2-build-onnxruntime.md @@ -54,8 +54,8 @@ To build the ONNX Runtime shared library, use one of the following configuration ``` -### Resulting Dynamic Link Library -When the build is complete, you'll find the `onnxruntime.dll` dynamic linked library in the following respective directories: +### Resulting Dynamically Linked Library +When the build is complete, you'll find the `onnxruntime.dll` dynamically linked library in the following respective directories: * For **Release** build: diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md index fe4dd6dcd7..fe28c46f6f 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/3-build-onnxruntime-generate-api.md @@ -18,7 +18,7 @@ The Generate() API in ONNX Runtime is designed for text generation tasks using m You can learn more about this area by reading the [ONNX Runtime Generate() API documentation](https://onnxruntime.ai/docs/genai/). {{% /notice %}} -In this section, you'll build the Generate API() from source. +In this section, you'll build the Generate() API from source. ### Clone the onnxruntime-genai repository @@ -37,7 +37,7 @@ You might be able to use later commits. These steps have been tested with the co {{% /notice %}} ### Build for Windows on Arm -The build script uses a ---config argument, which supports the following options: +The build script uses a --config argument, which supports the following options: - ```Release``` builds release build. - ```Debug``` builds binaries with debug symbols. - ```RelWithDebInfo``` builds release binaries with debug info. @@ -51,7 +51,7 @@ python build.py --config Release --skip_tests ### Verify the output -When the build is complete, confirm the ONNX Runtime Generate() API Dynamic Link Library has been created: +When the build is complete, confirm the ONNX Runtime Generate() API Dynamically Linked Library has been created: ```output dir build\Windows\Release\Release\onnxruntime-genai.dll diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md index ea0e2981b3..477178aada 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md @@ -54,7 +54,7 @@ This command downloads the model into a folder named `cpu_and_mobile`. ### Build the Model Runner (ONNX Runtime GenAI C Example) -In the previous step, you built the ONNX Runtime Generate() API from source. Now, copy over the resulting headers and dynamic linked libraries into the appropriate folders (``lib`` and ``include``). +In the previous step, you built the ONNX Runtime Generate() API from source. Now, copy over the resulting headers and Dynamically Linked Libraries into the appropriate folders (``lib`` and ``include``). Building from source is a better practice because the examples usually are updated to run with the latest changes: diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md index ca0e1890d4..9f072da1de 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/_index.md @@ -36,7 +36,7 @@ further_reading: link: https://onnxruntime.ai/docs/ type: documentation - resource: - title: ONNX Runtime generate() API + title: ONNX Runtime Generate() API link: https://onnxruntime.ai/docs/genai/ type: documentation - resource: From f62508e36273966ed0831b2921c10f1468e7a38b Mon Sep 17 00:00:00 2001 From: Maddy Underwood Date: Wed, 30 Apr 2025 19:26:44 +0000 Subject: [PATCH 12/12] Correction --- .../win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md index 477178aada..9a91ec8bf9 100644 --- a/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md +++ b/content/learning-paths/laptops-and-desktops/win_on_arm_build_onnxruntime/4-run-benchmark-on-WoA.md @@ -19,7 +19,7 @@ This Learning Path uses the short context version, which is quantized to 4-bits. The Phi-3 Mini model used here is in ONNX format. -### Setup +### Set up [Phi-3 ONNX models](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) are hosted on HuggingFace. Hugging Face uses Git for both version control and to download the ONNX model files, which are large.