You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/onnx/_index.md
+8-7Lines changed: 8 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,24 +1,25 @@
1
1
---
2
-
title: Run Phi-3.5 Vision Model with ONNX Runtime on Microsoft Azure Cobalt 100 VMs
2
+
title: Deploy Phi-3.5 Vision with ONNX Runtime on Azure Cobalt 100 on Arm
3
+
4
+
3
5
4
6
draft: true
5
7
cascade:
6
8
draft: true
7
9
8
10
minutes_to_complete: 30
9
11
10
-
who_is_this_for: This is an advanced topic for software developers, ML engineers, and cloud practitioners looking to deploy Microsoft's Phi Models on Arm-based servers using ONNX Runtime.
12
+
who_is_this_for: This is an advanced topic for developers, ML engineers, and cloud practitioners looking to deploy Microsoft's Phi Models on Arm-based servers using ONNX Runtime.
11
13
12
14
learning_objectives:
13
-
- Install ONNX Runtime, download and quantize the Phi-3.5 vision model.
14
-
- Run the Phi-3.5 model with ONNX Runtime on Azure.
15
+
- Quantize and run the Phi-3.5 vision model with ONNX Runtime on Azure.
15
16
- Analyze performance on Arm Neoverse-N2 based Azure Cobalt 100 VMs.
16
17
17
18
prerequisites:
18
-
- An [Armbased instance](/learning-paths/servers-and-cloud-computing/csp/) from an appropriate cloud service provider. This Learning Path has been tested on a Microsoft Azure Cobalt 100 virtual machine with 32 cores, 8GB of RAM, and 32GB of disk space.
19
+
- An [Arm-based instance](/learning-paths/servers-and-cloud-computing/csp/) from an appropriate cloud service provider. This Learning Path has been tested on a Microsoft Azure Cobalt 100 virtual machine with 32 cores, 8GB of RAM, and 32GB of disk space.
19
20
- Basic understanding of Python and machine learning concepts.
20
21
- Familiarity with ONNX Runtime and Azure cloud services.
21
-
- Knowledge of LLM (Large Language Model) fundamentals.
22
+
- Knowledge of Large Language Model (LLM) fundamentals.
After downloading the image, input the image prompt along with the image name, and enter the text prompt as demonstrated in the example below:
20
+
## Try an image + text prompt
21
+
22
+
After downloading the image, provide the image file name when prompted, followed by the text prompt, as demonstrated in the example below:
21
23

22
24
23
-
## Observe Performance Metrics
25
+
## Observe performance metrics
24
26
25
27
As shown in the example above, the LLM Chatbot performs inference at a speed of **44 tokens/second**, with the time to first token being approximately **1 second**. This highlights the efficiency and responsiveness of the LLM Chatbot in processing queries and generating outputs.
26
28
27
-
## Further Interaction and Custom Applications
29
+
## Further interaction and custom applications
28
30
29
31
You can continue interacting with the chatbot by asking follow-up prompts and observing the performance metrics displayed in the terminal.
30
32
31
-
This setup demonstrates how to build and configure applications using the Phi3.5 model for text generation with both text and image inputs. It also showcases the optimized performance of running Phi models on Arm CPUs, emphasizing the significant performance gains achieved through this workflow.
33
+
This setup shows how to build applications using the Phi-3.5 model for multimodal generation from text and image inputs. It also highlights the performance benefits of running Phi models on Arm CPUs.
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/onnx/setup.md
+26-14Lines changed: 26 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,27 +1,35 @@
1
1
---
2
2
# User change
3
-
title: "Build ONNX Runtime and setup Phi-3.5 vision model"
3
+
title: "Build ONNX Runtime and set up the Phi-3.5 Vision Model"
4
4
5
5
weight: 2
6
6
7
7
# Do not modify these elements
8
8
layout: "learningpathall"
9
9
---
10
+
## Overview
10
11
11
-
In this Learning Path you will learn how to run quantized Phi models using ONNX Runtime on Microsoft Azure Cobalt 100 servers using ONNX Runtime. Specifically, you will deploy the Phi 3.5 vision model on Arm-based servers running Ubuntu 24.04 LTS. The instructions have been tested on an Azure `Dpls_v6` 32 core instance.
12
+
In this Learning Path, you'll run quantized Phi models with ONNX Runtime on Microsoft Azure Cobalt 100 servers.
13
+
14
+
Specifically, you'll deploy the Phi-3.5 vision model on Arm-based servers running Ubuntu 24.04 LTS.
15
+
16
+
17
+
{{% notice Note %}}
18
+
These instructions have been tested on a 32-core Azure `Dpls_v6` instance.
19
+
{{% /notice %}}
12
20
13
-
## Overview
14
21
15
22
You will learn how to build and configure ONNX Runtime to enable efficient LLM inference on Arm CPUs.
16
23
17
-
The tutorial covers the following steps:
18
-
- Building ONNX Runtime, quantizing and converting the Phi 3.5 vision model to the ONNX format.
19
-
- Running the model using a Python script with ONNX Runtime to perform LLM inference on the CPU.
20
-
- Analyzing the performance.
24
+
This Learning Path walks you through the following tasks:
25
+
- Build ONNX Runtime.
26
+
- Quantize and convert the Phi-3.5 vision model to ONNX format.
27
+
- Run the model using a Python script with ONNX Runtime for CPU-based LLM inference.
28
+
- Analyze performance on Arm CPUs.
21
29
22
30
## Install dependencies
23
31
24
-
Install the following packages on your Arm-based server instance:
32
+
On your Arm-based server, install the following packages:
25
33
26
34
```bash
27
35
sudo apt update
@@ -30,18 +38,17 @@ Install the following packages on your Arm-based server instance:
30
38
31
39
## Create a requirements file
32
40
33
-
Use a file editor of your choice and create a `requirements.txt` file will the python packages shown below:
41
+
Use a file editor of your choice and create a `requirements.txt` file with the Python packages shown below:
34
42
35
43
```python
36
44
requests
37
45
torch
38
46
transformers
39
47
accelerate
40
48
huggingface-hub
41
-
pyreadline3
42
49
```
43
50
44
-
## Install Python Dependencies
51
+
## Install Python dependencies
45
52
46
53
Create a virtual environment:
47
54
```bash
@@ -68,13 +75,18 @@ Clone and build the `onnxruntime-genai` repository, which includes the Kleidi AI
0 commit comments