Skip to content

Commit 5bad00a

Browse files
authored
Merge pull request #545 from v-thepet/onnx
Freshness 8 - 180 days freshness updates
2 parents 58d3248 + f084316 commit 5bad00a

File tree

1 file changed

+34
-37
lines changed

1 file changed

+34
-37
lines changed
Lines changed: 34 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,101 +1,98 @@
11
---
2-
title: 'ONNX models: Optimize inference'
2+
title: ONNX Runtime and models
33
titleSuffix: Azure Machine Learning
4-
description: Learn how using the Open Neural Network Exchange (ONNX) can help optimize the inference of your machine learning model.
4+
description: Learn how using the Open Neural Network Exchange (ONNX) can help optimize inference of your machine learning models.
55
services: machine-learning
66
ms.service: azure-machine-learning
77
ms.subservice: core
88
ms.topic: concept-article
99
ms.author: mopeakande
1010
author: msakande
1111
ms.reviewer: kritifaujdar
12-
ms.date: 03/18/2024
13-
14-
#customer intent: As a data scientist, I want learn how to use ONNX to create machine learning models and accelerate inferencing.
12+
ms.date: 09/30/2024
13+
#customer intent: As a data scientist, I want to learn about ONNX so I can use it to optimize the inference of my machine learning models.
1514
---
1615

1716
# ONNX and Azure Machine Learning
1817

19-
Learn how use of the [Open Neural Network Exchange](https://onnx.ai) (ONNX) can help to optimize the inference of your machine learning model. _Inference_ or _model scoring_, is the process of using a deployed model to generate predictions on production data.
18+
This article describes how the [Open Neural Network Exchange (ONNX)](https://onnx.ai) can help optimize the *inference* of your machine learning models. Inference or model scoring is the process of using a deployed model to generate predictions on production data.
19+
20+
Optimizing machine learning models for inference requires you to tune the model and the inference library to make the most of hardware capabilities. This task becomes complex if you want to get optimal performance on different platforms such as cloud, edge, CPU, or GPU, because each platform has different capabilities and characteristics. The complexity increases if you need to run models from various frameworks on different platforms. It can be time-consuming to optimize all the different combinations of frameworks and hardware.
2021

21-
Optimizing machine learning models for inference requires you to tune the model and the inference library to make the most of the hardware capabilities. This task becomes complex if you want to get optimal performance on different kinds of platforms such as cloud or edge, CPU or GPU, and so on, since each platform has different capabilities and characteristics. The complexity increases if you have models from various frameworks that need to run on different platforms. It can be time-consuming to optimize all the different combinations of frameworks and hardware. Therefore, a useful solution is to train your model once in your preferred framework and then run it anywhere on the cloud or edge—this solution is where ONNX comes in.
22+
A useful solution is to train your model one time in your preferred framework, and then export or convert it to ONNX so it can run anywhere on the cloud or edge. Microsoft and a community of partners created ONNX as an open standard for representing machine learning models. You can export or convert models from [many frameworks](https://onnx.ai/supported-tools) to the standard ONNX format. Supported frameworks include TensorFlow, PyTorch, scikit-learn, Keras, Chainer, MXNet, and MATLAB. You can run models in the ONNX format on various platforms and devices.
2223

23-
## What is ONNX?
24+
This ONNX flow diagram shows available frameworks and deployment options.
2425

25-
Microsoft and a community of partners created ONNX as an open standard for representing machine learning models. Models from [many frameworks](https://onnx.ai/supported-tools) including TensorFlow, PyTorch, scikit-learn, Keras, Chainer, MXNet, and MATLAB can be exported or converted to the standard ONNX format. Once the models are in the ONNX format, they can be run on various platforms and devices.
26+
:::image type="content" source="media/concept-onnx/onnx.png" alt-text="ONNX flow diagram showing training, converters, and deployment." border="false" lightbox="media/concept-onnx/onnx.png":::
2627

27-
[ONNX Runtime](https://onnxruntime.ai) is a high-performance inference engine for deploying ONNX models to production. It's optimized for both cloud and edge and works on Linux, Windows, and Mac. While ONNX is written in C++, it also has C, Python, C#, Java, and JavaScript (Node.js) APIs for usage in many environments. ONNX Runtime supports both deep neural networks (DNN) and traditional machine learning models, and it integrates with accelerators on different hardware such as TensorRT on Nvidia GPUs, OpenVINO on Intel processors, and DirectML on Windows. By using ONNX Runtime, you can benefit from the extensive production-grade optimizations, testing, and ongoing improvements.
28+
## ONNX Runtime
2829

29-
ONNX Runtime is used in high-scale Microsoft services such as Bing, Office, and Azure AI. Although performance gains depend on many factors, these Microsoft services report an __average 2x performance gain on CPU__. In addition to Azure Machine Learning services, ONNX Runtime also runs in other products that support Machine Learning workloads, including:
30+
[ONNX Runtime](https://onnxruntime.ai) is a high-performance inference engine for deploying ONNX models to production. ONNX Runtime is optimized for both cloud and edge, and works on Linux, Windows, and macOS. ONNX is written in C++, but also has C, Python, C#, Java, and JavaScript (Node.js) APIs to use in those environments.
3031

31-
- __Windows__: The runtime is built into Windows as part of [Windows Machine Learning](/windows/ai/windows-ml/) and runs on hundreds of millions of devices.
32-
- __Azure SQL product family__: Run native scoring on data in [Azure SQL Edge](/azure/azure-sql-edge/onnx-overview) and [Azure SQL Managed Instance](/azure/azure-sql/managed-instance/machine-learning-services-overview).
33-
- __ML.NET__: [Run ONNX models in ML.NET](/dotnet/machine-learning/tutorials/object-detection-onnx).
32+
ONNX Runtime supports both deep neural networks (DNN) and traditional machine learning models, and it integrates with accelerators on different hardware such as TensorRT on Nvidia GPUs, OpenVINO on Intel processors, and DirectML on Windows. By using ONNX Runtime, you can benefit from extensive production-grade optimizations, testing, and ongoing improvements.
3433

35-
:::image type="content" source="media/concept-onnx/onnx.png" alt-text="ONNX flow diagram showing training, converters, and deployment." lightbox="media/concept-onnx/onnx.png":::
34+
High-scale Microsoft services such as Bing, Office, and Azure AI use ONNX Runtime. Although performance gains depend on many factors, these Microsoft services report an average 2x performance gain on CPU by using ONNX. ONNX Runtime runs in Azure Machine Learning and other Microsoft products that support machine learning workloads, including:
3635

37-
## How to obtain ONNX models
36+
- **Windows**. ONNX runtime is built into Windows as part of [Windows Machine Learning](/windows/ai/windows-ml/) and runs on hundreds of millions of devices.
37+
- **Azure SQL**. [Azure SQL Edge](/azure/azure-sql-edge/onnx-overview) and [Azure SQL Managed Instance](/azure/azure-sql/managed-instance/machine-learning-services-overview) use ONNX to run native scoring on data.
38+
- **ML.NET**. For an example, see [Tutorial: Detect objects using ONNX in ML.NET](/dotnet/machine-learning/tutorials/object-detection-onnx).
39+
40+
## Ways to obtain ONNX models
3841

3942
You can obtain ONNX models in several ways:
4043

41-
- Train a new ONNX model in Azure Machine Learning (as described in the [examples](#examples) section of this article) or by using [automated machine learning capabilities](concept-automated-ml.md#automl--onnx).
42-
- Convert an existing model from another format to ONNX as shown in these [tutorials](https://github.com/onnx/tutorials).
44+
- [Train a new ONNX model in Azure Machine Learning](https://github.com/onnx/onnx/tree/main/examples) or use [automated machine learning capabilities](concept-automated-ml.md#automl--onnx).
45+
- Convert an existing model from another format to ONNX. For more information, see [ONNX Tutorials](https://github.com/onnx/tutorials).
4346
- Get a pretrained ONNX model from the [ONNX Model Zoo](https://github.com/onnx/models).
4447
- Generate a customized ONNX model from [Azure AI Custom Vision service](/azure/ai-services/custom-vision-service/).
4548

46-
Many models, including image classification, object detection, and text processing models can be represented as ONNX models. If you run into an issue with a model that can't be converted successfully, file a GitHub issue in the repository of the converter that you used. You can continue using your existing model format until the issue is addressed.
49+
You can represent many models as ONNX, including image classification, object detection, and text processing models. If you can't convert your model successfully, file a GitHub issue in the repository of the converter you used.
4750

4851
## ONNX model deployment in Azure
4952

50-
With Azure Machine Learning, you can deploy, manage, and monitor your ONNX models. Using the standard [MLOps deployment workflow](concept-model-management-and-deployment.md) and ONNX Runtime, you can create a REST endpoint hosted in the cloud. For hands-on examples, see these [Jupyter notebooks](#examples).
53+
You can deploy, manage, and monitor your ONNX models in Azure Machine Learning. Using a standard [MLOps deployment workflow](concept-model-management-and-deployment.md#deploy-models-as-endpoints) with ONNX Runtime, you can create a REST endpoint hosted in the cloud.
5154

52-
### Installation and use of ONNX Runtime with Python
55+
## Python packages for ONNX Runtime
5356

54-
Python packages for ONNX Runtime are available on [PyPi.org](https://pypi.org) ([CPU](https://pypi.org/project/onnxruntime) and [GPU](https://pypi.org/project/onnxruntime-gpu)). Be sure to review the [system requirements](https://github.com/Microsoft/onnxruntime#system-requirements) before installation.
57+
Python packages for [CPU](https://pypi.org/project/onnxruntime) and [GPU](https://pypi.org/project/onnxruntime-gpu) ONNX Runtime are available on [PyPi.org](https://pypi.org). Be sure to review system requirements before installation.
5558

5659
To install ONNX Runtime for Python, use one of the following commands:
5760

5861
```python
59-
pip install onnxruntime # CPU build
62+
pip install onnxruntime # CPU build
6063
pip install onnxruntime-gpu # GPU build
6164
```
6265

63-
To call ONNX Runtime in your Python script, use:
66+
To call ONNX Runtime in your Python script, use the following code:
6467

6568
```python
6669
import onnxruntime
6770
session = onnxruntime.InferenceSession("path to model")
6871
```
6972

70-
The documentation accompanying the model usually tells you the inputs and outputs for using the model. You can also use a visualization tool such as [Netron](https://github.com/lutzroeder/Netron) to view the model. ONNX Runtime also lets you query the model metadata, inputs, and outputs as follows:
73+
The documentation accompanying the model usually tells you the inputs and outputs for using the model. You can also use a visualization tool such as [Netron](https://github.com/lutzroeder/Netron) to view the model.
74+
75+
ONNX Runtime lets you query the model metadata, inputs, and outputs, as follows:
7176

7277
```python
7378
session.get_modelmeta()
7479
first_input_name = session.get_inputs()[0].name
7580
first_output_name = session.get_outputs()[0].name
7681
```
7782

78-
To perform inferencing on your model, use `run` and pass in the list of outputs you want returned (or leave the list empty if you want all of them) and a map of the input values. The result is a list of the outputs.
83+
To perform inferencing on your model, use `run` and pass in the list of outputs you want returned and a map of the input values. Leave the output list empty if you want all of the outputs. The result is a list of the outputs.
7984

8085
```python
8186
results = session.run(["output1", "output2"], {
8287
"input1": indata1, "input2": indata2})
8388
results = session.run([], {"input1": indata1, "input2": indata2})
8489
```
8590

86-
For the complete Python API reference, see the [ONNX Runtime reference docs](https://onnxruntime.ai/docs/api/python/api_summary.html).
87-
88-
## Examples
89-
90-
- [!INCLUDE [aml-clone-in-azure-notebook](includes/aml-clone-for-examples.md)]
91-
- For samples that show ONNX usage in other languages, see the [ONNX Runtime GitHub](https://github.com/microsoft/onnxruntime/tree/master/samples).
91+
For the complete ONNX Runtime API reference, see the [Python API documentation](https://onnxruntime.ai/docs/api/python/api_summary.html).
9292

9393
## Related content
9494

95-
Learn more about **ONNX** or contribute to the project:
9695
- [ONNX project website](https://onnx.ai)
97-
- [ONNX code on GitHub](https://github.com/onnx/onnx)
98-
99-
Learn more about **ONNX Runtime** or contribute to the project:
96+
- [ONNX GitHub repository](https://github.com/onnx/onnx)
10097
- [ONNX Runtime project website](https://onnxruntime.ai)
101-
- [ONNX Runtime GitHub Repo](https://github.com/Microsoft/onnxruntime)
98+
- [ONNX Runtime GitHub repository](https://github.com/Microsoft/onnxruntime)

0 commit comments

Comments
 (0)