Skip to content

Commit 324a016

Browse files
committed
Edits
1 parent 0448c8f commit 324a016

File tree

1 file changed

+145
-19
lines changed

1 file changed

+145
-19
lines changed

articles/machine-learning/how-to-deploy-with-triton.md

Lines changed: 145 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -29,17 +29,32 @@ Learn how to use [NVIDIA Triton Inference Server](https://aka.ms/nvidia-triton-d
2929

3030
Triton is multi-framework, open-source software that is optimized for inference. It supports popular machine learning frameworks like TensorFlow, ONNX Runtime, PyTorch, NVIDIA TensorRT, and more. It can be used for your CPU or GPU workloads.
3131

32-
In this article, you will learn how to deploy Triton and a model to a managed online endpoint. Information is provided on using both the CLI (command line) and Azure Machine Learning studio.
32+
In this article, you will learn how to deploy Triton and a model to a managed online endpoint. Information is provided on using the CLI (command line), Python SDK v2, and Azure Machine Learning studio.
3333

3434
> [!NOTE]
3535
> * [NVIDIA Triton Inference Server](https://aka.ms/nvidia-triton-docs) is an open-source third-party software that is integrated in Azure Machine Learning.
3636
> * While Azure Machine Learning online endpoints are generally available, _using Triton with an online endpoint deployment is still in preview_.
3737
3838
## Prerequisites
3939

40+
# [Azure CLI](#tab/azure-cli)
41+
4042
[!INCLUDE [basic prereqs](../../includes/machine-learning-cli-prereqs.md)]
4143

42-
* A working Python 3.8 (or higher) environment.
44+
* A working Python 3.8 (or higher) environment.
45+
46+
* You must have additional Python packages installed for scoring and may install them with the code below. They include:
47+
* Numpy - An array and numerical computing library
48+
* [Triton Inference Server Client](https://github.com/triton-inference-server/client) - Facilitates requests to the Triton Inference Server
49+
* Pillow - A library for image operations
50+
* Gevent - A networking library used when connecting to the Triton Server
51+
52+
```azurecli
53+
pip install numpy
54+
pip install tritonclient[http]
55+
pip install pillow
56+
pip install gevent
57+
```
4358

4459
* Access to NCv3-series VMs for your Azure subscription.
4560

@@ -52,11 +67,57 @@ NVIDIA Triton Inference Server requires a specific model repository structure, w
5267

5368
The information in this document is based on using a model stored in ONNX format, so the directory structure of the model repository is `<model-repository>/<model-name>/1/model.onnx`. Specifically, this model performs image identification.
5469

55-
## Deploy using CLI (v2)
70+
71+
# [Python](#tab/python)
72+
73+
[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
74+
75+
[!INCLUDE [basic prereqs](../../includes/machine-learning-cli-prereqs.md)]
76+
77+
* A working Python 3.8 (or higher) environment.
78+
79+
* You must have additional Python packages installed for scoring and may install them with the code below. They include:
80+
* Numpy - An array and numerical computing library
81+
* [Triton Inference Server Client](https://github.com/triton-inference-server/client) - Facilitates requests to the Triton Inference Server
82+
* Pillow - A library for image operations
83+
* Gevent - A networking library used when connecting to the Triton Server
84+
85+
```azurecli
86+
pip install numpy
87+
pip install tritonclient[http]
88+
pip install pillow
89+
pip install gevent
90+
```
91+
92+
* Access to NCv3-series VMs for your Azure subscription.
93+
94+
> [!IMPORTANT]
95+
> You may need to request a quota increase for your subscription before you can use this series of VMs. For more information, see [NCv3-series](../virtual-machines/ncv3-series.md).
96+
97+
[!INCLUDE [clone repo & set defaults](../../includes/machine-learning-cli-prepare.md)]
98+
99+
The information in this article is based on the [Deploy a model to online endpoints using Triton](https://github.com/Azure/azureml-examples/blob/main/sdk/endpoints/online/triton/single-model/online-endpoints-triton.ipynb) notebook contained in the [azureml-examples](https://github.com/azure/azureml-examples) repository. To run the commands locally without having to copy/paste files, clone the repo and then change directories to the `sdk/endpoints/online/triton/single-model/online-endpoints-triton.ipynb` directory in the repo:
100+
101+
```azurecli
102+
git clone https://github.com/Azure/azureml-examples --depth 1
103+
cd azureml-examples
104+
cd cli
105+
```
106+
107+
# [Studio](#tab/azure-studio)
108+
109+
* An Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).
110+
111+
* An Azure Machine Learning workspace. If you don't have one, use the steps in [Manage Azure Machine Learning workspaces in the portal or with the Python SDK](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-workspace?tabs=azure-portal) to create one.
112+
113+
114+
## Define the deployment configuration
115+
116+
# [Azure CLI](#tab/azure-cli)
56117

57118
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
58119

59-
This section shows how you can deploy Triton to managed online endpoint using the Azure CLI with the Machine Learning extension (v2).
120+
This section shows how you can deploy to a managed online endpoint using the Azure CLI with the Machine Learning extension (v2).
60121

61122
> [!IMPORTANT]
62123
> For Triton no-code-deployment, **[testing via local endpoints](how-to-deploy-managed-online-endpoints.md#deploy-and-debug-locally-by-using-local-endpoints)** is currently not supported.
@@ -71,26 +132,13 @@ This section shows how you can deploy Triton to managed online endpoint using th
71132
72133
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-triton-managed-online-endpoint.sh" ID="set_endpoint_name":::
73134
74-
1. Install Python requirements using the following commands:
75-
76-
```azurecli
77-
pip install numpy
78-
pip install tritonclient[http]
79-
pip install pillow
80-
pip install gevent
81-
```
82-
83135
1. Create a YAML configuration file for your endpoint. The following example configures the name and authentication mode of the endpoint. The one used in the following commands is located at `/cli/endpoints/online/triton/single-model/create-managed-endpoint.yml` in the azureml-examples repo you cloned earlier:
84136
85137
__create-managed-endpoint.yaml__
86138
87139
:::code language="yaml" source="~/azureml-examples-main/cli/endpoints/online/triton/single-model/create-managed-endpoint.yaml":::
88140
89-
1. To create a new endpoint using the YAML configuration, use the following command:
90-
91-
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-triton-managed-online-endpoint.sh" ID="create_endpoint":::
92-
93-
1. Create a YAML configuration file for the deployment. The following example configures a deployment named __blue__ to the endpoint created in the previous step. The one used in the following commands is located at `/cli/endpoints/online/triton/single-model/create-managed-deployment.yml` in the azureml-examples repo you cloned earlier:
141+
1. Create a YAML configuration file for the deployment. The following example configures a deployment named __blue__ to the endpoint defined in the previous step. The one used in the following commands is located at `/cli/endpoints/online/triton/single-model/create-managed-deployment.yml` in the azureml-examples repo you cloned earlier:
94142
95143
> [!IMPORTANT]
96144
> For Triton no-code-deployment (NCD) to work, setting **`type`** to **`triton_model​`** is required, `type: triton_model​`. For more information, see [CLI (v2) model YAML schema](reference-yaml-model.md).
@@ -99,11 +147,89 @@ This section shows how you can deploy Triton to managed online endpoint using th
99147
100148
:::code language="yaml" source="~/azureml-examples-main/cli/endpoints/online/triton/single-model/create-managed-deployment.yaml":::
101149
150+
# [Python](#tab/python)
151+
152+
[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
153+
154+
This section shows how you can define a Triton deployment to deploy to a managed online endpoint using the Azure Machine Learning Python SDK (v2).
155+
156+
> [!IMPORTANT]
157+
> For Triton no-code-deployment, **[testing via local endpoints](how-to-deploy-managed-online-endpoints.md#deploy-and-debug-locally-by-using-local-endpoints)** is currently not supported.
158+
159+
160+
1. To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name.
161+
162+
```python
163+
subscription_id = "<SUBSCRIPTION_ID>"
164+
resource_group = "<RESOURCE_GROUP>"
165+
workspace_name = "<AML_WORKSPACE_NAME>"
166+
```
167+
168+
1. Use the following command to set the name of the endpoint that will be created. In this example, a random name is created for the endpoint:
169+
170+
```python
171+
import random
172+
173+
endpoint_name = f"endpoint-{random.randint(0, 10000)}"
174+
```
175+
176+
1. We use these details above in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.
177+
178+
```python
179+
from azure.ai.ml import MLClient
180+
from azure.identity import DefaultAzureCredential
181+
182+
ml_client = MLClient(
183+
DefaultAzureCredential(),
184+
subscription_id,
185+
resource_group,
186+
workspace_name,
187+
)
188+
```
189+
190+
1. Create a `ManagedOnlineEndpoint` object to configure the endpoint. The following example configures the name and authentication mode of the endpoint.
191+
192+
```python
193+
from azure.ai.ml.entities import ManagedOnlineEndpoint
194+
195+
endpoint = ManagedOnlineEndpoint(name=endpoint_name, auth_mode="key")
196+
```
197+
198+
1. Create a `ManagedOnlineDeployment` object to configure the deployment. The following example configures a deployment named __blue__ to the endpoint defined in the previous step and defines a local model inline.
199+
200+
```python
201+
from azure.ai.ml.entities import ManagedOnlineDeployment, Model
202+
203+
deployment = ManagedOnlineDeployment(
204+
name="blue",
205+
endpoint_name=endpoint_name,
206+
model=Model(path="./models", type="triton_model"),
207+
instance_type="Standard_NC6s_v3",
208+
instance_count=1,
209+
)
210+
```
211+
212+
# [Studio](#tab/azure-studio)
213+
214+
This section shows how you can define a Triton deployment on a managed online endpoint using [Azure Machine Learning studio](https://ml.azure.com).
215+
216+
217+
218+
219+
220+
221+
## Deploy to Azure
222+
223+
1. To create a new endpoint using the YAML configuration, use the following command:
224+
225+
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-triton-managed-online-endpoint.sh" ID="create_endpoint":::
226+
227+
102228
1. To create the deployment using the YAML configuration, use the following command:
103229

104230
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-triton-managed-online-endpoint.sh" ID="create_deployment":::
105231

106-
### Invoke your endpoint
232+
### Test your endpoint
107233

108234
Once your deployment completes, use the following command to make a scoring request to the deployed endpoint.
109235

0 commit comments

Comments
 (0)