Skip to content

Commit c80325f

Browse files
Merge pull request #268833 from cdpark/azureml-custom-dem108
User Story 226300: Q&M: AzureML Freshness updates - Custom container
2 parents ff396e8 + 9100390 commit c80325f

File tree

1 file changed

+35
-35
lines changed

1 file changed

+35
-35
lines changed

articles/machine-learning/how-to-deploy-custom-container.md

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: inferencing
88
author: dem108
99
ms.author: sehan
1010
ms.reviewer: mopeakande
11-
ms.date: 10/13/2022
11+
ms.date: 03/26/2024
1212
ms.topic: how-to
1313
ms.custom: deploy, devplatv2, devx-track-azurecli, cliv2, sdkv2
1414
ms.devlang: azurecli
@@ -18,40 +18,40 @@ ms.devlang: azurecli
1818

1919
[!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
2020

21-
Learn how to use a custom container for deploying a model to an online endpoint in Azure Machine Learning.
21+
Learn how to use a custom container to deploy a model to an online endpoint in Azure Machine Learning.
2222

2323
Custom container deployments can use web servers other than the default Python Flask server used by Azure Machine Learning. Users of these deployments can still take advantage of Azure Machine Learning's built-in monitoring, scaling, alerting, and authentication.
2424

2525
The following table lists various [deployment examples](https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/custom-container) that use custom containers such as TensorFlow Serving, TorchServe, Triton Inference Server, Plumber R package, and Azure Machine Learning Inference Minimal image.
2626

27-
|Example|Script (CLI)|Description|
27+
|Example|Script (CLI)|Description|
2828
|-------|------|---------|
2929
|[minimal/multimodel](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/minimal/multimodel)|[deploy-custom-container-minimal-multimodel](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-minimal-multimodel.sh)|Deploy multiple models to a single deployment by extending the Azure Machine Learning Inference Minimal image.|
3030
|[minimal/single-model](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/minimal/single-model)|[deploy-custom-container-minimal-single-model](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-minimal-single-model.sh)|Deploy a single model by extending the Azure Machine Learning Inference Minimal image.|
3131
|[mlflow/multideployment-scikit](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/mlflow/multideployment-scikit)|[deploy-custom-container-mlflow-multideployment-scikit](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-mlflow-multideployment-scikit.sh)|Deploy two MLFlow models with different Python requirements to two separate deployments behind a single endpoint using the Azure Machine Learning Inference Minimal Image.|
3232
|[r/multimodel-plumber](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/r/multimodel-plumber)|[deploy-custom-container-r-multimodel-plumber](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-r-multimodel-plumber.sh)|Deploy three regression models to one endpoint using the Plumber R package|
33-
|[tfserving/half-plus-two](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/tfserving/half-plus-two)|[deploy-custom-container-tfserving-half-plus-two](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-tfserving-half-plus-two.sh)|Deploy a simple Half Plus Two model using a TensorFlow Serving custom container using the standard model registration process.|
34-
|[tfserving/half-plus-two-integrated](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/tfserving/half-plus-two-integrated)|[deploy-custom-container-tfserving-half-plus-two-integrated](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-tfserving-half-plus-two-integrated.sh)|Deploy a simple Half Plus Two model using a TensorFlow Serving custom container with the model integrated into the image.|
33+
|[tfserving/half-plus-two](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/tfserving/half-plus-two)|[deploy-custom-container-tfserving-half-plus-two](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-tfserving-half-plus-two.sh)|Deploy a Half Plus Two model using a TensorFlow Serving custom container using the standard model registration process.|
34+
|[tfserving/half-plus-two-integrated](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/tfserving/half-plus-two-integrated)|[deploy-custom-container-tfserving-half-plus-two-integrated](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-tfserving-half-plus-two-integrated.sh)|Deploy a Half Plus Two model using a TensorFlow Serving custom container with the model integrated into the image.|
3535
|[torchserve/densenet](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/torchserve/densenet)|[deploy-custom-container-torchserve-densenet](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-torchserve-densenet.sh)|Deploy a single model using a TorchServe custom container.|
3636
|[torchserve/huggingface-textgen](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/torchserve/huggingface-textgen)|[deploy-custom-container-torchserve-huggingface-textgen](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-torchserve-huggingface-textgen.sh)|Deploy Hugging Face models to an online endpoint and follow along with the Hugging Face Transformers TorchServe example.|
3737
|[triton/single-model](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/triton/single-model)|[deploy-custom-container-triton-single-model](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-triton-single-model.sh)|Deploy a Triton model using a custom container|
3838

3939
This article focuses on serving a TensorFlow model with TensorFlow (TF) Serving.
4040

4141
> [!WARNING]
42-
> Microsoft may not be able to help troubleshoot problems caused by a custom image. If you encounter problems, you may be asked to use the default image or one of the images Microsoft provides to see if the problem is specific to your image.
42+
> Microsoft might not be able to help troubleshoot problems caused by a custom image. If you encounter problems, you might be asked to use the default image or one of the images Microsoft provides to see if the problem is specific to your image.
4343
4444
## Prerequisites
4545

4646
[!INCLUDE [cli & sdk](includes/machine-learning-cli-sdk-v2-prereqs.md)]
4747

48-
* You, or the service principal you use, must have `Contributor` access to the Azure Resource Group that contains your workspace. You'll have such a resource group if you configured your workspace using the quickstart article.
48+
* You, or the service principal you use, must have *Contributor* access to the Azure resource group that contains your workspace. You have such a resource group if you configured your workspace using the quickstart article.
4949

50-
* To deploy locally, you must have [Docker engine](https://docs.docker.com/engine/install/) running locally. This step is **highly recommended**. It will help you debug issues.
50+
* To deploy locally, you must have [Docker engine](https://docs.docker.com/engine/install/) running locally. This step is **highly recommended**. It helps you debug issues.
5151

5252
## Download source code
5353

54-
To follow along with this tutorial, download the source code below.
54+
To follow along with this tutorial, clone the source code from GitHub.
5555

5656
# [Azure CLI](#tab/cli)
5757

@@ -64,10 +64,10 @@ cd azureml-examples/cli
6464

6565
```azurecli
6666
git clone https://github.com/Azure/azureml-examples --depth 1
67-
cd azureml-examples/sdk
67+
cd azureml-examples/cli
6868
```
6969

70-
See also [the example notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/online/custom-container/online-endpoints-custom-container.ipynb) but note that `3. Test locally` section in the notebook assumes to run under the `azureml-examples/sdk` directory.
70+
See also [the example notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/online/custom-container/online-endpoints-custom-container.ipynb), but note that `3. Test locally` section in the notebook assumes that it runs under the `azureml-examples/sdk` directory.
7171

7272
---
7373

@@ -91,7 +91,7 @@ Use docker to run your image locally for testing:
9191

9292
### Check that you can send liveness and scoring requests to the image
9393

94-
First, check that the container is "alive," meaning that the process inside the container is still running. You should get a 200 (OK) response.
94+
First, check that the container is *alive*, meaning that the process inside the container is still running. You should get a 200 (OK) response.
9595

9696
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="check_liveness_locally":::
9797

@@ -101,11 +101,12 @@ Then, check that you can get predictions about unlabeled data:
101101

102102
### Stop the image
103103

104-
Now that you've tested locally, stop the image:
104+
Now that you tested locally, stop the image:
105105

106106
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="stop_image":::
107107

108108
## Deploy your online endpoint to Azure
109+
109110
Next, deploy your online endpoint to Azure.
110111

111112
# [Azure CLI](#tab/cli)
@@ -125,7 +126,8 @@ __tfserving-deployment.yml__
125126
# [Python SDK](#tab/python)
126127

127128
### Connect to Azure Machine Learning workspace
128-
Connect to Azure Machine Learning Workspace, configure workspace details, and get a handle to the workspace as follows:
129+
130+
Connect to your Azure Machine Learning workspace, configure workspace details, and get a handle to the workspace as follows:
129131

130132
1. Import the required libraries:
131133

@@ -183,7 +185,7 @@ endpoint = ManagedOnlineEndpoint(
183185

184186
### Configure online deployment
185187

186-
A deployment is a set of resources required for hosting the model that does the actual inferencing. We'll create a deployment for our endpoint using the `ManagedOnlineDeployment` class.
188+
A deployment is a set of resources required for hosting the model that does the actual inferencing. Create a deployment for our endpoint using the `ManagedOnlineDeployment` class.
187189

188190
> [!TIP]
189191
> - `name` - Name of the deployment.
@@ -229,23 +231,23 @@ There are a few important concepts to notice in this YAML/Python parameter:
229231

230232
#### Readiness route vs. liveness route
231233

232-
An HTTP server defines paths for both _liveness_ and _readiness_. A liveness route is used to check whether the server is running. A readiness route is used to check whether the server is ready to do work. In machine learning inference, a server could respond 200 OK to a liveness request before loading a model. The server could respond 200 OK to a readiness request only after the model has been loaded into memory.
234+
An HTTP server defines paths for both _liveness_ and _readiness_. A liveness route is used to check whether the server is running. A readiness route is used to check whether the server is ready to do work. In machine learning inference, a server could respond 200 OK to a liveness request before loading a model. The server could respond 200 OK to a readiness request only after the model is loaded into memory.
233235

234-
Review the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) for more information about liveness and readiness probes.
236+
For more information about liveness and readiness probes, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/).
235237

236238
Notice that this deployment uses the same path for both liveness and readiness, since TF Serving only defines a liveness route.
237239

238240
#### Locating the mounted model
239241

240-
When you deploy a model as an online endpoint, Azure Machine Learning _mounts_ your model to your endpoint. Model mounting enables you to deploy new versions of the model without having to create a new Docker image. By default, a model registered with the name *foo* and version *1* would be located at the following path inside of your deployed container: `/var/azureml-app/azureml-models/foo/1`
242+
When you deploy a model as an online endpoint, Azure Machine Learning _mounts_ your model to your endpoint. Model mounting allows you to deploy new versions of the model without having to create a new Docker image. By default, a model registered with the name *foo* and version *1* would be located at the following path inside of your deployed container: */var/azureml-app/azureml-models/foo/1*
241243

242-
For example, if you have a directory structure of `/azureml-examples/cli/endpoints/online/custom-container` on your local machine, where the model is named `half_plus_two`:
244+
For example, if you have a directory structure of */azureml-examples/cli/endpoints/online/custom-container* on your local machine, where the model is named *half_plus_two*:
243245

244246
:::image type="content" source="./media/how-to-deploy-custom-container/local-directory-structure.png" alt-text="Diagram showing a tree view of the local directory structure.":::
245247

246248
# [Azure CLI](#tab/cli)
247249

248-
and `tfserving-deployment.yml` contains:
250+
And *tfserving-deployment.yml* contains:
249251

250252
```yaml
251253
model:
@@ -256,26 +258,26 @@ model:
256258
257259
# [Python SDK](#tab/python)
258260
259-
and `Model` class contains:
261+
And `Model` class contains:
260262

261263
```python
262264
model = Model(name="tfserving-mounted", version="1", path="half_plus_two")
263265
```
264266

265267
---
266268

267-
then your model will be located under `/var/azureml-app/azureml-models/tfserving-deployment/1` in your deployment:
269+
Then your model will be located under */var/azureml-app/azureml-models/tfserving-deployment/1* in your deployment:
268270

269271
:::image type="content" source="./media/how-to-deploy-custom-container/deployment-location.png" alt-text="Diagram showing a tree view of the deployment directory structure.":::
270272

271-
You can optionally configure your `model_mount_path`. It enables you to change the path where the model is mounted.
273+
You can optionally configure your `model_mount_path`. It lets you change the path where the model is mounted.
272274

273275
> [!IMPORTANT]
274276
> The `model_mount_path` must be a valid absolute path in Linux (the OS of the container image).
275277

276278
# [Azure CLI](#tab/cli)
277279

278-
For example, you can have `model_mount_path` parameter in your _tfserving-deployment.yml_:
280+
For example, you can have `model_mount_path` parameter in your *tfserving-deployment.yml*:
279281

280282
```YAML
281283
name: tfserving-deployment
@@ -305,37 +307,35 @@ blue_deployment = ManagedOnlineDeployment(
305307

306308
---
307309

308-
then your model will be located at `/var/tfserving-model-mount/tfserving-deployment/1` in your deployment. Note that it's no longer under `azureml-app/azureml-models`, but under the mount path you specified:
310+
Then your model is located at */var/tfserving-model-mount/tfserving-deployment/1* in your deployment. Note that it's no longer under *azureml-app/azureml-models*, but under the mount path you specified:
309311

310312
:::image type="content" source="./media/how-to-deploy-custom-container/mount-path-deployment-location.png" alt-text="Diagram showing a tree view of the deployment directory structure when using mount_model_path.":::
311313

312314
### Create your endpoint and deployment
313315

314316
# [Azure CLI](#tab/cli)
315317

316-
Now that you've understood how the YAML was constructed, create your endpoint.
318+
Now that you understand how the YAML was constructed, create your endpoint.
317319

318320
```azurecli
319321
az ml online-endpoint create --name tfserving-endpoint -f endpoints/online/custom-container/tfserving-endpoint.yml
320322
```
321323

322-
Creating a deployment might take few minutes.
324+
Creating a deployment might take a few minutes.
323325

324326
```azurecli
325327
az ml online-deployment create --name tfserving-deployment -f endpoints/online/custom-container/tfserving-deployment.yml --all-traffic
326328
```
327329

328-
329-
330330
# [Python SDK](#tab/python)
331331

332-
Using the `MLClient` created earlier, we'll now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues.
332+
Using the `MLClient` created earlier, create the endpoint in the workspace. This command starts the endpoint creation and returns a confirmation response while the endpoint creation continues.
333333

334334
```python
335335
ml_client.begin_create_or_update(endpoint)
336336
```
337337

338-
Create the deployment by running as well.
338+
Create the deployment by running:
339339

340340
```python
341341
ml_client.begin_create_or_update(blue_deployment)
@@ -353,12 +353,12 @@ Once your deployment completes, see if you can make a scoring request to the dep
353353

354354
# [Python SDK](#tab/python)
355355

356-
Using the `MLClient` created earlier, we'll get a handle to the endpoint. The endpoint can be invoked using the `invoke` command with the following parameters:
356+
Using the `MLClient` created earlier, you get a handle to the endpoint. The endpoint can be invoked using the `invoke` command with the following parameters:
357357
- `endpoint_name` - Name of the endpoint
358358
- `request_file` - File with request data
359359
- `deployment_name` - Name of the specific deployment to test in an endpoint
360360

361-
We'll send a sample request using a json file. The sample json is in the [example repository](https://github.com/Azure/azureml-examples/tree/main/sdk/python/endpoints/online/custom-container).
361+
Send a sample request using a JSON file. The sample JSON is in the [example repository](https://github.com/Azure/azureml-examples/tree/main/sdk/python/endpoints/online/custom-container).
362362

363363
```python
364364
# test the blue deployment with some sample data
@@ -373,7 +373,7 @@ ml_client.online_endpoints.invoke(
373373

374374
### Delete the endpoint
375375

376-
Now that you've successfully scored with your endpoint, you can delete it:
376+
Now that you successfully scored with your endpoint, you can delete it:
377377

378378
# [Azure CLI](#tab/cli)
379379

@@ -389,7 +389,7 @@ ml_client.online_endpoints.begin_delete(name=online_endpoint_name)
389389

390390
---
391391

392-
## Next steps
392+
## Related content
393393

394394
- [Safe rollout for online endpoints](how-to-safely-rollout-online-endpoints.md)
395395
- [Troubleshooting online endpoints deployment](./how-to-troubleshoot-online-endpoints.md)

0 commit comments

Comments
 (0)