Skip to content

Commit eee5d1d

Browse files
committed
Update code and text
1 parent 78112c5 commit eee5d1d

File tree

2 files changed

+114
-96
lines changed

2 files changed

+114
-96
lines changed

articles/machine-learning/how-to-deploy-custom-container.md

Lines changed: 95 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,28 @@ ms.devlang: azurecli
1818

1919
[!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
2020

21-
In Azure Machine Learning, you can use a custom container to deploy a model to an online endpoint.
21+
In Azure Machine Learning, you can use a custom container to deploy a model to an online endpoint. Custom container deployments can use web servers other than the default Python Flask server that Azure Machine Learning uses.
2222

23-
Custom container deployments can use web servers other than the default Python Flask server that Azure Machine Learning uses. When you use a custom deployment, you can still take advantage of the built-in monitoring, scaling, alerting, and authentication that Azure Machine Learning offers.
23+
When you use a custom deployment, you can:
2424

25-
The following table lists various [deployment examples](https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/custom-container) that use custom containers. The examples use various tools and technologies, such as TensorFlow Serving, TorchServe, Triton Inference Server, the Plumber R package, and the Azure Machine Learning inference minimal image.
25+
- Use various tools and technologies, such as TensorFlow Serving, TorchServe, Triton Inference Server, the Plumber R package, and the Azure Machine Learning inference minimal image.
26+
- Still take advantage of the built-in monitoring, scaling, alerting, and authentication that Azure Machine Learning offers.
2627

27-
|Example|Script (CLI)|Description|
28+
This article shows you how to use a TensorFlow (TF) Serving image to serve a TF model.
29+
30+
## Prerequisites
31+
32+
[!INCLUDE [cli & sdk](includes/machine-learning-cli-sdk-v2-prereqs.md)]
33+
34+
* An Azure resource group that contains your workspace and that you or your service principal have Contributor access to. If you use the steps in [Create the workspace](quickstart-create-resources.md#create-the-workspace) to configure your workspace, you meet this requirement.
35+
36+
* [Docker Engine](https://docs.docker.com/engine/install/), installed and running locally. This prerequisite is **highly recommended**. You need it to deploy a model locally, and it's helpful for debugging.
37+
38+
## Deployment examples
39+
40+
The following table lists [deployment examples](https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/custom-container) that use custom containers and take advantage of various tools and technologies.
41+
42+
|Example|Azure CLI script|Description|
2843
|-------|------|---------|
2944
|[minimal/multimodel](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/minimal/multimodel)|[deploy-custom-container-minimal-multimodel](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-minimal-multimodel.sh)|Deploys multiple models to a single deployment by extending the Azure Machine Learning inference minimal image.|
3045
|[minimal/single-model](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/minimal/single-model)|[deploy-custom-container-minimal-single-model](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-minimal-single-model.sh)|Deploys a single model by extending the Azure Machine Learning inference minimal image.|
@@ -35,22 +50,14 @@ The following table lists various [deployment examples](https://github.com/Azure
3550
|[torchserve/densenet](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/torchserve/densenet)|[deploy-custom-container-torchserve-densenet](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-torchserve-densenet.sh)|Deploys a single model by using a TorchServe custom container.|
3651
|[triton/single-model](https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/custom-container/triton/single-model)|[deploy-custom-container-triton-single-model](https://github.com/Azure/azureml-examples/blob/main/cli/deploy-custom-container-triton-single-model.sh)|Deploys a Triton model by using a custom container.|
3752

38-
This article focuses on serving a TensorFlow model with TensorFlow (TF) Serving.
53+
This article shows you how to use the tfserving/half-plus-two example.
3954

4055
> [!WARNING]
41-
> Microsoft might not be able to help troubleshoot problems caused by a custom image. If you encounter problems, you might be asked to use the default image or one of the images Microsoft provides to see if the problem is specific to your image.
42-
43-
## Prerequisites
44-
45-
[!INCLUDE [cli & sdk](includes/machine-learning-cli-sdk-v2-prereqs.md)]
46-
47-
* You, or the service principal you use, must have *Contributor* access to the Azure resource group that contains your workspace. You have such a resource group if you configured your workspace using the quickstart article.
48-
49-
* To deploy locally, you must have [Docker engine](https://docs.docker.com/engine/install/) running locally. This step is **highly recommended**. It helps you debug issues.
56+
> Microsoft support teams might not be able to help troubleshoot problems caused by a custom image. If you encounter problems, you might be asked to use the default image or one of the images that Microsoft provides to see whether the problem is specific to your image.
5057
5158
## Download the source code
5259

53-
To follow along with the steps in this article, clone the source code from GitHub.
60+
The steps in this article use code samples from the [azureml-examples](https://github.com/Azure/azureml-examples) repository. Use the following commands to clone the repository:
5461

5562
# [Azure CLI](#tab/cli)
5663

@@ -63,50 +70,55 @@ cd azureml-examples/cli
6370

6471
```azurecli
6572
git clone https://github.com/Azure/azureml-examples --depth 1
66-
cd azureml-examples/sdk/python
73+
cd azureml-examples/cli
6774
```
6875

69-
See also [the example notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/online/custom-container/online-endpoints-custom-container.ipynb), but note that `3. Test locally` section in the notebook assumes that it runs under the `azureml-examples/sdk` directory.
76+
In the examples repository, most Python samples are under the sdk/python folder. For this article, go to the cli folder instead. The folder structure under the cli folder is slightly different than the sdk/python structure in this case. Most steps in this article require the cli structure.
77+
78+
To follow along with the example steps, see a [Jupyter notebook in the examples repository](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/online/custom-container/online-endpoints-custom-container.ipynb). But in the following sections of that notebook, the steps run from the azureml-examples/sdk/python folder instead of the cli folder:
79+
80+
- 3. Test locally
81+
- 5. Test the endpoint with sample data
7082

7183
---
7284

7385
## Initialize environment variables
7486

75-
Define environment variables:
87+
To use a TF model, you need several environment variables. Run the following commands to define those variables:
7688

7789
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="initialize_variables":::
7890

7991
## Download a TensorFlow model
8092

81-
Download and unzip a model that divides an input by two and adds 2 to the result:
93+
Download and unzip a model that divides an input value by two and adds two to the result:
8294

8395
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="download_and_unzip_model":::
8496

85-
## Run a TF Serving image locally to test that it works
97+
## Test a TF Serving image locally
8698

87-
Use docker to run your image locally for testing:
99+
Use Docker to run your image locally for testing:
88100

89101
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="run_image_locally_for_testing":::
90102

91-
### Check that you can send liveness and scoring requests to the image
103+
### Send liveness and scoring requests to the image
92104

93-
First, check that the container is *alive*, meaning that the process inside the container is still running. You should get a 200 (OK) response.
105+
Send a liveness request to check that the process inside the container is running. You should get a response of 200, or OK.
94106

95107
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="check_liveness_locally":::
96108

97-
Then, check that you can get predictions about unlabeled data:
109+
Send a scoring request to check that you can get predictions about unlabeled data:
98110

99111
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="check_scoring_locally":::
100112

101113
### Stop the image
102114

103-
Now that you tested locally, stop the image:
115+
When you finish testing locally, stop the image:
104116

105117
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-custom-container-tfserving-half-plus-two.sh" id="stop_image":::
106118

107119
## Deploy your online endpoint to Azure
108120

109-
Next, deploy your online endpoint to Azure.
121+
To deploy your online endpoint to Azure, take the steps in the following sections.
110122

111123
# [Azure CLI](#tab/cli)
112124

@@ -149,78 +161,79 @@ instance_count: 1
149161
150162
# [Python SDK](#tab/python)
151163
152-
### Connect to Azure Machine Learning workspace
164+
### Connect to your Azure Machine Learning workspace
153165
154-
Connect to your Azure Machine Learning workspace, configure workspace details, and get a handle to the workspace as follows:
166+
To configure your Azure Machine Learning workspace, take the following steps:
155167
156168
1. Import the required libraries:
157169
158-
```python
159-
# import required libraries
160-
from azure.ai.ml import MLClient
161-
from azure.ai.ml.entities import (
162-
ManagedOnlineEndpoint,
163-
ManagedOnlineDeployment,
164-
Model,
165-
Environment,
166-
CodeConfiguration,
167-
)
168-
from azure.identity import DefaultAzureCredential
169-
```
170+
```python
171+
# Import the required libraries.
172+
from azure.ai.ml import MLClient
173+
from azure.ai.ml.entities import (
174+
ManagedOnlineEndpoint,
175+
ManagedOnlineDeployment,
176+
Model,
177+
Environment,
178+
CodeConfiguration,
179+
)
180+
from azure.identity import DefaultAzureCredential
181+
```
170182

171-
2. Configure workspace details and get a handle to the workspace:
183+
2. Configure workspace settings and get a handle to the workspace:
172184

173-
```python
174-
# enter details of your Azure Machine Learning workspace
175-
subscription_id = "<SUBSCRIPTION_ID>"
176-
resource_group = "<RESOURCE_GROUP>"
177-
workspace = "<AZUREML_WORKSPACE_NAME>"
178-
179-
# get a handle to the workspace
180-
ml_client = MLClient(
181-
DefaultAzureCredential(), subscription_id, resource_group, workspace
182-
)
183-
```
185+
```python
186+
# Enter information about your Azure Machine Learning workspace.
187+
subscription_id = "<subscription-ID>"
188+
resource_group = "<resource-group-name>"
189+
workspace = "<Azure-Machine-Learning-workspace-name>"
184190
185-
For more information, see [Deploy machine learning models to managed online endpoint using Python SDK v2](how-to-deploy-managed-online-endpoint-sdk-v2.md).
191+
# Get a handle to the workspace.
192+
ml_client = MLClient(
193+
DefaultAzureCredential(), subscription_id, resource_group, workspace
194+
)
195+
```
186196

187-
### Configure online endpoint
197+
For more information, see [Deploy and score a machine learning model by using an online endpoint](how-to-deploy-online-endpoints.md?view=azureml-api-2&tabs=python).
188198

189-
> [!TIP]
190-
> * `name`: The name of the endpoint. It must be unique in the Azure region. The name for an endpoint must start with an upper- or lowercase letter and only consist of '-'s and alphanumeric characters. For more information on the naming rules, see [endpoint limits](how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints).
191-
> * `auth_mode` : Use `key` for key-based authentication. Use `aml_token` for Azure Machine Learning token-based authentication. A `key` doesn't expire, but `aml_token` does expire. For more information on authenticating, see [Authenticate to an online endpoint](how-to-authenticate-online-endpoint.md).
199+
### Configure an online endpoint
192200

193-
Optionally, you can add description, tags to your endpoint.
201+
Use the following code to configure an online endpoint. Keep the following points in mind:
202+
203+
- The name of the endpoint must be unique in its Azure region. An endpoint name must start with a letter and only consist of alphanumeric characters and hyphens. For more information about the naming rules, see [Azure Machine Learning online endpoints and batch endpoints](how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints).
204+
- For the `auth_mode` value, use `key` for key-based authentication. Use `aml_token` for Azure Machine Learning token-based authentication. A key doesn't expire, but a token does expire. For more information about authentication, see [Authenticate clients for online endpoints](how-to-authenticate-online-endpoint.md).
205+
- The description and tags are optional.
194206

195207
```python
196-
# Creating a unique endpoint name with current datetime to avoid conflicts
208+
# To create a unique endpoint name, use a time stamp of the current date and time.
197209
import datetime
198210
199211
online_endpoint_name = "endpoint-" + datetime.datetime.now().strftime("%m%d%H%M%f")
200212
201-
# create an online endpoint
213+
# Configure an online endpoint.
202214
endpoint = ManagedOnlineEndpoint(
203215
name=online_endpoint_name,
204-
description="this is a sample online endpoint",
216+
description="A sample online endpoint",
205217
auth_mode="key",
206-
tags={"foo": "bar"},
218+
tags={"env": "dev"},
207219
)
208220
```
209221

210-
### Configure online deployment
222+
### Configure an online deployment
223+
224+
A deployment is a set of resources that are required for hosting the model that does the actual inferencing. You can use the `ManagedOnlineDeployment` class to configure a deployment for your endpoint. The constructor of that class uses the following parameters:
211225

212-
A deployment is a set of resources required for hosting the model that does the actual inferencing. Create a deployment for our endpoint using the `ManagedOnlineDeployment` class.
226+
- `name`: The name of the deployment.
227+
- `endpoint_name`: The name of the endpoint to create the deployment under.
228+
- `model`: The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
229+
- `environment`: The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.
230+
- `environment_variables`: Environment variables that are set during deployment.
231+
- `MODEL_BASE_PATH`: The parent folder that contains a folder for your model.
232+
- `MODEL_NAME`: The name of your model.
233+
- `instance_type`: The virtual machine size to use for the deployment. For a list of supported sizes, see [Managed online endpoints SKU list](reference-managed-online-endpoints-vm-sku-list.md).
234+
- `instance_count`: The number of instances to use for the deployment.
213235

214-
> [!TIP]
215-
> - `name` - Name of the deployment.
216-
> - `endpoint_name` - Name of the endpoint to create the deployment under.
217-
> - `model` - The model to use for the deployment. This value can be either a reference to an existing versioned > model in the workspace or an inline model specification.
218-
> - `environment` - The environment to use for the deployment. This value can be either a reference to an existing > versioned environment in the workspace or an inline environment specification.
219-
> - `code_configuration` - the configuration for the source code and scoring script
220-
> - `path`- Path to the source code directory for scoring the model
221-
> - `scoring_script` - Relative path to the scoring file in the source code directory
222-
> - `instance_type` - The VM size to use for the deployment. For the list of supported sizes, see [endpoints SKU list](reference-managed-online-endpoints-vm-sku-list.md).
223-
> - `instance_count` - The number of instances to use for the deployment
236+
Use the following code to configure a deployment for your endpoint:
224237

225238
```python
226239
# create a blue deployment
@@ -251,17 +264,19 @@ blue_deployment = ManagedOnlineDeployment(
251264

252265
---
253266

254-
There are a few important concepts to note in this YAML/Python parameter:
267+
The following sections discuss a few important concepts about the YAML and Python parameters.
255268

256269
#### Base image
257270

258-
The base image is specified as a parameter in environment, and `docker.io/tensorflow/serving:latest` is used in this example. As you inspect the container, you can find that this server uses `ENTRYPOINT` to start an entry point script, which takes the environment variables such as `MODEL_BASE_PATH` and `MODEL_NAME`, and exposes ports such as `8501`. These details are all specific information for this chosen server. You can use this understanding of the server, to determine how to define the deployment. For example, if you set environment variables for `MODEL_BASE_PATH` and `MODEL_NAME` in the deployment definition, the server (in this case, TF Serving) will take the values to initiate the server. Likewise, if you set the port for the routes to be `8501` in the deployment definition, the user request to such routes will be correctly routed to the TF Serving server.
271+
In the `environment` section in YAML, or the `Environment` constructor in Python, you specify the base image as a parameter. This example uses `docker.io/tensorflow/serving:latest` as the `image` value.
272+
273+
If you inspect your container, you can see that this server uses `ENTRYPOINT` commands to start an entry point script. That script takes environment variables such as `MODEL_BASE_PATH` and `MODEL_NAME`, and it exposes ports such as `8501`. These details all pertain to this server, and you can use this information to determine how to define your deployment. For example, if you set the `MODEL_BASE_PATH` and `MODEL_NAME` environment variables in your deployment definition, TF Serving uses those values to initiate the server. Likewise, if you set the port for each route to be `8501` in the deployment definition, user requests to those routes are correctly routed to the TF Serving server.
259274

260-
Note that this specific example is based on the TF Serving case, but you can use any containers that will stay up and respond to requests coming to liveness, readiness, and scoring routes. You can refer to other examples and see how the dockerfile is formed (for example, using `CMD` instead of `ENTRYPOINT`) to create the containers.
275+
This example is based on the TF Serving case, but you can use any container that stays up and responds to requests that go to liveness, readiness, and scoring routes. To see how to form a Dockerfile to create a container, you can refer to other examples. Some servers use `CMD` instructions instead of `ENTRYPOINT` instructions.
261276

262-
#### Inference config
277+
#### The inference_config parameter
263278

264-
Inference config is a parameter in environment, and it specifies the port and path for 3 types of the route: liveness, readiness, and scoring route. Inference config is required if you want to run your own container with managed online endpoint.
279+
In the `environment` section or the `Environment` class, `inference_config` is a parameter. It specifies the port and path for three types of routes: liveness, readiness, and scoring routes. The `inference_config` parameter is required if you want to run your own container with a managed online endpoint.
265280

266281
#### Readiness route vs liveness route
267282

@@ -414,7 +429,7 @@ For the request data, you can use a sample JSON file from the [example repositor
414429
response = ml_client.online_endpoints.invoke(
415430
endpoint_name=online_endpoint_name,
416431
deployment_name="blue",
417-
request_file="sample-request.json",
432+
request_file="sample_request.json",
418433
)
419434
```
420435

0 commit comments

Comments
 (0)