Skip to content

Commit 1549448

Browse files
committed
Addrssing PM feedback
1 parent dd22ce6 commit 1549448

File tree

1 file changed

+63
-54
lines changed

1 file changed

+63
-54
lines changed

articles/machine-learning/how-to-safely-rollout-online-endpoints.md

Lines changed: 63 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -168,20 +168,45 @@ If you cloned the examples repo, your local machine already has copies of the fi
168168

169169
1. Go to [https://github.com/Azure/azureml-examples/](https://github.com/Azure/azureml-examples/).
170170
1. Go to the **<> Code** button on the page, and then select **Download ZIP** from the **Local** tab.
171-
1. Locate the folder `/cli/endpoints/online/model-1/model` and the file `/cli/endpoints/online/model-1/onlinescoring/score.py`.
171+
1. Locate the model folder `/cli/endpoints/online/model-1/model` and scoring script `/cli/endpoints/online/model-1/onlinescoring/score.py` for a first model `model-1`.
172+
1. Locate the model folder `/cli/endpoints/online/model-2/model` and scoring script `/cli/endpoints/online/model-2/onlinescoring/score.py` for a second model `model-2`.
172173

173174
---
174175

175176
## Define the endpoint and deployment
176177

177178
Online endpoints are used for online (real-time) inferencing. Online endpoints contain deployments that are ready to receive data from clients and send responses back in real time.
178179

179-
To define an endpoint, you need to specify:
180+
### Define an endpoint
181+
182+
To define an endpoint, you need to specify the following key attributes:
180183

181184
* Endpoint name: The name of the endpoint. It must be unique in the Azure region. For more information on the naming rules, see [managed online endpoint limits](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints).
182-
* Authentication mode: The authentication method for the endpoint. Choose between key-based authentication and Azure Machine Learning token-based authentication. A key doesn't expire, but a token does expire. For more information on authenticating, see [Authenticate to an online endpoint](how-to-authenticate-online-endpoint.md).
185+
* Authentication mode: The authentication method for the endpoint. Choose between key-based authentication `key` and Azure Machine Learning token-based authentication `aml_token`. A key doesn't expire, but a token does expire. For more information on authenticating, see [Authenticate to an online endpoint](how-to-authenticate-online-endpoint.md).
183186
* Optionally, you can add a description and tags to your endpoint.
184187

188+
### Define a deployment
189+
190+
A *deployment* is a set of resources required for hosting the model that does the actual inferencing. To deploy a model, you must have:
191+
192+
- Model files (or the name and version of a model that's already registered in your workspace). In the example, we have a scikit-learn model that does regression.
193+
- A scoring script, that is, code that executes the model on a given input request. The scoring script receives data submitted to a deployed web service and passes it to the model. The script then executes the model and returns its response to the client. The scoring script is specific to your model and must understand the data that the model expects as input and returns as output. In this example, we have a *score.py* file.
194+
- An environment in which your model runs. The environment can be a Docker image with Conda dependencies or a Dockerfile.
195+
- Settings to specify the instance type and scaling capacity.
196+
197+
The following table describes the key attributes of a deployment:
198+
199+
| Attribute | Description |
200+
|-----------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
201+
| Name | The name of the deployment. |
202+
| Endpoint name | The name of the endpoint to create the deployment under. |
203+
| Model | The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification. |
204+
| Code path | The path to the directory on the local development environment that contains all the Python source code for scoring the model. You can use nested directories and packages. |
205+
| Scoring script | The relative path to the scoring file in the source code directory. This Python code must have an `init()` function and a `run()` function. The `init()` function will be called after the model is created or updated (you can use it to cache the model in memory, for example). The `run()` function is called at every invocation of the endpoint to do the actual scoring and prediction. |
206+
| Environment | The environment to host the model and code. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification. |
207+
| Instance type | The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](reference-managed-online-endpoints-vm-sku-list.md). |
208+
| Instance count | The number of instances to use for the deployment. Base the value on the workload you expect. For high availability, we recommend that you set the value to at least `3`. We reserve an extra 20% for performing upgrades. For more information, see [managed online endpoint quotas](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints). |
209+
185210
# [Azure CLI](#tab/azure-cli)
186211

187212
### Create online endpoint
@@ -217,7 +242,7 @@ To create an online endpoint:
217242

218243
### Create the 'blue' deployment
219244

220-
A deployment is a set of resources required for hosting the model that does the actual inferencing. In this article, you'll use the *endpoints/online/managed/sample/blue-deployment.yml* file to configure the key aspects of the deployment<!-- [link to "define the deployment" section in Deploy article] -->. The following snippet shows the contents of the file:
245+
In this article, you'll use the *endpoints/online/managed/sample/blue-deployment.yml* file to configure the key aspects of the deployment. The following snippet shows the contents of the file:
221246

222247
:::code language="yaml" source="~/azureml-examples-main/cli/endpoints/online/managed/sample/blue-deployment.yml":::
223248

@@ -238,14 +263,14 @@ For more information on registering your model as an asset, see [Register your m
238263

239264
### Create online endpoint
240265

241-
To create a managed online endpoint, use the `ManagedOnlineEndpoint` class. This class allows users to configure the following key aspects of the endpoint:
266+
To create a managed online endpoint, use the `ManagedOnlineEndpoint` class. This class allows users to configure the key aspects of the endpoint.
242267

243-
* `name` - Name of the endpoint. Needs to be unique at the Azure region level
268+
<!-- * `name` - Name of the endpoint. Needs to be unique at the Azure region level
244269
* `auth_mode` - The authentication method for the endpoint. Key-based authentication and Azure Machine Learning token-based authentication are supported. Key-based authentication doesn't expire but Azure Machine Learning token-based authentication does. Possible values are `key` or `aml_token`.
245270
* `identity`- The managed identity configuration for accessing Azure resources for endpoint provisioning and inference.
246271
* `type`- The type of managed identity. Azure Machine Learning supports `system_assigned` or `user_assigned` identity.
247272
* `user_assigned_identities` - List (array) of fully qualified resource IDs of the user-assigned identities. This property is required if `identity.type` is user_assigned.
248-
* `description`- Description of the endpoint.
273+
* `description`- Description of the endpoint. -->
249274

250275
1. Configure the endpoint:
251276

@@ -260,7 +285,8 @@ To create a managed online endpoint, use the `ManagedOnlineEndpoint` class. This
260285

261286
### Create the 'blue' deployment
262287

263-
A deployment is a set of resources required for hosting the model that does the actual inferencing. To create a deployment for your managed online endpoint, use the `ManagedOnlineDeployment` class. This class allows users to configure the key aspects of the deployment. <!-- [link to "define the deployment" section in Deploy article] -->
288+
To create a deployment for your managed online endpoint, use the `ManagedOnlineDeployment` class. This class allows users to configure the key aspects of the deployment.
289+
The following table describes the attributes of a `deployment`:
264290

265291
1. Configure blue deployment:
266292

@@ -318,7 +344,7 @@ One way to create a managed online endpoint in the studio is from the **Models**
318344
1. Go to the [Azure Machine Learning studio](https://ml.azure.com).
319345
1. In the left navigation bar, select the **Models** page.
320346
1. Select the model named `model-1` by checking the circle next to its name.
321-
1. Select **Deploy** > **Deploy to real-time endpoint**.
347+
1. Select **Deploy** > **Real-time endpoint**.
322348

323349
:::image type="content" source="media/how-to-safely-rollout-managed-endpoints/deploy-from-models-page.png" lightbox="media/how-to-safely-rollout-managed-endpoints/deploy-from-models-page.png" alt-text="A screenshot of creating a managed online endpoint from the Models UI.":::
324350

@@ -327,12 +353,6 @@ One way to create a managed online endpoint in the studio is from the **Models**
327353
:::image type="content" source="media/how-to-safely-rollout-managed-endpoints/online-endpoint-wizard.png" lightbox="media/how-to-safely-rollout-managed-endpoints/online-endpoint-wizard.png" alt-text="A screenshot of a managed online endpoint create wizard.":::
328354

329355
1. Enter an __Endpoint name__.
330-
331-
> [!NOTE]
332-
> * Endpoint name: The name of the endpoint. It must be unique in the Azure region. For more information on the naming rules, see [managed online endpoint limits](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints).
333-
> * Authentication type: The authentication method for the endpoint. Choose between key-based authentication and Azure Machine Learning token-based authentication. A `key` doesn't expire, but a token does expire. For more information on authenticating, see [Authenticate to an online endpoint](how-to-authenticate-online-endpoint.md).
334-
> * Optionally, you can add a description and tags to your endpoint.
335-
336356
1. Keep the default selections: __Managed__ for the compute type and __key-based authentication__ for the authentication type.
337357
1. Select __Next__, until you get to the "Deployment" page. Here, perform the following tasks:
338358

@@ -548,11 +568,23 @@ Though `green` has 0% of traffic allocated, you can still invoke the endpoint an
548568

549569
Once you've tested your `green` deployment, you can 'mirror' (or copy) a percentage of the live traffic to it. Mirroring traffic (also called shadowing) doesn't change the results returned to clients. Requests still flow 100% to the `blue` deployment. The mirrored percentage of the traffic is copied and submitted to the `green` deployment so you can gather metrics and logging without impacting your clients. Mirroring is useful when you want to validate a new deployment without impacting clients. For example, you can use mirroring to check if latency is within acceptable bounds or to check that there are no HTTP errors. Testing the new deployment with traffic mirroring/shadowing is also known as [shadow testing](https://microsoft.github.io/code-with-engineering-playbook/automated-testing/shadow-testing/). The deployment receiving the mirrored traffic (in this case, the `green` deployment) can also be called the shadow deployment.
550570

551-
> [!WARNING]
552-
> Mirroring traffic uses your [endpoint bandwidth quota](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints) (default 5 MBPS). Your endpoint bandwidth will be throttled if you exceed the allocated quota. For information on monitoring bandwidth throttling, see [Monitor managed online endpoints](how-to-monitor-online-endpoints.md#metrics-at-endpoint-scope).
571+
Mirroring has the following limitations:
572+
* Mirrored traffic is supported for the CLI (v2) (version 2.4.0 or above) and Python SDK (v2) (version 1.0.0 or above). If you update the endpoint using an older version of CLI/SDK or Studio UI, the setting for mirrored traffic will be removed.
573+
* Mirrored traffic isn't currently supported for Kubernetes online endpoints.
574+
* You can mirror traffic to only one deployment.
575+
* The maximum mirrored traffic you can configure is 50%. This limit is to reduce the effect on your [endpoint bandwidth quota](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints) (default 5 MBPS). Your endpoint bandwidth will be throttled if you exceed the allocated quota. For information on monitoring bandwidth throttling, see [Monitor managed online endpoints](how-to-monitor-online-endpoints.md#metrics-at-endpoint-scope)..
553576

554-
> [!IMPORTANT]
555-
> Mirrored traffic is supported for the CLI (v2) (version 2.4.0 or above) and Python SDK (v2) (version 1.0.0 or above). If you update the endpoint using an older version of CLI/SDK or Studio UI, the setting for mirrored traffic will be removed.
577+
Also note the following behavior:
578+
* A deployment can only be set to live or mirrored traffic, not both.
579+
* You can send traffic directly to the mirror deployment by specifying the deployment set for mirror traffic.
580+
* You can send traffic directly to a live deployment by specifying the deployment set for live traffic, but in this case the traffic won't be mirrored to the mirror deployment. Mirror traffic is routed from traffic sent to the endpoint without specifying the deployment.
581+
582+
> [!TIP]
583+
> You can use `--deployment-name` option [for CLI v2](/cli/azure/ml/online-endpoint#az-ml-online-endpoint-invoke-optional-parameters), or `deployment_name` option [for SDK v2](/python/api/azure-ai-ml/azure.ai.ml.operations.onlineendpointoperations#azure-ai-ml-operations-onlineendpointoperations-invoke) to specify the deployment to be routed to.
584+
585+
Now, let's set the green deployment to receive 10% of mirrored traffic. Clients will still receive predictions from the blue deployment only.
586+
587+
:::image type="content" source="./media/how-to-safely-rollout-managed-endpoints/endpoint-concept-mirror.png" alt-text="Diagram showing 10% traffic mirrored to one deployment.":::
556588

557589
# [Azure CLI](#tab/azure-cli)
558590

@@ -568,37 +600,6 @@ for i in {1..20} ; do
568600
done
569601
```
570602

571-
# [Python](#tab/python)
572-
573-
The following command mirrors 10% of the traffic to the `green` deployment:
574-
575-
[!notebook-python[](~/azureml-examples-main/sdk/python/endpoints/online/managed/online-endpoints-safe-rollout.ipynb?name=new_deployment_traffic)]
576-
577-
You can test mirror traffic by invoking the endpoint several times:
578-
[!notebook-python[](~/azureml-examples-main/sdk/python/endpoints/online/managed/online-endpoints-safe-rollout.ipynb?name=several_tests_to_mirror_traffic)]
579-
580-
# [Studio](#tab/azure-studio)
581-
582-
The studio doesn't support mirrored traffic. See the Azure CLI or Python tabs for steps to mirror traffic to a deployment.
583-
584-
---
585-
586-
Mirroring has the following limitations:
587-
* You can only mirror traffic to one deployment.
588-
* Mirror traffic isn't currently supported for Kubernetes online endpoints.
589-
* The maximum mirrored traffic you can configure is 50%. This limit is to reduce the effect on your endpoint bandwidth quota.
590-
591-
Also note the following behavior:
592-
* A deployment can only be set to live or mirror traffic, not both.
593-
* You can send traffic directly to the mirror deployment by specifying the deployment set for mirror traffic.
594-
* You can send traffic directly to a live deployment by specifying the deployment set for live traffic, but in this case the traffic won't be mirrored to the mirror deployment. Mirror traffic is routed from traffic sent to endpoint without specifying the deployment.
595-
596-
> [!TIP]
597-
> You can use `--deployment-name` option [for CLI v2](/cli/azure/ml/online-endpoint#az-ml-online-endpoint-invoke-optional-parameters), or `deployment_name` option [for SDK v2](/python/api/azure-ai-ml/azure.ai.ml.operations.onlineendpointoperations#azure-ai-ml-operations-onlineendpointoperations-invoke) to specify the deployment to be routed to.
598-
599-
:::image type="content" source="./media/how-to-safely-rollout-managed-endpoints/endpoint-concept-mirror.png" alt-text="Diagram showing 10% traffic mirrored to one deployment.":::
600-
601-
# [Azure CLI](#tab/azure-cli)
602603
You can confirm that the specific percentage of the traffic was sent to the `green` deployment by seeing the logs from the deployment:
603604

604605
```azurecli
@@ -610,6 +611,14 @@ After testing, you can set the mirror traffic to zero to disable mirroring:
610611
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-safe-rollout-online-endpoints.sh" ID="reset_mirror_traffic" :::
611612

612613
# [Python](#tab/python)
614+
615+
The following command mirrors 10% of the traffic to the `green` deployment:
616+
617+
[!notebook-python[](~/azureml-examples-main/sdk/python/endpoints/online/managed/online-endpoints-safe-rollout.ipynb?name=new_deployment_traffic)]
618+
619+
You can test mirror traffic by invoking the endpoint several times:
620+
[!notebook-python[](~/azureml-examples-main/sdk/python/endpoints/online/managed/online-endpoints-safe-rollout.ipynb?name=several_tests_to_mirror_traffic)]
621+
613622
You can confirm that the specific percentage of the traffic was sent to the `green` deployment by seeing the logs from the deployment:
614623

615624
```python
@@ -650,12 +659,12 @@ Once you've tested your `green` deployment, allocate a small percentage of traff
650659
1. Adjust the deployment traffic by allocating 10% to the green deployment and 90% to the blue deployment.
651660
1. Select **Update**.
652661

653-
> [!TIP]
654-
> The **Total traffic percentage** must sum to either 0% (to disable traffic) or 100% (to enable traffic).
655-
656662
---
657663

658-
Now, your `green` deployment will receive 10% of requests.
664+
> [!TIP]
665+
> The total traffic percentage must sum to either 0% (to disable traffic) or 100% (to enable traffic).
666+
667+
Now, your `green` deployment will receive 10% of all live traffic. Clients will receive predictions from both the `blue` and `green` deployments.
659668

660669
:::image type="content" source="./media/how-to-safely-rollout-managed-endpoints/endpoint-concept.png" alt-text="Diagram showing traffic split between deployments.":::
661670

0 commit comments

Comments
 (0)