You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-safely-rollout-online-endpoints.md
+21-25Lines changed: 21 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -170,7 +170,7 @@ cd azureml-examples/cli/endpoints/online/model-1
170
170
171
171
### Download files from the examples repository
172
172
173
-
Instead of cloning the examples repo, you can download the repo to your local machine:
173
+
Instead of cloning the examples repository, you can download the repository to your local machine:
174
174
175
175
1. Go to [https://github.com/Azure/azureml-examples/](https://github.com/Azure/azureml-examples/).
176
176
1. Select **<> Code**, and then go to the **Local** tab and select **Download ZIP**.
@@ -204,14 +204,14 @@ A *deployment* is a set of resources that are required for hosting the model tha
204
204
| --- | --- | --- |
205
205
| Name | Required | The name of the deployment. |
206
206
| Endpoint name | Required | The name of the endpoint to create the deployment under. |
207
-
| Model | Optional | The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification. In the example, a `scikit-learn` model does regression. |
207
+
| Model | Optional | The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification. In this article's examples, a `scikit-learn` model does regression. |
208
208
| Code path | Optional | The path to the folder on the local development environment that contains all the Python source code for scoring the model. You can use nested directories and packages. |
209
-
| Scoring script | Optional | Python code that executes the model on a given input request. This value can be the relative path to the scoring file in the source code folder.<br>The scoring script receives data submitted to a deployed web service and passes it to the model. The script then executes the model and returns its response to the client. The scoring script is specific to your model and must understand the data that the model expects as input and returns as output.<br>This example uses a score.py file. This Python code must have an `init` function and a `run` function. The `init` function is called after the model is created or updated. You can use it to cache the model in memory, for example. The `run` function is called at every invocation of the endpoint to do the actual scoring and prediction. |
209
+
| Scoring script | Optional | Python code that executes the model on a given input request. This value can be the relative path to the scoring file in the source code folder.<br>The scoring script receives data submitted to a deployed web service and passes it to the model. The script then executes the model and returns its response to the client. The scoring script is specific to your model and must understand the data that the model expects as input and returns as output.<br>This article's examples use a score.py file. This Python code must have an `init` function and a `run` function. The `init` function is called after the model is created or updated. You can use it to cache the model in memory, for example. The `run` function is called at every invocation of the endpoint to do the actual scoring and prediction. |
210
210
| Environment | Required | The environment to host the model and code. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification. The environment can be a Docker image with Conda dependencies, a Dockerfile, or a registered environment. |
211
211
| Instance type | Required | The virtual machine size to use for the deployment. For a list of supported sizes, see [Managed online endpoints SKU list](reference-managed-online-endpoints-vm-sku-list.md). |
212
212
| Instance count | Required | The number of instances to use for the deployment. You base the value on the workload you expect. For high availability, we recommend that you use at least three instances. Azure Machine Learning reserves an extra 20 percent for performing upgrades. For more information, see [Azure Machine Learning online endpoints and batch endpoints](how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints). |
213
213
214
-
To see a full list of attributes that you can specify when you create a deployment, see [CLI (v2) managed online deployment YAML schema](/azure/machine-learning/reference-yaml-deployment-managed-online) or for version 2 of the Python SDK, see [ManagedOnlineDeployment Class](/python/api/azure-ai-ml/azure.ai.ml.entities.managedonlinedeployment).
214
+
To see a full list of attributes that you can specify when you create a deployment, see [CLI (v2) managed online deployment YAML schema](/azure/machine-learning/reference-yaml-deployment-managed-online). For version 2 of the Python SDK, see [ManagedOnlineDeployment Class](/python/api/azure-ai-ml/azure.ai.ml.entities.managedonlinedeployment).
215
215
216
216
# [Azure CLI](#tab/azure-cli)
217
217
@@ -231,18 +231,14 @@ The following table describes keys that the endpoint YAML format uses. To see ho
231
231
232
232
To create an online endpoint:
233
233
234
-
1. Set your endpoint name:
235
-
236
-
For Unix, run the following command. Replace `YOUR_ENDPOINT_NAME` with a unique name.
234
+
1. Set your endpoint name by running the following Unix command. Replace `YOUR_ENDPOINT_NAME` with a unique name.
> Endpoint names must be unique within an Azure region. For example, in the Azure `westus2` region, there can be only one endpoint with the name `my-endpoint`.
242
240
243
-
1. Create the endpoint in the cloud:
244
-
245
-
Run the following code to use the endpoint.yml file to configure the endpoint:
241
+
1. Create the endpoint in the cloud by running the following code. This code uses the endpoint.yml file to configure the endpoint:
@@ -294,7 +290,7 @@ To create a deployment for your managed online endpoint, use the `ManagedOnlineD
294
290
295
291
For more information about registering your model as an asset, see [Register a model by using the Azure CLI or Python SDK](how-to-manage-models.md#register-a-model-by-using-the-azure-cli-or-python-sdk).
296
292
297
-
For more information on creating an environment, see [Create a custom environment](how-to-manage-environments-v2.md#create-a-custom-environment).
293
+
For more information about creating an environment, see [Create a custom environment](how-to-manage-environments-v2.md#create-a-custom-environment).
298
294
299
295
> [!NOTE]
300
296
> To create a deployment for a Kubernetes online endpoint, use the `KubernetesOnlineDeployment` class.
@@ -329,17 +325,17 @@ To register the example model, take the steps in the following sections.
329
325
330
326
:::image type="content" source="media/how-to-safely-rollout-managed-endpoints/register-model-folder.png" alt-text="Screenshot of the Register model from local files page. Under Browse, Browse folder is highlighted." lightbox="media/how-to-safely-rollout-managed-endpoints/register-model-folder.png":::
331
327
332
-
1. Go to the local copy of the repo you cloned or downloaded earlier, and then select **\azureml-examples\cli\endpoints\online\model-1\model**. When prompted, select **Upload** and wait for the upload to finish.
328
+
1. Go to the local copy of the repository you cloned or downloaded earlier, and then select **\azureml-examples\cli\endpoints\online\model-1\model**. When prompted, select **Upload** and wait for the upload to finish.
333
329
334
330
1. Select **Next**.
335
331
336
332
#### Configure and register the model
337
333
338
334
1. On the **Model settings** page, under **Name**, enter a friendly name for the model. The steps in this article assume the model is named `model-1`.
339
335
340
-
1. Select **Next**, and then select **Register** to complete registration.
336
+
1. Select **Next**, and then select **Register** to complete the registration.
341
337
342
-
For later examples in this article, you also need to register a model from the \azureml-examples\cli\endpoints\online\model-2\model folder in your local copy of the repo. To register that model, repeat the steps in the previous two sections, but name the model `model-2`.
338
+
For later examples in this article, you also need to register a model from the \azureml-examples\cli\endpoints\online\model-2\model folder in your local copy of the repository. To register that model, repeat the steps in the previous two sections, but name the model `model-2`.
343
339
344
340
For more information about working with registered models, see [Work with registered models in Azure Machine Learning](how-to-manage-models.md).
345
341
@@ -385,7 +381,7 @@ One way to create a managed online endpoint in the studio is from the **Models**
385
381
1. Select **Next**.
386
382
387
383
1. On the **Code and environment for inferencing** page, take the following steps:
388
-
1. Under **Select a scoring script for inferencing**, select **Browse**, and then select the \azureml-examples\cli\endpoints\online\model-1\onlinescoring\score.py file from the repo you cloned or downloaded earlier.
384
+
1. Under **Select a scoring script for inferencing**, select **Browse**, and then select the \azureml-examples\cli\endpoints\online\model-1\onlinescoring\score.py file from the repository you cloned or downloaded earlier.
389
385
1. In the search box above the list of environments, start entering **sklearn**, and then select the **sklearn-1.5:19** curated environment.
390
386
1. Select **Next**.
391
387
@@ -451,7 +447,7 @@ The output lists information about the `$ENDPOINT_NAME` endpoint and the `blue`
451
447
452
448
### Test the endpoint by using sample data
453
449
454
-
You can invoke the endpoint by using the `invoke` command. The following command uses the [sample-request.json](https://github.com/Azure/azureml-examples/tree/main/sdk/python/endpoints/online/model-1/sample-request.json) JSON file to send a sample request:
450
+
You can invoke the endpoint by using the `invoke` command. The following command uses the [sample-request.json](https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/model-1/sample-request.json) JSON file to send a sample request:
@@ -600,7 +596,7 @@ You can create a new deployment to add to your managed online endpoint. To creat
600
596
1. Select **Next**.
601
597
602
598
1. On the **Code and environment for inferencing** page, take the following steps:
603
-
1. Under **Select a scoring script for inferencing**, select **Browse**, and then select the \azureml-examples\cli\endpoints\online\model-2\onlinescoring\score.py file from the repo you cloned or downloaded earlier.
599
+
1. Under **Select a scoring script for inferencing**, select **Browse**, and then select the \azureml-examples\cli\endpoints\online\model-2\onlinescoring\score.py file from the repository you cloned or downloaded earlier.
604
600
1. In the search box above the list of environments, start entering **sklearn**, and then select the **sklearn-1.5:19** curated environment.
605
601
1. Select **Next**.
606
602
@@ -658,7 +654,7 @@ Even though 0 percent of the traffic goes to the `green` deployment, you can sti
658
654
659
655
## Test the deployment with mirrored traffic
660
656
661
-
After you test your `green` deployment, you can *mirror* a percentage of the live traffic to your endpoint by copying that percentage of traffic and sending it to the `green` deployment. Traffic mirroring, which is also called shadowing, doesn't change the results returned to clients—100 percent of requests still flow to the `blue` deployment. The mirrored percentage of the traffic is copied and also submitted to the `green` deployment so that you can gather metrics and logging without impacting your clients.
657
+
After you test your `green` deployment, you can *mirror* a percentage of the live traffic to your endpoint by copying that percentage of traffic and sending it to the `green` deployment. Traffic mirroring, which is also called *shadowing*, doesn't change the results returned to clients—100 percent of requests still flow to the `blue` deployment. The mirrored percentage of the traffic is copied and also submitted to the `green` deployment so that you can gather metrics and logging without impacting your clients.
662
658
663
659
Mirroring is useful when you want to validate a new deployment without impacting clients. For example, you can use mirroring to check whether latency is within acceptable bounds or to check that there are no HTTP errors. The use of traffic mirroring, or shadowing, to test a new deployment is also known as [shadow testing](https://microsoft.github.io/code-with-engineering-playbook/automated-testing/shadow-testing/). The deployment that receives the mirrored traffic, in this case, the `green` deployment, can also be called the *shadow deployment*.
664
660
@@ -667,11 +663,11 @@ Mirroring has the following limitations:
667
663
* Mirroring is supported for versions 2.4.0 and later of the Azure Machine Learning CLI and versions 1.0.0 and later of the Python SDK. If you use an older version of the Azure Machine Learning CLI or the Python SDK to update an endpoint, you lose the mirror traffic setting.
668
664
* Mirroring isn't currently supported for Kubernetes online endpoints.
669
665
* You can mirror traffic to only one deployment in an endpoint.
670
-
* The maximum percentage of traffic you can mirror is 50 percent. This cap limits the effect on your [endpoint bandwidth quota](how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints), which has a default value of 5 MBps. Your endpoint bandwidth is throttled if you exceed the allocated quota. For information about monitoring bandwidth throttling, see [Supported metrics for Microsoft.MachineLearningServices/workspaces/onlineEndpoints](monitor-azure-machine-learning-reference.md#supported-metrics-for-microsoftmachinelearningservicesworkspacesonlineendpoints).
666
+
* The maximum percentage of traffic you can mirror is 50 percent. This cap limits the effect on your [endpoint bandwidth quota](how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints), which has a default value of 5 MBps. Your endpoint bandwidth is throttled if you exceed the allocated quota. For information about monitoring bandwidth throttling, see [Bandwidth throttling](how-to-monitor-online-endpoints.md#bandwidth-throttling).
671
667
672
668
Also note the following behavior:
673
669
674
-
*A deployment can be configured to receive only live traffic or mirrored traffic, not both.
670
+
*You can configure a deployment to receive only live traffic or mirrored traffic, not both.
675
671
* When you invoke an endpoint, you can specify the name of any of its deployments—even a shadow deployment—to return the prediction.
676
672
* When you invoke an endpoint and specify the name of a deployment to receive incoming traffic, Azure Machine Learning doesn't mirror traffic to the shadow deployment. Azure Machine Learning mirrors traffic to the shadow deployment from traffic sent to the endpoint when you don't specify a deployment.
677
673
@@ -685,7 +681,7 @@ Use the following command to mirror 10 percent of the traffic and send it to the
You can confirm that the specified percentage of the traffic is sent to the `green` deployment by checking the logs from the deployment:
@@ -826,7 +822,7 @@ Use the following steps to delete an individual deployment from a managed online
826
822
# [Studio](#tab/azure-studio)
827
823
828
824
> [!NOTE]
829
-
> You can't delete a deployment that has live traffic allocated to it. Before you delete the deployment, you must [set the traffic allocation](#send-all-traffic-to-your-new-deployment) for the deployment to 0 percent.
825
+
> You can't delete a deployment that has live traffic allocated to it. Before you delete the deployment, you must [set the traffic allocation](#send-all-traffic-to-the-new-deployment) for the deployment to 0 percent.
830
826
831
827
1. On the endpoint page, go to the **Details** tab, and then go to the `blue` deployment card.
832
828
@@ -864,4 +860,4 @@ Alternatively, you can delete a managed online endpoint directly in the endpoint
0 commit comments