You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, you'll learn how to deploy a new version of a machine learning model in production without causing any disruption. You'll use blue-green deployment, also known as a safe rollout strategy, to introduce a new version of a web service to production. This strategy will allow you to roll out your new version of the web service to a small subset of users or requests before rolling it out completely.
22
+
In this article, you'll learn how to deploy a new version of a machine learning model in production without causing any disruption. You'll use a blue-green deployment strategy (also known as a safe rollout strategy) to introduce a new version of a web service to production. This strategy will allow you to roll out your new version of the web service to a small subset of users or requests before rolling it out completely.
23
23
24
24
This article assumes you're using online endpoints, that is, endpoints that are used for online (real-time) inferencing. There are two types of online endpoints: **managed online endpoints** and **Kubernetes online endpoints**. For more information on endpoints and the differences between managed online endpoints and Kubernetes online endpoints, see [What are Azure Machine Learning endpoints?](concept-endpoints.md#managed-online-endpoints-vs-kubernetes-online-endpoints).
25
25
@@ -28,8 +28,8 @@ The main example in this article uses managed online endpoints for deployment. T
28
28
In this article, you'll learn to:
29
29
30
30
> [!div class="checklist"]
31
-
> * Define an online endpoint and a deployment called "blue" to serve version 1 of a model
32
-
> * Scale the blue deployment so that it can handle more requests
31
+
> * Define an online endpoint with a deployment called "blue" to serve version 1 of a model
32
+
<!-- > * Scale the blue deployment so that it can handle more requests -->
33
33
> * Deploy version 2 of the model (called the "green" deployment) to the endpoint, but send the deployment no live traffic
34
34
> * Test the green deployment in isolation
35
35
> * Mirror a percentage of live traffic to the green deployment to validate it (preview)
@@ -174,7 +174,7 @@ If you cloned the examples repo, your local machine already has copies of the fi
174
174
175
175
## Define the endpoint and deployment
176
176
177
-
Online endpoints are used for online (real-time) inferencing. Online endpoints contain deployments that are ready to receive data from clients and can send responses back in real time.
177
+
Online endpoints are used for online (real-time) inferencing. Online endpoints contain deployments that are ready to receive data from clients and send responses back in real time.
178
178
179
179
To define an endpoint, you need to specify:
180
180
@@ -186,6 +186,18 @@ To define an endpoint, you need to specify:
186
186
187
187
### Create online endpoint
188
188
189
+
You'll use the *endpoints/online/managed/sample/endpoint.yml* file to configure the endpoint. The following snippet shows the contents of the file:
The reference for the endpoint YAML format is described in the following table. To learn how to specify these attributes, see the [online endpoint YAML reference](reference-yaml-endpoint-online.md). For information about limits related to managed endpoints, see [Manage and increase quotas for resources with Azure Machine Learning](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints).
|`$schema`| (Optional) The YAML schema. To see all available options in the YAML file, you can view the schema in the preceding code snippet in a browser. |
198
+
|`name`| The name of the endpoint. |
199
+
|`auth_mode`| Use `key` for key-based authentication. Use `aml_token` for Azure Machine Learning token-based authentication. To get the most recent token, use the `az ml online-endpoint get-credentials` command. |
200
+
189
201
To create an online endpoint:
190
202
191
203
1. Set your endpoint name:
@@ -198,8 +210,8 @@ To create an online endpoint:
198
210
> Endpoint names must be unique within an Azure region. For example, in the Azure `westus2` region, there can be only one endpoint with the name `my-endpoint`.
199
211
200
212
1. Create the endpoint in the cloud:
201
-
202
-
Run the following code to use the `endpoint.yml` file to configure the endpoint:
213
+
214
+
Run the following code to use the `endpoint.yml` file to configure the endpoint:
@@ -266,11 +278,11 @@ A deployment is a set of resources required for hosting the model that does the
266
278
267
279
# [Studio](#tab/azure-studio)
268
280
269
-
Before you can deploy a model in the Azure Machine Learning studio, you must register it in the workspace.
281
+
When you create a managed online endpoint in the Azure Machine Learning studio, you must define an initial deployment for the endpoint. To define a deployment, you must have a registered model in your workspace. Let's begin by registering the model that we'll use for the deployment.
270
282
271
283
### Register the model
272
284
273
-
A model registration is a logical entity in the workspace that may contain a single model file or a directory of multiple files. As a best practice for production, you should register the model and environment. When creating the endpoint and deployment in this article, we'll assume that you've registered the [model folder](https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/model-1/model) that contains the model.
285
+
A model registration is a logical entity in the workspace. This entity may contain a single model file or a directory of multiple files. As a best practice for production, you should register the model and environment. When creating the endpoint and deployment in this article, we'll assume that you've registered the [model folder](https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/model-1/model) that contains the model.
274
286
275
287
To register the example model, follow these steps:
276
288
@@ -510,7 +522,7 @@ You can test mirror traffic by invoking the endpoint several times:
510
522
511
523
# [Studio](#tab/azure-studio)
512
524
513
-
M.A: add details here
525
+
The studio doesn't support mirrored traffic. See the Azure CLI or Python tabs for steps to mirror traffic to a deployment.
514
526
515
527
---
516
528
@@ -555,7 +567,7 @@ After testing, you can set the mirror traffic to zero to disable mirroring:
555
567
556
568
# [Studio](#tab/azure-studio)
557
569
558
-
M.A: add details here
570
+
The studio doesn't support mirrored traffic. See the Azure CLI or Python tabs for steps to mirror traffic to a deployment.
0 commit comments