Skip to content

Commit 7b836d1

Browse files
author
Larry Franks
committed
updates for endpoint versions
1 parent b54b630 commit 7b836d1

File tree

1 file changed

+35
-7
lines changed

1 file changed

+35
-7
lines changed

articles/machine-learning/how-to-deploy-azure-kubernetes-service.md

Lines changed: 35 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -228,10 +228,28 @@ For information on using VS Code, see [deploy to AKS via the VS Code extension](
228228
> Deploying through VS Code requires the AKS cluster to be created or attached to your workspace in advance.
229229
230230
## Deploy models to AKS using controlled rollout (preview)
231-
Analyze and promote model versions in a controlled fashion using endpoints. Deploy up to 6 versions behind a single endpoint and configure the % of scoring traffic to each deployed version. You can enable app insights to view operational metrics of endpoints and deployed versions.
231+
232+
Analyze and promote model versions in a controlled fashion using endpoints. You can deploy up to six versions behind a single endpoint. Endpoints provide the following capabilities:
233+
234+
* Configure the __percentage of scoring traffic sent to each endpoint__. For example, route 20% of the traffic to endpoint 'test' and 80% to 'production'.
235+
236+
> [!NOTE]
237+
> If you do not account for 100% of the traffic, any remaining percentage is routed to the __default__ endpoint. For example, if you configure endpoint 'test' to get 10% of the traffic, the remaining 90% is sent to the default endpoint.
238+
>
239+
> The first endpoint created is automatically configured as the default. You can change this by setting `is_default=True` when creating or updating an endpoint version.
240+
241+
* Tag an endpoint as either __control__ or __treatment__. For example, the current production endpoint might be the control, while potential new models are deployed as treatments. After evaluating performance of the treatment endpoints, if one outperforms the current control, it might be promoted to the new production/control.
242+
243+
> [!NOTE]
244+
> You can only have __one__ control endpoint. You can have multiple treatments.
245+
246+
You can enable app insights to view operational metrics of endpoints and deployed versions.
232247

233248
### Create an endpoint
234-
Once you are ready to deploy your models, create a scoring endpoint and deploy your first version. The step below shows you how to deploy and create the endpoint using the SDK. The first deployment will be defined as the default version which means that unspecified traffic percentile across all versions will go to the default version.
249+
Once you are ready to deploy your models, create a scoring endpoint and deploy your first version. The following example shows how to deploy and create the endpoint using the SDK. The first deployment will be defined as the default version, which means that unspecified traffic percentile across all versions will go to the default version.
250+
251+
> [!TIP]
252+
> In the following example, the configuration sets this endpoint to handle 20% of the traffic. Since this is the first endpoint, it's also the default endpoint. And since we don't have any other endpoints for the other 80% of traffic, it is routed to the default endpoint. Until other endpoints that take a percentage of traffic are deployed, this one effectively receives 100% of the traffic.
235253
236254
```python
237255
import azureml.core,
@@ -242,8 +260,8 @@ from azureml.core.compute import ComputeTarget
242260
compute = ComputeTarget(ws, 'myaks')
243261
namespace_name= endpointnamespace
244262
# define the endpoint and version name
245-
endpoint_name = "mynewendpoint",
246-
version_name= "versiona",
263+
endpoint_name = "mynewendpoint"
264+
version_name= "versiona"
247265
# create the deployment config and define the scoring traffic percentile for the first deployment
248266
endpoint_deployment_config = AksEndpoint.deploy_configuration(cpu_cores = 0.1, memory_gb = 0.2,
249267
enable_app_insights = True,
@@ -253,11 +271,16 @@ endpoint_deployment_config = AksEndpoint.deploy_configuration(cpu_cores = 0.1, m
253271
traffic_percentile = 20)
254272
# deploy the model and endpoint
255273
endpoint = Model.deploy(ws, endpoint_name, [model], inference_config, endpoint_deployment_config, compute)
274+
# Wait for he process to complete
275+
endpoint.wait_for_deployment(True)
256276
```
257277

258278
### Update and add versions to an endpoint
259279

260-
Add another version to your endpoint and configure the scoring traffic percentile going to the version. There are two types of versions, a control and a treatment version. There can be multiple treatment version to help compare against a single control version.
280+
Add another version to your endpoint and configure the scoring traffic percentile going to the version. There are two types of versions, a control and a treatment version. There can be multiple treatment versions to help compare against a single control version.
281+
282+
> [!TIP]
283+
> The previous endpoint is set to receive 20% of the traffic. It's also the default, so it gets any leftover traffic not handled by other endpoints. The second endpoint version created by the next code snippet accepts 10% of traffic. After it is created, the first endpoint version is configured for 20% of the traffic and the new version for 10%. The remaining 70% is sent to the first endpoint version, because it is also the default version.
261284
262285
```python
263286
from azureml.core.webservice import AksEndpoint
@@ -270,9 +293,13 @@ endpoint.create_version(version_name = version_name_add,
270293
tags = {'modelVersion':'b'},
271294
description = "my second version",
272295
traffic_percentile = 10)
296+
endpoint.wait_for_deployment(True)
273297
```
274298

275-
Update existing versions or delete them in an endpoint. You can change the version's default type, control type, and the traffic percentile.
299+
Update existing versions or delete them in an endpoint. You can change the version's default type, control type, and the traffic percentile. In the following example, the second version increases its traffic to 40% and is now the default.
300+
301+
> [!TIP]
302+
> Since the second version is now default, and is configured for 40%, while the original version is configured for 20%. This means that 40% of traffic is not accounted for by version configurations, and will also be routed to the second version. It effectively receives 80% of the traffic.
276303
277304
```python
278305
from azureml.core.webservice import AksEndpoint
@@ -283,7 +310,8 @@ endpoint.update_version(version_name=endpoint.versions["versionb"].name,
283310
traffic_percentile=40,
284311
is_default=True,
285312
is_control_version_type=True)
286-
313+
# Wait for the process to complete before deleting
314+
endpoint.wait_for_deployment(true)
287315
# delete a version in an endpoint
288316
endpoint.delete_version(version_name="versionb")
289317

0 commit comments

Comments
 (0)