You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-azure-kubernetes-service.md
+35-7Lines changed: 35 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -228,10 +228,28 @@ For information on using VS Code, see [deploy to AKS via the VS Code extension](
228
228
> Deploying through VS Code requires the AKS cluster to be created or attached to your workspace in advance.
229
229
230
230
## Deploy models to AKS using controlled rollout (preview)
231
-
Analyze and promote model versions in a controlled fashion using endpoints. Deploy up to 6 versions behind a single endpoint and configure the % of scoring traffic to each deployed version. You can enable app insights to view operational metrics of endpoints and deployed versions.
231
+
232
+
Analyze and promote model versions in a controlled fashion using endpoints. You can deploy up to six versions behind a single endpoint. Endpoints provide the following capabilities:
233
+
234
+
* Configure the __percentage of scoring traffic sent to each endpoint__. For example, route 20% of the traffic to endpoint 'test' and 80% to 'production'.
235
+
236
+
> [!NOTE]
237
+
> If you do not account for 100% of the traffic, any remaining percentage is routed to the __default__ endpoint. For example, if you configure endpoint 'test' to get 10% of the traffic, the remaining 90% is sent to the default endpoint.
238
+
>
239
+
> The first endpoint created is automatically configured as the default. You can change this by setting `is_default=True` when creating or updating an endpoint version.
240
+
241
+
* Tag an endpoint as either __control__ or __treatment__. For example, the current production endpoint might be the control, while potential new models are deployed as treatments. After evaluating performance of the treatment endpoints, if one outperforms the current control, it might be promoted to the new production/control.
242
+
243
+
> [!NOTE]
244
+
> You can only have __one__ control endpoint. You can have multiple treatments.
245
+
246
+
You can enable app insights to view operational metrics of endpoints and deployed versions.
232
247
233
248
### Create an endpoint
234
-
Once you are ready to deploy your models, create a scoring endpoint and deploy your first version. The step below shows you how to deploy and create the endpoint using the SDK. The first deployment will be defined as the default version which means that unspecified traffic percentile across all versions will go to the default version.
249
+
Once you are ready to deploy your models, create a scoring endpoint and deploy your first version. The following example shows how to deploy and create the endpoint using the SDK. The first deployment will be defined as the default version, which means that unspecified traffic percentile across all versions will go to the default version.
250
+
251
+
> [!TIP]
252
+
> In the following example, the configuration sets this endpoint to handle 20% of the traffic. Since this is the first endpoint, it's also the default endpoint. And since we don't have any other endpoints for the other 80% of traffic, it is routed to the default endpoint. Until other endpoints that take a percentage of traffic are deployed, this one effectively receives 100% of the traffic.
235
253
236
254
```python
237
255
import azureml.core,
@@ -242,8 +260,8 @@ from azureml.core.compute import ComputeTarget
242
260
compute = ComputeTarget(ws, 'myaks')
243
261
namespace_name= endpointnamespace
244
262
# define the endpoint and version name
245
-
endpoint_name ="mynewendpoint",
246
-
version_name="versiona",
263
+
endpoint_name ="mynewendpoint"
264
+
version_name="versiona"
247
265
# create the deployment config and define the scoring traffic percentile for the first deployment
Add another version to your endpoint and configure the scoring traffic percentile going to the version. There are two types of versions, a control and a treatment version. There can be multiple treatment version to help compare against a single control version.
280
+
Add another version to your endpoint and configure the scoring traffic percentile going to the version. There are two types of versions, a control and a treatment version. There can be multiple treatment versions to help compare against a single control version.
281
+
282
+
> [!TIP]
283
+
> The previous endpoint is set to receive 20% of the traffic. It's also the default, so it gets any leftover traffic not handled by other endpoints. The second endpoint version created by the next code snippet accepts 10% of traffic. After it is created, the first endpoint version is configured for 20% of the traffic and the new version for 10%. The remaining 70% is sent to the first endpoint version, because it is also the default version.
Update existing versions or delete them in an endpoint. You can change the version's default type, control type, and the traffic percentile.
299
+
Update existing versions or delete them in an endpoint. You can change the version's default type, control type, and the traffic percentile. In the following example, the second version increases its traffic to 40% and is now the default.
300
+
301
+
> [!TIP]
302
+
> Since the second version is now default, and is configured for 40%, while the original version is configured for 20%. This means that 40% of traffic is not accounted for by version configurations, and will also be routed to the second version. It effectively receives 80% of the traffic.
0 commit comments