You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-azure-kubernetes-service.md
+35-7Lines changed: 35 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -228,10 +228,28 @@ For information on using VS Code, see [deploy to AKS via the VS Code extension](
228
228
> Deploying through VS Code requires the AKS cluster to be created or attached to your workspace in advance.
229
229
230
230
## Deploy models to AKS using controlled rollout (preview)
231
-
Analyze and promote model versions in a controlled fashion using endpoints. Deploy up to 6 versions behind a single endpoint and configure the % of scoring traffic to each deployed version. You can enable app insights to view operational metrics of endpoints and deployed versions.
231
+
232
+
Analyze and promote model versions in a controlled fashion using endpoints. You can deploy up to six versions behind a single endpoint. Endpoints provide the following capabilities:
233
+
234
+
* Configure the __percentage of scoring traffic sent to each endpoint__. For example, route 20% of the traffic to endpoint 'test' and 80% to 'production'.
235
+
236
+
> [!NOTE]
237
+
> If you do not account for 100% of the traffic, any remaining percentage is routed to the __default__ endpoint version. For example, if you configure endpoint version 'test' to get 10% of the traffic, and 'prod' for 30%, the remaining 60% is sent to the default endpoint version.
238
+
>
239
+
> The first endpoint version created is automatically configured as the default. You can change this by setting `is_default=True` when creating or updating an endpoint version.
240
+
241
+
* Tag an endpoint version as either __control__ or __treatment__. For example, the current production endpoint version might be the control, while potential new models are deployed as treatment versions. After evaluating performance of the treatment versions, if one outperforms the current control, it might be promoted to the new production/control.
242
+
243
+
> [!NOTE]
244
+
> You can only have __one__ control. You can have multiple treatments.
245
+
246
+
You can enable app insights to view operational metrics of endpoints and deployed versions.
232
247
233
248
### Create an endpoint
234
-
Once you are ready to deploy your models, create a scoring endpoint and deploy your first version. The step below shows you how to deploy and create the endpoint using the SDK. The first deployment will be defined as the default version which means that unspecified traffic percentile across all versions will go to the default version.
249
+
Once you are ready to deploy your models, create a scoring endpoint and deploy your first version. The following example shows how to deploy and create the endpoint using the SDK. The first deployment will be defined as the default version, which means that unspecified traffic percentile across all versions will go to the default version.
250
+
251
+
> [!TIP]
252
+
> In the following example, the configuration sets the initial endpoint version to handle 20% of the traffic. Since this is the first endpoint, it's also the default version. And since we don't have any other versions for the other 80% of traffic, it is routed to the default as well. Until other versions that take a percentage of traffic are deployed, this one effectively receives 100% of the traffic.
235
253
236
254
```python
237
255
import azureml.core,
@@ -242,8 +260,8 @@ from azureml.core.compute import ComputeTarget
242
260
compute = ComputeTarget(ws, 'myaks')
243
261
namespace_name= endpointnamespace
244
262
# define the endpoint and version name
245
-
endpoint_name ="mynewendpoint",
246
-
version_name="versiona",
263
+
endpoint_name ="mynewendpoint"
264
+
version_name="versiona"
247
265
# create the deployment config and define the scoring traffic percentile for the first deployment
Add another version to your endpoint and configure the scoring traffic percentile going to the version. There are two types of versions, a control and a treatment version. There can be multiple treatment version to help compare against a single control version.
280
+
Add another version to your endpoint and configure the scoring traffic percentile going to the version. There are two types of versions, a control and a treatment version. There can be multiple treatment versions to help compare against a single control version.
281
+
282
+
> [!TIP]
283
+
> The second version, created by the following code snippet, accepts 10% of traffic. The first version is configured for 20%, so only 30% of the traffic is configured for specific versions. The remaining 70% is sent to the first endpoint version, because it is also the default version.
Update existing versions or delete them in an endpoint. You can change the version's default type, control type, and the traffic percentile.
299
+
Update existing versions or delete them in an endpoint. You can change the version's default type, control type, and the traffic percentile. In the following example, the second version increases its traffic to 40% and is now the default.
300
+
301
+
> [!TIP]
302
+
> After the following code snippet, the second version is now default. It is now configured for 40%, while the original version is still configured for 20%. This means that 40% of traffic is not accounted for by version configurations. The leftover traffic will be routed to the second version, because it is now default. It effectively receives 80% of the traffic.
0 commit comments