Skip to content

Commit 19e8a4f

Browse files
committed
edits per review feedback
1 parent 4824419 commit 19e8a4f

File tree

1 file changed

+36
-30
lines changed

1 file changed

+36
-30
lines changed

articles/machine-learning/how-to-autoscale-endpoints.md

Lines changed: 36 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ author: msakande
99
ms.author: mopeakande
1010
ms.reviewer: sehan
1111
ms.custom: devplatv2, cliv2, update-code
12-
ms.date: 07/29/2024
12+
ms.date: 08/07/2024
1313

1414
#customer intent: As a developer, I want to autoscale online endpoints in Azure Machine Learning so I can control resource usage in my deployment based on metrics or schedules.
1515
---
@@ -18,25 +18,31 @@ ms.date: 07/29/2024
1818

1919
[!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
2020

21-
The autoscale process lets you automatically run the right amount of resources to handle the load on your application. [Online endpoints](concept-endpoints.md) in Azure Machine Learning support autoscaling through integration with the Azure Monitor autoscale feature.
21+
In this article, you learn to manage resource usage in a deployment by configuring autoscaling based on metrics and schedules. The autoscale process lets you automatically run the right amount of resources to handle the load on your application. [Online endpoints](concept-endpoints.md) in Azure Machine Learning support autoscaling through integration with the autoscale feature in Azure Monitor.
2222

23-
Azure Monitor autoscaling provides a rich set of rules. You can configure metrics-based scaling (such as CPU utilization greater than 70%), schedule-based scaling (such as scaling rules for peak business hours), or a combination. For more information, see [Overview of autoscale in Microsoft Azure](../azure-monitor/autoscale/autoscale-overview.md).
23+
Azure Monitor autoscale allows you to set rules that trigger one or more autoscale actions when conditions of the rules are met. You can configure metrics-based scaling (such as CPU utilization greater than 70%), schedule-based scaling (such as scaling rules for peak business hours), or a combination of the two. For more information, see [Overview of autoscale in Microsoft Azure](../azure-monitor/autoscale/autoscale-overview.md).
2424

2525
:::image type="content" source="media/how-to-autoscale-endpoints/concept-autoscale.png" border="false" alt-text="Diagram that shows how autoscale adds and removes instances as needed.":::
2626

27-
You can currently manage autoscaling by using the Azure CLI, the REST APIs, Azure Resource Manager, or the browser-based Azure portal.
27+
You can currently manage autoscaling by using the Azure CLI, the REST APIs, Azure Resource Manager, the Python SDK, or the browser-based Azure portal.
2828

2929
## Prerequisites
3030

3131
- A deployed endpoint. For more information, see [Deploy and score a machine learning model by using an online endpoint](how-to-deploy-online-endpoints.md).
3232

3333
- To use autoscale, the role `microsoft.insights/autoscalesettings/write` must be assigned to the identity that manages autoscale. You can use any built-in or custom roles that allow this action. For general guidance on managing roles for Azure Machine Learning, see [Manage users and roles](how-to-assign-roles.md). For more on autoscale settings from Azure Monitor, see [Microsoft.Insights autoscalesettings](/azure/templates/microsoft.insights/autoscalesettings).
3434

35+
- To use the Python SDK to manage the Azure Monitor service, install the `azure-mgmt-monitor` package with the following command:
36+
37+
```console
38+
pip install azure-mgmt-monitor
39+
```
40+
3541
## Define autoscale profile
3642

37-
To enable autoscale for a Machine Learning endpoint, you first define an autoscale profile. The profile specifies the default, minimum, and maximum scale set capacity. The following example shows how to set the number of virtual machine (VM) instances for the default, minimum, and maximum scale capacity.
43+
To enable autoscale for an online endpoint, you first define an autoscale profile. The profile specifies the default, minimum, and maximum scale set capacity. The following example shows how to set the number of virtual machine (VM) instances for the default, minimum, and maximum scale capacity.
3844

39-
# [Azure CLI](#tab/azure-cli)
45+
# [Azure CLI](#tab/cli)
4046

4147
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
4248

@@ -55,7 +61,7 @@ To enable autoscale for a Machine Learning endpoint, you first define an autosca
5561
> [!NOTE]
5662
> For more information, see the [az monitor autoscale](/cli/azure/monitor/autoscale) reference.
5763
58-
# [Python](#tab/python)
64+
# [Python SDK](#tab/python)
5965

6066
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
6167

@@ -160,11 +166,11 @@ Leave the configuration pane open. In the next section, you configure the __Rule
160166

161167
---
162168

163-
## Create scale out rule with deployment metrics
169+
## Create scale-out rule based on deployment metrics
164170

165-
A common scaling out rule increases the number of VM instances when the average CPU load is high. The following example shows how to allocate two more nodes (up to the maximum) if the CPU average load is greater than 70% for 5 minutes:
171+
A common scale-out rule is to increase the number of VM instances when the average CPU load is high. The following example shows how to allocate two more nodes (up to the maximum) if the CPU average load is greater than 70% for 5 minutes:
166172

167-
# [Azure CLI](#tab/azure-cli)
173+
# [Azure CLI](#tab/cli)
168174

169175
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
170176

@@ -175,7 +181,7 @@ The rule is part of the `my-scale-settings` profile, where `autoscale-name` matc
175181
> [!NOTE]
176182
> For more information, see the [az monitor autoscale](/cli/azure/monitor/autoscale) Azure CLI syntax reference.
177183

178-
# [Python](#tab/python)
184+
# [Python SDK](#tab/python)
179185

180186
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
181187

@@ -232,7 +238,7 @@ The rule is part of the `my-scale-settings` profile, where `autoscale-name` matc
232238

233239
# [Studio](#tab/azure-studio)
234240

235-
The following steps continue with the autoscaling configuration.
241+
The following steps continue with the autoscale configuration.
236242

237243
1. For the __Rules__ option, select the __Add a rule__ link. The __Scale rule__ page opens.
238244

@@ -254,17 +260,17 @@ Leave the configuration pane open. In the next section, you adjust the __Rules__
254260

255261
---
256262

257-
## Create scale in rule with deployment metrics
263+
## Create scale-in rule based on deployment metrics
258264

259-
When the average CPU load is light, a scale in rule can reduce the number of VM instances. The following example shows how to release a single node, down to a minimum of two, if the CPU load is less than 30% for 5 minutes.
265+
When the average CPU load is light, a scale-in rule can reduce the number of VM instances. The following example shows how to release a single node down to a minimum of two, if the CPU load is less than 30% for 5 minutes.
260266

261-
# [Azure CLI](#tab/azure-cli)
267+
# [Azure CLI](#tab/cli)
262268

263269
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
264270

265271
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-moe-autoscale.sh" ID="scale_in_on_cpu_util" :::
266272

267-
# [Python](#tab/python)
273+
# [Python SDK](#tab/python)
268274

269275
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
270276

@@ -338,7 +344,7 @@ The following steps adjust the __Rules__ configuration to support a scale in rul
338344

339345
:::image type="content" source="media/how-to-autoscale-endpoints/scale-in-rule.png" lightbox="media/how-to-autoscale-endpoints/scale-in-rule.png" alt-text="Screenshot that shows how to configure the scale in rule for less than 30% CPU for 5 minutes.":::
340346

341-
If you configure both scale-out and scale in rules, your rules look similar to the following screenshot. The rules specify that if average CPU load exceeds 70% for 5 minutes, two more nodes should be allocated, up to the limit of five. If CPU load is less than 30% for 5 minutes, a single node should be released, down to the minimum of two.
347+
If you configure both scale-out and scale-in rules, your rules look similar to the following screenshot. The rules specify that if average CPU load exceeds 70% for 5 minutes, two more nodes should be allocated, up to the limit of five. If CPU load is less than 30% for 5 minutes, a single node should be released, down to the minimum of two.
342348

343349
:::image type="content" source="media/how-to-autoscale-endpoints/autoscale-rules-final.png" lightbox="media/how-to-autoscale-endpoints/autoscale-rules-final.png" alt-text="Screenshot that shows the autoscale settings including the scale in and scale-out rules.":::
344350

@@ -348,15 +354,15 @@ Leave the configuration pane open. In the next section, you specify other scale
348354

349355
## Create scale rule based on endpoint metrics
350356

351-
In the previous examples, you created rules to scale in or out based on deployment metrics. You can also create a rule that applies to the deployment endpoint. The following example shows how to allocate another node when the request latency is greater than an average of 70 milliseconds for 5 minutes.
357+
In the previous sections, you created rules to scale in or out based on deployment metrics. You can also create a rule that applies to the deployment endpoint. In this section, you learn how to allocate another node when the request latency is greater than an average of 70 milliseconds for 5 minutes.
352358

353-
# [Azure CLI](#tab/azure-cli)
359+
# [Azure CLI](#tab/cli)
354360

355361
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
356362

357363
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-moe-autoscale.sh" ID="scale_up_on_request_latency" :::
358364

359-
# [Python](#tab/python)
365+
# [Python SDK](#tab/python)
360366

361367
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
362368

@@ -440,21 +446,21 @@ The following steps continue the rule configuration on the __Custom autoscale__
440446

441447
---
442448

443-
## Find supported Metrics IDs
449+
## Find IDs for supported metrics
444450

445451
If you want to use other metrics in code to set up autoscale rules by using the Azure CLI or the SDK, see the table in [Available metrics](how-to-monitor-online-endpoints.md#available-metrics).
446452

447453
## Create scale rule based on schedule
448454

449-
You can also create rules that apply only on certain days or at certain times. The following example creates a rule that sets the node count to 2 on the weekend.
455+
You can also create rules that apply only on certain days or at certain times. In this section, you create a rule that sets the node count to 2 on the weekends.
450456

451-
# [Azure CLI](#tab/azure-cli)
457+
# [Azure CLI](#tab/cli)
452458

453459
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
454460

455461
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-moe-autoscale.sh" ID="weekend_profile" :::
456462

457-
# [Python](#tab/python)
463+
# [Python SDK](#tab/python)
458464

459465
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
460466

@@ -508,17 +514,17 @@ The following steps configure the rule with options on the __Custom autoscale__
508514

509515
---
510516

511-
## Enable or disable autoscaling
517+
## Enable or disable autoscale
512518

513519
You can enable or disable a specific autoscale profile.
514520

515-
# [Azure CLI](#tab/azure-cli)
521+
# [Azure CLI](#tab/cli)
516522

517523
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
518524

519525
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-moe-autoscale.sh" ID="disable_profile" :::
520526

521-
# [Python](#tab/python)
527+
# [Python SDK](#tab/python)
522528

523529
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
524530

@@ -546,13 +552,13 @@ mon_client.autoscale_settings.create_or_update(
546552

547553
If you're not going to use your deployments, delete the resources with the following steps.
548554

549-
# [Azure CLI](#tab/azure-cli)
555+
# [Azure CLI](#tab/cli)
550556

551557
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
552558

553559
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-moe-autoscale.sh" ID="delete_endpoint" :::
554560

555-
# [Python](#tab/python)
561+
# [Python SDK](#tab/python)
556562

557563
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
558564

@@ -567,7 +573,7 @@ ml_client.online_endpoints.begin_delete(endpoint_name)
567573

568574
# [Studio](#tab/azure-studio)
569575

570-
1. In [Azure Machine Learning studio](https://ml.azure.com), go to your workspace, and select __Endpoints__ from the left menu.
576+
1. In [Azure Machine Learning studio](https://ml.azure.com), go to your workspace and select __Endpoints__ from the left menu.
571577

572578
1. In the list of endpoints, select the endpoint to delete (check the circle next to the model name).
573579

0 commit comments

Comments
 (0)