You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-autoscale-endpoints.md
+12-13Lines changed: 12 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,6 @@ author: msakande
9
9
ms.author: mopeakande
10
10
ms.reviewer: sehan
11
11
ms.custom: devplatv2, cliv2, update-code
12
-
13
12
ms.date: 07/29/2024
14
13
15
14
#customer intent: As a developer, I want to autoscale online endpoints in Azure Machine Learning so I can control resource usage in my deployment based on metrics or schedules.
@@ -21,7 +20,7 @@ ms.date: 07/29/2024
21
20
22
21
The autoscale process lets you automatically run the right amount of resources to handle the load on your application. [Online endpoints](concept-endpoints.md) in Azure Machine Learning support autoscaling through integration with the Azure Monitor autoscale feature.
23
22
24
-
Azure Monitor autoscaling provides a rich set of rules. You can configure metrics-based scaling, such as CPU utilization greater than 70%, schedule-based scaling, such as scaling rules for peak business hours, or a combination. For more information, see [Overview of autoscale in Microsoft Azure](../azure-monitor/autoscale/autoscale-overview.md).
23
+
Azure Monitor autoscaling provides a rich set of rules. You can configure metrics-based scaling (such as CPU utilization greater than 70%), schedule-based scaling (such as scaling rules for peak business hours), or a combination. For more information, see [Overview of autoscale in Microsoft Azure](../azure-monitor/autoscale/autoscale-overview.md).
25
24
26
25
:::image type="content" source="media/how-to-autoscale-endpoints/concept-autoscale.png" border="false" alt-text="Diagram that shows how autoscale adds and removes instances as needed.":::
27
26
@@ -93,7 +92,7 @@ To enable autoscale for a Machine Learning endpoint, you first define an autosca
93
92
mon_client = MonitorManagementClient(
94
93
credential, subscription_id
95
94
)
96
-
```
95
+
```
97
96
98
97
1. Get the endpoint and deployment objects:
99
98
@@ -132,19 +131,19 @@ To enable autoscale for a Machine Learning endpoint, you first define an autosca
132
131
]
133
132
}
134
133
)
135
-
```
134
+
```
136
135
137
136
# [Studio](#tab/azure-studio)
138
137
139
138
1. In [Azure Machine Learning studio](https://ml.azure.com), go to your workspace, and select __Endpoints__ from the left menu.
140
139
141
140
1. In the list of available endpoints, select the endpoint to configure:
142
141
143
-
:::image type="content" source="media/how-to-autoscale-endpoints/select-endpoint.png" alt-text="Screenshot that shows how to select an endpoint deployment entry for a Machine Learning workspace in the studio.":::
142
+
:::image type="content" source="media/how-to-autoscale-endpoints/select-endpoint.png" alt-text="Screenshot that shows how to select an endpoint deployment entry for a Machine Learning workspace in the studio." lightbox="media/how-to-autoscale-endpoints/select-endpoint.png":::
144
143
145
144
1. On the __Details__ tab for the selected endpoint, select __Configure auto scaling__:
146
145
147
-
:::image type="content" source="media/how-to-autoscale-endpoints/configure-auto-scaling.png" alt-text="Screenshot that shows how to select the option to configure autoscaling for an endpoint.":::
146
+
:::image type="content" source="media/how-to-autoscale-endpoints/configure-auto-scaling.png" alt-text="Screenshot that shows how to select the option to configure autoscaling for an endpoint." lightbox="media/how-to-autoscale-endpoints/configure-auto-scaling.png":::
148
147
149
148
1. For the __Choose how to scale your resources__ option, select __Custom autoscale__ to begin the configuration.
150
149
@@ -155,7 +154,7 @@ To enable autoscale for a Machine Learning endpoint, you first define an autosca
155
154
- __Instance limits__ > __Maximum__: Set the value to 5.
156
155
- __Instance limits__ > __Default__: Set the value to 2.
157
156
158
-
:::image type="content" source="media/how-to-autoscale-endpoints/choose-custom-autoscale.png" alt-text="Screenshot that shows how to configure the autoscale settings in the studio.":::
157
+
:::image type="content" source="media/how-to-autoscale-endpoints/choose-custom-autoscale.png" alt-text="Screenshot that shows how to configure the autoscale settings in the studio." lightbox="media/how-to-autoscale-endpoints/choose-custom-autoscale.png":::
159
158
160
159
Leave the configuration pane open. In the next section, you configure the __Rules__ settings.
161
160
@@ -203,7 +202,7 @@ The rule is part of the `my-scale-settings` profile, where `autoscale-name` matc
203
202
)
204
203
```
205
204
206
-
This rule refers to the last 5-minute average of the `CPUUtilizationpercentage` value from the arguments `metric_name`, `time_window`and`time_aggregation`. When the value of the metric is greater than the `threshold` of 70, two more VM instances are allocated.
205
+
This rule refers to the last 5-minute average of the `CPUUtilizationpercentage` value from the arguments `metric_name`, `time_window`,and`time_aggregation`. When the value of the metric is greater than the `threshold` of 70, the deployment allocates two more VM instances.
207
206
208
207
1. Update the `my-scale-settings` profile to include this rule:
209
208
@@ -249,7 +248,7 @@ The following steps continue with the autoscaling configuration.
249
248
250
249
1. Select __Add__ to create the rule:
251
250
252
-
:::image type="content" source="media/how-to-autoscale-endpoints/scale-out-rule.png" lightbox="media/how-to-autoscale-endpoints/scale-out-rule.png" alt-text="Screenshot that shows how to configure the scaleout rule for greater than 70% CPU for 5 minutes.":::
251
+
:::image type="content" source="media/how-to-autoscale-endpoints/scale-out-rule.png" lightbox="media/how-to-autoscale-endpoints/scale-out-rule.png" alt-text="Screenshot that shows how to configure the scale-out rule for greater than 70% CPU for 5 minutes.":::
253
252
254
253
Leave the configuration pane open. In the next section, you adjust the __Rules__ settings.
255
254
@@ -339,9 +338,9 @@ The following steps adjust the __Rules__ configuration to support a scale in rul
339
338
340
339
:::image type="content" source="media/how-to-autoscale-endpoints/scale-in-rule.png" lightbox="media/how-to-autoscale-endpoints/scale-in-rule.png" alt-text="Screenshot that shows how to configure the scale in rule for less than 30% CPU for 5 minutes.":::
341
340
342
-
If you configure both scaleout and scale in rules, your rules look similar to the following screenshot. The rules specify that if average CPU load exceeds 70%for5 minutes, two more nodes should be allocated, up to the limit of five. If CPU load is less than 30%for5 minutes, a single node should be released, down to the minimum of two.
341
+
If you configure both scale-out and scale in rules, your rules look similar to the following screenshot. The rules specify that if average CPU load exceeds 70%for5 minutes, two more nodes should be allocated, up to the limit of five. If CPU load is less than 30%for5 minutes, a single node should be released, down to the minimum of two.
343
342
344
-
:::image type="content" source="media/how-to-autoscale-endpoints/autoscale-rules-final.png" lightbox="media/how-to-autoscale-endpoints/autoscale-rules-final.png" alt-text="Screenshot that shows the autoscale settings including the scale in and scaleout rules.":::
343
+
:::image type="content" source="media/how-to-autoscale-endpoints/autoscale-rules-final.png" lightbox="media/how-to-autoscale-endpoints/autoscale-rules-final.png" alt-text="Screenshot that shows the autoscale settings including the scale in and scale-out rules.":::
345
344
346
345
Leave the configuration pane open. In the next section, you specify other scale settings.
347
346
@@ -581,5 +580,5 @@ Alternatively, you can delete a managed online endpoint directly in the [endpoin
0 commit comments