|
1 | 1 | ---
|
2 |
| -title: Autoscale settings in Azure Monitor |
| 2 | +title: Understanding autoscale settings in Azure Monitor |
3 | 3 | description: "A detailed breakdown of autoscale settings and how they work. Applies to Virtual Machines, Cloud Services, Web Apps"
|
| 4 | +author: EdB-MSFT |
4 | 5 | ms.topic: conceptual
|
5 |
| -ms.date: 12/18/2017 |
| 6 | +ms.date: 11/02/2022 |
6 | 7 | ms.subservice: autoscale
|
7 |
| -ms.reviewer: riroloff |
| 8 | +ms.custom: ignite-2022 |
| 9 | +ms.author: ebayansh |
8 | 10 | ---
|
9 |
| -# Autoscale settings |
| 11 | +# Understand autoscale settings |
10 | 12 |
|
11 |
| -Autoscale settings help you provision just the right mount of resources to support the load on your application. Configure autoscale settings to be triggered based on metrics that indicate load or performance, or triggered at a scheduled date and time. |
| 13 | +Autoscale settings help ensure that you have the right amount of resources running to handle the fluctuating load of your application. You can configure autoscale settings to be triggered based on metrics that indicate load or performance, or triggered at a scheduled date and time. |
12 | 14 |
|
13 |
| -This article gives a detailed description of the autoscale settings. |
| 15 | +This article gives a detailed explanation of the autoscale settings. |
14 | 16 |
|
15 | 17 | ## Autoscale setting schema
|
16 | 18 |
|
17 |
| -The [Autoscale ARM template resource definition](https://learn.microsoft.com/en-us/azure/templates/microsoft.insights/autoscalesettings?pivots=deployment-language-arm-template#autoscaleprofile-1) contains a description of all template elements. |
18 |
| - |
19 |
| -The example below has the following settings: |
20 |
| - |
21 |
| -- A single, default profile. |
22 |
| -- The profile has two rules. The first rule scales out, the second scales in. |
23 |
| -- The scale-out rule is triggered when the virtual machine scale set's average percentage CPU metric is greater than 85 percent for the past 10 minutes. |
24 |
| -- The scale-in rule is triggered when the virtual machine scale set's average is less than 60 percent for the past minute. |
| 19 | +The following example shows an autoscale setting. This autoscale setting has the following attributes: |
| 20 | +- A single default profile. |
| 21 | +- Two metric rules in this profile: one for scale-out, and one for scale-in. |
| 22 | + - The scale-out rule is triggered when the Virtual Machine Scale Set's average percentage CPU metric is greater than 85 percent for the past 10 minutes. |
| 23 | + - The scale-in rule is triggered when the Virtual Machine Scale Set's average is less than 60 percent for the past minute. |
25 | 24 |
|
26 | 25 | > [!NOTE]
|
27 | 26 | > A setting can have multiple profiles. To learn more, see the [profiles](#autoscale-profiles) section. A profile can also have multiple scale-out rules and scale-in rules defined. To see how they are evaluated, see the [evaluation](#autoscale-evaluation) section.
|
@@ -99,29 +98,28 @@ The table below describes the elements in the above autoscale setting's JSON.
|
99 | 98 | | profiles | name | |The name of the profile. You can choose any name that helps you identify the profile. |
|
100 | 99 | | profiles | capacity.maximum | Instance limits - Maximum |The maximum capacity allowed. It ensures that autoscale doesn't scale your resource above this number when executing the profile. |
|
101 | 100 | | profiles | capacity.minimum | Instance limits - Minimum |The minimum capacity allowed. It ensures that autoscale doesn't scale your resource below this number when executing the profile |
|
102 |
| -| profiles | capacity.default | Instance limits - Default |If there's a problem reading the resource metric, and the current capacity is below the default, autoscale scales out to the default. This ensures the availability of the resource. If the current capacity is already higher than the default capacity, autoscale does not scale in. | |
| 101 | +| profiles | capacity.default | Instance limits - Default |If there's a problem reading the resource metric, and the current capacity is below the default, autoscale scales out to the default. This ensures the availability of the resource. If the current capacity is already higher than the default capacity, autoscale doesn't scale in. | |
103 | 102 | | profiles | rules | Rules |Autoscale automatically scales between the maximum and minimum capacities, by using the rules in the profile. You can have multiple rules in a profile. Typically there are two rules: one to determine when to scale out, and the other to determine when to scale in. |
|
104 | 103 | | rule | metricTrigger | Scale rule |Defines the metric condition of the rule. |
|
105 | 104 | | metricTrigger | metricName | Metric name |The name of the metric. |
|
106 |
| -| metricTrigger | metricResourceUri | |The resource ID of the resource that emits the metric. In most cases, it is the same as the resource being scaled. In some cases, it can be different. For example, you can scale a virtual machine scale set based on the number of messages in a storage queue. | |
| 105 | +| metricTrigger | metricResourceUri | |The resource ID of the resource that emits the metric. In most cases, it is the same as the resource being scaled. In some cases, it can be different. For example, you can scale a Virtual Machine Scale Set based on the number of messages in a storage queue. | |
107 | 106 | | metricTrigger | timeGrain | Time grain (minutes) |The metric sampling duration. For example, **TimeGrain = “PT1M”** means that the metrics should be aggregated every 1 minute, by using the aggregation method specified in the statistic element. |
|
108 | 107 | | metricTrigger | statistic | Time grain statistic |The aggregation method within the timeGrain period. For example, **statistic = “Average”** and **timeGrain = “PT1M”** means that the metrics should be aggregated every 1 minute, by taking the average. This property dictates how the metric is sampled. |
|
109 | 108 | | metricTrigger | timeWindow | Duration |The amount of time to look back for metrics. For example, **timeWindow = “PT10M”** means that every time autoscale runs, it queries metrics for the past 10 minutes. The time window allows your metrics to be normalized, and avoids reacting to transient spikes. |
|
110 | 109 | | metricTrigger | timeAggregation |Time aggregation |The aggregation method used to aggregate the sampled metrics. For example, **TimeAggregation = “Average”** should aggregate the sampled metrics by taking the average. In the preceding case, take the ten 1-minute samples, and average them. |
|
111 | 110 | | rule | scaleAction | Action |The action to take when the metricTrigger of the rule is triggered. |
|
112 | 111 | | scaleAction | direction | Operation |"Increase" to scale out, or "Decrease" to scale in.|
|
113 | 112 | | scaleAction | value |Instance count |How much to increase or decrease the capacity of the resource. |
|
114 |
| -| scaleAction | cooldown | Cool down (minutes)|The amount of time to wait after a scale operation before scaling again. For example, if **cooldown = “PT10M”**, autoscale does not attempt to scale again for another 10 minutes. The cooldown is to allow the metrics to stabilize after the addition or removal of instances. | |
| 113 | +| scaleAction | cooldown | Cool down (minutes)|The amount of time to wait after a scale operation before scaling again. For example, if **cooldown = “PT10M”**, autoscale doesn't attempt to scale again for another 10 minutes. The cooldown is to allow the metrics to stabilize after the addition or removal of instances. | |
| 114 | + |
115 | 115 |
|
116 | 116 | ## Autoscale profiles
|
117 | 117 |
|
118 | 118 | There are three types of autoscale profiles:
|
119 | 119 |
|
120 |
| -- **Regular profile:** The most common profile. If you don’t need to scale your resource based on the day of the week, or on a particular day, you can use a regular profile. This profile can then be configured with metric rules that dictate when to scale out and when to scale in. You should only have one regular profile defined. |
121 |
| - |
122 |
| - The example profile used earlier in this article is an example of a regular profile. Note that it is also possible to set a profile to scale to a static instance count for your resource. |
123 |
| - |
124 |
| -- **Fixed date profile:** This profile is for special cases. For example, let’s say you have an important event coming up on December 26, 2017 (PST). You want the minimum and maximum capacities of your resource to be different on that day, but still scale on the same metrics. In this case, you should add a fixed date profile to your setting’s list of profiles. The profile is configured to run only on the event’s day. For any other day, autoscale uses the regular profile. |
| 120 | +- **Default profile:** Use the default profile if you don’t need to scale your resource based on a particular date and time, or day of the week, use a regular or default profile. You can only have one default profile. The sample profile used above is an example of a default profile. |
| 121 | +- |
| 122 | +- **Fixed date profile:** This profile is relevant for a single date and time. Use the fixed date profile to set scaling rules for a specific event. The profile runs only on the event’s date and time. For all other times, autoscale uses the default profile. |
125 | 123 |
|
126 | 124 | ```json
|
127 | 125 | "profiles": [
|
@@ -160,164 +158,42 @@ There are three types of autoscale profiles:
|
160 | 158 | }
|
161 | 159 | ]
|
162 | 160 | ```
|
163 |
| - |
164 |
| -- **Recurrence profile:** This type of profile enables you to ensure that this profile is always used on a particular day of the week. Recurrence profiles only have a start time. They run until the next recurrence profile or fixed date profile is set to start. An autoscale setting with only one recurrence profile runs that profile, even if there is a regular profile defined in the same setting. The following two examples illustrate how this profile is used: |
165 |
| - |
166 |
| - **Example 1: Weekdays vs. weekends** |
167 |
| - |
168 |
| - Let’s say that on weekends, you want your maximum capacity to be 4. On weekdays, because you expect more load, you want your maximum capacity to be 10. In this case, your setting would contain two recurrence profiles, one to run on weekends and the other on weekdays. |
169 |
| - The setting looks like this: |
170 |
| - |
171 |
| - ``` JSON |
172 |
| - "profiles": [ |
173 |
| - { |
174 |
| - "name": "weekdayProfile", |
175 |
| - "capacity": { |
176 |
| - ... |
177 |
| - }, |
178 |
| - "rules": [{ |
179 |
| - ... |
180 |
| - }], |
181 |
| - "recurrence": { |
182 |
| - "frequency": "Week", |
183 |
| - "schedule": { |
184 |
| - "timeZone": "Pacific Standard Time", |
185 |
| - "days": [ |
186 |
| - "Monday" |
187 |
| - ], |
188 |
| - "hours": [ |
189 |
| - 0 |
190 |
| - ], |
191 |
| - "minutes": [ |
192 |
| - 0 |
193 |
| - ] |
194 |
| - } |
195 |
| - }} |
196 |
| - }, |
197 |
| - { |
198 |
| - "name": "weekendProfile", |
199 |
| - "capacity": { |
200 |
| - ... |
201 |
| - }, |
202 |
| - "rules": [{ |
203 |
| - ... |
204 |
| - }] |
205 |
| - "recurrence": { |
206 |
| - "frequency": "Week", |
207 |
| - "schedule": { |
208 |
| - "timeZone": "Pacific Standard Time", |
209 |
| - "days": [ |
210 |
| - "Saturday" |
211 |
| - ], |
212 |
| - "hours": [ |
213 |
| - 0 |
214 |
| - ], |
215 |
| - "minutes": [ |
216 |
| - 0 |
217 |
| - ] |
218 |
| - } |
219 |
| - } |
220 |
| - }] |
221 |
| - ``` |
222 | 161 |
|
223 |
| - The preceding setting shows that each recurrence profile has a schedule. This schedule determines when the profile starts running. The profile stops when it’s time to run another profile. |
| 162 | +- **Recurrence profile:** A recurrence profile is used for a day or set of days of the week. The schema for a recurring profile doesn't include an end date. The end of date and time for a recurring profile is set by the start time of the following profile. When using the portal to configure recurring profiles, the default profile is automatically duplicated, and restarts at the time that you specify for the recurring profile to end. For more information on configuring multiple profiles, see [Autoscale with multiple profiles](./autoscale-multiprofile.md) |
224 | 163 |
|
225 |
| - For example, in the preceding setting, “weekdayProfile” is set to start on Monday at 12:00 AM. That means this profile starts running on Monday at 12:00 AM. It continues until Saturday at 12:00 AM, when “weekendProfile” is scheduled to start running. |
226 |
| - |
227 |
| - **Example 2: Business hours** |
228 |
| - |
229 |
| - Let's say you want to have one metric threshold during business hours (9:00 AM to 5:00 PM), and a different one for all other times. The setting would look like this: |
230 |
| - |
231 |
| - ``` JSON |
232 |
| - "profiles": [ |
233 |
| - { |
234 |
| - "name": "businessHoursProfile", |
235 |
| - "capacity": { |
236 |
| - ... |
237 |
| - }, |
238 |
| - "rules": [{ |
239 |
| - ... |
240 |
| - }], |
241 |
| - "recurrence": { |
242 |
| - "frequency": "Week", |
243 |
| - "schedule": { |
244 |
| - "timeZone": "Pacific Standard Time", |
245 |
| - "days": [ |
246 |
| - "Monday", “Tuesday”, “Wednesday”, “Thursday”, “Friday” |
247 |
| - ], |
248 |
| - "hours": [ |
249 |
| - 9 |
250 |
| - ], |
251 |
| - "minutes": [ |
252 |
| - 0 |
253 |
| - ] |
254 |
| - } |
255 |
| - } |
256 |
| - }, |
257 |
| - { |
258 |
| - "name": "nonBusinessHoursProfile", |
259 |
| - "capacity": { |
260 |
| - ... |
261 |
| - }, |
262 |
| - "rules": [{ |
263 |
| - ... |
264 |
| - }] |
265 |
| - "recurrence": { |
266 |
| - "frequency": "Week", |
267 |
| - "schedule": { |
268 |
| - "timeZone": "Pacific Standard Time", |
269 |
| - "days": [ |
270 |
| - "Monday", “Tuesday”, “Wednesday”, “Thursday”, “Friday” |
271 |
| - ], |
272 |
| - "hours": [ |
273 |
| - 17 |
274 |
| - ], |
275 |
| - "minutes": [ |
276 |
| - 0 |
277 |
| - ] |
278 |
| - } |
279 |
| - } |
280 |
| - }] |
281 |
| - ``` |
282 |
| - |
283 |
| - The preceding setting shows that “businessHoursProfile” begins running on Monday at 9:00 AM, and continues to 5:00 PM. That’s when “nonBusinessHoursProfile” starts running. The “nonBusinessHoursProfile” runs until 9:00 AM Tuesday, and then the “businessHoursProfile” takes over again. This repeats until Friday at 5:00 PM. At that point, “nonBusinessHoursProfile” runs all the way to Monday at 9:00 AM. |
284 |
| - |
285 |
| -> [!Note] |
286 |
| -> The autoscale user interface in the Azure portal enforces end times for recurrence profiles, and begins running the autoscale setting's default profile in between recurrence profiles. |
287 |
| - |
288 | 164 | ## Autoscale evaluation
|
289 |
| -Given that Autoscale settings can have multiple profiles, and each profile can have multiple metric rules, it is important to understand how an Autoscale setting is evaluated. The Autoscale job runs every 30 to 60 seconds, depending on the resource type. Each time the Autoscale job runs, it begins by choosing the profile that is applicable. Then Autoscale evaluates the minimum and maximum values, and any metric rules in the profile, and decides if a scale action is necessary. |
290 |
| - |
291 | 165 |
|
| 166 | +Autoscale settings can have multiple profiles. Each profile can have multiple metric rules. Each time the autoscale job runs, it begins by choosing the applicable profile for that time. Then autoscale evaluates the minimum and maximum values, any metric rules in the profile, and decides if a scale action is necessary. The autoscale job runs every 30 to 60 seconds, depending on the resource type. |
292 | 167 |
|
293 |
| -Given that Autoscale settings can have multiple profiles, and each profile can have multiple metric rules, it is important to understand how an Autoscale setting is evaluated. The Autoscale job runs every 30 to 60 seconds, depending on the resource type. Each time the Autoscale job runs, it begins by choosing the profile that is applicable. Then Autoscale evaluates the minimum and maximum values, and any metric rules in the profile, and decides if a scale action is necessary. |
294 |
| -### Which profile will autoscale pick? |
| 168 | +### Which profile will autoscale use? |
295 | 169 |
|
296 |
| -Autoscale uses the following sequence to pick the profile: |
297 |
| -1. It first looks for any fixed date profile that is configured to run now. If there is,autoscale runs it. If there are multiple fixed date profiles that are supposed to run,autoscale selects the first one. |
298 |
| -2. If there are no fixed date profiles,autoscale looks at recurrence profiles. If a recurrence profile is found, it runs it. |
299 |
| -3. If there are no fixed date or recurrence profiles,autoscale runs the regular profile. |
| 170 | +Each time the autoscale service runs, the profiles are evaluated in the following order: |
300 | 171 |
|
301 |
| -### How does autoscale evaluate multiple rules? |
| 172 | +1. Fixed date profiles |
| 173 | +1. Recurring profiles |
| 174 | +1. Default profile |
302 | 175 |
|
303 |
| -After autoscale determines which profile to run, it evaluates all the scale-out rules in the profile (these are rules with **direction = “Increase”**). |
| 176 | +The first suitable profile found will be used. |
304 | 177 |
|
305 |
| -If one or more scale-out rules are triggered,autoscale calculates the new capacity determined by the **scaleAction** of each of those rules. Then it scales out to the maximum of those capacities, to ensure service availability. |
| 178 | +### How does autoscale evaluate multiple rules? |
306 | 179 |
|
307 |
| -For example, let's say there is a virtual machine scale set with a current capacity of 10. There are two scale-out rules: one that increases capacity by 10 percent, and one that increases capacity by 3 counts. The first rule would result in a new capacity of 11, and the second rule would result in a capacity of 13. To ensure service availability,autoscale chooses the action that results in the maximum capacity, so the second rule is chosen. |
| 180 | +After autoscale determines which profile to run, it evaluates the scale-out rules in the profile, that is, where **direction = “Increase”**. |
| 181 | +If one or more scale-out rules are triggered, autoscale calculates the new capacity determined by the **scaleAction** specified for each of those rules. If there's more than one scale-out rule triggered, autoscale scales to the maximum specified capacity to ensure service availability. |
| 182 | +For example, There are two rules: Rule1 specifies scale out by 3 instances and rule 2 specifies scale out by 5. IF Both rules are triggered, autoscale will scale out by 5 instances. Similarly, if one rule specifies scale out by 3 instances and another rule, scale out by 15%, the higher of the two instance counts will be used. |
308 | 183 |
|
309 |
| -If no scale-out rules are triggered,autoscale evaluates all the scale-in rules (rules with **direction = “Decrease”**). Autoscale only takes a scale-in action if all of the scale-in rules are triggered. |
| 184 | +If no scale-out rules are triggered, autoscale evaluates all the scale-in rules, that is, rules with **direction = “Decrease”**. Autoscale only scales in if all of the scale-in rules are triggered. |
310 | 185 |
|
311 |
| -Autoscale calculates the new capacity determined by the **scaleAction** of each of those rules. Then it chooses the scale action that results in the maximum of those capacities to ensure service availability. |
| 186 | +Autoscale calculates the new capacity determined by the **scaleAction** of each of those rules. To ensure service availability, autoscale scales in by as little as possible to achieve the maximum capacity specified. For example, assume two scale-in rules, one that decreases capacity by 50 percent, and one that decreases capacity by 3 instances. If first rule results in 5 instances and the second rule results in 7, autoscale scales-in to 7 instances. |
312 | 187 |
|
313 |
| -For example, let's say there is a virtual machine scale set with a current capacity of 10. There are two scale-in rules: one that decreases capacity by 50 percent, and one that decreases capacity by 3 counts. The first rule would result in a new capacity of 5, and the second rule would result in a capacity of 7. To ensure service availability, autoscale chooses the action that results in the maximum capacity, so the second rule is chosen. |
| 188 | +Each time autoscale calculates the result of a scale action, it evaluates whether that action would trigger the opposite scale action. The scenario where a scale action triggers the opposite scale action is known as flapping. Autoscale may defer a scale action to avoid flapping or may scale by a number less than what was specified in the rule. For more information on flapping, see [Flapping in Autoscale](./autoscale-custom-metric.md) |
314 | 189 |
|
315 | 190 | ## Next steps
|
316 |
| -Learn more about autoscale by referring to the following: |
| 191 | + |
| 192 | +Learn more about autoscale by referring to the following articles : |
317 | 193 |
|
318 | 194 | * [Overview of autoscale](./autoscale-overview.md)
|
319 | 195 | * [Azure Monitor autoscale common metrics](./autoscale-common-metrics.md)
|
320 |
| -* [Best practices for Azure Monitor autoscale](./autoscale-best-practices.md) |
| 196 | +* [Autoscale with multiple profiles](./autoscale-multiprofile.md) |
| 197 | +* [Flapping in Autoscale](./autoscale-custom-metric.md) |
321 | 198 | * [Use autoscale actions to send email and webhook alert notifications](./autoscale-webhook-email.md)
|
322 | 199 | * [Autoscale REST API](/rest/api/monitor/autoscalesettings)
|
323 |
| - |
0 commit comments