You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-monitor/alerts/alerts-dynamic-thresholds.md
+34-36Lines changed: 34 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Create an Azure Monitor metric alert with dynamic thresholds
3
-
description: Create metric alerts with machine learning-based dynamic thresholds.
3
+
description: Get information about creating metric alerts with dynamic thresholds that are based on machine learning.
4
4
author: AbbyMSFT
5
5
ms.author: abbyweisberg
6
6
ms.reviewer: yalavi
@@ -26,7 +26,7 @@ We recommend configuring alert rules with dynamic thresholds on these metrics:
26
26
27
27
Dynamic thresholds help you:
28
28
29
-
- Create scalable alerts for hundreds of metric series with one alert rule. If you have fewer alert rules, you spend less time creating and managing them. Scalable alerting is especially useful for multiple dimensions or for multiple resources, such as to all resources in a subscription.
29
+
- Create scalable alerts for hundreds of metric series with one alert rule. If you have fewer alert rules, you spend less time creating and managing them. Scalable alerts are especially useful for multiple dimensions or for multiple resources, such as all resources in a subscription.
30
30
- Create rules without having to know what threshold to configure.
31
31
- Configure metric alerts by using high-level concepts without needing extensive domain knowledge about the metric.
32
32
- Prevent noisy (low precision) or wide (low recall) thresholds that don't have an expected pattern.
@@ -36,7 +36,7 @@ You can use dynamic thresholds on:
36
36
- Most Azure Monitor platform and custom metrics.
37
37
- Common application and infrastructure metrics.
38
38
- Noisy metrics, such as machine CPU or memory.
39
-
- Metrics with low dispersion, such as availability and error rate
39
+
- Metrics with low dispersion, such as availability and error rate.
40
40
41
41
You can configure dynamic thresholds by using:
42
42
@@ -46,73 +46,71 @@ You can configure dynamic thresholds by using:
46
46
47
47
## Alert threshold calculation and preview
48
48
49
-
When an alert rule is first created, dynamic thresholds use 10 days of historical data to calculate hourly or daily seasonal patterns. The chart that you see in the alert preview reflects that data.
49
+
When an alert rule is created, dynamic thresholds use 10 days of historical data to calculate hourly or daily seasonal patterns. The chart that you see in the alert preview reflects that data.
50
50
51
-
After an alert rule is created, dynamic thresholds continually use all available historical data to learn, and they make adjustments to be more accurate. After three weeks, dynamic thresholds have enough data to identify weekly patterns, and the model is adjusted to include weekly seasonality.
51
+
Dynamic thresholds continually use all available historical data to learn, and they make adjustments to be more accurate. After three weeks, dynamic thresholds have enough data to identify weekly patterns, and the model is adjusted to include weekly seasonality.
52
52
53
53
The system automatically recognizes prolonged outages and removes them from the threshold learning algorithm. If there's a prolonged outage, dynamic thresholds understand the data. They detect system issues with the same level of sensitivity as before the outage occurred.
54
54
55
55
## Considerations for using dynamic thresholds
56
56
57
-
- To ensure accurate threshold calculation, alerts that use dynamic thresholds don't trigger an alert before collecting three days and at least 30 samples of metric data. New resources or resources that are missing metric data don't trigger an alert until enough data is available.
57
+
- To ensure accurate threshold calculation, alert rules that use dynamic thresholds don't trigger an alert before collecting three days and at least 30 samples of metric data. New resources or resources that are missing metric data don't trigger an alert until enough data is available.
58
58
- Dynamic thresholds need at least three weeks of historical data to detect weekly seasonality. Some detailed patterns, such as bihourly or semiweekly patterns, might not be detected.
59
59
- If the behavior of a metric changed recently, the changes aren't immediately reflected in the dynamic threshold's upper and lower bounds. The borders are calculated based on metric data from the last 10 days. When you view the dynamic threshold's borders for a particular metric, look at the metric trend in the last week and not only for recent hours or days.
60
60
- Dynamic thresholds are good for detecting significant deviations, as opposed to slowly evolving issues. Slow behavior changes probably won't trigger an alert.
61
61
62
62
## Known issues with dynamic threshold sensitivity
63
63
64
-
- If an alert rule that uses dynamic thresholds is too noisy or fires too much, you might need to reduce the sensitivity of your alert rule. Use one of the following options:
64
+
- If an alert rule that uses dynamic thresholds is too noisy or fires too much, you might need to reduce its sensitivity. Use one of the following options:
65
65
66
66
-**Threshold sensitivity**: Set the sensitivity to **Low** to be more tolerant of deviations.
67
-
-**Number of violations** (under **Advanced settings**): Configure the alert rule to be triggered only if several deviations occur within a certain period of time. This setting makes the rule less susceptible to transient deviations.
67
+
-**Number of violations** (under **Advanced settings**): Configure the alert rule to trigger only if several deviations occur within a certain period of time. This setting makes the rule less susceptible to transient deviations.
68
68
69
-
- You might encounter an alert rule that uses dynamic thresholds doesn't fire or isn't sensitive enough, even though it's configured with high sensitivity. This issue can happen when the metric's distribution is highly irregular. Consider one of the following solutions:
69
+
- You might find that an alert rule that uses dynamic thresholds doesn't fire or isn't sensitive enough, even though it's configured with high sensitivity. This scenario can happen when the metric's distribution is highly irregular. Consider one of the following solutions:
70
70
71
71
- Move to monitoring a complementary metric that's suitable for your scenario, if applicable. For example, check for changes in success rate rather than failure rate.
72
72
- Try selecting a different value for **Aggregation granularity (Period)**.
73
-
- Check if there has been a drastic change in the metric behavior in the last 10 days, such as an outage. An abrupt change can affect the upper and lower thresholds calculated for the metric and make them broader. Wait a few days until the outage is no longer taken into the thresholds calculation. You can also edit the alert rule to use the **Ignore data before** option in the**Advanced settings**.
74
-
- If your data has weekly seasonality, but not enough history is available for the metric, the calculated thresholds can result in having broad upper and lower bounds. For example, the calculation can treat weekdays and weekends in the same way and build wide borders that don't always fit the data. This issue should resolve itself after enough metric history is available. Then, the correct seasonality is detected and the calculated thresholds update accordingly.
73
+
- Check if a drastic change happened in the metric behavior in the last 10 days, such as an outage. An abrupt change can affect the upper and lower thresholds calculated for the metric and make them broader. Wait a few days until the outage is no longer included in the threshold calculation. You can also edit the alert rule to use the **Ignore data before** option in **Advanced settings**.
74
+
- If your data has weekly seasonality, but not enough history is available for the metric, the calculated thresholds can result in broad upper and lower bounds. For example, the calculation can treat weekdays and weekends in the same way and build wide borders that don't always fit the data. This issue should resolve itself after enough metric history is available. Then, the correct seasonality is detected and the calculated thresholds are updated accordingly.
75
75
76
76
- When a metric value exhibits large fluctuations, dynamic thresholds might build a wide model around the metric values, which can result in a lower or higher boundary than expected. This scenario can happen when:
77
77
78
78
- The sensitivity is set to low.
79
79
- The metric exhibits an irregular behavior with high variance, which appears as spikes or dips in the data.
80
80
81
-
Consider making the model less sensitive by choosing a higher sensitivity or selecting a larger **Lookback period**. You can also use the **Ignore data before** option to exclude a recent irregularity from the historical data used to build the model.
81
+
Consider making the model less sensitive by choosing a higher sensitivity or selecting a larger **Lookback period** value. You can also use the **Ignore data before** option to exclude a recent irregularity from the historical data that's used to build the model.
82
82
83
-
## Configure dynamic thresholds
83
+
## Configuration of dynamic thresholds
84
84
85
-
Follow the procedure to [create or edit an alert rule](alerts-create-new-alert-rule.md#create-or-edit-an-alert-rule-in-the-azure-portal), by using these settings:
85
+
To configure dynamic thresholds, follow the [procedure to create an alert rule](alerts-create-new-alert-rule.md#create-or-edit-an-alert-rule-in-the-azure-portal) by using these settings on the **Condition** tab:
86
86
87
-
1. On the **Conditions** tab:
88
-
1. In the **Thresholds** field, select **Dynamic**.
89
-
1. In the **Aggregation type**, we recommend that you don't select **Maximum**.
90
-
1. In the **Operator** field, select **Greater than** unless behavior represents the application usage.
91
-
1. In **Threshold Sensitivity**, select **Medium** or **Low** to reduce alert noise.
92
-
1. In the **Check every** field, select how often the alert rule checks if the condition is met. To minimize the the business impact of the alert, consider using a lower frequency. Make sure that the **check every** field is less than or equal to the **lookback period** field.
93
-
1. In the **Lookback period**, set the time period to look back at each time the data is checked. Make sure that this **lookback period** is greater than or equal to the **Check every** field.
94
-
1. In the **Advanced Settings**, choose how many violations will trigger the alert within a specific time period. Optionally, set the date from which to start learning the metric historical data and calculate the dynamic thresholds.
95
-
1. Continue with the rest of the process to create an alert rule.
87
+
- For **Threshold**, select **Dynamic**.
88
+
- For **Aggregation type**, we recommend that you don't select **Maximum**.
89
+
- For **Operator**, select **Greater than** unless the behavior represents the application usage.
90
+
- For **Threshold sensitivity**, select **Medium** or **Low** to reduce alert noise.
91
+
- For **Check every**, select how often the alert rule checks if the condition is met. To minimize the business impact of the alert, consider using a lower frequency. Make sure that this value is less than or equal to the **Lookback period** value.
92
+
- For **Lookback period**, set the time period to look back at each time that the data is checked. Make sure that this value is greater than or equal to the **Check every** value.
93
+
- For **Advanced options**, choose how many violations will trigger the alert within a specific time period. Optionally, set the date from which to start learning the metric historical data and calculate the dynamic thresholds.
96
94
97
95
> [!NOTE]
98
-
> Metric alert rules created through the portal are created in the same resource group as the target resource.
96
+
> Metric alert rules that you create through the portal are created in the same resource group as the target resource.
99
97
100
-
## Understand dynamic thresholds charts
98
+
## Chart for dynamic thresholds
101
99
102
-
The following chart shows a metric, its dynamic thresholds limits, and some alerts that fired when the value was outside the allowed thresholds.
100
+
The following chart shows a metric, its dynamic threshold limits, and some alerts that fired when the value was outside the allowed thresholds.
103
101
104
-
:::image type="content" source="media/alerts-dynamic-thresholds/threshold-picture-8bit.png" lightbox="media/alerts-dynamic-thresholds/threshold-picture-8bit.png" alt-text="Screenshot that shows a metric, its dynamic thresholds limits, and some alerts that fired.":::
102
+
:::image type="content" source="media/alerts-dynamic-thresholds/threshold-picture-8bit.png" lightbox="media/alerts-dynamic-thresholds/threshold-picture-8bit.png" alt-text="Screenshot of a chart that shows a metric, its dynamic threshold limits, and some alerts that fired.":::
105
103
106
104
Use the following information to interpret the chart:
107
105
108
-
-**Blue line**: The actual measured metric over time.
109
-
-**Blue shaded area**: Shows the allowed range for the metric. If the metric values stay within this range, no alert is triggered.
110
-
-**Blue dots**: If you left select on part of the chart and then hover over the blue line, a blue dot appears under your cursor that shows an individual aggregated metric value.
111
-
-**Pop-up with blue dot**: Shows the measured metric value (the blue dot) and the upper and lower values of the allowed range.
112
-
-**Red dot with a black circle**: Shows the first metric value out of the allowed range. This value fires a metric alert and puts it in an active state.
113
-
-**Red dots**: Indicate other measured values outside of the allowed range. They don't trigger more metric alerts, but the alert stays in the active state.
114
-
-**Red area**: Shows the time when the metric value was outside of the allowed range. The alert remains in the active state as long as subsequent measured values are out of the allowed range, but no new alerts are fired.
115
-
-**End of red area**: When the blue line is back inside the allowed values, the red area stops and the measured value line turns blue. The status of the metric alert fired at the time of the red dot with black outline is set to resolved.
106
+
-**Blue line**: The metric measured over time.
107
+
-**Blue shaded area**: The allowed range for the metric. If the metric values stay within this range, no alert is triggered.
108
+
-**Blue dots**: Aggregated metric values. If you select part of the chart and then hover over the blue line, a blue dot appears under your cursor to indicate an individual aggregated metric value.
109
+
-**Pop-up with blue dot**: The measured metric value (blue dot) and the upper and lower values of the allowed range.
110
+
-**Red dot with a black circle**: The first metric value out of the allowed range. This value fires a metric alert and puts it in an active state.
111
+
-**Red dots**: Other measured values outside the allowed range. They don't trigger more metric alerts, but the alert stays in the active state.
112
+
-**Red area**: The time when the metric value was outside the allowed range. The alert remains in the active state as long as subsequent measured values are out of the allowed range, but no new alerts are fired.
113
+
-**End of red area**: A return to allowed values. When the blue line is back inside the allowed values, the red area stops and the measured value line turns blue. The status of the metric alert fired at the time of the red dot with a black circle is set to resolved.
116
114
117
115
## Metrics not supported by dynamic thresholds
118
116
@@ -197,4 +195,4 @@ Dynamic thresholds are supported for most metrics, but the following metrics can
197
195
198
196
-[Manage your alert rules](alerts-manage-alert-rules.md)
0 commit comments