Skip to content

Commit 9965b34

Browse files
authored
Merge pull request #57341 from rboucher/patch-37
editing and updating
2 parents d630de2 + 51b4f09 commit 9965b34

File tree

1 file changed

+21
-18
lines changed

1 file changed

+21
-18
lines changed

articles/monitoring-and-diagnostics/alert-log-troubleshoot.md

Lines changed: 21 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -10,48 +10,50 @@ ms.author: vinagara
1010
ms.component: alerts
1111
---
1212
# Troubleshooting log alerts in Azure Monitor
13-
1413
## Overview
15-
This article shows you handle common issues seen while setting up log alerts inside Azure monitor. And provide solution to frequently asked questions regarding functionality or configuration of log alerts.
16-
The term **Log Alerts** to describe alerts where signal is custom query based on [Log Analytics](../log-analytics/log-analytics-tutorial-viewdata.md) or [Application Insights](../application-insights/app-insights-analytics.md). Learn more about functionality, terminology, and types from [Log alerts - Overview](monitor-alerts-unified-log.md).
14+
This article shows you how to resolve common issues seen when setting up log alerts in Azure monitor. It also provides solutions to frequently asked questions regarding functionality or configuration of log alerts.
15+
16+
The term **Log Alerts** to describe alerts that fire based on a custom query in [Log Analytics](../log-analytics/log-analytics-tutorial-viewdata.md) or [Application Insights](../application-insights/app-insights-analytics.md). Learn more about functionality, terminology, and types in [Log alerts - Overview](monitor-alerts-unified-log.md).
1717

1818
> [!NOTE]
19-
> This article doesn't consider cases when the alert rule is shown as triggered in Azure portal and notification via associated Action Group(s). For such cases, please refer to details in the article on [Action Groups](monitoring-action-groups.md).
19+
> This article doesn't consider cases when the Azure portal shows and alert rule triggered and a notification performed by an associated Action Group(s). For such cases, please refer to details in the article on [Action Groups](monitoring-action-groups.md).
2020
2121

2222
## Log alert didn't fire
2323

24-
Detailed next are some common reasons why a configured [log alert rule in Azure Monitor](alert-log.md) doesn't get triggered when viewed in [Azure Alerts](monitoring-alerts-managing-alert-states.md), when you expect it to be fired.
24+
Here are some common reasons why a configured [log alert rule in Azure Monitor](alert-log.md) state doesn't show [as *fired* when expected](monitoring-alerts-managing-alert-states.md).
2525

2626
### Data Ingestion time for Logs
27-
Log alert works by periodically running customer provided query based on [Log Analytics](../log-analytics/log-analytics-tutorial-viewdata.md) or [Application Insights](../application-insights/app-insights-analytics.md). Both are powered by the power of Analytics, which processes vast amounts of log data and provides functionality on the same. As the Log Analytics service involves processing many terabytes of data from thousands of customers and from varied sources across the world - the service is susceptible to time delay. For more information, see [Data ingestion time in Log Analytics](../log-analytics/log-analytics-data-ingestion-time.md).
27+
Log alert periodically runs your query based on [Log Analytics](../log-analytics/log-analytics-tutorial-viewdata.md) or [Application Insights](../application-insights/app-insights-analytics.md). Because Log Analytics processes many terabytes of data from thousands of customers from varied sources across the world, the service is susceptible to a varying time delay. For more information, see [Data ingestion time in Log Analytics](../log-analytics/log-analytics-data-ingestion-time.md).
2828

29-
To overcome the data ingestion delay that may occur in Log Analytics or Application Insights logs; log alert waits and retries after some time when it finds data is not yet ingested for the alerting time period. Log Alerts has an exponentially increasing wait time set, so as to make sure we wait necessary time for data to be ingested by Log Analytics. Hence if the logs queried by your log alert rule are affected by ingestion delays, then log alert will trigger only after the data is available in Log Analytics post-ingestion and after exponential time gap due to log alert service retrying multiple times in the interim.
29+
To mitigate data ingestion delay, the system waits and retries the alert query multiple times if it finds the needed data is not yet ingested. The system has an exponentially increasing wait time set. The log alert only triggers after the data is available so they delay could be due to slow log data ingestion.
3030

3131
### Incorrect time period configured
32-
As described in the article on [terminology for log alerts](monitor-alerts-unified-log.md#log-search-alert-rule---definition-and-types), time period stated in configuration specifies the time range for the query. The query returns only records that were created within this range of time. Time period restricts the data fetched for log query to prevent abuse and circumvents any time command (like ago) used in log query.
33-
*For example, If the time period is set to 60 minutes, and the query is run at 1:15 PM, only records created between 12:15 PM and 1:15 PM is returned to execute log query. Now if the log query uses time command like ago (1d), the log query would be run only for data between 12:15 PM and 1:15 PM - as if data exists for only the past 60 minutes. And not for seven days of data as specified in log query.*
32+
As described in the article on [terminology for log alerts](monitor-alerts-unified-log.md#log-search-alert-rule---definition-and-types), the time period stated in configuration specifies the time range for the query. The query returns only records that were created within this range of time. Time period restricts the data fetched for log query to prevent abuse and circumvents any time command (like ago) used in log query.
33+
*For example, If the time period is set to 60 minutes, and the query is run at 1:15 PM, only records created between 12:15 PM and 1:15 PM are used for the log query. If the log query uses a time command like *ago (1d)*, the query still only uses data between 12:15 PM and 1:15 PM because the time period is set to that interval.*
34+
35+
Therefore, check that time period in the configuration matches your query. For the example stated earlier, if the log query uses *ago (1d)* as shown with Green marker, then the time period should be set to 24 hours or 1440 minutes (as indicated in Red), to ensure the query executes as intended.
3436

35-
Based on your query logic, check if appropriate time period in the configuration has been provided. For the example stated earlier, if the log query uses ago (1d) as shown with Green marker - then the time period should be set to 24 hours or 1440 minutes (as indicated in Red), to ensure the query provided executes correctly as envisaged.
36-
![Time Period](./media/monitor-alerts-unified/LogAlertTimePeriod.png)
37+
![Time Period](./media/monitor-alerts-unified/LogAlertTimePeriod.png)
3738

3839
### Suppress Alerts option is set
39-
As described in step 8 of the article on [creating a log alert rule in Azure portal](alert-log.md#managing-log-alerts-from-the-azure-portal), log alert provides an option configure automatic suppression of the alert rule and prevent notification/trigger for stipulated amount of time. Suppress Alerts option will cause log alert to execute while not triggering action group for the time specified in **Suppress Alerts** option and hence user may feel that alert didn't fire while in actuality it was suppressed as configured.
40-
![Suppress Alerts](./media/monitor-alerts-unified/LogAlertSuppress.png)
40+
As described in step 8 of the article on [creating a log alert rule in Azure portal](alert-log.md#managing-log-alerts-from-the-azure-portal), log alerts provide a **Suppress Alerts** option to suppress triggering and notification actions for a configured amount of time. As a result, you may think that an alert didn't fire while in actuality it did, but was suppressed.
41+
42+
![Suppress Alerts](./media/monitor-alerts-unified/LogAlertSuppress.png)
4143

4244
### Metric measurement alert rule is incorrect
43-
Metric measurement type of log alert rule is subtype of log alerts, which have special capabilities but in turn employs restriction on the alert query syntax. Metric measurement log alert rule requires the output of alert query to provide a metric time series - a table with distinct equally sized time periods along with corresponding values of AggregatedValue computed. Additionally, users can choose to have in the table additional variables alongside AggregatedValue like Computer, Node, etc. using which data in the table can be sorted.
45+
**Metric measurement log alerts** are a subtype of log alerts, which have special capabilities and a restricted alert query syntax. A metric measurement log alert rule requires the query output to be a metric time series; that is, a table with distinct equally sized time periods along with corresponding aggregated values. Additionally, users can choose to have additional variables in the table alongside AggregatedValue. These variables may be used to sort the table.
4446

45-
For example, suppose metric measurement log alert rule was configured as:
47+
For example, suppose a metric measurement log alert rule was configured as:
4648
- query was: `search *| summarize AggregatedValue = count() by $table, bin(timestamp, 1h)`
4749
- time period of 6 hours
4850
- threshold of 50
4951
- alert logic of three consecutive breaches
5052
- Aggregate Upon chosen as $table
5153

52-
Now since in command, we have used summarize … by and provided two variables: timestamp & $table; alert service will choose $table to “Aggregate Upon” - basically sort the result table by field: $table - as shown below and then look at the multiple AggregatedValue for each table type (like availabilityResults) to see if there was consecutive breaches of 3 or more.
54+
Since the command includes *summarize … by* and provided two variables (timestamp & $table), the system chooses $table to “Aggregate Upon”. It sorts the result table by the field *$table* as shown below and then looks at the multiple AggregatedValue for each table type (like availabilityResults) to see if there was consecutive breaches of 3 or more.
5355

54-
![Metric Measurement query execution with multiple values](./media/monitor-alerts-unified/LogMMQuery.png)
56+
![Metric Measurement query execution with multiple values](./media/monitor-alerts-unified/LogMMQuery.png)
5557

5658
As “Aggregate Upon” is $table – the data is sorted on $table column (as in RED); then we group and look for types of “Aggregate Upon” field (that is) $table – for example: values for availabilityResults will be considered as one plot/entity (as highlighted in Orange). In this value plot/entity – alert service checks for three consecutive breaches occurring (as shown in Green) for which alert will get triggered for table value 'availabilityResults'. Similarly, if for any other value of $table if three consecutive breaches are seen - another alert notification will be triggered for the same; with alert service automatically sorting the values in one plot/entity (as in Orange) by time.
5759

@@ -81,4 +83,5 @@ What is shown in **query to be executed** section is what log alert service will
8183

8284
* Learn about [Log Alerts in Azure Alerts](monitor-alerts-unified-log.md)
8385
* Learn more about [Application Insights](../application-insights/app-insights-analytics.md)
84-
* Learn more about [Log Analytics](../log-analytics/log-analytics-queries.md).
86+
* Learn more about [Log Analytics](../log-analytics/log-analytics-overview.md).
87+

0 commit comments

Comments
 (0)