Skip to content

Commit 1b01405

Browse files
authored
Update proactive-failure-diagnostics.md
1 parent 1777bc8 commit 1b01405

File tree

1 file changed

+12
-52
lines changed

1 file changed

+12
-52
lines changed

articles/azure-monitor/alerts/proactive-failure-diagnostics.md

Lines changed: 12 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,6 @@ In the example shown before, the analysis discovered that most failures are abou
4242
When you instrument your service with these calls, the analyzer looks for an exception and a dependency failure that are associated with requests in the identified cluster. It also looks for an example of any trace logs associated with those requests. The alert you receive includes this additional information that can provide context to the detection and hint on the root cause for the detected problem.
4343

4444
### Alert logic details
45-
4645
Failure Anomalies detection relies on a proprietary machine learning algorithm, so the reasons for an alert firing or not firing are not always deterministic. With that said, the primary factors that the algorithm uses are:
4746

4847
* Analysis of the failure percentage of requests/dependencies in a rolling time window of 20 minutes.
@@ -54,61 +53,22 @@ Failure Anomalies detection relies on a proprietary machine learning algorithm,
5453
## Managing Failure Anomalies alert rules
5554

5655
### Alert rule creation
56+
A Failure Anomalies alert rule is created automatically when your Application Insights resource is created. The rule is automatically configured to analyze the telemetry on that resource.
57+
You can create the rule again using Azure [REST API](https://learn.microsoft.com/rest/api/monitor/smart-detector-alert-rules?view=rest-monitor-2019-06-01) or using a [Resource Manager template](./proactive-arm-config#failure-anomalies-alert-rule). This is useful if the automatic creation of the rule failed for some reason, or if you deleted the rule by chance or on purpose,
5758

58-
Failure Anomalies alert rule is automatically created together with your Application Insights resource and is automatically configured to analyze the telemetry on that resource.
59-
If for some reason the Failure Anomalies alert rule was not created, or if you deleted it by chance or on purpose, you can create the rule again using Azure [REST API](https://learn.microsoft.com/rest/api/monitor/smart-detector-alert-rules?view=rest-monitor-2019-06-01) or using a Resource Manager template.
60-
61-
Following is a example of a Resource Manager template for creating a Failure Anomalies alert rule.
62-
63-
```json
64-
{
65-
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
66-
"contentVersion": "1.0.0.0",
67-
"resources": [
68-
{
69-
"type": "microsoft.alertsmanagement/smartdetectoralertrules",
70-
"apiVersion": "2019-03-01",
71-
"name": "Response Latency Degradation - my-app",
72-
"location": "global",
73-
"properties": {
74-
"description": "Response Latency Degradation notifies you of an unusual increase in latency in your app response to requests.",
75-
"state": "Enabled",
76-
"severity": "2",
77-
"frequency": "PT1M",
78-
"detector": {
79-
"id": "FailureAnomaliesDetector",
80-
"parameters": null,
81-
"name": "Failure Anomalies",
82-
"supportedCadences": [
83-
1
84-
],
85-
"supportedResourceTypes": [
86-
"ApplicationInsights"
87-
],
88-
"parameterDefinitions": []
89-
},
90-
"scope": ["/subscriptions/00000000-1111-2222-3333-444444444444/resourceGroups/MyResourceGroup/providers/microsoft.insights/components/my-app"],
91-
"actionGroups": {
92-
"groupIds": ["/subscriptions/00000000-1111-2222-3333-444444444444/resourcegroups/MyResourceGroup/providers/microsoft.insights/actiongroups/MyActionGroup"]
93-
}
94-
}
95-
}
96-
]
97-
}
98-
```
9959
### Alert rule configuration
60+
To configure a Failure Anomalies alert rule in the portal, open the Alerts page and select Alert Rules. Failure Anomalies alert rules are included along with any alerts that you set manually.
10061

101-
You can disable Smart Detection alert rule from the portal or using Azure Resource Manager ([see template example](./proactive-arm-config.md)).
62+
:::image type="content" source="./media/proactive-failure-diagnostics/021.png" alt-text="On the Application Insights resource page, click Alerts tile, then Manage alert rules." lightbox="./media/proactive-failure-diagnostics/021.png":::
10263

103-
This alert rule is created with an associated [Action Group](./action-groups.md) named "Application Insights Smart Detection". By default, this action group is configured for Email Azure Resource Manager Role actions and sends notification to your subscription Azure Resource Manager Monitoring Contributor and Monitoring Reader users. You can modify this action group, remove it, or add more action group, just as for any other Azure alert rule. Notifications sent from this alert rule follow the [common alert schema](./alerts-common-schema.md).
64+
Click the alert rule to configure it.
10465

105-
To configure Failure Anomalies alert rules in the portal, open the Alerts page and select Alert Rules. Failure Anomalies alert rules are included along with any alerts that you have set manually, and you can see whether it is currently in the alert state.
66+
:::image type="content" source="./media/proactive-failure-diagnostics/032.png" alt-text="Rule configuration screen." lightbox="./media/proactive-failure-diagnostics/032.png":::
10667

107-
:::image type="content" source="./media/proactive-failure-diagnostics/021.png" alt-text="On the Application Insights resource page, click Alerts tile, then Manage alert rules." lightbox="./media/proactive-failure-diagnostics/021.png":::
68+
You can disable Smart Detection alert rule from the portal or using Azure Resource Manager template ([see template example](./proactive-arm-config#failure-anomalies-alert-rule).
10869

109-
Click the alert to configure it.
70+
This alert rule is created with an associated [Action Group](./action-groups.md) named "Application Insights Smart Detection". By default, this action group contains Email Azure Resource Manager Role actions and sends notification to users who have Monitoring Contributor or Monitoring Reader subscription Azure Resource Manager roles in your subscription. You can remove, change or add the action groups that the rule triggers, as for any other Azure alert rule. Notifications sent from this alert rule follow the [common alert schema](./alerts-common-schema.md).
11071

111-
:::image type="content" source="./media/proactive-failure-diagnostics/032.png" alt-text="Rule configuration screen." lightbox="./media/proactive-failure-diagnostics/032.png":::
11272

11373
## Delete alerts
11474

@@ -425,17 +385,17 @@ Notice that if you delete an Application Insights resource, the associated Failu
425385

426386
An alert indicates that an abnormal rise in the failed request rate was detected. It's likely that there is some problem with your app or its environment.
427387

428-
To investigate further, click on 'View full details in Application Insights' the links in this page will take you straight to a [search page](../app/diagnostic-search.md) filtered to the relevant requests, exception, dependency, or traces.
388+
To investigate further, click on 'View full details in Application Insights'. The links in this page take you straight to a [search page](../app/diagnostic-search.md) filtered to the relevant requests, exception, dependency, or traces.
429389

430390
You can also open the [Azure portal](https://portal.azure.com), navigate to the Application Insights resource for your app, and open the Failures page.
431391

432-
Clicking on 'Diagnose failures' will help you get more details and resolve the issue.
392+
Clicking on 'Diagnose failures' can help you get more details and resolve the issue.
433393

434394
:::image type="content" source="./media/proactive-failure-diagnostics/051.png" alt-text="Diagnostic search." lightbox="./media/proactive-failure-diagnostics/051.png#lightbox":::
435395

436-
From the percentage of requests and number of users affected, you can decide how urgent the issue is. In the example shown before, the failure rate of 78.5% compares with a normal rate of 2.2%, indicates that something bad is going on. On the other hand, only 46 users were affected. If it was your app, you'd be able to assess how serious that is.
396+
From the percentage of requests and number of users affected, you can decide how urgent the issue is. In the example shown before, the failure rate of 78.5% compares with a normal rate of 2.2%, indicates that something bad is going on. On the other hand, only 46 users were affected. This can help you to assess how serious the problem is.
437397

438-
In many cases, you will be able to diagnose the problem quickly from the request name, exception, dependency failure, and trace data provided.
398+
In many cases, you can diagnose the problem quickly from the request name, exception, dependency failure, and trace data provided.
439399

440400
In this example, there was an exception from SQL Database due to request limit being reached.
441401

0 commit comments

Comments
 (0)