You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -29,96 +28,101 @@ You must have configured an HDInsight cluster to use Azure Monitor logs, and add
29
28
Learn how to look for specific metrics for your HDInsight cluster.
30
29
31
30
1. Open the Log Analytics workspace that is associated to your HDInsight cluster from the Azure portal.
32
-
1.Select the **Log Search** tile.
33
-
1. Type the following query in the search box to search for all metrics for all available metrics for all HDInsight clusters configured to use Azure Monitor logs, and then select **RUN**.
31
+
1.Under **General**, select **Logs**.
32
+
1. Type the following query in the search box to search for all metrics for all available metrics for all HDInsight clusters configured to use Azure Monitor logs, and then select **Run**. Review the results.
34
33
35
-
search *
34
+
```kusto
35
+
search *
36
+
```
36
37
37
38

38
39
39
-
The output shall look like:
40
-
41
-

42
-
43
-
1. From the left pane, under **Type**, select a metric that you want to dig deep into, and then select **Apply**. The following screenshot shows the `metrics_resourcemanager_queue_root_default_CL` type is selected.
40
+
1. From the left menu, select the **Filter** tab.
44
41
45
-
> [!NOTE]
46
-
> You may need to select the **[+]More** button to find the metric you are looking for. Also, the **Apply** button is at the bottom of the list so you must scroll down to see it.
47
-
48
-
Notice that the query in the text box changes to one shown in the highlighted box in the following screenshot:
42
+
1. Under **Type**, select **Heartbeat**. Then select **Apply & Run**.
49
43
50
44

51
45
52
-
1. To dig deeper into this specific metric. For example, you can refine the existing output based on the average of resources used in a 10-minute interval, categorized by cluster name using the following query:
53
-
54
-
search in (metrics_resourcemanager_queue_root_default_CL) * | summarize AggregatedValue = avg(UsedAMResourceMB_d) by ClusterName_s, bin(TimeGenerated, 10m)
55
-
56
-
1. Instead of refining based on the average of resources used, you can use the following query to refine the results based on when the maximum resources were used (as well as 90th and 95th percentile) in a 10-minute window:
46
+
1. Notice that the query in the text box changes to:
1. You can dig deeper by using the options available in the left menu. For example:
61
54
62
-
Learn how to look error messages during a specific time window. The steps here are just one example on how you can arrive at the error message you are interested in. You can use any property that is available to look for the errors you are trying to find.
55
+
- To see logs from a specific node:
63
56
64
-
1. Open the Log Analytics workspace that is associated to your HDInsight cluster from the Azure portal.
65
-
2. Select the **Log Search** tile.
66
-
3. Type the following query to search for all error messages for all HDInsight clusters configured to use Azure Monitor logs, and then select **RUN**.
67
-
68
-
search "Error"
57
+

69
58
70
-
You shall see an output like the following output:
59
+
- To see logs at certain times:
71
60
72
-

61
+

73
62
74
-
4. From the left pane, under **Type** category, select an error type that you want to dig deep into, and then select **Apply**. Notice the results are refined to only show the error of the type you selected.
63
+
1. Select **Apply & Run** and review the results. Also note that the query was updated to:
75
64
76
-
5. You can dig deeper into this specific error list by using the options available in the left pane. For example:
65
+
```kusto
66
+
search *
67
+
| where Type == "Heartbeat"
68
+
| where (Computer == "zk2-myhado") and (TimeGenerated == "2019-12-02T23:15:02.69Z" or TimeGenerated == "2019-12-02T23:15:08.07Z" or TimeGenerated == "2019-12-02T21:09:34.787Z")
69
+
```
77
70
78
-
- To see error messages from a specific worker node:
71
+
### Additional sample queries
79
72
80
-

73
+
A sample query based on the average of resources used in a 10-minute interval, categorized by cluster name:
81
74
82
-
- To see an error occurred at a certain time:
75
+
```kusto
76
+
search in (metrics_resourcemanager_queue_root_default_CL) *
77
+
| summarize AggregatedValue = avg(UsedAMResourceMB_d) by ClusterName_s, bin(TimeGenerated, 10m)
78
+
```
83
79
84
-

80
+
Instead of refining based on the average of resources used, you can use the following query to refine the results based on when the maximum resources were used (as well as 90th and 95th percentile) in a 10-minute window:
85
81
86
-
6. To see the specific error. You can select **[+]show more** to look at the actual error message.
87
-
88
-

82
+
```kusto
83
+
search in (metrics_resourcemanager_queue_root_default_CL) *
The first step to create an alert is to arrive at a query based on which the alert is triggered. You can use any query that you want to create an alert.
93
90
94
91
1. Open the Log Analytics workspace that is associated to your HDInsight cluster from the Azure portal.
95
-
2. Select the **Log Search** tile.
96
-
3. Run the following query on which you want to create an alert, and then select **RUN**.
92
+
1. Under **General**, select **Logs**.
93
+
1. Run the following query on which you want to create an alert, and then select **Run**.
97
94
98
-
metrics_resourcemanager_queue_root_default_CL | where AppsFailed_d > 0
95
+
```kusto
96
+
metrics_resourcemanager_queue_root_default_CL | where AppsFailed_d > 0
97
+
```
99
98
100
99
The query provides list of failed applications running on HDInsight clusters.
101
100
102
-
4. Select **New Alert Rule** on the top of the page.
101
+
1. Select **New alert rule** on the top of the page.
103
102
104
103

105
104
106
-
5. In the **Create rule** window, enter the query and other details to create an alert, and then select **Create alert rule**.
105
+
1. In the **Create rule** window, enter the query and other details to create an alert, and then select **Create alert rule**.
107
106
108
107

109
108
110
-
To edit or delete an existing alert:
109
+
### Edit or delete an existing alert
111
110
112
111
1. Open the Log Analytics workspace from the Azure portal.
113
-
2. From the left menu, select **Alert**.
114
-
3. Select the alert you want to edit or delete.
115
-
4. You have the following options: **Save**, **Discard**, **Disable**, and **Delete**.
112
+
113
+
1. From the left menu, under **Monitoring**, select **Alerts**.
114
+
115
+
1. Towards the top, select **Manage alert rules**.
116
+
117
+
1. Select the alert you want to edit or delete.
118
+
119
+
1. You have the following options: **Save**, **Discard**, **Disable**, and **Delete**.
0 commit comments