Skip to content

Commit 573a0ef

Browse files
authored
Merge pull request #97575 from dagiro/freshness89
freshness89
2 parents bb24e86 + f4e6d6c commit 573a0ef

12 files changed

+53
-49
lines changed

articles/hdinsight/hdinsight-hadoop-oms-log-analytics-use-queries.md

Lines changed: 53 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,16 @@ author: hrasheed-msft
55
ms.author: hrasheed
66
ms.reviewer: jasonh
77
ms.service: hdinsight
8-
ms.custom: hdinsightactive
98
ms.topic: conceptual
10-
ms.date: 11/05/2018
9+
ms.custom: hdinsightactive
10+
ms.date: 12/02/2019
1111
---
1212

1313
# Query Azure Monitor logs to monitor HDInsight clusters
1414

1515
Learn some basic scenarios on how to use Azure Monitor logs to monitor Azure HDInsight clusters:
1616

1717
* [Analyze HDInsight cluster metrics](#analyze-hdinsight-cluster-metrics)
18-
* [Search for specific log messages](#search-for-specific-log-messages)
1918
* [Create event alerts](#create-alerts-for-tracking-events)
2019

2120
[!INCLUDE [azure-monitor-log-analytics-rebrand](../../includes/azure-monitor-log-analytics-rebrand.md)]
@@ -29,96 +28,101 @@ You must have configured an HDInsight cluster to use Azure Monitor logs, and add
2928
Learn how to look for specific metrics for your HDInsight cluster.
3029

3130
1. Open the Log Analytics workspace that is associated to your HDInsight cluster from the Azure portal.
32-
1. Select the **Log Search** tile.
33-
1. Type the following query in the search box to search for all metrics for all available metrics for all HDInsight clusters configured to use Azure Monitor logs, and then select **RUN**.
31+
1. Under **General**, select **Logs**.
32+
1. Type the following query in the search box to search for all metrics for all available metrics for all HDInsight clusters configured to use Azure Monitor logs, and then select **Run**. Review the results.
3433

35-
search *
34+
```kusto
35+
search *
36+
```
3637
3738
![Apache Ambari analytics search all metrics](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-search-all-metrics.png "Search all metrics")
3839
39-
The output shall look like:
40-
41-
![log analytics search all metrics](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-search-all-metrics-output.png "Search all metrics output")
42-
43-
1. From the left pane, under **Type**, select a metric that you want to dig deep into, and then select **Apply**. The following screenshot shows the `metrics_resourcemanager_queue_root_default_CL` type is selected.
40+
1. From the left menu, select the **Filter** tab.
4441
45-
> [!NOTE]
46-
> You may need to select the **[+]More** button to find the metric you are looking for. Also, the **Apply** button is at the bottom of the list so you must scroll down to see it.
47-
48-
Notice that the query in the text box changes to one shown in the highlighted box in the following screenshot:
42+
1. Under **Type**, select **Heartbeat**. Then select **Apply & Run**.
4943
5044
![log analytics search specific metrics](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-search-specific-metrics.png "Search for specific metrics")
5145
52-
1. To dig deeper into this specific metric. For example, you can refine the existing output based on the average of resources used in a 10-minute interval, categorized by cluster name using the following query:
53-
54-
search in (metrics_resourcemanager_queue_root_default_CL) * | summarize AggregatedValue = avg(UsedAMResourceMB_d) by ClusterName_s, bin(TimeGenerated, 10m)
55-
56-
1. Instead of refining based on the average of resources used, you can use the following query to refine the results based on when the maximum resources were used (as well as 90th and 95th percentile) in a 10-minute window:
46+
1. Notice that the query in the text box changes to:
5747
58-
search in (metrics_resourcemanager_queue_root_default_CL) * | summarize ["max(UsedAMResourceMB_d)"] = max(UsedAMResourceMB_d), ["pct95(UsedAMResourceMB_d)"] = percentile(UsedAMResourceMB_d, 95), ["pct90(UsedAMResourceMB_d)"] = percentile(UsedAMResourceMB_d, 90) by ClusterName_s, bin(TimeGenerated, 10m)
48+
```kusto
49+
search *
50+
| where Type == "Heartbeat"
51+
```
5952
60-
## Search for specific log messages
53+
1. You can dig deeper by using the options available in the left menu. For example:
6154
62-
Learn how to look error messages during a specific time window. The steps here are just one example on how you can arrive at the error message you are interested in. You can use any property that is available to look for the errors you are trying to find.
55+
- To see logs from a specific node:
6356
64-
1. Open the Log Analytics workspace that is associated to your HDInsight cluster from the Azure portal.
65-
2. Select the **Log Search** tile.
66-
3. Type the following query to search for all error messages for all HDInsight clusters configured to use Azure Monitor logs, and then select **RUN**.
67-
68-
search "Error"
57+
![Search for specific errors output1](./media/hdinsight-hadoop-oms-log-analytics-use-queries/log-analytics-specific-node.png "Search for specific errors output1")
6958
70-
You shall see an output like the following output:
59+
- To see logs at certain times:
7160
72-
![Azure portal log search errors](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-search-all-errors-output.png "Search all errors output")
61+
![Search for specific errors output2](./media/hdinsight-hadoop-oms-log-analytics-use-queries/log-analytics-specific-time.png "Search for specific errors output2")
7362
74-
4. From the left pane, under **Type** category, select an error type that you want to dig deep into, and then select **Apply**. Notice the results are refined to only show the error of the type you selected.
63+
1. Select **Apply & Run** and review the results. Also note that the query was updated to:
7564
76-
5. You can dig deeper into this specific error list by using the options available in the left pane. For example:
65+
```kusto
66+
search *
67+
| where Type == "Heartbeat"
68+
| where (Computer == "zk2-myhado") and (TimeGenerated == "2019-12-02T23:15:02.69Z" or TimeGenerated == "2019-12-02T23:15:08.07Z" or TimeGenerated == "2019-12-02T21:09:34.787Z")
69+
```
7770
78-
- To see error messages from a specific worker node:
71+
### Additional sample queries
7972
80-
![Search for specific errors output1](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-search-specific-error-refined.png "Search for specific errors output1")
73+
A sample query based on the average of resources used in a 10-minute interval, categorized by cluster name:
8174
82-
- To see an error occurred at a certain time:
75+
```kusto
76+
search in (metrics_resourcemanager_queue_root_default_CL) *
77+
| summarize AggregatedValue = avg(UsedAMResourceMB_d) by ClusterName_s, bin(TimeGenerated, 10m)
78+
```
8379

84-
![Search for specific errors output2](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-search-specific-error-time.png "Search for specific errors output2")
80+
Instead of refining based on the average of resources used, you can use the following query to refine the results based on when the maximum resources were used (as well as 90th and 95th percentile) in a 10-minute window:
8581

86-
6. To see the specific error. You can select **[+]show more** to look at the actual error message.
87-
88-
![Search for specific errors output3](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-search-specific-error-arrived.png "Search for specific errors output3")
82+
```kusto
83+
search in (metrics_resourcemanager_queue_root_default_CL) *
84+
| summarize ["max(UsedAMResourceMB_d)"] = max(UsedAMResourceMB_d), ["pct95(UsedAMResourceMB_d)"] = percentile(UsedAMResourceMB_d, 95), ["pct90(UsedAMResourceMB_d)"] = percentile(UsedAMResourceMB_d, 90) by ClusterName_s, bin(TimeGenerated, 10m)
85+
```
8986

9087
## Create alerts for tracking events
9188

9289
The first step to create an alert is to arrive at a query based on which the alert is triggered. You can use any query that you want to create an alert.
9390

9491
1. Open the Log Analytics workspace that is associated to your HDInsight cluster from the Azure portal.
95-
2. Select the **Log Search** tile.
96-
3. Run the following query on which you want to create an alert, and then select **RUN**.
92+
1. Under **General**, select **Logs**.
93+
1. Run the following query on which you want to create an alert, and then select **Run**.
9794

98-
metrics_resourcemanager_queue_root_default_CL | where AppsFailed_d > 0
95+
```kusto
96+
metrics_resourcemanager_queue_root_default_CL | where AppsFailed_d > 0
97+
```
9998
10099
The query provides list of failed applications running on HDInsight clusters.
101100
102-
4. Select **New Alert Rule** on the top of the page.
101+
1. Select **New alert rule** on the top of the page.
103102
104103
![Enter query to create an alert1](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-create-alert-query.png "Enter query to create an alert1")
105104
106-
5. In the **Create rule** window, enter the query and other details to create an alert, and then select **Create alert rule**.
105+
1. In the **Create rule** window, enter the query and other details to create an alert, and then select **Create alert rule**.
107106
108107
![Enter query to create an alert2](./media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-create-alert.png "Enter query to create an alert2")
109108
110-
To edit or delete an existing alert:
109+
### Edit or delete an existing alert
111110
112111
1. Open the Log Analytics workspace from the Azure portal.
113-
2. From the left menu, select **Alert**.
114-
3. Select the alert you want to edit or delete.
115-
4. You have the following options: **Save**, **Discard**, **Disable**, and **Delete**.
112+
113+
1. From the left menu, under **Monitoring**, select **Alerts**.
114+
115+
1. Towards the top, select **Manage alert rules**.
116+
117+
1. Select the alert you want to edit or delete.
118+
119+
1. You have the following options: **Save**, **Discard**, **Disable**, and **Delete**.
116120
117121
![HDInsight Azure Monitor logs alert delete edit](media/hdinsight-hadoop-oms-log-analytics-use-queries/hdinsight-log-analytics-edit-alert.png)
118122
119123
For more information, see [Create, view, and manage metric alerts using Azure Monitor](../azure-monitor/platform/alerts-metric.md).
120124
121125
## See also
122126
127+
* [Get started with log queries in Azure Monitor](../azure-monitor/log-query/get-started-queries.md)
123128
* [Create custom views by using View Designer in Azure Monitor](../azure-monitor/platform/view-designer.md)
124-
* [Create, view, and manage metric alerts using Azure Monitor](../azure-monitor/platform/alerts-metric.md)
Loading
8.85 KB
Loading
Loading
Loading

0 commit comments

Comments
 (0)