Skip to content

Commit 8bba20f

Browse files
committed
additional metrics added and categorized
1 parent 4d0d2ac commit 8bba20f

File tree

1 file changed

+74
-26
lines changed

1 file changed

+74
-26
lines changed

articles/data-explorer/using-metrics.md

Lines changed: 74 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -33,35 +33,83 @@ In the Metrics pane:
3333

3434
![Metrics pane](media/using-metrics/metrics-pane.png)
3535

36-
1. To create a metric chart, select **Metric** name and relevant **Aggregation** per metric as detailed below. The **Resource** and **Metric Namespace** pickers are pre-selected to your Azure Data Explorer cluster.
37-
38-
**Metric** | **Unit** | **Aggregation** | **Metric description**
39-
|---|---|---|---|
40-
| Cache utilization | Percent | Avg, Max, Min | Percentage of allocated cache resources currently in use by the cluster. Cache is the size of SSD allocated for user activity according to the defined cache policy. An average cache utilization of 80% or less is a sustainable state for a cluster. If the average cache utilization is above 80%, the cluster should be [scaled up](manage-cluster-vertical-scaling.md) to a storage optimized pricing tier or [scaled out](manage-cluster-horizontal-scaling.md) to more instances. Alternatively, adapt the cache policy (fewer days in cache). If cache utilization is over 100%, the size of data to be cached, according to the caching policy, is larger that the total size of cache on the cluster. |
41-
| CPU | Percent | Avg, Max, Min | Percentage of allocated compute resources currently in use by machines in the cluster. An average CPU of 80% or less is sustainable for a cluster. The maximum value of CPU is 100%, which means there are no additional compute resources to process data. When a cluster isn't performing well, check the maximum value of the CPU to determine if there are specific CPUs that are blocked. |
42-
| Events processed (for Event Hubs) | Count | Max, Min, Sum | Total number of events read from event hubs and processed by the cluster. The events are split into events rejected and events accepted by the cluster engine. |
43-
| Ingestion latency | Seconds | Avg, Max, Min | Latency of data ingested, from the time the data was received in the cluster until it's ready for query. The ingestion latency period depends on the ingestion scenario. |
44-
| Ingestion result | Count | Count | Total number of ingestion operations that failed and succeeded. Use **apply splitting** to create buckets of success and fail results and analyze the dimensions (**Value** > **Status**).|
45-
| Ingestion utilization | Percent | Avg, Max, Min | Percentage of actual resources used to ingest data from the total resources allocated, in the capacity policy, to perform ingestion. The default capacity policy is no more than 512 concurrent ingestion operations or 75% of the cluster resources invested in ingestion. Average ingestion utilization of 80% or less is a sustainable state for a cluster. Maximum value of ingestion utilization is 100%, which means all cluster ingestion ability is used and an ingestion queue may result. |
46-
| Ingestion volume (in MB) | Count | Max, Min, Sum | The total size of data ingested to the cluster (in MB) before compression. |
47-
| Keep alive | Count | Avg | Tracks the responsiveness of the cluster. A fully responsive cluster returns value 1 and a blocked or disconnected cluster returns 0. |
48-
| Query duration | Seconds | Count, Avg, Min, Max, Sum | Total time until query results are received (doesn't include network latency). |
49-
| Total number of concurrent queries | Count | Avg, Max, Min, Sum | The number of queries run in parallel in the cluster. This metric is a good way to estimate the load on the cluster. |
50-
| Total number of throttled queries | Count | Avg, Max, Min, Sum | The number of throttled (rejected) queries in the cluster. The maximum number of concurrent (parallel) queries allowed is defined in the concurrent query policy. |
51-
| Total number of throttled commands | Count | Avg, Max, Min, Sum | The number of throttled (rejected) commands in the cluster, since the maximum allowed number of concurrent (parallel) commands was reached. |
52-
| Total number of extents | Count | Avg, Max, Min, Sum | Total number of data extents in the cluster. Changes in this metric can imply massive data structure changes and high load on the cluster, since merging data extents is a CPU-heavy activity. |
53-
| | | | |
36+
1. To create a metric chart, select **Metric** name and relevant **Aggregation** per metric as \\detailed below\\. The **Resource** and **Metric Namespace** pickers are pre-selected for your Azure Data Explorer cluster.
37+
1. Select the **Add metric** button to see multiple metrics plotted in the same chart.
38+
1. Select the **+ New chart** button to see multiple charts in one view.
39+
1. Use the time picker to change the time range (default: past 24 hours).
40+
1. Use [**Add filter** and **Apply splitting**](/azure/azure-monitor/platform/metrics-getting-started#apply-dimension-filters-and-splitting) for metrics that have dimensions.
41+
1. Select **Pin to dashboard** to add your chart configuration to the dashboards so that you can view it again.
42+
1. Set **New alert rule** to visualize your metrics using the set criteria. The new alerting rule will include your target resource, metric, splitting, and filter dimensions from your chart. Modify these settings in the [alert rule creation pane](/azure/azure-monitor/platform/metrics-charts#create-alert-rules).
5443

55-
Additional information about [supported Azure Data Explorer cluster metrics](/azure/azure-monitor/platform/metrics-supported#microsoftkustoclusters)
44+
Additional information on using the [Metrics Explorer](/azure/azure-monitor/platform/metrics-getting-started).
5645

57-
2. Select the **Add metric** button to see multiple metrics plotted in the same chart.
58-
3. Select the **+ New chart** button to see multiple charts in one view.
59-
4. Use the time picker to change the time range (default: past 24 hours).
60-
5. Use [**Add filter** and **Apply splitting**](/azure/azure-monitor/platform/metrics-getting-started#apply-dimension-filters-and-splitting) for metrics that have dimensions.
61-
6. Select **Pin to dashboard** to add your chart configuration to the dashboards so that you can view it again.
62-
7. Set **New alert rule** to visualize your metrics using the set criteria. The new alerting rule will include your target resource, metric, splitting, and filter dimensions from your chart. Modify these settings in the [alert rule creation pane](/azure/azure-monitor/platform/metrics-charts#create-alert-rules).
46+
## Supported Azure Data Explorer Metrics
6347

64-
Additional information on using the [Metrics Explorer](/azure/azure-monitor/platform/metrics-getting-started).
48+
The supported Azure Data Explorer Metrics are separated into various categories according to usage.
49+
50+
### Cluster health metrics
51+
52+
The cluster health metrics track the general health of the cluster. This includes resource and ingestion utilization and responsiveness.
53+
54+
**Metric** | **Unit** | **Aggregation** | **Metric description**
55+
|---|---|---|---|
56+
| Cache utilization | Percent | Avg, Max, Min | Percentage of allocated cache resources currently in use by the cluster. Cache is the size of SSD allocated for user activity according to the defined cache policy. An average cache utilization of 80% or less is a sustainable state for a cluster. If the average cache utilization is above 80%, the cluster should be [scaled up](manage-cluster-vertical-scaling.md) to a storage optimized pricing tier or [scaled out](manage-cluster-horizontal-scaling.md) to more instances. Alternatively, adapt the cache policy (fewer days in cache). If cache utilization is over 100%, the size of data to be cached, according to the caching policy, is larger that the total size of cache on the cluster. |
57+
| CPU | Percent | Avg, Max, Min | Percentage of allocated compute resources currently in use by machines in the cluster. An average CPU of 80% or less is sustainable for a cluster. The maximum value of CPU is 100%, which means there are no additional compute resources to process data. When a cluster isn't performing well, check the maximum value of the CPU to determine if there are specific CPUs that are blocked. |
58+
| Ingestion utilization | Percent | Avg, Max, Min | Percentage of actual resources used to ingest data from the total resources allocated, in the capacity policy, to perform ingestion. The default capacity policy is no more than 512 concurrent ingestion operations or 75% of the cluster resources invested in ingestion. Average ingestion utilization of 80% or less is a sustainable state for a cluster. Maximum value of ingestion utilization is 100%, which means all cluster ingestion ability is used and an ingestion queue may result. |
59+
| Keep alive | Count | Avg | Tracks the responsiveness of the cluster. A fully responsive cluster returns value 1 and a blocked or disconnected cluster returns 0. |
60+
| Total number of throttled commands | Count | Avg, Max, Min, Sum | The number of throttled (rejected) commands in the cluster, since the maximum allowed number of concurrent (parallel) commands was reached. |
61+
| Total number of extents | Count | Avg, Max, Min, Sum | Total number of data extents in the cluster. Changes in this metric can imply massive data structure changes and high load on the cluster, since merging data extents is a CPU-heavy activity. |
62+
| | | | |
63+
64+
### Export health and performance metrics
65+
66+
Export health and performance metrics track the general health and performance of export operations like lateness, results, number of records, and utilization.
67+
68+
**Metric** | **Unit** | **Aggregation** | **Metric description**
69+
|---|---|---|---|
70+
Continuous export – num of exported records | Count | Sum | Total number of records exported from the cluster. |
71+
Continuous export Max Lateness Minutes | Count | Max | \\Max value of exported lateness in minutes.|
72+
Continuous export Pending Count | Count | Max | Max value of pending export operations.
73+
Continuous export result | Count | Count | Total number of continuous export operations \\by result. \\It includes a continuous export name, database
74+
Export utilization | Percent | Max | \\Export usage of defined slot for export operations in percentage.
75+
| | | | |
76+
77+
### Ingestion health and performance metrics
78+
79+
Ingestion health and performance metrics track the general health and performance of ingestion operations like latency, results, and volume.
80+
81+
**Metric** | **Unit** | **Aggregation** | **Metric description**
82+
|---|---|---|---|
83+
| Events processed (for Event/IoT Hubs) | Count | Max, Min, Sum | Total number of events read from event hubs and processed by the cluster. The events are split into events rejected and events accepted by the cluster engine. |
84+
| Ingestion latency | Seconds | Avg, Max, Min | Latency of data ingested, from the time the data was received in the cluster until it's ready for query. The ingestion latency period depends on the ingestion scenario. |
85+
| Ingestion result | Count | Count | Total number of ingestion operations that failed and succeeded. Use **apply splitting** to create buckets of success and fail results and analyze the dimensions (**Value** > **Status**).|
86+
| Ingestion volume (in MB) | Count | Max, Sum | The total size of data ingested to the cluster (in MB) before compression. |
87+
| | | | |
88+
89+
### Query performance
90+
91+
Query performance metrics track query duration and total number of concurrent or throttled queries
92+
93+
**Metric** | **Unit** | **Aggregation** | **Metric description**
94+
|---|---|---|---|
95+
| Query duration | \\Seconds or \\Milliseconds | Avg, Min, Max, Sum | Total time until query results are received (doesn't include network latency). |
96+
| Total number of concurrent queries | Count | Avg, Max, Min, Sum | The number of queries run in parallel in the cluster. This metric is a good way to estimate the load on the cluster. |
97+
| Total number of throttled queries | Count | Avg, Max, Min, Sum | The number of throttled (rejected) queries in the cluster. The maximum number of concurrent (parallel) queries allowed is defined in the concurrent query policy. |
98+
| | | | |
99+
100+
### Streaming ingest metrics
101+
102+
**Metric** | **Unit** | **Aggregation** | **Metric description**
103+
|---|---|---|---|
104+
Streaming Ingest Data Rate | Count | Avg, Max, Min
105+
\\RateRequestsPerSecond | | Total volume of data ingested to the cluster. |
106+
Streaming Ingest Duration Count
107+
\\Milliseconds | Avg, Max, Min | Total request duration of all streaming ingest requests. |
108+
Streaming Ingest Request Rate | Count | Count, Avg, Max, Min, Sum | Total number of streaming ingest requests. |
109+
Streaming Ingest Result | Count | Avg | Total number of Streaming Ingest requests by result type. |
110+
| | | | |
111+
112+
Additional information about [supported Azure Data Explorer cluster metrics](/azure/azure-monitor/platform/metrics-supported#microsoftkustoclusters)
65113

66114

67115
## Next steps

0 commit comments

Comments
 (0)