You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-monitor/containers/container-insights-metric-alerts.md
+16-9Lines changed: 16 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,9 @@ This article reviews the experience and provides guidance on configuring and man
14
14
15
15
If you're not familiar with Azure Monitor alerts, see [Overview of alerts in Microsoft Azure](../alerts/alerts-overview.md) before you start. To learn more about metric alerts, see [Metric alerts in Azure Monitor](../alerts/alerts-metric-overview.md).
16
16
17
+
> [!NOTE]
18
+
> Beginning October 8, 2021, three alerts have been updated to correctly calculate the alert condition: **Container CPU %**, **Container working set memory %**, and **Persistent Volume Usage %**. These new alerts have the same names as their corresponding previously available alerts, but they use new, updated metrics. We recommend that you disable the alerts that use the "Old" metrics, described in this article, and enable the "New" metrics. The "Old" metrics will no longer be available in recommended alerts after they are disabled, but you can manually re-enable them.
19
+
17
20
## Prerequisites
18
21
19
22
Before you start, confirm the following:
@@ -37,11 +40,11 @@ To alert on what matters, Container insights includes the following metric alert
37
40
38
41
|Name| Description |Default threshold |
39
42
|----|-------------|------------------|
40
-
|Average container CPU % |Calculates average CPU used per container.|When average CPU usage per container is greater than 95%.|
41
-
|Average container working set memory % |Calculates average working set memory used per container.|When average working set memory usage per container is greater than 95%. |
43
+
|**(New)Average container CPU %**|Calculates average CPU used per container.|When average CPU usage per container is greater than 95%.|
44
+
|**(New)Average container working set memory %**|Calculates average working set memory used per container.|When average working set memory usage per container is greater than 95%. |
42
45
|Average CPU % |Calculates average CPU used per node. |When average node CPU utilization is greater than 80% |
43
46
|Average Disk Usage % |Calculates average disk usage for a node.|When disk usage for a node is greater than 80%. |
44
-
|Average Persistent Volume Usage % |Calculates average PV usage per pod. |When average PV usage per pod is greater than 80%.|
47
+
|**(New)Average Persistent Volume Usage %**|Calculates average PV usage per pod. |When average PV usage per pod is greater than 80%.|
45
48
|Average Working set memory % |Calculates average Working set memory for a node. |When average Working set memory for a node is greater than 80%. |
46
49
|Restarting container count |Calculates number of restarting containers. | When container restarts are greater than 0. |
47
50
|Failed Pod Counts |Calculates if any pod in failed state.|When a number of pods in failed state are greater than 0. |
@@ -76,7 +79,7 @@ The following alert-based metrics have unique behavior characteristics compared
76
79
77
80
## Metrics collected
78
81
79
-
The following metrics are enabled and collected, unless otherwise specified, as part of this feature:
82
+
The following metrics are enabled and collected, unless otherwise specified, as part of this feature. The metrics in **bold** with label "Old" are the ones replaced by "New" metrics collected for correct alert evaluation.
80
83
81
84
|Metric namespace |Metric |Description |
82
85
|---------|----|------------|
@@ -93,10 +96,14 @@ The following metrics are enabled and collected, unless otherwise specified, as
93
96
|Insights.container/pods |restartingContainerCount |Count of container restarts by controller, Kubernetes namespace.|
94
97
|Insights.container/pods |oomKilledContainerCount |Count of OOMkilled containers by controller, Kubernetes namespace.|
95
98
|Insights.container/pods |podReadyPercentage |Percentage of pods in ready state by controller, Kubernetes namespace.|
96
-
|Insights.container/containers |cpuExceededPercentage |CPU utilization percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.<br> Collected |
97
-
|Insights.container/containers |memoryRssExceededPercentage |Memory RSS percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
98
-
|Insights.container/containers |memoryWorkingSetExceededPercentage |Memory Working Set percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
99
-
|Insights.container/persistentvolumes |pvUsageExceededPercentage |PV utilization percentage for persistent volumes exceeding user configurable threshold (default is 60.0) by claim name, Kubernetes namespace, volume name, pod name, and node name.
99
+
|Insights.container/containers |**(Old)cpuExceededPercentage**|CPU utilization percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.<br> Collected |
100
+
|Insights.container/containers |**(New)cpuThresholdViolated**|Metric triggered when CPU utilization percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.<br> Collected |
101
+
|Insights.container/containers |**(Old)memoryRssExceededPercentage**|Memory RSS percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
102
+
|Insights.container/containers |**(New)memoryRssThresholdViolated**|Metric triggered when Memory RSS percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
103
+
|Insights.container/containers |**(Old)memoryWorkingSetExceededPercentage**|Memory Working Set percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
104
+
|Insights.container/containers |**(New)memoryWorkingSetThresholdViolated**|Metric triggered when Memory Working Set percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
105
+
|Insights.container/persistentvolumes |**(Old)pvUsageExceededPercentage**|PV utilization percentage for persistent volumes exceeding user configurable threshold (default is 60.0) by claim name, Kubernetes namespace, volume name, pod name, and node name.|
106
+
|Insights.container/persistentvolumes |**(New)pvUsageThresholdViolated** |Metric triggered when PV utilization percentage for persistent volumes exceeding user configurable threshold (default is 60.0) by claim name, Kubernetes namespace, volume name, pod name, and node name.
100
107
101
108
## Enable alert rules
102
109
@@ -246,4 +253,4 @@ The configuration change can take a few minutes to finish before taking effect,
246
253
247
254
- View [log query examples](container-insights-log-query.md) to see pre-defined queries and examples to evaluate or customize for alerting, visualizing, or analyzing your clusters.
248
255
249
-
- To learn more about Azure Monitor and how to monitor other aspects of your Kubernetes cluster, see [View Kubernetes cluster performance](container-insights-analyze.md).
256
+
- To learn more about Azure Monitor and how to monitor other aspects of your Kubernetes cluster, see [View Kubernetes cluster performance](container-insights-analyze.md).
Copy file name to clipboardExpand all lines: articles/azure-monitor/containers/container-insights-update-metrics.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,10 +19,10 @@ The following metrics are enabled as part of this feature:
19
19
20
20
| Metric namespace | Metric | Description |
21
21
|------------------|--------|-------------|
22
-
| Insights.container/nodes | cpuUsageMillicores, cpuUsagePercentage, memoryRssBytes, memoryRssPercentage, memoryWorkingSetBytes, memoryWorkingSetPercentage, nodesCount, diskUsedPercentage, | As *node* metrics, they include *host* as a dimension. They also include the<br> node's name as value for the *host* dimension. |
22
+
| Insights.container/nodes | cpuUsageMillicores, cpuUsagePercentage, memoryRssBytes, memoryRssPercentage, memoryWorkingSetBytes, memoryWorkingSetPercentage, **cpuUsageAllocatablePercentage**, **memoryWorkingSetAllocatablePercentage**, **memoryRssAllocatablePercentage**, nodesCount, diskUsedPercentage, | As *node* metrics, they include *host* as a dimension. They also include the<br> node's name as value for the *host* dimension. |
23
23
| Insights.container/pods | podCount, completedJobsCount, restartingContainerCount, oomKilledContainerCount, podReadyPercentage | As *pod* metrics, they include the following as dimensions - ControllerName, Kubernetes namespace, name, phase. |
To support these new capabilities, a new containerized agent is included in the release, version **microsoft/oms:ciprod05262020** for AKS and version **microsoft/oms:ciprod09252020** for Azure Arc enabled Kubernetes clusters. New deployments of AKS automatically include this configuration change and capabilities. Updating your cluster to support this feature can be performed from the Azure portal, Azure PowerShell, or with Azure CLI. With Azure PowerShell and CLI. You can enable this per-cluster or for all clusters in your subscription.
0 commit comments