You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-monitor/containers/container-insights-metric-alerts.md
+22-18Lines changed: 22 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -154,24 +154,28 @@ The following sections present information on the alert rules provided by Contai
154
154
155
155
### Community alert rules
156
156
157
-
These handpicked alerts come from the Prometheus community. Source code for these mixin alerts can be found in [GitHub](https://aka.ms/azureprometheus-mixins):
158
-
159
-
- KubeJobNotCompleted
160
-
- KubeJobFailed
161
-
- KubePodCrashLooping
162
-
- KubePodNotReady
163
-
- KubeDeploymentReplicasMismatch
164
-
- KubeStatefulSetReplicasMismatch
165
-
- KubeHpaReplicasMismatch
166
-
- KubeHpaMaxedOut
167
-
- KubeQuotaAlmostFull
168
-
- KubeMemoryQuotaOvercommit
169
-
- KubeCPUQuotaOvercommit
170
-
- KubeVersionMismatch
171
-
- KubeNodeNotReady
172
-
- KubeNodeReadinessFlapping
173
-
- KubeletTooManyPods
174
-
- KubeNodeUnreachable
157
+
These handpicked alerts come from the Prometheus community. Source code for these mixin alerts can be found in [GitHub](https://aka.ms/azureprometheus-communityalerts):
158
+
159
+
| Alert name | Description | Default threshold |
160
+
|:---|:---|:---|
161
+
| NodeFilesystemSpaceFillingUp | An extrapolation algorithm predicts that disk space usage for a node on a device in a cluster will run out of space within the upcoming 24 hours. | NA |
162
+
| NodeFilesystemSpaceUsageFull85Pct | Disk space usage for a node on a device in a cluster is greater than 85%. | 85% |
163
+
| KubePodCrashLooping | Pod is in CrashLoop which means the app dies or is unresponsive and kubernetes tries to restart it automatically. | NA |
164
+
| KubePodNotReady | Pod has been in a non-ready state for more than 15 minutes. | NA |
165
+
| KubeDeploymentReplicasMismatch | Deployment has not matched the expected number of replicas. | NA |
166
+
| KubeStatefulSetReplicasMismatch | StatefulSet has not matched the expected number of replicas. | NA |
167
+
| KubeJobNotCompleted | Job is taking more than 1h to complete. | NA |
168
+
| KubeJobFailed | Job failed complete. | NA |
169
+
| KubeHpaReplicasMismatch | Horizontal Pod Autoscaler has not matched the desired number of replicas for longer than 15 minutes. | NA |
170
+
| KubeHpaMaxedOut | Horizontal Pod Autoscaler has been running at max replicas for longer than 15 minutes. | NA |
171
+
| KubeCPUQuotaOvercommit | Cluster has overcommitted CPU resource requests for Namespaces and cannot tolerate node failure. | 1.5 |
172
+
| KubeMemoryQuotaOvercommit | Cluster has overcommitted memory resource requests for Namespaces. | 1.5 |
173
+
| KubeQuotaAlmostFull | Cluster reaches to the allowed limits for given namespace. | Between 0.9 and 1 |
174
+
| KubeVersionMismatch | Different semantic versions of Kubernetes components running. | NA |
175
+
| KubeNodeNotReady | KubeNodeNotReady alert is fired when a Kubernetes node is not in Ready state for a certain period. | NA |
176
+
| KubeNodeUnreachable | Kubernetes node is unreachable and some workloads may be rescheduled. | NA |
177
+
| KubeletTooManyPods | The alert fires when a specific node is running >95% of its capacity of pods | 0.95 |
178
+
| KubeNodeReadinessFlapping | The readiness status of node has changed few times in the last 15 minutes. | 2 |
0 commit comments