Skip to content

Commit 115a2ec

Browse files
Merge pull request #229846 from Sohamdg081992/AddTableforDefaultAlerts
Add table of default alerts similar to recommended alerts
2 parents b79ec47 + fc34624 commit 115a2ec

File tree

1 file changed

+22
-18
lines changed

1 file changed

+22
-18
lines changed

articles/azure-monitor/containers/container-insights-metric-alerts.md

Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -154,24 +154,28 @@ The following sections present information on the alert rules provided by Contai
154154
155155
### Community alert rules
156156
157-
These handpicked alerts come from the Prometheus community. Source code for these mixin alerts can be found in [GitHub](https://aka.ms/azureprometheus-mixins):
158-
159-
- KubeJobNotCompleted
160-
- KubeJobFailed
161-
- KubePodCrashLooping
162-
- KubePodNotReady
163-
- KubeDeploymentReplicasMismatch
164-
- KubeStatefulSetReplicasMismatch
165-
- KubeHpaReplicasMismatch
166-
- KubeHpaMaxedOut
167-
- KubeQuotaAlmostFull
168-
- KubeMemoryQuotaOvercommit
169-
- KubeCPUQuotaOvercommit
170-
- KubeVersionMismatch
171-
- KubeNodeNotReady
172-
- KubeNodeReadinessFlapping
173-
- KubeletTooManyPods
174-
- KubeNodeUnreachable
157+
These handpicked alerts come from the Prometheus community. Source code for these mixin alerts can be found in [GitHub](https://aka.ms/azureprometheus-communityalerts):
158+
159+
| Alert name | Description | Default threshold |
160+
|:---|:---|:---|
161+
| NodeFilesystemSpaceFillingUp | An extrapolation algorithm predicts that disk space usage for a node on a device in a cluster will run out of space within the upcoming 24 hours. | NA |
162+
| NodeFilesystemSpaceUsageFull85Pct | Disk space usage for a node on a device in a cluster is greater than 85%. | 85% |
163+
| KubePodCrashLooping | Pod is in CrashLoop which means the app dies or is unresponsive and kubernetes tries to restart it automatically. | NA |
164+
| KubePodNotReady | Pod has been in a non-ready state for more than 15 minutes. | NA |
165+
| KubeDeploymentReplicasMismatch | Deployment has not matched the expected number of replicas. | NA |
166+
| KubeStatefulSetReplicasMismatch | StatefulSet has not matched the expected number of replicas. | NA |
167+
| KubeJobNotCompleted | Job is taking more than 1h to complete. | NA |
168+
| KubeJobFailed | Job failed complete. | NA |
169+
| KubeHpaReplicasMismatch | Horizontal Pod Autoscaler has not matched the desired number of replicas for longer than 15 minutes. | NA |
170+
| KubeHpaMaxedOut | Horizontal Pod Autoscaler has been running at max replicas for longer than 15 minutes. | NA |
171+
| KubeCPUQuotaOvercommit | Cluster has overcommitted CPU resource requests for Namespaces and cannot tolerate node failure. | 1.5 |
172+
| KubeMemoryQuotaOvercommit | Cluster has overcommitted memory resource requests for Namespaces. | 1.5 |
173+
| KubeQuotaAlmostFull | Cluster reaches to the allowed limits for given namespace. | Between 0.9 and 1 |
174+
| KubeVersionMismatch | Different semantic versions of Kubernetes components running. | NA |
175+
| KubeNodeNotReady | KubeNodeNotReady alert is fired when a Kubernetes node is not in Ready state for a certain period. | NA |
176+
| KubeNodeUnreachable | Kubernetes node is unreachable and some workloads may be rescheduled. | NA |
177+
| KubeletTooManyPods | The alert fires when a specific node is running >95% of its capacity of pods | 0.95 |
178+
| KubeNodeReadinessFlapping | The readiness status of node has changed few times in the last 15 minutes. | 2 |
175179
176180
### Recommended alert rules
177181

0 commit comments

Comments
 (0)