You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
summary: An Alertmanager instance failed to send notifications.
48
53
expr: |
@@ -57,9 +62,12 @@ spec:
57
62
severity: warning
58
63
- alert: AlertmanagerClusterFailedToSendAlerts
59
64
annotations:
60
-
description: The minimum notification failure rate to {{ $labels.integration }} sent from any instance in the {{$labels.job}} cluster is {{ $value | humanizePercentage }}.
65
+
description: The minimum notification failure rate to {{ $labels.integration
66
+
}} sent from any instance in the {{$labels.job}} cluster is {{ $value |
description: The minimum notification failure rate to {{ $labels.integration }} sent from any instance in the {{$labels.job}} cluster is {{ $value | humanizePercentage }}.
83
+
description: The minimum notification failure rate to {{ $labels.integration
84
+
}} sent from any instance in the {{$labels.job}} cluster is {{ $value |
summary: Alertmanager instances within the same cluster have different configurations.
93
105
expr: |
@@ -100,9 +112,12 @@ spec:
100
112
severity: critical
101
113
- alert: AlertmanagerClusterDown
102
114
annotations:
103
-
description: '{{ $value | humanizePercentage }} of Alertmanager instances within the {{$labels.job}} cluster have been up for less than half of the last 5m.'
115
+
description: '{{ $value | humanizePercentage }} of Alertmanager instances
116
+
within the {{$labels.job}} cluster have been up for less than half of the
summary: Half or more of the Alertmanager instances within the same cluster are down.
119
+
summary: Half or more of the Alertmanager instances within the same cluster
120
+
are down.
106
121
expr: |
107
122
(
108
123
count by (namespace,service) (
@@ -119,9 +134,12 @@ spec:
119
134
severity: critical
120
135
- alert: AlertmanagerClusterCrashlooping
121
136
annotations:
122
-
description: '{{ $value | humanizePercentage }} of Alertmanager instances within the {{$labels.job}} cluster have restarted at least 5 times in the last 10m.'
137
+
description: '{{ $value | humanizePercentage }} of Alertmanager instances
138
+
within the {{$labels.job}} cluster have restarted at least 5 times in the
- expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal"}[5m])) WITHOUT (cpu, mode) / ON(instance) GROUP_LEFT() count(sum(node_cpu_seconds_total) BY (instance, cpu)) BY (instance)
Copy file name to clipboardExpand all lines: manifests/kube-state-metrics-prometheusRule.yaml
+11-4Lines changed: 11 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,9 @@ spec:
16
16
rules:
17
17
- alert: KubeStateMetricsListErrors
18
18
annotations:
19
-
description: kube-state-metrics is experiencing errors at an elevated rate in list operations. This is likely causing it to not be able to expose metrics about Kubernetes objects correctly or at all.
19
+
description: kube-state-metrics is experiencing errors at an elevated rate
20
+
in list operations. This is likely causing it to not be able to expose metrics
summary: kube-state-metrics is experiencing errors in list operations.
22
24
expr: |
@@ -29,7 +31,9 @@ spec:
29
31
severity: critical
30
32
- alert: KubeStateMetricsWatchErrors
31
33
annotations:
32
-
description: kube-state-metrics is experiencing errors at an elevated rate in watch operations. This is likely causing it to not be able to expose metrics about Kubernetes objects correctly or at all.
34
+
description: kube-state-metrics is experiencing errors at an elevated rate
35
+
in watch operations. This is likely causing it to not be able to expose
36
+
metrics about Kubernetes objects correctly or at all.
summary: kube-state-metrics is experiencing errors in watch operations.
35
39
expr: |
@@ -42,7 +46,9 @@ spec:
42
46
severity: critical
43
47
- alert: KubeStateMetricsShardingMismatch
44
48
annotations:
45
-
description: kube-state-metrics pods are running with different --total-shards configuration, some Kubernetes objects may be exposed multiple times or not exposed at all.
49
+
description: kube-state-metrics pods are running with different --total-shards
50
+
configuration, some Kubernetes objects may be exposed multiple times or
0 commit comments