Skip to content

Commit c4c4786

Browse files
apps sc: update kube-state-metrics alerts
1 parent 8e75cf9 commit c4c4786

File tree

2 files changed

+12
-7
lines changed

2 files changed

+12
-7
lines changed

helmfile.d/charts/prometheus-alerts/CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,11 @@
77
1. helmfile.d/charts/prometheus-alerts/templates/alerts/kubernetes-resources.yaml
88
- MODIFIED - KubeCPUOvercommit properly compute per cluster as done upstream
99
- MODIFIED - KubeMemoryOvercommit properly compute per cluster as done upstream
10+
1. helmfile.d/charts/prometheus-alerts/templates/alerts/kube-state-metrics.yaml
11+
- MODIFIED - KubeStateMetricsListErrors properly compute per cluster as done upstream
12+
- MODIFIED - KubeStateMetricsWatchErrors properly compute per cluster as done upstream
13+
- MODIFIED - KubeStateMetricsShardingMismatch properly compute per cluster as done upstream
14+
- MODIFIED - KubeStateMetricsShardsMissing properly compute per cluster as done upstream
1015

1116
## 2022.12.01
1217

helmfile.d/charts/prometheus-alerts/templates/alerts/kube-state-metrics.yaml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@ spec:
2323
runbook_url: {{ .Values.defaultRules.runbookUrl }}kube-state-metrics/kubestatemetricslisterrors
2424
summary: kube-state-metrics is experiencing errors in list operations.
2525
expr: |-
26-
(sum(rate(kube_state_metrics_list_total{job="kube-state-metrics",result="error"}[5m]))
26+
(sum(rate(kube_state_metrics_list_total{job="kube-state-metrics",result="error"}[5m])) by (cluster)
2727
/
28-
sum(rate(kube_state_metrics_list_total{job="kube-state-metrics"}[5m])))
28+
sum(rate(kube_state_metrics_list_total{job="kube-state-metrics"}[5m])) by (cluster))
2929
> 0.01
3030
for: 15m
3131
labels:
@@ -39,9 +39,9 @@ spec:
3939
runbook_url: {{ .Values.defaultRules.runbookUrl }}kube-state-metrics/kubestatemetricswatcherrors
4040
summary: kube-state-metrics is experiencing errors in watch operations.
4141
expr: |-
42-
(sum(rate(kube_state_metrics_watch_total{job="kube-state-metrics",result="error"}[5m]))
42+
(sum(rate(kube_state_metrics_watch_total{job="kube-state-metrics",result="error"}[5m])) by (cluster)
4343
/
44-
sum(rate(kube_state_metrics_watch_total{job="kube-state-metrics"}[5m])))
44+
sum(rate(kube_state_metrics_watch_total{job="kube-state-metrics"}[5m])) by (cluster))
4545
> 0.01
4646
for: 15m
4747
labels:
@@ -54,7 +54,7 @@ spec:
5454
description: kube-state-metrics pods are running with different --total-shards configuration, some Kubernetes objects may be exposed multiple times or not exposed at all.
5555
runbook_url: {{ .Values.defaultRules.runbookUrl }}kube-state-metrics/kubestatemetricsshardingmismatch
5656
summary: kube-state-metrics sharding is misconfigured.
57-
expr: stdvar (kube_state_metrics_total_shards{job="kube-state-metrics"}) != 0
57+
expr: stdvar (kube_state_metrics_total_shards{job="kube-state-metrics"}) by (cluster) != 0
5858
for: 15m
5959
labels:
6060
severity: critical
@@ -67,9 +67,9 @@ spec:
6767
runbook_url: {{ .Values.defaultRules.runbookUrl }}kube-state-metrics/kubestatemetricsshardsmissing
6868
summary: kube-state-metrics shards are missing.
6969
expr: |-
70-
2^max(kube_state_metrics_total_shards{job="kube-state-metrics"}) - 1
70+
2^max(kube_state_metrics_total_shards{job="kube-state-metrics"}) by (cluster) - 1
7171
-
72-
sum( 2 ^ max by (shard_ordinal) (kube_state_metrics_shard_ordinal{job="kube-state-metrics"}) )
72+
sum( 2 ^ max by (shard_ordinal) (kube_state_metrics_shard_ordinal{job="kube-state-metrics"})) by (cluster)
7373
!= 0
7474
for: 15m
7575
labels:

0 commit comments

Comments
 (0)