Skip to content

Commit 203683b

Browse files
DanSibbernsenDanSibbernsen
authored andcommitted
chore(RunbookDocumentation): updating documentation so links from alertmanager to specific alerts work
1 parent 9821d07 commit 203683b

File tree

1 file changed

+34
-7
lines changed

1 file changed

+34
-7
lines changed

runbook.md

Lines changed: 34 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -98,20 +98,47 @@ This page collects this repositories alerts and begins the process of describing
9898
+ *Severity*: warning
9999
### Group Name: "kubernetes-system"
100100
##### Alert Name: "KubeNodeNotReady"
101-
+ *Message*: `{{ $labels.node }} has been unready for more than an 15 minutes"`
101+
+ *Message*: `{{ $labels.node }} has been unready for more than 15 minutes."`
102+
+ *Severity*: warning
103+
##### Alert Name: "KubeNodeUnreachable"
104+
+ *Message*: `{{ $labels.node }} is unreachable and some workloads may be rescheduled.`
105+
+ *Severity*: warning
106+
##### Alert Name: "KubeletTooManyPods"
107+
+ *Message*: `Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage }} of its Pod capacity.`
108+
+ *Severity*: info
109+
##### Alert Name: "KubeNodeReadinessFlapping"
110+
+ *Message*: `The readiness status of node {{ $labels.node }} has changed {{ $value }} times in the last 15 minutes.`
111+
+ *Severity*: warning
112+
##### Alert Name: "KubeletPlegDurationHigh"
113+
+ *Message*: `The Kubelet Pod Lifecycle Event Generator has a 99th percentile duration of {{ $value }} seconds on node {{ $labels.node }}.`
114+
+ *Severity*: warning
115+
##### Alert Name: "KubeletPodStartUpLatencyHigh"
116+
+ *Message*: `Kubelet Pod startup 99th percentile latency is {{ $value }} seconds on node {{ $labels.node }}.`
117+
+ *Severity*: warning
118+
##### Alert Name: "KubeletClientCertificateExpiration"
119+
+ *Message*: `Client certificate for Kubelet on node {{ $labels.node }} expires in 7 days.`
120+
+ *Severity*: warning
121+
##### Alert Name: "KubeletClientCertificateExpiration"
122+
+ *Message*: `Client certificate for Kubelet on node {{ $labels.node }} expires in 1 day.`
123+
+ *Severity*: critical
124+
##### Alert Name: "KubeletServerCertificateExpiration"
125+
+ *Message*: `Server certificate for Kubelet on node {{ $labels.node }} expires in 7 days.`
126+
+ *Severity*: warning
127+
##### Alert Name: "KubeletServerCertificateExpiration"
128+
+ *Message*: `Server certificate for Kubelet on node {{ $labels.node }} expires in 1 day.`
129+
+ *Severity*: critical
130+
##### Alert Name: "KubeletClientCertificateRenewalErrors"
131+
+ *Message*: `Kubelet on node {{ $labels.node }} has failed to renew its client certificate ({{ $value | humanize }} errors in the last 15 minutes).`
132+
+ *Severity*: warning
133+
##### Alert Name: "KubeletServerCertificateRenewalErrors"
134+
+ *Message*: `Kubelet on node {{ $labels.node }} has failed to renew its server certificate ({{ $value | humanize }} errors in the last 5 minutes).`
102135
+ *Severity*: warning
103136
##### Alert Name: "KubeVersionMismatch"
104137
+ *Message*: `There are {{ $value }} different versions of Kubernetes components running.`
105138
+ *Severity*: warning
106139
##### Alert Name: "KubeClientErrors"
107140
+ *Message*: `Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ $value | humanizePercentage }} errors.'`
108141
+ *Severity*: warning
109-
##### Alert Name: "KubeClientErrors"
110-
+ *Message*: `Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ printf \"%0.0f\" $value }} errors / sec.'`
111-
+ *Severity*: warning
112-
##### Alert Name: "KubeletTooManyPods"
113-
+ *Message*: `Kubelet {{$labels.instance}} is running {{$value}} pods, close to the limit of 110.`
114-
+ *Severity*: warning
115142
##### Alert Name: "KubeClientCertificateExpiration"
116143
+ *Message*: `A client certificate used to authenticate to the apiserver is expiring in less than 7 days.`
117144
+ *Severity*: warning

0 commit comments

Comments
 (0)