Commit 3830dfd
authored
Fix KubeClientCertificateExpiration alerts (#941)
1) Change aggregation `by (le)` to `without(service,endpoint...)`, dropping only useless labels, but keeping external labels (like environment etc) intact. Otherwise they get dropped.
2) Change order of metrics in expression: `apiserver_client_certificate_expiration_seconds_bucket` metric comes first so actual expiration date is shown as result in Grafana->Explore queries, not `apiserver_client_certificate_expiration_seconds_count` value (which is quite useless). This make it easier to troubleshoot.
3) Finally, fix aggregation for `on(job)` to become `(job, cluster, instance)`. Otherwise, It would be enough to have just single instance with certificate expiration problem, and it would set all apiservers to 'firing' (false positive!).1 parent c70f03d commit 3830dfd
1 file changed
+6
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | | - | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
54 | 56 | | |
55 | 57 | | |
56 | 58 | | |
| |||
64 | 66 | | |
65 | 67 | | |
66 | 68 | | |
67 | | - | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
68 | 72 | | |
69 | 73 | | |
70 | 74 | | |
| |||
0 commit comments