Skip to content

Commit 0107bc7

Browse files
authored
Make FS space alerts thresholds configurable (prometheus#1624)
* Make FS space alerts thresholds configurable (#1) This makes it possible to tweak the thresholds for the NodeFilesystemSpaceFillingUp alerts. Which might be necessary in systems like Kubernetes, where the image garbage collector runs at 85%, so it's not a problem that the disk reaches that usage %. Signed-off-by: iuri aranda <[email protected]>
1 parent a7c31ff commit 0107bc7

File tree

2 files changed

+14
-2
lines changed

2 files changed

+14
-2
lines changed

docs/node-mixin/alerts/alerts.libsonnet

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
alert: 'NodeFilesystemSpaceFillingUp',
99
expr: |||
1010
(
11-
node_filesystem_avail_bytes{%(nodeExporterSelector)s,%(fsSelector)s} / node_filesystem_size_bytes{%(nodeExporterSelector)s,%(fsSelector)s} * 100 < 40
11+
node_filesystem_avail_bytes{%(nodeExporterSelector)s,%(fsSelector)s} / node_filesystem_size_bytes{%(nodeExporterSelector)s,%(fsSelector)s} * 100 < %(fsSpaceFillingUpWarningThreshold)d
1212
and
1313
predict_linear(node_filesystem_avail_bytes{%(nodeExporterSelector)s,%(fsSelector)s}[6h], 24*60*60) < 0
1414
and
@@ -28,7 +28,7 @@
2828
alert: 'NodeFilesystemSpaceFillingUp',
2929
expr: |||
3030
(
31-
node_filesystem_avail_bytes{%(nodeExporterSelector)s,%(fsSelector)s} / node_filesystem_size_bytes{%(nodeExporterSelector)s,%(fsSelector)s} * 100 < 20
31+
node_filesystem_avail_bytes{%(nodeExporterSelector)s,%(fsSelector)s} / node_filesystem_size_bytes{%(nodeExporterSelector)s,%(fsSelector)s} * 100 < %(fsSpaceFillingUpCriticalThreshold)d
3232
and
3333
predict_linear(node_filesystem_avail_bytes{%(nodeExporterSelector)s,%(fsSelector)s}[6h], 4*60*60) < 0
3434
and

docs/node-mixin/config.libsonnet

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,18 @@
3535
// just a warning for K8s nodes.
3636
nodeCriticalSeverity: 'critical',
3737

38+
// Available disk space (%) thresholds on which to trigger the
39+
// 'NodeFilesystemSpaceFillingUp' alerts. These alerts fire if the disk
40+
// usage grows in a way that it is predicted to run out in 4h or 1d
41+
// and if the provided thresholds have been reached right now.
42+
// In some cases you'll want to adjust these, e.g. by default Kubernetes
43+
// runs the image garbage collection when the disk usage reaches 85%
44+
// of its available space. In that case, you'll want to reduce the
45+
// critical threshold below to something like 14 or 15, otherwise
46+
// the alert could fire under normal node usage.
47+
fsSpaceFillingUpWarningThreshold: 40,
48+
fsSpaceFillingUpCriticalThreshold: 20,
49+
3850
grafana_prefix: '',
3951
},
4052
}

0 commit comments

Comments
 (0)