Skip to content

Commit e8deda2

Browse files
authored
Add systemd service crashlooping alert (#30)
* Add NodeSystemdServiceCrashlooping alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Fix typo Signed-off-by: Vitaly Zhuravlev <[email protected]> --------- Signed-off-by: Vitaly Zhuravlev <[email protected]>
1 parent dc6a1e8 commit e8deda2

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

docs/node-observ-lib/linux/alerts.libsonnet

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -414,6 +414,20 @@
414414
description: 'Systemd service {{ $labels.name }} has entered failed state at {{ $labels.instance }}',
415415
},
416416
},
417+
{
418+
alert: 'NodeSystemdServiceCrashlooping',
419+
expr: |||
420+
increase(node_systemd_service_restart_total{%(filteringSelector)s}[5m]) > 2
421+
||| % this.config,
422+
'for': '15m',
423+
labels: {
424+
severity: 'warning',
425+
},
426+
annotations: {
427+
summary: 'Systemd service keeps restaring, possibly crash looping.',
428+
description: 'Systemd service {{ $labels.name }} has been restarted too many times at {{ $labels.instance }} for the last 15 minutes. Please check if service is crash looping.',
429+
},
430+
},
417431
]
418432
+ if this.config.enableHardware then
419433
[{

0 commit comments

Comments
 (0)