We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent d806dee commit ce9ed5bCopy full SHA for ce9ed5b
environments/common/files/prometheus/rules/slurm.rules
@@ -9,3 +9,8 @@ groups:
9
expr: "slurm_nodes_down > 0\n"
10
labels:
11
severity: critical
12
+ - alert: SlurmNodeFail
13
+ annotations:
14
+ description: '{{ $value }} Slurm nodes are in fail status'
15
+ summary: 'At least one Slurm node is failed.'
16
+ expr: "slurm_nodes_fail > 0\n"
0 commit comments