Skip to content

Commit 951ee9e

Browse files
committed
Auto-restart slurmctld on failure after 1 second.
Signed-off-by: Giacomo Marciani <[email protected]>
1 parent de6eacd commit 951ee9e

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

cookbooks/aws-parallelcluster-slurm/templates/default/slurm/head_node/slurmctld.service.erb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ ExecReload=/bin/kill -HUP $MAINPID
1212
LimitNOFILE=562930
1313
LimitMEMLOCK=infinity
1414
LimitSTACK=infinity
15+
Restart=on-failure
16+
RestartSec=1s
1517

1618
[Install]
1719
WantedBy=multi-user.target

0 commit comments

Comments
 (0)