Skip to content

Commit 2b690f1

Browse files
authored
Merge pull request #6793 from grondo/termination-doc-fix
doc: correct description of job termination in flux-config-exec(5)
2 parents bb62002 + c5a8f40 commit 2b690f1

File tree

1 file changed

+13
-10
lines changed

1 file changed

+13
-10
lines changed

doc/man5/flux-config-exec.rst

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,10 @@ kill-timeout
5959
:ref:`job_termination` below for details.
6060

6161
max-kill-count
62-
(optional) The maximum number of times a job will be sent ``kill-signal``
63-
before the execution system will consider the job unkillable and drains
64-
the node. The default is 8. See :ref:`job_termination` below for details.
65-
for details.
62+
(optional) The maximum number of times ``kill-signal`` will be sent to the
63+
job shell before the execution system considers the job unkillable and
64+
drains the node. The default is 8. See :ref:`job_termination` below for
65+
details.
6666

6767
term-signal
6868
(optional) A string specifying an alternate signal to ``SIGTERM`` when
@@ -131,12 +131,15 @@ the following sequence
131131
- The job shells are notified to send ``term-signal`` to job tasks, unless
132132
the job is being terminated due to a time limit, in which case ``SIGALRM``
133133
is sent instead.
134-
- After ``kill-timeout``, any remaining shells are sent ``kill-signal``
135-
- This continues with an exponential backoff, with the timeout doubling
136-
after each attempt (capped at 300s)
137-
- After a total of ``max-kill-count`` attempts, any nodes still running
138-
processes are drained with the message: "unkillable user processes for job
139-
JOBID"
134+
- After ``kill-timeout``, job shells are notified to send ``kill-signal`` to
135+
tasks. This repeats every ``kill-timeout`` seconds.
136+
- After a delay of ``5*kill-timeout``, the job execution system transitions
137+
to sending ``kill-signal`` to the job shells directly.
138+
- This continues with an exponential backoff starting at ``kill-timeout``,
139+
with the timeout doubling after each attempt (capped at 300s).
140+
- After a total of ``max-kill-count`` attempts to signal the job shell,
141+
any nodes still running processes are drained with the message: "unkillable
142+
user processes for job JOBID."
140143

141144
EXAMPLES
142145
========

0 commit comments

Comments
 (0)