@@ -59,10 +59,10 @@ kill-timeout
5959 :ref: `job_termination ` below for details.
6060
6161max-kill-count
62- (optional) The maximum number of times a job will be sent ``kill-signal ``
63- before the execution system will consider the job unkillable and drains
64- the node. The default is 8. See :ref: `job_termination ` below for details.
65- for details.
62+ (optional) The maximum number of times ``kill-signal `` will be sent to the
63+ job shell before the execution system considers the job unkillable and
64+ drains the node. The default is 8. See :ref: `job_termination ` below for
65+ details.
6666
6767term-signal
6868 (optional) A string specifying an alternate signal to ``SIGTERM `` when
@@ -131,12 +131,15 @@ the following sequence
131131 - The job shells are notified to send ``term-signal `` to job tasks, unless
132132 the job is being terminated due to a time limit, in which case ``SIGALRM ``
133133 is sent instead.
134- - After ``kill-timeout ``, any remaining shells are sent ``kill-signal ``
135- - This continues with an exponential backoff, with the timeout doubling
136- after each attempt (capped at 300s)
137- - After a total of ``max-kill-count `` attempts, any nodes still running
138- processes are drained with the message: "unkillable user processes for job
139- JOBID"
134+ - After ``kill-timeout ``, job shells are notified to send ``kill-signal `` to
135+ tasks. This repeats every ``kill-timeout `` seconds.
136+ - After a delay of ``5*kill-timeout ``, the job execution system transitions
137+ to sending ``kill-signal `` to the job shells directly.
138+ - This continues with an exponential backoff starting at ``kill-timeout ``,
139+ with the timeout doubling after each attempt (capped at 300s).
140+ - After a total of ``max-kill-count `` attempts to signal the job shell,
141+ any nodes still running processes are drained with the message: "unkillable
142+ user processes for job JOBID."
140143
141144EXAMPLES
142145========
0 commit comments