Skip to content

Commit d4069ef

Browse files
authored
Add note on troubleshooting laggy cancellations (#97485) (#97499)
Today we document that tasks may not react to cancellations immediately, but in practice it's surprising to users and kind of a bug if they run for too long after being cancelled. This commit adds a little extra detail about the information to collect to troubleshoot such a situation.
1 parent 7d11e41 commit d4069ef

File tree

2 files changed

+13
-7
lines changed

2 files changed

+13
-7
lines changed

docs/reference/cluster/nodes-hot-threads.asciidoc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@
66

77
Returns the hot threads on each selected node in the cluster.
88

9-
109
[[cluster-nodes-hot-threads-api-request]]
1110
==== {api-request-title}
1211

@@ -53,7 +52,9 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=node-id]
5352

5453
`threads`::
5554
(Optional, integer) Specifies the number of hot threads to provide
56-
information for. Defaults to `3`.
55+
information for. Defaults to `3`. If you are using this API for
56+
troubleshooting, set this parameter to a large number (e.g.
57+
`9999`) to get information about all the threads in the system.
5758

5859
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
5960

docs/reference/cluster/tasks.asciidoc

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -245,11 +245,16 @@ POST _tasks/_cancel?nodes=nodeId1,nodeId2&actions=*reindex
245245
--------------------------------------------------
246246

247247
A task may continue to run for some time after it has been cancelled because it
248-
may not be able to safely stop its current activity straight away. The list
249-
tasks API will continue to list these cancelled tasks until they complete. The
250-
`cancelled` flag in the response to the list tasks API indicates that the
251-
cancellation command has been processed and the task will stop as soon as
252-
possible.
248+
may not be able to safely stop its current activity straight away, or because
249+
{es} must complete its work on other tasks before it can process the
250+
cancellation. The list tasks API will continue to list these cancelled tasks
251+
until they complete. The `cancelled` flag in the response to the list tasks API
252+
indicates that the cancellation command has been processed and the task will
253+
stop as soon as possible. To troubleshoot why a cancelled task does not
254+
complete promptly, use the list tasks API with the `?detailed` parameter to
255+
identify the other tasks the system is running and also use the
256+
<<cluster-nodes-hot-threads>> API to obtain detailed information about the work
257+
the system is doing instead of completing the cancelled task.
253258

254259
===== Task Grouping
255260

0 commit comments

Comments
 (0)