Skip to content

Commit 91035f8

Browse files
authored
(Doc+) Allocation Explain Examples: THROTTLED, MAX_RETRY (#111558) (#112104)
Adds [Allocation Explain examples](https://www.elastic.co/guide/en/elasticsearch/reference/master/cluster-allocation-explain.html#cluster-allocation-explain-api-examples) for `THROTTLED` and `MAX_RETRY`. Also formats sub TOC so that we can after link code message to those docs.
1 parent f3f0d2e commit 91035f8

File tree

1 file changed

+100
-1
lines changed

1 file changed

+100
-1
lines changed

docs/reference/cluster/allocation-explain.asciidoc

Lines changed: 100 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ you might expect otherwise.
8181

8282
===== Unassigned primary shard
8383

84+
====== Conflicting settings
8485
The following request gets an allocation explanation for an unassigned primary
8586
shard.
8687

@@ -158,6 +159,56 @@ node.
158159
<5> The decider which led to the `no` decision for the node.
159160
<6> An explanation as to why the decider returned a `no` decision, with a helpful hint pointing to the setting that led to the decision. In this example, a newly created index has <<indices-get-settings,an index setting>> that requires that it only be allocated to a node named `nonexistent_node`, which does not exist, so the index is unable to allocate.
160161

162+
====== Maximum number of retries exceeded
163+
164+
The following response contains an allocation explanation for an unassigned
165+
primary shard that has reached the maximum number of allocation retry attempts.
166+
167+
[source,js]
168+
----
169+
{
170+
"index" : "my-index-000001",
171+
"shard" : 0,
172+
"primary" : true,
173+
"current_state" : "unassigned",
174+
"unassigned_info" : {
175+
"at" : "2017-01-04T18:03:28.464Z",
176+
"failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException",
177+
"reason": "ALLOCATION_FAILED",
178+
"failed_allocation_attempts": 5,
179+
"last_allocation_status": "no",
180+
},
181+
"can_allocate": "no",
182+
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
183+
"node_allocation_decisions" : [
184+
{
185+
"node_id" : "3sULLVJrRneSg0EfBB-2Ew",
186+
"node_name" : "node_t0",
187+
"transport_address" : "127.0.0.1:9400",
188+
"roles" : ["data_content", "data_hot"],
189+
"node_decision" : "no",
190+
"store" : {
191+
"matching_size" : "4.2kb",
192+
"matching_size_in_bytes" : 4325
193+
},
194+
"deciders" : [
195+
{
196+
"decider": "max_retry",
197+
"decision" : "NO",
198+
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2024-07-30T21:04:12.166Z], failed_attempts[5], failed_nodes[[mEKjwwzLT1yJVb8UxT6anw]], delayed=false, details[failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException], allocation_status[deciders_no]]]"
199+
}
200+
]
201+
}
202+
]
203+
}
204+
----
205+
// NOTCONSOLE
206+
207+
If decider message indicates a transient allocation issue, use
208+
<<cluster-reroute,the cluster reroute API>> to retry allocation.
209+
210+
====== No valid shard copy
211+
161212
The following response contains an allocation explanation for an unassigned
162213
primary shard that was previously allocated.
163214

@@ -184,6 +235,8 @@ TIP: If a shard is unassigned with an allocation status of `no_valid_shard_copy`
184235

185236
===== Unassigned replica shard
186237

238+
====== Allocation delayed
239+
187240
The following response contains an allocation explanation for a replica that's
188241
unassigned due to <<delayed-allocation,delayed allocation>>.
189242

@@ -241,8 +294,52 @@ unassigned due to <<delayed-allocation,delayed allocation>>.
241294
<2> The remaining delay before allocating the replica shard.
242295
<3> Information about the shard data found on a node.
243296

297+
====== Allocation throttled
298+
299+
The following response contains an allocation explanation for a replica that's
300+
queued to allocate but currently waiting on other queued shards.
301+
302+
[source,js]
303+
----
304+
{
305+
"index" : "my-index-000001",
306+
"shard" : 0,
307+
"primary" : false,
308+
"current_state" : "unassigned",
309+
"unassigned_info" : {
310+
"reason" : "NODE_LEFT",
311+
"at" : "2017-01-04T18:53:59.498Z",
312+
"details" : "node_left[G92ZwuuaRY-9n8_tc-IzEg]",
313+
"last_allocation_status" : "no_attempt"
314+
},
315+
"can_allocate": "throttled",
316+
"allocate_explanation": "Elasticsearch is currently busy with other activities. It expects to be able to allocate this shard when those activities finish. Please wait.",
317+
"node_allocation_decisions" : [
318+
{
319+
"node_id" : "3sULLVJrRneSg0EfBB-2Ew",
320+
"node_name" : "node_t0",
321+
"transport_address" : "127.0.0.1:9400",
322+
"roles" : ["data_content", "data_hot"],
323+
"node_decision" : "no",
324+
"deciders" : [
325+
{
326+
"decider": "throttling",
327+
"decision": "THROTTLE",
328+
"explanation": "reached the limit of incoming shard recoveries [2], cluster setting [cluster.routing.allocation.node_concurrent_incoming_recoveries=2] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
329+
}
330+
]
331+
}
332+
]
333+
}
334+
----
335+
// NOTCONSOLE
336+
337+
This is a transient message that might appear when a large amount of shards are allocating.
338+
244339
===== Assigned shard
245340

341+
====== Cannot remain on current node
342+
246343
The following response contains an allocation explanation for an assigned shard.
247344
The response indicates the shard is not allowed to remain on its current node
248345
and must be reallocated.
@@ -295,6 +392,8 @@ and must be reallocated.
295392
<2> The deciders that factored into the decision of why the shard is not allowed to remain on its current node.
296393
<3> Whether the shard is allowed to be allocated to another node.
297394

395+
====== Must remain on current node
396+
298397
The following response contains an allocation explanation for a shard that must
299398
remain on its current node. Moving the shard to another node would not improve
300399
cluster balance.
@@ -338,7 +437,7 @@ cluster balance.
338437
===== No arguments
339438

340439
If you call the API with no arguments, {es} retrieves an allocation explanation
341-
for an arbitrary unassigned primary or replica shard.
440+
for an arbitrary unassigned primary or replica shard, returning any unassigned primary shards first.
342441

343442
[source,console]
344443
----

0 commit comments

Comments
 (0)