Skip to content

Commit cc4a604

Browse files
matthewabbottgeorgewallace
authored andcommitted
Add link to MAX_RETRY allocation explain docs (elastic#113657)
1 parent 673b58b commit cc4a604

File tree

4 files changed

+13
-5
lines changed

4 files changed

+13
-5
lines changed

docs/reference/cluster/allocation-explain.asciidoc

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,7 @@ node.
159159
<5> The decider which led to the `no` decision for the node.
160160
<6> An explanation as to why the decider returned a `no` decision, with a helpful hint pointing to the setting that led to the decision. In this example, a newly created index has <<indices-get-settings,an index setting>> that requires that it only be allocated to a node named `nonexistent_node`, which does not exist, so the index is unable to allocate.
161161

162+
[[maximum-number-of-retries-exceeded]]
162163
====== Maximum number of retries exceeded
163164

164165
The following response contains an allocation explanation for an unassigned
@@ -195,17 +196,19 @@ primary shard that has reached the maximum number of allocation retry attempts.
195196
{
196197
"decider": "max_retry",
197198
"decision" : "NO",
198-
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2024-07-30T21:04:12.166Z], failed_attempts[5], failed_nodes[[mEKjwwzLT1yJVb8UxT6anw]], delayed=false, details[failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException], allocation_status[deciders_no]]]"
199+
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [POST /_cluster/reroute?retry_failed] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2024-07-30T21:04:12.166Z], failed_attempts[5], failed_nodes[[mEKjwwzLT1yJVb8UxT6anw]], delayed=false, details[failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException], allocation_status[deciders_no]]]"
199200
}
200201
]
201202
}
202203
]
203204
}
204205
----
205206
// NOTCONSOLE
206-
207-
If decider message indicates a transient allocation issue, use
208-
the <<cluster-reroute,cluster reroute>> API to retry allocation.
207+
When Elasticsearch is unable to allocate a shard, it will attempt to retry allocation up to
208+
the maximum number of retries allowed. After this, Elasticsearch will stop attempting to
209+
allocate the shard in order to prevent infinite retries which may impact cluster
210+
performance. Run the <<cluster-reroute,cluster reroute>> API to retry allocation, which
211+
will allocate the shard if the issue preventing allocation has been resolved.
209212

210213
[[no-valid-shard-copy]]
211214
====== No valid shard copy

server/src/main/java/org/elasticsearch/cluster/routing/allocation/decider/MaxRetryAllocationDecider.java

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
import org.elasticsearch.cluster.routing.ShardRouting;
1515
import org.elasticsearch.cluster.routing.UnassignedInfo;
1616
import org.elasticsearch.cluster.routing.allocation.RoutingAllocation;
17+
import org.elasticsearch.common.ReferenceDocs;
1718
import org.elasticsearch.common.settings.Setting;
1819

1920
/**
@@ -72,9 +73,11 @@ private static Decision debugDecision(Decision decision, UnassignedInfo info, in
7273
return Decision.single(
7374
Decision.Type.NO,
7475
NAME,
75-
"shard has exceeded the maximum number of retries [%d] on failed allocation attempts - manually call [%s] to retry, [%s]",
76+
"shard has exceeded the maximum number of retries [%d] on failed allocation attempts - "
77+
+ "manually call [%s] to retry, and for more information, see [%s] [%s]",
7678
maxRetries,
7779
RETRY_FAILED_API,
80+
ReferenceDocs.ALLOCATION_EXPLAIN_MAX_RETRY,
7881
info.toString()
7982
);
8083
} else {

server/src/main/java/org/elasticsearch/common/ReferenceDocs.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ public enum ReferenceDocs {
8282
FORMING_SINGLE_NODE_CLUSTERS,
8383
CIRCUIT_BREAKER_ERRORS,
8484
ALLOCATION_EXPLAIN_NO_COPIES,
85+
ALLOCATION_EXPLAIN_MAX_RETRY,
8586
// this comment keeps the ';' on the next line so every entry above has a trailing ',' which makes the diff for adding new links cleaner
8687
;
8788

server/src/main/resources/org/elasticsearch/common/reference-docs-links.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,4 @@ X_OPAQUE_ID api-conventions.
4444
FORMING_SINGLE_NODE_CLUSTERS modules-discovery-bootstrap-cluster.html#modules-discovery-bootstrap-cluster-joining
4545
CIRCUIT_BREAKER_ERRORS circuit-breaker-errors.html
4646
ALLOCATION_EXPLAIN_NO_COPIES cluster-allocation-explain.html#no-valid-shard-copy
47+
ALLOCATION_EXPLAIN_MAX_RETRY cluster-allocation-explain.html#maximum-number-of-retries-exceeded

0 commit comments

Comments
 (0)