Cluster state and recovery constructs for in-place shard split#20979
Cluster state and recovery constructs for in-place shard split#20979vikasvb90 wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit 8dbd619)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 8dbd619 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 8450d14
Suggestions up to commit 8600fa4
|
|
❌ Gradle check result for 8600fa4: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
8600fa4 to
8450d14
Compare
|
Persistent review updated to latest commit 8450d14 |
8450d14 to
8dbd619
Compare
|
Persistent review updated to latest commit 8dbd619 |
|
❌ Gradle check result for 8dbd619: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Failed to generate code suggestions for PR |
|
❌ Gradle check result for 7f2da9b: ABORTED Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Failed to generate code suggestions for PR |
|
❌ Gradle check result for d78724a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
| * Applies a shard split request to the given cluster state. Updates the split metadata | ||
| * in the index metadata and triggers a reroute so that child shards get allocated. | ||
| */ | ||
| static ClusterState applyShardSplitRequest( |
There was a problem hiding this comment.
we should have separate check to not allow split if cluster is in mixed mode during upgrades to have a guardrail
There was a problem hiding this comment.
Do you mean for the version lower than the split supported version? In other version upgrades we can support this. And for a version upgrade from a non-supported split version to a supported split version, it is implicitly handled because the REST action as well as the admin client (invoked from a plugin) won't be available on lower version cluster manager.
If you want I can still put an additional check here.
There was a problem hiding this comment.
Actually I am thinking that we should prevent splitting a shard in all cases where it is now supposed to move off the node except shard rebalancing case.
And if split already started then we should cancel it and let the shard relocate in cases above.
What say?
There was a problem hiding this comment.
I have added multiple checks now to meet shard split criteria now in this class. Cancellation of split will happen later as part of a new cluster state update so that will come in next PR. I will also need to add cancellation reason in _cat/recovery API to inform user why split was cancelled.
| /** | ||
| * Creates a new allocation id for a child shard that is the result of a split. | ||
| */ | ||
| public static AllocationId newTargetSplit(AllocationId allocationId, String childAllocId) { |
There was a problem hiding this comment.
the above method generates a transient value and when do we call this method? Not clear on the flow here.
There was a problem hiding this comment.
Ya there are actually missing pieces to avoid making this PR huge. But your concern is very valid. So linking the source code from my personal repo here. For this comment,
newSplitis called on parent shard to generate N random UUIDs and store them in splitChildAllocationIds on the parent's allocation object.newTargetSplitis called on each child shard. Takes one of the generated UUIDs, and creates a link to its parent by storingparentAllocationId.
This is same as relocation flow where in newRelocation source shard generates random UUID for target shard where target shard's id becomes this generated UUID. And this is how both are linked together.
For split, these generations happen in createRecoveringChildShards of RoutingNodes
| /** | ||
| * Returns the recovering child shards of this splitting shard, or null if not splitting. | ||
| */ | ||
| public ShardRouting[] getRecoveringChildShards() { |
There was a problem hiding this comment.
getRecoveringChildShards() returns the raw ShardRouting[] array directly. Callers can mutate the array contents, breaking the immutability contract of ShardRouting. It should return a defensive copy or an unmodifiable view (e.g., Arrays.copyOf or expose as List).
can you look into this
d78724a to
7b2c720
Compare
|
Failed to generate code suggestions for PR |
1 similar comment
|
Failed to generate code suggestions for PR |
7b2c720 to
b1ffae2
Compare
|
Failed to generate code suggestions for PR |
1 similar comment
|
Failed to generate code suggestions for PR |
|
❌ Gradle check result for b1ffae2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: vikasvb90 <vikasvb@amazon.com>
b1ffae2 to
f3dfee0
Compare
|
Failed to generate code suggestions for PR |
Description
Add cluster state infrastructure for in-place shard split
This PR adds the cluster state update service and supporting POJO changes needed to trigger an in-place shard split.
Changes:
Cluster state update service:
Routing POJO changes to support shard split lifecycle:
The REST API is not exposed in this PR. The routing allocation logic to actually assign child shards to nodes will follow in a subsequent PR.
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.