-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Add support for delegating write to split target #136241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add support for delegating write to split target #136241
Conversation
…itRequestOnSourceTwoPass Refresh
…com:ankikuma/elasticsearch into 09162025/ReshardSplitRequestOnSourceTwoPass merged
…itRequestOnSourceTwoPass Refresh
…com:ankikuma/elasticsearch into 09162025/ReshardSplitRequestOnSourceTwoPass pull
…itRequestOnSourceTwoPass Refresh
…itRequestOnSourceTwoPass Refresh
…itRequestOnSourceTwoPass Refresh
…itRequestOnSourceTwoPass Refresh
…com:ankikuma/elasticsearch into 09162025/ReshardSplitRequestOnSourceTwoPass Pull
…itRequestOnSourceTwoPass Refresh
…com:ankikuma/elasticsearch into ankikuma-09162025/ReshardSplitRequestOnSourceTwoPass
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
…hardSplitRequestOnSourceTwoPass
…hardSplitRequestOnSourceTwoPass
...r/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/bulk/ShardBulkSplitHelper.java
Outdated
Show resolved
Hide resolved
public static Tuple<BulkShardResponse, Exception> combineResponses( | ||
BulkShardRequest originalRequest, | ||
Map<ShardId, BulkShardRequest> splitRequests, | ||
Map<ShardId, Tuple<BulkShardResponse, Exception>> responses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need Either
type :)
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Outdated
Show resolved
Hide resolved
LGTM |
…hardSplitRequestOnSourceTwoPass
…hardSplitRequestOnSourceTwoPass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I had a few questions/suggestions but nothing major. We probably want some more end-to-end ITs when this lands.
|
||
private ShardBulkSplitHelper() {} | ||
|
||
public static Map<ShardId, BulkShardRequest> splitRequests(BulkShardRequest request, ProjectMetadata project) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we document this a bit? Like I'm assuming that the caller has blocked handoff here by taking permits. It looks like there's also an expectation that the caller won't call this function unless it has determined that the coordinator's shard summary doesn't match the shard's, so it doesn't need to fast-path for that.
And there looks like there's an expectation on the caller that if the request has no items, an entry for the source shard with no items will be returned, but that otherwise this function should not generate empty sub-requests (e.g., if it created a map eagerly for both source and target that would be a bug).
} | ||
} | ||
BulkShardResponse bulkShardResponse = new BulkShardResponse(originalRequest.shardId(), bulkItemResponses); | ||
// TODO: Decide how to handle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you looked at consumers of this? If so could you leave a breadcrumb to your investigation? If not, we should ticket that and link the ticket here.
import java.util.concurrent.ConcurrentHashMap; | ||
import java.util.function.Supplier; | ||
|
||
public class ReplicationSplitHelper< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to have a little javadoc explaining what this class is for if you get a chance
new ShardId(index, newShardId), | ||
shardNum -> new ArrayList<>() | ||
); | ||
shardRequests.add(new BulkItemRequest(bulkItemRequest.id(), bulkItemRequest.request())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we assert anything about the shard id in bulkItemRequest.request()
? From skimming code I think it's probably null (thin serialization) but I don't know if it always is, or how it would be used if it's not null.
*/ | ||
int route(IndexRouting indexRouting); | ||
|
||
int rerouteAtSourceDuringResharding(IndexRouting indexRouting); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc comment? I think they're nice for interface/abstract methods.
*/ | ||
public abstract int indexShard(IndexRequest indexRequest); | ||
|
||
public abstract int rerouteToTarget(IndexRequest indexRequest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc comment? :)
} | ||
return indexShard(indexRequest); | ||
} else if (addIdWithRoutingHash) { | ||
// TODO: is this correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're probably going to have to generate test cases for tsdb/logsdb
IndexMetadata indexMetadata = IndexMetadata.builder(indexName).settings(settings).build(); | ||
indexMetadata = IndexMetadata.builder(indexMetadata).reshardAddShards(2).build(); | ||
|
||
SplitShardCountSummary staleSummary = SplitShardCountSummary.fromInt(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally I'd prefer to generate this from a 1 shard metadata instead of assuming the serialization meaning, but I suppose if we changed serialization we'd notice.
This commit adds the logic to delegate bulk shard requests to the split
target when a primary receives a request from a stale coordinator.