-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Calculate routing num shards correctly during reshard #125601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calculate routing num shards correctly during reshard #125601
Conversation
…tingNumShards refresh
…tingNumShards Refresh elasticsearch branch
…tingNumShards Refresh elasticsearch
| public Builder reshardAddShards(int shardCount) { | ||
| // Assert routingNumShards is null ? | ||
| // Assert numberOfShards > 0 | ||
| public Builder reshardAddShards(int shardCount, IndexMetadata sourceMetadata) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the API this way makes it possible to accidentally supply the wrong metadata to the method, and we know we've already provided the right metadata in the builder's constructor.
I think the reason you're passing it in is that getIndexNumberOfRoutingShards wants a metadata object but we've already decomposed it into parts in the constructor. I think it's probably fine to just hold on to a reference to the whole thing for the life of the builder (i.e., this.indexMetadata = indexMetadata in the constructor or something), or to change the interface to getIndexNumberOfRoutingShards, which only has a handful of users that mostly pass in null. Maybe the first option is the simplest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I didn't like passing in the sourceMetadata either. But I don't know if I want to hold onto a reference to the whole thing because it looks like we would end up with some kind of recursion in toXContent() wouldn't we ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I hadn't noticed that when IndexMetadata goes over the wire it passes through the Builder interface.
Looking at MetadataCreateIndexService::getIndexNumberOfRoutingShards the only thing it actually uses sourceMetadata for is to call getNumberOfShards on it if it exists. So one option to avoid passing in this essentially redundant sourceMetadata field would be to refactor getIndexNumberOfRoutingShards a bit to have an inner method that just takes routingNumShards or 0 if metadata is null, which you could call directly from here, and then make the existing getIndexNumberOfRoutingShards(Settings indexSettings, @Nullable IndexMetadata sourceMetadata) just be something like return getIndexNumberOfRoutingShards(settings, sourceMetadata == null ? 0 : sourceMetadata.getRoutingNumShards()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sure, I refactored the code.
|
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
…tingNumShards Refresh ES branch
…tingNumShards merge with main
…tingNumShards Refresh elasticsearch branch
…tingNumShards refresh branch
bcully
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just the one comment.
| // Assert routingNumShards is null ? | ||
| // Assert numberOfShards > 0 | ||
| if (shardCount % numberOfShards() != 0) { | ||
| public Builder reshardAddShards(int targetShardCount, final int sourceNumShards) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't sourceNumShards extractable from this.settings? Passing it in opens the door to the same kind of inconsistency I was worried about before when we were passing in the whole metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I think I might have finally got it right this time !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry about all the iteration.
henningandersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, left a comment.
| // Assert routingNumShards is null ? | ||
| // Assert numberOfShards > 0 | ||
| if (shardCount % numberOfShards() != 0) { | ||
| public Builder reshardAddShards(int targetShardCount, final int sourceNumShards) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a simple unittest of this functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an IT test to StatelessReshardIT in the linked stateless PR. Were you thinking of something different ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice as a small followup to have coverage in the same repo as the code (just in unit test form), instead of relying on an external repository, since they don't always change in lockstep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a follow up ticket for this.
…tingNumShards refresh branch
This is linked to serverless PR