Skip to content

Conversation

@ankikuma
Copy link
Contributor

@ankikuma ankikuma commented Mar 25, 2025

This is linked to serverless PR

@elasticsearchmachine elasticsearchmachine added v9.1.0 serverless-linked Added by automation, don't add manually labels Mar 25, 2025
@ankikuma ankikuma marked this pull request as ready for review April 8, 2025 15:43
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Apr 8, 2025
public Builder reshardAddShards(int shardCount) {
// Assert routingNumShards is null ?
// Assert numberOfShards > 0
public Builder reshardAddShards(int shardCount, IndexMetadata sourceMetadata) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the API this way makes it possible to accidentally supply the wrong metadata to the method, and we know we've already provided the right metadata in the builder's constructor.

I think the reason you're passing it in is that getIndexNumberOfRoutingShards wants a metadata object but we've already decomposed it into parts in the constructor. I think it's probably fine to just hold on to a reference to the whole thing for the life of the builder (i.e., this.indexMetadata = indexMetadata in the constructor or something), or to change the interface to getIndexNumberOfRoutingShards, which only has a handful of users that mostly pass in null. Maybe the first option is the simplest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I didn't like passing in the sourceMetadata either. But I don't know if I want to hold onto a reference to the whole thing because it looks like we would end up with some kind of recursion in toXContent() wouldn't we ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I hadn't noticed that when IndexMetadata goes over the wire it passes through the Builder interface.

Looking at MetadataCreateIndexService::getIndexNumberOfRoutingShards the only thing it actually uses sourceMetadata for is to call getNumberOfShards on it if it exists. So one option to avoid passing in this essentially redundant sourceMetadata field would be to refactor getIndexNumberOfRoutingShards a bit to have an inner method that just takes routingNumShards or 0 if metadata is null, which you could call directly from here, and then make the existing getIndexNumberOfRoutingShards(Settings indexSettings, @Nullable IndexMetadata sourceMetadata) just be something like return getIndexNumberOfRoutingShards(settings, sourceMetadata == null ? 0 : sourceMetadata.getRoutingNumShards()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure, I refactored the code.

@ankikuma ankikuma added the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@ankikuma ankikuma added the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@ankikuma ankikuma added the :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. label Apr 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@elasticsearchmachine elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team and removed needs:triage Requires assignment of a team area label labels Apr 18, 2025
Copy link
Contributor

@bcully bcully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just the one comment.

// Assert routingNumShards is null ?
// Assert numberOfShards > 0
if (shardCount % numberOfShards() != 0) {
public Builder reshardAddShards(int targetShardCount, final int sourceNumShards) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't sourceNumShards extractable from this.settings? Passing it in opens the door to the same kind of inconsistency I was worried about before when we were passing in the whole metadata.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I think I might have finally got it right this time !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Sorry about all the iteration.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left a comment.

// Assert routingNumShards is null ?
// Assert numberOfShards > 0
if (shardCount % numberOfShards() != 0) {
public Builder reshardAddShards(int targetShardCount, final int sourceNumShards) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a simple unittest of this functionality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an IT test to StatelessReshardIT in the linked stateless PR. Were you thinking of something different ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice as a small followup to have coverage in the same repo as the code (just in unit test form), instead of relying on an external repository, since they don't always change in lockstep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a follow up ticket for this.

@ankikuma ankikuma merged commit 1a7d630 into elastic:main May 8, 2025
17 checks passed
ywangd pushed a commit to ywangd/elasticsearch that referenced this pull request May 9, 2025
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. >non-issue serverless-linked Added by automation, don't add manually Team:Distributed Indexing Meta label for Distributed Indexing team v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants