Skip to content

Conversation

@original-brownbear
Copy link
Contributor

No need to do this so complicated, just count down one when we're actually done with a specific shard id.

No need to do this so complicated, just count down one when we're actually done with a specific shard id.
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jan 22, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@piergm piergm self-requested a review January 22, 2025 10:14
@original-brownbear
Copy link
Contributor Author

For others: This is a nice step on the way to #118490 (in addition to just being a considerable simplification) since it aligns the counting with the way it's done for batched execution in that PR.

@original-brownbear
Copy link
Contributor Author

This also removes all uses of the group shards iterator interfaces and makes #116891 a trivial to understand cleanup.

@javanna javanna requested review from javanna and removed request for piergm February 3, 2025 08:39
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments. Shall we remove the two unused methods from GroupShardsIterator as well (totalSize and totalSizeWith1ForEmpty) ?

// it's number of active shards but use 1 as the default if no replica of a shard is active at this point.
// on a per shards level we use shardIt.remaining() to increment the totalOps pointer but add 1 for the current shard result
// we process hence we add one for the non active partition here.
this.expectedTotalOps = shardsIts.totalSizeWith1ForEmpty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity, this was introduced with faefc77 . The intent was to account for inactive shards (that are in the process of being allocated) as 1, so that they are expected to fail, otherwise their failure would make the search complete earlier than expected, missing results from other active shards.

Together with it SearchWhileCreatingIndexTests was added, which is now called SearchWhileCreatingIndexIT. The test went through a lot of changes and refactoring over time, and it was also quite problematic and flaky in the early days. Funnily enough, if I remove the counting of the empty group as 1 (and call totalSize instead), this specific test still succeeds. I may not have run it enough times to cause failures, or perhaps the issue that this was fixing no longer manifests. Either way, a lot of other tests fail due to too many ops executed compared to the expected ops, because the counting also needs to be adjusted accordingly (which is expected).

In principle, I agree that counting each shard as 1, regardless of how many copies it has whether that be inactive, primary only, one replica or multiple replicas is simpler.

// but its range was available in the IndexMetadata, in that
// case the shardsIt.remaining() would be 0, expectedTotalOps
// accounts for unavailable shards too.
remainingOpsOnIterator = Math.max(shardsIt.remaining(), 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely good to get rid of this special case.

@original-brownbear
Copy link
Contributor Author

@javanna can we just remove the whole iterator class next as suggested in #116891 ? :) It's an absolutely trivial change after this, the methods you mention are essentially all there is to that class today?

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a minor comment for readability. I would still remove the unused methods from GroupShardsIterator, leaving the removal of the entire class for a follow-up. If you prefer to leave the methods and do the removal later, that's also good, but I would not do it in this PR.

@original-brownbear
Copy link
Contributor Author

Thanks Luca :)

Removed the methods now and fixed the decrementAndGet (only used the other direction before because of the var handle, much cleaner now) I'll update my follow up PR for the removal of the class :)

@original-brownbear original-brownbear merged commit e4fd6c0 into elastic:main Feb 5, 2025
17 checks passed
@original-brownbear original-brownbear deleted the simplify-abstract-async-search branch February 5, 2025 10:56
@original-brownbear
Copy link
Contributor Author

@javanna I updated #116891 which should be a trivial change after this one :)

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Feb 10, 2025
No need to do this so complicated, just count down one when we're actually done with a specific shard id.
original-brownbear added a commit that referenced this pull request Feb 11, 2025
No need to do this so complicated, just count down one when we're actually done with a specific shard id.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants