-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Breakdown undesired allocations by shard routing role #132235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 7 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
2806d30
Breakdown undesired allocations by tier
nicktindall 0ee7997
Merge branch 'main' into undesired_by_tier
nicktindall 24c2887
Break down by shard routing role instead of tier
nicktindall 5fbd316
Merge remote-tracking branch 'nt-elastic/main' into undesired_by_tier
nicktindall 47c4516
Remove remaining cruft
nicktindall 5ba30bb
Naming
nicktindall 5fb7f21
Only output roles that have undesired allocations
nicktindall e614d67
Merge branch 'main' into undesired_by_tier
nicktindall bef60ec
Break down allocation stats by tier for IngestMetricsService
nicktindall 78ee79a
Return undesired allocations ratio as double (to match setting)
nicktindall 89eee2c
Tidy up DesiredBalanceMetrics
nicktindall 6f41c89
Only publish when node is master AND we have done a reconciliation
nicktindall 1b0d35a
Use AllocationStats instead of map
nicktindall 9725a88
Merge remote-tracking branch 'origin/main' into undesired_by_tier
nicktindall e3e84bb
Fix naming, some pedantry
nicktindall 9b5b4b8
Merge remote-tracking branch 'nt-elastic/undesired_by_tier' into unde…
nicktindall File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to collect it all in a map of roles, then shouldn't it also replace
totalAllocationsandundesiredAllocationsExcludingShuttingDownNodessince those would be the values that the "default" role would have in stateful? but then you get into what to do for the values returned from desired balance API in serverless, and there you'd have to sum up index_only and search_only I guess, and sprinkle a bunch of asserts since index/search and default should be mutually exclusive in this map. Another option is to just add the specific break downs we need here? index/searchTierAllocations and index/searchTierUndesiredAllocations? I think you'd need both total and undesired since we're going to need the ratio in the autoscaler.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also how do we get these two stats (index/TierAllocations and indexTierUndesiredAllocations?) out of the balancer in the autoscaler? Is there a getter for these?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell we have no concept of "tier" in the ES codebase. There is role, but we seem to stop short of defining that as equivalent to a tier. I don't quite understand why but I don't want to make assumptions about what role means in regards to tiers (see
co.elastic.elasticsearch.stateless.allocation.StatelessAllocationDecider#canAllocateShardToNode, it's not even done there)if we make two specific fields, we're baking in the assumption that
Role.INDEX_ONLYmeans indexing tier,Role.SEARCH_ONLYmeans search tier andRole.DEFAULTmeans the only tier in a stateful deploymentSo we'd probably need to add assertions to trigger if these assumptions were no longer valid. I think having a map of role to counts (possibly
Role->record RoleStats(int total, int undesired)) at least leaves the interpretation of the roles to the context they're used in.Maybe these are valid assumptions? Perhaps there is some history regarding the modelling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a getter and put up a serverless PR to illustrate that flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:))) it feels like we're saying the same thing.
index/searchTierAllocations and index/searchTierUndesiredAllocationsare not good names, I agree with you. I was suggesting instead of the map, we could just add two more variables for total allocation and undesired allocation of index_only (or could be called promotable shards), since we don't need the rest. that's the only part of that map, that we use for now for the specific ticket that initiated this change. we could also use the map and ignore the rest, if you prefer, that's fine.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the map. It's a breakdown of those stats by node role. In this code we never cared about node role previously, and we should continue to not care.
If we pick out specific roles to populate fields we're baking in knowledge of the roles and their meaning in this code, I'd rather we tried to keep that knowledge in the stateless code, which is the only place we should see anything other than
DEFAULT, and the only place we're using the per-role stats.