-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Add node weight changes to balance round summaries #122195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add node weight changes to balance round summaries #122195
Conversation
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
| * @param numberOfBalancingRounds How many balancing round summaries are combined in this report. | ||
| * @param numberOfShardMoves The sum of shard moves for each balancing round being combined into a single summary. | ||
| */ | ||
| public record CombinedBalancingRoundSummary(int numberOfBalancingRounds, long numberOfShardMoves) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this into the BalancingRoundSummary class because I wanted to make it more obvious that both should be updated at the same time. I have not yet thought of a way to enforce the relationship.
| /** | ||
| * Summarizes the work required to move from an old to new desired balance shard allocation. | ||
| */ | ||
| private BalancingRoundSummary calculateBalancingRoundSummary(DesiredBalance oldDesiredBalance, DesiredBalance newDesiredBalance) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this logic into the AllocationBalancingRoundSummaryService to make it more unit testable. The summaries are also going to become more complex as I add more summary stats/metrics: that logic seems more appropriate in the summary service.
nicktindall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some minor comments, biggest concern is probably that delta in assertDoubleEquals
Again, sorry for the delay
...asticsearch/cluster/routing/allocation/allocator/AllocationBalancingRoundSummaryService.java
Show resolved
Hide resolved
.../main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancingRoundSummary.java
Show resolved
Hide resolved
.../main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancingRoundSummary.java
Outdated
Show resolved
Hide resolved
| balancerRoundSummaryService.addBalancerRoundSummary(calculateBalancingRoundSummary(oldDesiredBalance, newDesiredBalance)); | ||
| balancerRoundSummaryService.addBalancerRoundSummary( | ||
| AllocationBalancingRoundSummaryService.createBalancerRoundSummary(oldDesiredBalance, newDesiredBalance) | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I wonder if it's worth just changing the signature (and perhaps the name to reflect the change) of addBalancerRoundSummary to just take the old and new desired balances? then reduces some of the coupling between DesiredBalanceShardsAllocator and the AllocationBalancingRoundSummaryService?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could do that and keep the testability (it could just call a static method on itself that could be exposed for testing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand your suggestion. If I change the method to
public void addBalancerRoundSummary(DesiredBalance oldDesired, DesiredBalance newDesired)
Then for testing addBalancerRoundSummary I'll have to create DesiredBalance instances and reason about the BalancingRoundSummary that results. Right now it's simpler to test addBalancerRoundSummary.
The createBalancerRoundSummary method could still be tested the same way even if it was internal (exposed for testing, as you say). But addBalancerRoundSummary seems more complicated, unless I'm missing something 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I just meant maybe it's easier, for the caller, to provide an overload e.g.
class AllocationBalancerRoundSummary {
public void addBalancerRoundSummary(DesiredBalance old, DesiredBalance new) {
addBalancerRoundSummary(createBalancerRoundSummary(old, new));
}
public void addBalancerRoundSummary(BalancingRoundSummary summary) {
// ...
}
public static BalancingRoundSummary createBalancingRoundSummary(DesiredBalance old, DesiredBalance new) {
// ...
}
}That way DesiredBalanceShardAllocator doesn't need to know about createBalancingRoundSummary?
The overload being just a convenience that I'd argue doesn't need to be explicitly tested itself because it's just delegating to things that are tested.
But no strong feelings about this one. Up to you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, thanks for explaining. Sure, done in 3df5904
...search/cluster/routing/allocation/allocator/AllocationBalancingRoundSummaryServiceTests.java
Outdated
Show resolved
Hide resolved
.../java/org/elasticsearch/cluster/routing/allocation/allocator/BalancingRoundSummaryTests.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with the feedback: e99d3b9
I realized there was a little more cleanup I could do, per this comment. I also don't think I understand your suggestion in this other comment: could you take another look? @nicktindall
...search/cluster/routing/allocation/allocator/AllocationBalancingRoundSummaryServiceTests.java
Outdated
Show resolved
Hide resolved
.../java/org/elasticsearch/cluster/routing/allocation/allocator/BalancingRoundSummaryTests.java
Show resolved
Hide resolved
...asticsearch/cluster/routing/allocation/allocator/AllocationBalancingRoundSummaryService.java
Show resolved
Hide resolved
.../main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancingRoundSummary.java
Show resolved
Hide resolved
.../main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancingRoundSummary.java
Outdated
Show resolved
Hide resolved
| balancerRoundSummaryService.addBalancerRoundSummary(calculateBalancingRoundSummary(oldDesiredBalance, newDesiredBalance)); | ||
| balancerRoundSummaryService.addBalancerRoundSummary( | ||
| AllocationBalancingRoundSummaryService.createBalancerRoundSummary(oldDesiredBalance, newDesiredBalance) | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand your suggestion. If I change the method to
public void addBalancerRoundSummary(DesiredBalance oldDesired, DesiredBalance newDesired)
Then for testing addBalancerRoundSummary I'll have to create DesiredBalance instances and reason about the BalancingRoundSummary that results. Right now it's simpler to test addBalancerRoundSummary.
The createBalancerRoundSummary method could still be tested the same way even if it was internal (exposed for testing, as you say). But addBalancerRoundSummary seems more complicated, unless I'm missing something 🤔
.../main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancingRoundSummary.java
Show resolved
Hide resolved
nicktindall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The life cycle of the node weights during the balancer process:
So I'm grabbing the node weights from the DesiredBalance, comparing the previous and new DesiredBalance instances after a balancing round goes to update the current DesiredBalance reference.
I settled on summarizing the changes by saving the old DesiredBalance weights per node along with a weights diff to reach the new DesiredBalance's weights per node. This allows me to combine multiple summaries by using the oldest summary's base node weights, and summing the diffs across summaries to reach the final node weight diffs.