Skip to content

Conversation

ywangd
Copy link
Member

@ywangd ywangd commented Oct 10, 2025

Flamegraph shows Balancer instantiation takes consider amount of time in an allocate call. More than 1/4 of the instantiation time is for computing disk related stats which is wasteful when the disk weight factor is zero. This PR skips these computations in such case.

Flamegraph shows Balancer instantiation takes consider amount of time in
an allocate call. More than 1/4 of the instantiation time is for
computing disk related stats which is wasteful when the disk weight
factor is zero. This PR skips these computations in such case.
@ywangd ywangd requested a review from nicktindall October 10, 2025 05:19
@ywangd ywangd added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v9.3.0 labels Oct 10, 2025
Comment on lines 331 to 334
avgDiskUsageInBytesPerNode = skipDiskUsageCalculation
? 0
: WeightFunction.avgDiskUsageInBytesPerNode(allocation.clusterInfo(), metadata, routingNodes);
nodes = Collections.unmodifiableMap(buildModelFromAssigned(skipDiskUsageCalculation));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost saving is realistic. My main question is whether the approach is considered hacky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about if rather than having the additional flag, we passed the weighting around, and the expensive parts could only perform the calculation if the weighting was non-zero?

I'm not sure if that's better, but it's a thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could refer to this.balancingWeights perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below flamegraph (from many-shards benchmark) that shows the time spent on disk related computation (purple color) inside allocate calls.
Screenshot 2025-10-10 at 4 40 09 pm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We synced offline and agreed to change the boolean flag to be a method on BalancingWeights.

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Oct 10, 2025
@ywangd ywangd marked this pull request as ready for review October 13, 2025 04:50
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Oct 13, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

Copy link
Contributor

@nicktindall nicktindall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some comments, but nothing worth holding the change up for.

Map.of(),
Map.of(),
Map.of()
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could use ClusterInfo.builder().shardSizes(...).build()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot there is builder. Pushed b7e22fc looks much nicer! Thanks!

private float maxShardSizeBytes(ProjectIndex index) {
if (balancingWeights.diskUsageIgnored()) {
return 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit I have minor apprehensions about. But it seems like it's not an easy one to skip on the caller side. And we do have that information available to us here via the balancingWeights.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove this change if you prefer. It does not really show up in the flamegraph. So I could be over-zealous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe that would be nicer. The behaviour seems a little surprising potentially. It would seem safer if the caller was skipping the call rather than the callee just returning zero.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that we can check it at the call site. Not sure why I initially thought it was not feasible ... Pushed 0affb99
Let me know if this works for you. Thanks!

Copy link
Contributor

@nicktindall nicktindall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, ship it!

@ywangd ywangd merged commit c9d59a1 into elastic:main Oct 14, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue serverless-linked Added by automation, don't add manually Team:Distributed Coordination Meta label for Distributed Coordination team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants