Skip to content

Conversation

nicktindall
Copy link
Contributor

@nicktindall nicktindall commented Jul 30, 2025

Change ClusterInfo to contain a list of utilization samples rather than a single one.

I decided not to include the information about hot spotting in this, because it's hard to imagine what it'll look like without having done the work to calculate it & interpret it.

Happy to add it if we think we are set on what that looks like.

newUtilizationSamples.add(
new NodeUsageStatsForThreadPools.UtilizationSample(previousUtilization.instant(), newWritePoolUtilization)
);
return new NodeUsageStatsForThreadPools.ThreadPoolUsageStats(writeThreadPoolStats.numberOfThreads(), newUtilizationSamples);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simulator will replace the most recent utilization value with the new one. We could instead add one to the end, but then what timestamp would we put on it? 🤷

Hard to know the best strategy without knowing how the determination of "hot spotting" is made, and whether we care about that in the simulator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hot-spot detection should be same for real data and simulation. How about simulate samples? Maybe apply fixed value to all of them and then run hot-spot detection?

@nicktindall nicktindall added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Jul 30, 2025
public record UtilizationSample(Instant instant, float utilization) implements Writeable {

@Override
public boolean equals(Object o) {
Copy link
Contributor Author

@nicktindall nicktindall Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were equals, hashCode and toString methods on these records, I'm not sure if they were once classes or we're doing something special here? I removed them in lieu of the ones you get for free with a record.

@nicktindall nicktindall changed the title Multiple utilization samples & hot-spotting indicator in write load decider Multiple utilization samples in write load decider Jul 30, 2025
# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
@nicktindall nicktindall marked this pull request as ready for review July 31, 2025 00:19
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Jul 31, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we delayed this change until we've got to a point where we can see from production experience that this sort of thing will be needed.

@nicktindall nicktindall deleted the cluster_info_for_write_decider branch September 3, 2025 04:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants