Skip to content

Conversation

@timgrein
Copy link
Contributor

Fixes #121294

ensureStableCluster doesn't guarantee that the nodes already processed the rate limit assignments, but only guarantees that each node:

  • Responded to a health check
  • Didn't timeout
  • The cluster is fully connected

Therefore we need to wait (set the timeout to 15 seconds to begin with), which should give each node/calculator enough time to process the topology change and calculate the rate limit assignments. This is a pattern I saw throughout the whole ES codebase (using assertBusy(() -> {...})).

@timgrein timgrein added >test Issues or PRs that are addressing/adding tests v9.0.0 v8.18.0 v8.19.0 v9.1.0 labels Jan 31, 2025
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jan 31, 2025
@timgrein timgrein changed the title [Inference API] Wait for rate limit assignments to happen rate limit calculator tests [Inference API] Wait for rate limit assignments to happen in rate limit calculator tests Jan 31, 2025
@ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.SUITE, numDataNodes = 0)
public class InferenceServiceNodeLocalRateLimitCalculatorTests extends ESIntegTestCase {

private static final Integer RATE_LIMIT_ASSIGNMENT_MAX_WAIT_TIME_IN_SECONDS = 15;
Copy link
Contributor Author

@timgrein timgrein Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used once, but I wanted to make it obvious that we've a timeout used throughout this class. Therefore I've put it at the top as a constant.

@timgrein timgrein added :ml Machine learning Team:ML Meta label for the ML team labels Jan 31, 2025
@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jan 31, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@timgrein timgrein added the auto-backport Automatically create backport pull requests when merged label Jan 31, 2025
@timgrein timgrein merged commit 4642f15 into elastic:main Jan 31, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.0 Commit could not be cherrypicked due to conflicts
8.18 Commit could not be cherrypicked due to conflicts
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 121379

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending :ml Machine learning Team:ML Meta label for the ML team >test Issues or PRs that are addressing/adding tests v8.18.0 v8.19.0 v9.0.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] InferenceServiceNodeLocalRateLimitCalculatorTests class failing

3 participants