-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Remove test dependencies on cluster state API master waiting #129118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As preparation for running the cluster state API on the local node, we need to update these tests that currently depend on that API running on (and waiting for) the master node. Relates elastic#127212
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
| } | ||
|
|
||
| /** | ||
| * Waits for all nodes in the cluster to have a consistent view of which node is currently the master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A "consistent view" is maybe not the right phrasing as it's not technically consistent (i.e. other cluster states can come in after that moment), but I couldn't come up with a better phrasing. Suggestions are welcome.
| ); | ||
| entries.add(new NamedWriteableRegistry.Entry(NamedDiff.class, ModelRegistryMetadata.TYPE, ModelRegistryMetadata::readDiffFrom)); | ||
|
|
||
| doEnsureClusterStateConsistency(new NamedWriteableRegistry(entries)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I'd forgotten that these tests explicitly customize this method. I'd like to get some opinion from the ML team about whether they think this stuff is all adequately covered elsewhere before just removing it like this.
We could instead make doEnsureClusterStateConsistency just pick any of the states it observes as the "master" state - there's no real need for it to have come from the actual master. Since an external test cluster only has one client, it's not really checking consistency across the cluster any more with this change but it is at least checking that it round-trips through its wire representation faithfully. I suspect there's value in such a test, given that all this ML-specific customization relates to named writeables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is at least checking that it round-trips through its wire representation faithfully
Since we're only able to obtain one cluster state (i.e. the one from the single client the external test cluster exposes), what "round trip" would we be testing exactly? We can check the "one way trip" of serializing the cluster state to XContent, but we don't have any other cluster state to compare it to. We could parse the XContent back to a cluster state, but this doesn't feel like the right place to test cluster state SerDer. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
org.elasticsearch.test.ESIntegTestCase#doEnsureClusterStateConsistency round-trips the state through its transport representation again - see the calls to ClusterState.Builder.toBytes and ClusterState.Builder.fromBytes therein.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some round-trip assertion logic to MlINativeIntegTestCase. I put it there instead of EsIntegTestCase as that is the only usage of ExternalTestCluster (and no new usages should be created as it's deprecated), so we avoid some complexity in EsIntegTestCase. Let me know what you think.
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Nodes that are not publish quorum may not have applied the latest update when awaitMasterNode returns. This PR fixes it by waiting for the desired number of nodes. Relates: elastic#129118 Resolves: elastic#130883
The original cluster should be properly formed before the tests kick off. This PR ensures that. Relates: elastic#129118 Resolves: elastic#130883 Resolves: elastic#130979
The original cluster should be properly formed before the tests kick off. This PR ensures that. Relates: elastic#129118 Resolves: elastic#130883 Resolves: elastic#130979
As preparation for running the cluster state API on the local node, we need to update these tests that currently depend on that API running on (and waiting for) the master node.
Relates #127212