Skip to content

Commit a29392c

Browse files
authored
Fix race in FileSettingsServiceIT.testSettingsAppliedOnStart (#134368)
This was failing very very rarely due to unfortunate timing conditions. Cluster state changes are applied to all nodes prior to being published on the master node itself. However, the cluster state listener was previously attached to the data node, allowing for a very short time window where the state update wasn't visible on the master node itself when checking in `assertClusterStateSaveOK`. This changes the test to attach the listener to the master node itself preventing above condition. I was initially worried it might be attached too late in cases, but I couldn't reproduce any more issues this way. > According to the dashboard, this started to fail on Monday (13/07). It definitely does not look like a test failure, so I'm assigning a medium priority, which we could raise if we discover this is a new bug. I couldn't find any related commit that might have caused this. Still wondering why this started failing around that time 🤔 Fixes #131210
1 parent a375c6e commit a29392c

File tree

2 files changed

+10
-7
lines changed

2 files changed

+10
-7
lines changed

muted-tests.yml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -411,9 +411,6 @@ tests:
411411
- class: org.elasticsearch.gradle.internal.transport.TransportVersionManagementPluginFuncTest
412412
method: cannot change committed ids to a branch
413413
issue: https://github.com/elastic/elasticsearch/issues/132790
414-
- class: org.elasticsearch.reservedstate.service.FileSettingsServiceIT
415-
method: testSettingsAppliedOnStart
416-
issue: https://github.com/elastic/elasticsearch/issues/131210
417414
- class: org.elasticsearch.packaging.test.ArchiveGenerateInitialCredentialsTests
418415
method: test40VerifyAutogeneratedCredentials
419416
issue: https://github.com/elastic/elasticsearch/issues/132877

server/src/internalClusterTest/java/org/elasticsearch/reservedstate/service/FileSettingsServiceIT.java

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -188,18 +188,23 @@ public void clusterChanged(ClusterChangedEvent event) {
188188
return new Tuple<>(savedClusterState, metadataVersion);
189189
}
190190

191-
private Tuple<CountDownLatch, AtomicLong> setupClusterStateListener(String node, long version) {
191+
private Tuple<CountDownLatch, AtomicLong> setupClusterStateListener(String node, long fileSettingsVersion) {
192192
ClusterService clusterService = internalCluster().clusterService(node);
193193
CountDownLatch savedClusterState = new CountDownLatch(1);
194194
AtomicLong metadataVersion = new AtomicLong(-1);
195195
clusterService.addListener(new ClusterStateListener() {
196196
@Override
197197
public void clusterChanged(ClusterChangedEvent event) {
198198
ReservedStateMetadata reservedState = event.state().metadata().reservedStateMetadata().get(FileSettingsService.NAMESPACE);
199-
if (reservedState != null && reservedState.version() == version) {
199+
if (reservedState != null && reservedState.version() == fileSettingsVersion) {
200200
clusterService.removeListener(this);
201201
metadataVersion.set(event.state().metadata().version());
202202
savedClusterState.countDown();
203+
logger.info(
204+
"done waiting for file settings [version: {}, metadata version: {}]",
205+
event.state().version(),
206+
event.state().metadata().version()
207+
);
203208
}
204209
}
205210
});
@@ -265,15 +270,16 @@ public void testSettingsAppliedOnStart() throws Exception {
265270
FileSettingsService dataFileSettingsService = internalCluster().getInstance(FileSettingsService.class, dataNode);
266271

267272
assertFalse(dataFileSettingsService.watching());
268-
var savedClusterState = setupClusterStateListener(dataNode, versionCounter.incrementAndGet());
273+
long expectedVersion = versionCounter.incrementAndGet();
269274

270275
// In internal cluster tests, the nodes share the config directory, so when we write with the data node path
271276
// the master will pick it up on start
272-
writeJSONFile(dataNode, testJSON, logger, versionCounter.get());
277+
writeJSONFile(dataNode, testJSON, logger, expectedVersion);
273278

274279
logger.info("--> start master node");
275280
final String masterNode = internalCluster().startMasterOnlyNode();
276281
awaitMasterNode(internalCluster().getNonMasterNodeName(), masterNode);
282+
var savedClusterState = setupClusterStateListener(masterNode, expectedVersion);
277283

278284
FileSettingsService masterFileSettingsService = internalCluster().getInstance(FileSettingsService.class, masterNode);
279285

0 commit comments

Comments
 (0)