Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1731,6 +1731,20 @@ public static boolean indexExists(String index, Client client) {
return getIndexResponse.getIndices().length > 0;
}

public static void ensureIndexExists(String index) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit, maybe awaitIndexExists to show that it will wait? Otherwise this reads to me as something that will immediately fail if the index doesn't exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep renamed as suggested. Thanks!

ensureIndexExists(index, SAFE_AWAIT_TIMEOUT.seconds(), SAFE_AWAIT_TIMEOUT.timeUnit());
}

public static void ensureIndexExists(String index, long timeout, TimeUnit unit) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we just always used SAFE_AWAIT_TIMEOUT (I suspect the one case where we wait 30s below is a bug and could be 10s). But if we do need to expose the timeout to callers could we use TimeValue in the API rather than long/TimeUnit? I'll fix up the safeGet overload to use TimeValue too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted the 30 seconds use cases. But still kept the method variant that takes a timeout parameter since the usage for remote cluster has a 120 seconds wait time. I suspect 120s is excessive. But probably justify to be longer than 10s.

safeGet(
clusterAdmin().prepareHealth(new TimeValue(timeout, unit), index)
.setIndicesOptions(IndicesOptions.LENIENT_EXPAND_OPEN_CLOSED)
.execute(),
timeout,
unit
);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised this PR as a draft since I am not sure whether this is what we want. There are scenarios that are not covered by this, e.g. assertBusy for index not exists, assertBusy for an index exists on a remote cluster.

For the former, I wonder whether it is easier to change the existing indexExists method to always talk to the master node. For the later, I don't have a great suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an integ test involving multiple clusters we have a client for each cluster so we should be able to do the same health API call there.

For the assertBusy/assertFalse waits that I could find it looks like it's ok to wait on any node - we're just checking that the index deletion has been committed, not that it has been applied everywhere, so need no change. For instance if the next thing the test does involves another cluster state update then that will of course wait for the previous one to complete. If that weren't the case, a GET _cluster/health?wait_for_events=LANGUID would do the trick I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions. Yeah the assertFalse case does not seem to be problematic. Unlike assertTrue case, the tests do not anything more against the deleted index. For the remote cluster case, I made the variant of the method to take a client parameter.


/**
* Syntactic sugar for enabling allocation for <code>indices</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ public void onFailure(Exception e) {
// the shrink index generated in the first attempt must've been deleted!
assertBusy(() -> assertFalse(indexExists(firstAttemptShrinkIndexName[0])));

assertBusy(() -> assertTrue(indexExists(secondCycleShrinkIndexName[0])), 30, TimeUnit.SECONDS);
ensureIndexExists(secondCycleShrinkIndexName[0], 30, TimeUnit.SECONDS);

// at this point, the second shrink attempt was executed and the manged index is looping into the `shrunk-shards-allocated` step as
// waiting for the huge numbers of replicas for the shrunk index to allocate. this will never happen, so let's unblock this
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ public void testSingleNodeCluster() throws Exception {
ClusterState clusterState = clusterAdmin().prepareState(TEST_REQUEST_TIMEOUT).get().getState();
RoutingNode routingNodeEntry1 = clusterState.getRoutingNodes().node(node1);
assertThat(routingNodeEntry1.numberOfShardsWithState(STARTED), equalTo(1));
assertBusy(() -> { assertTrue(indexExists("test")); });
ensureIndexExists("test");
IndexLifecycleService indexLifecycleService = internalCluster().getInstance(IndexLifecycleService.class, server_1);
assertThat(indexLifecycleService.getScheduler().jobCount(), equalTo(1));
assertNotNull(indexLifecycleService.getScheduledJob());
Expand Down Expand Up @@ -420,7 +420,7 @@ public void testMasterDedicatedDataDedicated() throws Exception {
RoutingNode routingNodeEntry1 = clusterState.getRoutingNodes().node(node2);
assertThat(routingNodeEntry1.numberOfShardsWithState(STARTED), equalTo(1));

assertBusy(() -> assertTrue(indexExists("test")));
ensureIndexExists("test");
assertBusy(() -> {
LifecycleExecutionState lifecycleState = clusterAdmin().prepareState(TEST_REQUEST_TIMEOUT)
.get()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -367,7 +367,7 @@ protected static void assertThatAuditMessagesMatch(String configId, String... ex
// Make sure we wrote to the audit
// Since calls to write the AbstractAuditor are sent and forgot (async) we could have returned from the start,
// finished the job (as this is a very short analytics job), all without the audit being fully written.
assertBusy(() -> assertTrue(indexExists(NotificationsIndex.NOTIFICATIONS_INDEX)));
ensureIndexExists(NotificationsIndex.NOTIFICATIONS_INDEX);

@SuppressWarnings("unchecked")
Matcher<String>[] itemMatchers = Arrays.stream(expectedAuditMessagePrefixes).map(Matchers::startsWith).toArray(Matcher[]::new);
Expand Down