Skip to content

Commit 7c87b07

Browse files
Speed up internal cluster tests by optimizing #awaitGlobalNettyThreadsFinish (#93890)
Using the #awaitInactivity call costs almost 1s every time it's called since that's the hard coded quiet period of the executor and its thread is mostly started at the end of a test (before #awaitGlobalNettyThreadsFinish) as a result of shutting down the nodes (the global pool is used for resolving pool shutdown listeners). By simply asserting that we don't have any tasks queued on it we get a stronger assertion and save a second per test suite almost. This is particularly helpful when running tests on a laptop with less parallelism. This partically saves over a minute when running `server:check` on my laptop.
1 parent 0b1c91c commit 7c87b07

File tree

3 files changed

+28
-15
lines changed

3 files changed

+28
-15
lines changed

test/framework/src/main/java/org/elasticsearch/test/ESIntegTestCase.java

Lines changed: 5 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2269,21 +2269,12 @@ public static void afterClass() throws Exception {
22692269
}
22702270

22712271
/**
2272-
* After the cluster is stopped, there are a few netty threads that can linger, so we wait for them to finish otherwise these
2273-
* lingering threads can intermittently trigger the thread leak detector.
2272+
* After the cluster is stopped, there are a few netty threads that can linger, so we make sure we don't leak any tasks on them.
22742273
*/
2275-
static void awaitGlobalNettyThreadsFinish() {
2276-
try {
2277-
GlobalEventExecutor.INSTANCE.awaitInactivity(5, TimeUnit.SECONDS);
2278-
} catch (InterruptedException e) {
2279-
Thread.currentThread().interrupt();
2280-
} catch (IllegalStateException e) {
2281-
if (e.getMessage().equals("thread was not started") == false) {
2282-
throw e;
2283-
}
2284-
// ignore since the thread was never started
2285-
}
2286-
2274+
static void awaitGlobalNettyThreadsFinish() throws Exception {
2275+
// Don't use GlobalEventExecutor#awaitInactivity. It will waste up to 1s for every call and we expect no tasks queued for it
2276+
// except for the odd scheduled shutdown task.
2277+
assertBusy(() -> assertEquals(0, GlobalEventExecutor.INSTANCE.pendingTasks()));
22872278
try {
22882279
ThreadDeathWatcher.awaitInactivity(5, TimeUnit.SECONDS);
22892280
} catch (InterruptedException e) {

test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -182,7 +182,7 @@
182182
@ThreadLeakScope(Scope.SUITE)
183183
@ThreadLeakLingering(linger = 5000) // 5 sec lingering
184184
@TimeoutSuite(millis = 20 * TimeUnits.MINUTE)
185-
@ThreadLeakFilters(filters = { GraalVMThreadsFilter.class })
185+
@ThreadLeakFilters(filters = { GraalVMThreadsFilter.class, NettyGlobalThreadsFilter.class })
186186
@LuceneTestCase.SuppressSysoutChecks(bugUrl = "we log a lot on purpose")
187187
// we suppress pretty much all the lucene codecs for now, except asserting
188188
// assertingcodec is the winner for a codec here: it finds bugs and gives clear exceptions.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
/*
2+
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
3+
* or more contributor license agreements. Licensed under the Elastic License
4+
* 2.0 and the Server Side Public License, v 1; you may not use this file except
5+
* in compliance with, at your election, the Elastic License 2.0 or the Server
6+
* Side Public License, v 1.
7+
*/
8+
9+
package org.elasticsearch.test;
10+
11+
import com.carrotsearch.randomizedtesting.ThreadFilter;
12+
13+
/**
14+
* Excludes threads started by {@link io.netty.util.concurrent.GlobalEventExecutor} which are static per-JVM and reused across test suits.
15+
* We make sure we don't leak any tasks on them in {@link ESIntegTestCase#awaitGlobalNettyThreadsFinish()}.
16+
*/
17+
public class NettyGlobalThreadsFilter implements ThreadFilter {
18+
@Override
19+
public boolean reject(Thread t) {
20+
return t.getName().startsWith("globalEventExecutor");
21+
}
22+
}

0 commit comments

Comments
 (0)