Commit 68ea296
Fix flaky ResourceAwareTasksTests (opensearch-project#20863)
Race condition between request completion and task resource tracking
cleanup.
The sequence of events:
1. Task is cancelled via `CancelTasksRequest`
2. The node operation throws `TaskCancelledException`
3. The response is sent back to the caller, which counts down
`requestCompleteLatch`
4. The test's main thread wakes up from `requestCompleteLatch.await()`
and asserts `resourceTasks.size() == 0`
5. Meanwhile, `TaskResourceTrackingService.stopTracking()` (which
calls `resourceAwareTasks.remove()`) is invoked asynchronously
via a `resourceTrackingCompletionListener` registered in
`TaskManager.register()`
Steps 4 and 5 race. I was able to reproduce the failure locally using
`stess-ng` and verify this fix.
Signed-off-by: Andrew Ross <andrross@amazon.com>1 parent d10224b commit 68ea296
File tree
1 file changed
+3
-1
lines changed- server/src/test/java/org/opensearch/action/admin/cluster/node/tasks
1 file changed
+3
-1
lines changedLines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
410 | 410 | | |
411 | 411 | | |
412 | 412 | | |
413 | | - | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
414 | 416 | | |
415 | 417 | | |
416 | 418 | | |
| |||
0 commit comments