Aggs: Add cancellation checks to FilterByFilter aggregator #130452

ivancea · 2025-07-02T12:53:54Z

By default, the FilterByFilterAggregator (Used by the "filter" and "filters" aggs) was using the DefaultBulkScorer (From Lucene), which has no cancellation mechanism.

This PR wraps it into a CancellableBulkScorer, which instead calls the inner scorer with ranges, and checks cancellation between them.

This should solve cases of long-running tasks using these aggregators not being cancelled, or greatly reduce the time they take after cancellation.

elasticsearchmachine · 2025-07-02T12:54:21Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-07-02T12:56:31Z

Hi @ivancea, I've created a changelog YAML for you.

elasticsearchmachine · 2025-07-02T18:27:30Z

Hi @ivancea, I've updated the changelog YAML for you.

nik9000 · 2025-07-02T18:39:54Z

...rnalClusterTest/java/org/elasticsearch/search/aggregations/bucket/FiltersCancellationIT.java

+     *     As CancellableBulkScorer does a minimum of 4096 docs per batch, this number must be low to avoid long test times.
+     * </p>
+     */
+    private static final long SLEEP_SCRIPT_MS = 1;


Could you use a pair of Semaphors like so:

private static final Semaphore scriptRunPermits = new Semaphore(0); private static final Semaphore cancelRunPermits = new Semaphore(0); ... In the test: cancelRunPermits.acquire(); client().cancelTheRequest(); scriptRunPermits.release(Integer.MAX_VALUE); ... In the script: cancelRunPermits.release(1); scriptRunPermits.acquire();

I'm sure there's a simpler way to do it. But this'd block the cancel task until the the script starts. Then block the script until the cancel has finished.

Done! It looks far better, faster and consistent with a Repeat(1000)

…30452) By default, the `FilterByFilterAggregator` (Used by the `"filter"` and `"filters"` aggs) was using the `DefaultBulkScorer` (From Lucene), which has no cancellation mechanism. This PR wraps it into a `CancellableBulkScorer`, which instead calls the inner scorer with ranges, and checks cancellation between them. This should solve cases of long-running tasks using these aggregators not being cancelled, or greatly reduce the time they take after cancellation.

…130736) By default, the `FilterByFilterAggregator` (Used by the `"filter"` and `"filters"` aggs) was using the `DefaultBulkScorer` (From Lucene), which has no cancellation mechanism. This PR wraps it into a `CancellableBulkScorer`, which instead calls the inner scorer with ranges, and checks cancellation between them. This should solve cases of long-running tasks using these aggregators not being cancelled, or greatly reduce the time they take after cancellation.

…130745) By default, the `FilterByFilterAggregator` (Used by the `"filter"` and `"filters"` aggs) was using the `DefaultBulkScorer` (From Lucene), which has no cancellation mechanism. This PR wraps it into a `CancellableBulkScorer`, which instead calls the inner scorer with ranges, and checks cancellation between them. This should solve cases of long-running tasks using these aggregators not being cancelled, or greatly reduce the time they take after cancellation.

…130746) By default, the `FilterByFilterAggregator` (Used by the `"filter"` and `"filters"` aggs) was using the `DefaultBulkScorer` (From Lucene), which has no cancellation mechanism. This PR wraps it into a `CancellableBulkScorer`, which instead calls the inner scorer with ranges, and checks cancellation between them. This should solve cases of long-running tasks using these aggregators not being cancelled, or greatly reduce the time they take after cancellation.

…130744) By default, the `FilterByFilterAggregator` (Used by the `"filter"` and `"filters"` aggs) was using the `DefaultBulkScorer` (From Lucene), which has no cancellation mechanism. This PR wraps it into a `CancellableBulkScorer`, which instead calls the inner scorer with ranges, and checks cancellation between them. This should solve cases of long-running tasks using these aggregators not being cancelled, or greatly reduce the time they take after cancellation.

…130743) By default, the `FilterByFilterAggregator` (Used by the `"filter"` and `"filters"` aggs) was using the `DefaultBulkScorer` (From Lucene), which has no cancellation mechanism. This PR wraps it into a `CancellableBulkScorer`, which instead calls the inner scorer with ranges, and checks cancellation between them. This should solve cases of long-running tasks using these aggregators not being cancelled, or greatly reduce the time they take after cancellation.

Fixes #130770 Both a bigger thread pool and not draining the semaphore permits were leading to failing tests sometimes because of blocked threads (Too many threads searching would end up draining all permits in parallel, and getting stuck). The test was added in #130452

Fixes elastic#130770 Both a bigger thread pool and not draining the semaphore permits were leading to failing tests sometimes because of blocked threads (Too many threads searching would end up draining all permits in parallel, and getting stuck). The test was added in elastic#130452

Fixes #130770 Both a bigger thread pool and not draining the semaphore permits were leading to failing tests sometimes because of blocked threads (Too many threads searching would end up draining all permits in parallel, and getting stuck). The test was added in #130452

Fixes elastic#130770 Both a bigger thread pool and not draining the semaphore permits were leading to failing tests sometimes because of blocked threads (Too many threads searching would end up draining all permits in parallel, and getting stuck). The test was added in elastic#130452

Aggs: Add cancellation checks to FilterByFilter aggregator

f433d52

ivancea requested review from nik9000 and not-napoleon July 2, 2025 12:53

ivancea added :Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged v8.18.4 v8.17.9 v9.2.0 v8.19.1 v8.20.0 labels Jul 2, 2025

ivancea added the >enhancement label Jul 2, 2025

Update docs/changelog/130452.yaml

55fe6e4

ivancea added >bug and removed >enhancement labels Jul 2, 2025

ivancea and others added 3 commits July 2, 2025 16:17

Use CancellableBulkScorer in count

9dd64a6

Added tests to ensure cancellation

b268be9

Update docs/changelog/130452.yaml

db32181

nik9000 reviewed Jul 2, 2025

View reviewed changes

ivancea added 2 commits July 7, 2025 11:50

Merge branch 'main' into aggs-filters-cancellation

cfee87a

Made test consistent and fast by using semaphores

50f06c1

ivancea requested a review from nik9000 July 7, 2025 12:06

Remove unused annotation

42ef421

nik9000 approved these changes Jul 7, 2025

View reviewed changes

ivancea added the v9.1.1 label Jul 7, 2025

ivancea merged commit 05dff31 into elastic:main Jul 7, 2025
34 checks passed

ivancea deleted the aggs-filters-cancellation branch July 7, 2025 15:16

ivancea added the v9.0.4 label Jul 7, 2025

ivancea removed v8.20.0 backport pending labels Jul 7, 2025

ivancea mentioned this pull request Jul 8, 2025

Aggs: Fix Filters agg cancellation flaky test #130810

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Aggs: Add cancellation checks to FilterByFilter aggregator #130452

Aggs: Add cancellation checks to FilterByFilter aggregator #130452

Uh oh!

ivancea commented Jul 2, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

nik9000 Jul 2, 2025

Uh oh!

ivancea Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Aggs: Add cancellation checks to FilterByFilter aggregator #130452

Aggs: Add cancellation checks to FilterByFilter aggregator #130452

Uh oh!

Conversation

ivancea commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

nik9000 Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

ivancea Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ivancea commented Jul 2, 2025 •

edited

Loading