Avoid rewrite round_to with expensive queries #135987

dnhatn · 2025-10-04T01:23:24Z

Today, we use a threshold (defaults to 128) to avoid generating too many sub-queries when replacing round_to with sub-queries. However, we do not account for cases where the main query is expensive. In such cases, running many expensive queries is slower and more costly than running a single query and then reading values and rounding. Our benchmark shows that this query takes 800ms with query-and-tags, but only 40ms without it.

TS metric* 
| WHERE host.name LIKE \"host-*\" AND @timestamp >= \"2025-07-25T12:55:59.000Z\" AND @timestamp <= \"2025-07-25T17:25:59.000Z\" 
| STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes)

And this query:

TS new_metrics* 
| WHERE host.name IN("host-0", "host-1", "host-2") AND @timestamp >= "2025-07-25T12:55:59.000Z" AND @timestamp <= "2025-07-25T17:25:59.000Z" 
| STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes)

reduces from 50ms to 10ms.

This change proposes using the threshold as the number of query clauses and assigning higher weights to expensive queries, such as wildcard or prefix queries. This allows us to disable the rewrite when it is less efficient, while still enabling it if the number of sub-queries is small.

I consider this a bug and will backport it to 9.2.1.

dnhatn · 2025-10-04T04:08:08Z

.../elasticsearch/xpack/esql/optimizer/rules/physical/local/ReplaceRoundToWithQueryAndTags.java

+        return Math.ceilDiv(threshold, clauses);
+    }
+
+    static int estimateQueryClauses(QueryBuilder q) {


This is a rough estimate - any suggestions are welcome.

This looks good to me.

Would we also want to handle leaf query builders that target doc value only fields differently then if it targets an indexed field? I guess if that is the case, then that should be for another change.

+1. We might also need to convert to queries, rewrite them, then estimate.

elasticsearchmachine · 2025-10-04T04:08:56Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-10-04T04:08:56Z

Hi @dnhatn, I've created a changelog YAML for you.

martijnvg

I left a few questions, but this LGTM!

martijnvg · 2025-10-04T07:08:52Z

.../elasticsearch/xpack/esql/optimizer/rules/physical/local/ReplaceRoundToWithQueryAndTags.java

+        if (q == null || q instanceof MatchAllQueryBuilder || q instanceof MatchNoneQueryBuilder) {
+            return 0;
+        }
+        if (q instanceof WildcardQueryBuilder || q instanceof RegexpQueryBuilder || q instanceof PrefixQueryBuilder) {


I think we want to add FuzzyQueryBuilder here as well?
Or maybe we should check for MultiTermQueryBuilder? (but that also includes range query builder, which if indexed should count as one, I think?)

Sure, I added it in b055ae0.

the range query too in 2f7fd82

martijnvg · 2025-10-04T07:13:40Z

.../elasticsearch/xpack/esql/optimizer/rules/physical/local/ReplaceRoundToWithQueryAndTags.java

+        return Math.ceilDiv(threshold, clauses);
+    }
+
+    static int estimateQueryClauses(QueryBuilder q) {


This looks good to me.

Would we also want to handle leaf query builders that target doc value only fields differently then if it targets an indexed field? I guess if that is the case, then that should be for another change.

martijnvg · 2025-10-04T07:16:09Z

.../elasticsearch/xpack/esql/optimizer/rules/physical/local/ReplaceRoundToWithQueryAndTags.java

+        }
+        if (q instanceof MultiTermQueryBuilder) {
+            return 3;
+        }


Should we also score phrase queries differently?

...ticsearch/xpack/esql/optimizer/rules/physical/local/ReplaceRoundToWithQueryAndTagsTests.java

… round_to_expensive_queries

dnhatn · 2025-10-04T08:24:04Z

@martijnvg Thank you so much for the quick review!

kkrik-es · 2025-10-04T13:05:59Z

.../elasticsearch/xpack/esql/optimizer/rules/physical/local/ReplaceRoundToWithQueryAndTags.java

+        int clauses = estimateQueryClauses(stats, query) + 1;
+        if (indexMode == IndexMode.TIME_SERIES) {
+            // No doc partitioning for time_series sources; increase the threshold to trade overhead for parallelism.
+            threshold *= 2;


Super nit: conside adding constants for these numbers (2, 5 etc).

kkrik-es

:)

@timestamp

Today, we use a threshold (defaults to 128) to avoid generating too many sub-queries when replacing round_to with sub-queries. However, we do not account for cases where the main query is expensive. In such cases, running many expensive queries is slower and more costly than running a single query and then reading values and rounding. Our benchmark shows that this query takes 800ms with query-and-tags, but only 40ms without it. TS metric* | WHERE host.name LIKE \"host-*\" AND @timestamp >= \"2025-07-25T12:55:59.000Z\" AND @timestamp <= \"2025-07-25T17:25:59.000Z\" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) And this query: TS new_metrics* | WHERE host.name IN("host-0", "host-1", "host-2") AND @timestamp >= "2025-07-25T12:55:59.000Z" AND @timestamp <= "2025-07-25T17:25:59.000Z" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) reduces from 50ms to 10ms. This change proposes using the threshold as the number of query clauses and assigning higher weights to expensive queries, such as wildcard or prefix queries. This allows us to disable the rewrite when it is less efficient, while still enabling it if the number of sub-queries is small.

elasticsearchmachine · 2025-10-04T15:18:51Z

💚 Backport successful

Status	Branch	Result
✅	9.2

@timestamp

Today, we use a threshold (defaults to 128) to avoid generating too many sub-queries when replacing round_to with sub-queries. However, we do not account for cases where the main query is expensive. In such cases, running many expensive queries is slower and more costly than running a single query and then reading values and rounding. Our benchmark shows that this query takes 800ms with query-and-tags, but only 40ms without it. TS metric* | WHERE host.name LIKE \"host-*\" AND @timestamp >= \"2025-07-25T12:55:59.000Z\" AND @timestamp <= \"2025-07-25T17:25:59.000Z\" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) And this query: TS new_metrics* | WHERE host.name IN("host-0", "host-1", "host-2") AND @timestamp >= "2025-07-25T12:55:59.000Z" AND @timestamp <= "2025-07-25T17:25:59.000Z" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) reduces from 50ms to 10ms. This change proposes using the threshold as the number of query clauses and assigning higher weights to expensive queries, such as wildcard or prefix queries. This allows us to disable the rewrite when it is less efficient, while still enabling it if the number of sub-queries is small.

Avoid rewrite round_to with expensive queries

14c6551

elasticsearchmachine added the v9.3.0 label Oct 4, 2025

dnhatn commented Oct 4, 2025

View reviewed changes

dnhatn added v9.2.1 :Analytics/ES|QL AKA ESQL >bug auto-backport Automatically create backport pull requests when merged labels Oct 4, 2025

dnhatn requested review from fang-xing-esql, kkrik-es, martijnvg and nik9000 October 4, 2025 04:08

dnhatn marked this pull request as ready for review October 4, 2025 04:08

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 4, 2025

Update docs/changelog/135987.yaml

c9ce8b8

martijnvg approved these changes Oct 4, 2025

View reviewed changes

dnhatn added 4 commits October 4, 2025 00:44

Add fuzzy

b055ae0

Add range with points

2f7fd82

harden tests

e0ec420

Merge remote-tracking branch 'dnhatn/round_to_expensive_queries' into…

327f284

… round_to_expensive_queries

dnhatn enabled auto-merge (squash) October 4, 2025 08:25

kkrik-es reviewed Oct 4, 2025

View reviewed changes

kkrik-es approved these changes Oct 4, 2025

View reviewed changes

dnhatn merged commit d5ad51a into elastic:main Oct 4, 2025
34 checks passed

dnhatn deleted the round_to_expensive_queries branch October 4, 2025 15:17

dnhatn mentioned this pull request Oct 4, 2025

[9.2] Avoid rewrite round_to with expensive queries (#135987) #135992

Merged

kkrik-es mentioned this pull request Oct 9, 2025

Efficient prefix/wildcard filtering on dimension fields #136252

Closed

Avoid rewrite round_to with expensive queries #135987

Avoid rewrite round_to with expensive queries #135987

Uh oh!

Conversation

dnhatn commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Oct 4, 2025

Uh oh!

elasticsearchmachine commented Oct 4, 2025

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dnhatn commented Oct 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kkrik-es left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 4, 2025

💚 Backport successful

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dnhatn commented Oct 4, 2025 •

edited

Loading

dnhatn Oct 4, 2025 •

edited

Loading