[ES|QL] Substitute date_trunc with round_to when the pre-calculated rounding points are available #128639

fang-xing-esql · 2025-05-29T19:52:58Z

This is a subtask of the filter-by-filter aggregation in ES|QL. And the following changes are included in this PR.

Consolidated min/max per shard into SearchStats, according to the statistics retrieved from Lucene for DateFieldType only, added SearchContextStatsTests and tested min/max.
At date nodes, if min/max are available in SearchStats, provide them to DateTrunc.createRounding, call prepare(min, max), instead of prepareForUnknown().
At data node, pre-calculated rounding points are returned by fixedRoundingPoints(), substitute date_trunc and bucket with round_to function. A new rule LocalSubstituteSurrogateExpressions is added in the local rewrite batch in LocalLogicalPlanOptimizer to do the substitution.
The rewrite from date_trunc/bucket to round_to does not apply to date_nanos yet, as the underlying APIs support millis only, and nanos has not been tested yet.
PreparedRounding.maybeUseArray may hit an assert if the calendar intervals are multiple quantities, like 10 month or 3 quarter, date histogram aggregation has this limitation documented here,this is likely why the assert hasn't been hit before, and the same limitation will apply to this rewrite.
The performance improvement looks promising if the rewrite from date_trunc to round_to can be done, however it is not likely to be reflected in the existing perf regression tests. The nyc_taxis is used by the date_histogram/date_trunc related ES|QL tests, it has a wider date range, from year 1900 to 2253, the fixed rounding points of the date field - dropoff_datetime couldn't be calculated for this wide date range, so the date_trunc function in the existing perf regressions tests are not rewritten to round_to. With some modifications on nyc_taxis's data - keep data from year 2015 only, the query below shows about ~12% performance improvements, ~840ms(on main) vs ~740ms(on branch where date_trunc is rewritten to round_to).

+ curl -k -u esbench:super-secret-password 'https://elasticsearch-0:9200/_query?format=txt' -H Content-Type:application/json '-d{
    "query": "from nyc_taxis | stats c=count(*) by m=bucket(dropoff_datetime, 1 month) | sort m"
}'

**Main**: the measurements are collected from 5 runs of the query above, they are pretty consistent
  "profile" : {
    "query" : {
      "start_millis" : 1750975801544,
      "stop_millis" : 1750975802359,
      "took_millis" : 815,
      "took_nanos" : 814654568
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975797680,
      "stop_millis" : 1750975798512,
      "took_millis" : 832,
      "took_nanos" : 832115025
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975804516,
      "stop_millis" : 1750975805364,
      "took_millis" : 848,
      "took_nanos" : 848035085
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975808248,
      "stop_millis" : 1750975809093,
      "took_millis" : 845,
      "took_nanos" : 845664731
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975811275,
      "stop_millis" : 1750975812132,
      "took_millis" : 857,
      "took_nanos" : 857257746
    },

**Branch**: : the measurements are collected from 5 runs of the query above, they are pretty consistent
[2025-06-26T22:11:32,743][TRACE][o.e.x.e.o.LocalLogicalPlanOptimizer] [elasticsearch-0] About to apply rule local.SubstituteSurrogateExpressionsWithSearchStats
[2025-06-26T22:11:32,744][INFO ][stdout                   ] [elasticsearch-0] field: FieldName[string=dropoff_datetime], min: 1420070400000
[2025-06-26T22:11:32,744][INFO ][stdout                   ] [elasticsearch-0] field: FieldName[string=dropoff_datetime], max: 1451606399000
[2025-06-26T22:11:32,744][INFO ][stdout                   ] [elasticsearch-0] field: FieldName[string=dropoff_datetime], min string: 2015-01-01T00:00:00.000Z
[2025-06-26T22:11:32,744][INFO ][stdout                   ] [elasticsearch-0] field: FieldName[string=dropoff_datetime], max string: 2015-12-31T23:59:59.000Z
[2025-06-26T22:11:32,744][INFO ][stdout                   ] [elasticsearch-0] field name = FieldName[string=dropoff_datetime], min = 1420070400000, max = 1451606399000, roundingPoints = [1420070400000, 1422748800000, 1425168000000, 1427846400000, 1430438400000, 1433116800000, 1435708800000, 1438387200000, 1441065600000, 1443657600000, 1446336000000, 1448928000000]
[2025-06-26T22:11:32,745][TRACE][o.e.x.e.o.LocalLogicalPlanOptimizer] [elasticsearch-0] Rule local.SubstituteSurrogateExpressionsWithSearchStats applied
Aggregate[[m{r}#306],[COUNT(*[KEYWORD],true[BOOLEAN]) AS c#308, m{r}#306]] = Aggregate[[m{r}#306],[COUNT(*[KEYWORD],true[BOOLEAN]) AS c#308, m{r}#306]]
\_Eval[[DATETRUNC(P1M[DATE_PERIOD],dropoff_datetime{f}#310) AS m#306]]     ! \_Eval[[ROUNDTO(dropoff_datetime{f}#310,1420070400000[DATETIME],1422748800000[DATETIME],1425168000000[DATETIME],14278
  \_EsRelation[nyc_taxis][dropoff_datetime{f}#310]                         ! 46400000[DATETIME],1430438400000[DATETIME],1433116800000[DATETIME],1435708800000[DATETIME],1438387200000[DATETIME],1441065600000[DATETIME],1443657600000[DATETIME],1446336000000[DATETIME],1448928000000[DATETIME]) AS m#306]]
                                                                           !   \_EsRelation[nyc_taxis][dropoff_datetime{f}#310]

  "profile" : {
    "query" : {
      "start_millis" : 1750975835802,
      "stop_millis" : 1750975836517,
      "took_millis" : 715,
      "took_nanos" : 714436407
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975838797,
      "stop_millis" : 1750975839547,
      "took_millis" : 750,
      "took_nanos" : 749372816
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975832887,
      "stop_millis" : 1750975833632,
      "took_millis" : 745,
      "took_nanos" : 744764974
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975829550,
      "stop_millis" : 1750975830284,
      "took_millis" : 734,
      "took_nanos" : 733643219
    },
  "profile" : {
    "query" : {
      "start_millis" : 1750975826368,
      "stop_millis" : 1750975827132,
      "took_millis" : 764,
      "took_nanos" : 763297524
    },

The situation on wider date range found during perf tests can be improved by considering the filtering on the date field in the query, use the overlapped the date range from the filtering and search stats to calculate the fixed rounding point. This will be a follow up task, it is not implemented in this PR.

…d_to

elasticsearchmachine · 2025-05-29T19:53:23Z

Hi @fang-xing-esql, I've created a changelog YAML for you.

…r multiple quantities for month/quarter/year

elasticsearchmachine · 2025-06-27T21:35:32Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 · 2025-07-01T13:49:20Z

To call it out - if your data is clean - which most log indices are - then you'll see the ~12% performance improvement from this if you are CPU bound. There's a few follow ups coming:

Clamp the index min and max to the min and max from WHERE and filter. This should let even the weird nyc_taxis data benefit from this. And it'll also bring it in play when you do stuff like | STATS COUNT(*) BY DATE_TRUNC(1 minute, @timestamp) on the last 15 minutes of data.
Further improve the performance by flipping to a filter-by-filter execution model. This is what let's aggs run STATS COUNT(*) BY DATE_TRUNC(1 minute, @timestamp) instantly.

nik9000

Looks good to me but probably wants someone with more QL background too.

astefan

In general it looks good, I think the surrogate expression approach can be used for this kind of replacement.
I have objections regarding code abstractions and readability. The code in Bucket and DateTrunc is almost identical with minor exception and can be refactored and simplified.
Also, the SurrogateExpression that we use today is specific to the LogicalPlanOptimizer and is used only there. I would create a separate interface only for things applicable on data nodes, even more so since this new rule will use SearchStats (which is a data node thing).

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateTrunc.java

astefan · 2025-07-09T10:13:47Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateTrunc.java

+            var min = searchStats.min(fieldName);
+            var max = searchStats.max(fieldName);
+            // If min/max is available create rounding with them
+            if (min instanceof Long minValue && max instanceof Long maxValue && interval().foldable()) {


min and max here can be null? Can they be something else other than Long? I am assuming yes, and if so why here it only matters if it's long?

Yes, min and max can be null if we don't find statistics of the fields from Lucene. And in this PR, they can only be Long, because:

SearchStats only consolidates min and max for date fields, and they are Long in Lucene.

We only do the substitution for date fields in DateTrunc and Bucket here.

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateTrunc.java

...sql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalLogicalPlanOptimizerTests.java

.../xpack/esql/optimizer/rules/logical/local/SubstituteSurrogateExpressionsWithSearchStats.java

...gin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/Bucket.java

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateTrunc.java

...gin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/Bucket.java

fang-xing-esql · 2025-07-11T01:13:06Z

...ava/org/elasticsearch/xpack/esql/expression/function/scalar/date/BinaryDateTimeFunction.java

-
-    private final ZoneId zoneId;
-
-    protected BinaryDateTimeFunction(Source source, Expression argument, Expression timestamp) {


Clean up, this class is not used.

fang-xing-esql · 2025-07-11T01:21:23Z

...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateDiff.java

 public class DateDiff extends EsqlScalarFunction {
    public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "DateDiff", DateDiff::new);

-    public static final ZoneId UTC = ZoneId.of("Z");


ExpressionBuilder, Add and Sub all refer to DateUtils.UTC, which is the same as ZoneId.of("Z"), refactor this to point to DateUtils.UTC as well. This seems to be the only place that directly references ZoneId.of("Z").

fang-xing-esql · 2025-07-11T01:23:22Z

...sql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalLogicalPlanOptimizerTests.java

    private LogicalPlan plan(String query, Analyzer analyzer) {
        var analyzed = analyzer.analyze(parser.createStatement(query, EsqlTestUtils.TEST_CFG));
-        // System.out.println(analyzed);
-        var optimized = logicalOptimizer.optimize(analyzed);
-        // System.out.println(optimized);
-        return optimized;
+        return logicalOptimizer.optimize(analyzed);
    }

-    private LogicalPlan plan(String query) {
+    protected LogicalPlan plan(String query) {
        return plan(query, analyzer);
    }

-    private LogicalPlan localPlan(LogicalPlan plan, SearchStats searchStats) {
+    protected LogicalPlan localPlan(LogicalPlan plan, SearchStats searchStats) {
        var localContext = new LocalLogicalOptimizerContext(EsqlTestUtils.TEST_CFG, FoldContext.small(), searchStats);
-        // System.out.println(plan);
-        var localPlan = new LocalLogicalPlanOptimizer(localContext).localOptimize(plan);
-        // System.out.println(localPlan);
-        return localPlan;
+        return new LocalLogicalPlanOptimizer(localContext).localOptimize(plan);
    }


Small refactors.

fang-xing-esql · 2025-07-11T01:45:32Z

In general it looks good, I think the surrogate expression approach can be used for this kind of replacement. I have objections regarding code abstractions and readability. The code in Bucket and DateTrunc is almost identical with minor exception and can be refactored and simplified. Also, the SurrogateExpression that we use today is specific to the LogicalPlanOptimizer and is used only there. I would create a separate interface only for things applicable on data nodes, even more so since this new rule will use SearchStats (which is a data node thing).

Thanks so much for reviewing and for the suggestions on the code refactor @astefan! All of the comments have been addressed.

astefan

LGTM

Is there any existent IT test (csv-spec or yaml) that covers queries that are impacted by this change?
I mean, it feels right to also see this thing in a more real test (like something closer to what users will use and see and feel). If there is no such IT test, please add one which makes sense and which actually uses this thing this PR adds.

nik9000 · 2025-07-15T14:10:03Z

For checking that things are actually rewritten in a test, you can do a new test case in RestEsqlIT and index a bunch of docs and run with a profile. Inside the EvalOperator you'll see it running RoundTo. You could also see the rewritten plan.

Once we push this stuff to the search index we can use the infrastructure in PushQueriesIT.

nik9000 · 2025-07-15T14:10:11Z

Once we push this stuff to the search index we can use the infrastructure in PushQueriesIT.

But that's a followup.

fang-xing-esql · 2025-07-15T14:29:53Z

LGTM

Is there any existent IT test (csv-spec or yaml) that covers queries that are impacted by this change? I mean, it feels right to also see this thing in a more real test (like something closer to what users will use and see and feel). If there is no such IT test, please add one which makes sense and which actually uses this thing this PR adds.

We have quite a few existing CsvTests with date_trunc and bucket that are impacted by this change, and they did uncover some bugs for this change, and they are fixed in this PR, I'll add an IT test in this PR for it as well.

nik9000 · 2025-07-15T14:42:09Z

I'll add an IT test in this PR for it as well.

For me this kind of thing is mostly to make sure that we're really turning on the optimization when we expect to. It's not about today so much - it's about a year from now when you go and make another change and it turns off the optimization unexpectedly.

astefan · 2025-07-15T14:45:58Z

We have quite a few existing CsvTests with date_trunc and bucket that are impacted by this change, and they did uncover some bugs for this change, and they are fixed in this PR

It's good that we have them. I'm ok with the PR as is now, no need for another IT test. Let's merge it 👍

fang-xing-esql · 2025-07-15T14:52:09Z

Thank you for reviewing @nik9000, @astefan !

fang-xing-esql · 2025-07-18T14:26:08Z

To call it out - if your data is clean - which most log indices are - then you'll see the ~12% performance improvement from this if you are CPU bound. There's a few follow ups coming:

Clamp the index min and max to the min and max from WHERE and filter. This should let even the weird nyc_taxis data benefit from this. And it'll also bring it in play when you do stuff like | STATS COUNT(*) BY DATE_TRUNC(1 minute, @timestamp) on the last 15 minutes of data.

This is tracked in #131341

Further improve the performance by flipping to a filter-by-filter execution model. This is what let's aggs run STATS COUNT(*) BY DATE_TRUNC(1 minute, @timestamp) instantly.

consolidate min/max in SearchStats and substitue date_trunc with roun…

b677c8f

…d_to

fang-xing-esql added >enhancement :Analytics/ES|QL AKA ESQL v9.1.0 labels May 29, 2025

Update docs/changelog/128639.yaml

08c3d18

fang-xing-esql and others added 16 commits May 29, 2025 16:11

Merge branch 'main' into date-trunc-search-stats

3f96c15

skip multi-typed fields with mixed date and date_nanos

c3ffa8f

Merge branch 'main' into date-trunc-search-stats

f78fa18

support bucket SubstituteSurrogateExpressionsWithSearchStats

88dbdf8

Merge branch 'main' into date-trunc-search-stats

511faeb

Merge branch 'main' into date-trunc-search-stats

3d28f88

Merge branch 'main' into date-trunc-search-stats

3f1cd81

add tests for min/max in SearchContextStats, disallow substitution fo…

e59ea5a

…r multiple quantities for month/quarter/year

add tests for min/max in SearchContextStats, disallow substitution fo…

2f1aef9

…r multiple quantities for month/quarter/year

Merge branch 'main' into date-trunc-search-stats

debd20e

Merge branch 'main' into date-trunc-search-stats

5d6d9c9

Merge branch 'main' into date-trunc-search-stats

5a152ab

fix test failures

7b669b8

safe guard prepare with min max

54567d4

Merge branch 'main' into date-trunc-search-stats

3208b5f

Merge branch 'main' into date-trunc-search-stats

ee24a7e

tylerperk changed the title ~~[ES|QL] Substitue date_trunc with round_to when the pre-calculated rounding points are available~~ [ES|QL] Substitute date_trunc with round_to when the pre-calculated rounding points are available Jun 24, 2025

elasticsearchmachine added v9.2.0 and removed v9.1.0 labels Jun 26, 2025

fang-xing-esql added 2 commits June 27, 2025 14:20

Merge branch 'main' into date-trunc-search-stats

a76362f

add trace and clean up

57dfe15

fang-xing-esql marked this pull request as ready for review June 27, 2025 21:35

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 27, 2025

Merge branch 'main' into date-trunc-search-stats

d42ea8e

Merge branch 'main' into date-trunc-search-stats

89039cd

nik9000 approved these changes Jul 7, 2025

View reviewed changes

astefan requested changes Jul 9, 2025

View reviewed changes

fang-xing-esql added 2 commits July 10, 2025 11:30

Merge branch 'main' into date-trunc-search-stats

1d453f3

refactor according to review comments

4e82da7

fang-xing-esql commented Jul 11, 2025

View reviewed changes

Merge branch 'main' into date-trunc-search-stats

a2b80be

astefan approved these changes Jul 14, 2025

View reviewed changes

fang-xing-esql merged commit 04ae527 into elastic:main Jul 15, 2025
33 checks passed

fang-xing-esql mentioned this pull request Jul 15, 2025

[ES|QL] Consider min/max from predicates when transform date_trunc/bucket to round_to #131341

Closed

not-napoleon mentioned this pull request Jul 22, 2025

ES|QL: Add TBUCKET function #131449

Merged

fang-xing-esql mentioned this pull request Jul 22, 2025

[ES|QL] Add new queries into nyc_taxis track to measure DateTrunc to RoundTo transformation elastic/rally-tracks#824

Merged


		private final ZoneId zoneId;

		protected BinaryDateTimeFunction(Source source, Expression argument, Expression timestamp) {

[ES|QL] Substitute date_trunc with round_to when the pre-calculated rounding points are available #128639

[ES|QL] Substitute date_trunc with round_to when the pre-calculated rounding points are available #128639

Uh oh!

Conversation

fang-xing-esql commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented May 29, 2025

Uh oh!

elasticsearchmachine commented Jun 27, 2025

Uh oh!

nik9000 commented Jul 1, 2025

Uh oh!

nik9000 left a comment

Choose a reason for hiding this comment

Uh oh!

astefan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

astefan Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fang-xing-esql Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

fang-xing-esql commented Jul 11, 2025

Uh oh!

astefan left a comment

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Jul 15, 2025

Uh oh!

nik9000 commented Jul 15, 2025

Uh oh!

fang-xing-esql commented Jul 15, 2025

Uh oh!

nik9000 commented Jul 15, 2025

Uh oh!

astefan commented Jul 15, 2025

Uh oh!

fang-xing-esql commented Jul 15, 2025

Uh oh!

Uh oh!

fang-xing-esql commented Jul 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fang-xing-esql commented May 29, 2025 •

edited

Loading