fix(esql): check per-shard DateFieldType for DocValuesSkipper by salvatore-campagna · Pull Request #142752 · elastic/elasticsearch

salvatore-campagna · 2026-02-20T11:46:38Z

Summary

SearchContextStats.min()/max() derived hasDocValuesSkipper once from the first shard's DateFieldType and applied it globally to all shards via doWithContexts. In mixed environments (TSDB shards using DocValuesSkipper + standard shards using PointValues), the wrong API gets called on some shards, causing sentinel values to leak into min/max results.

The two APIs have different sentinel behavior when a shard has no data for the field:

PointValues.getMinPackedValue() returns null when there are no points: callers can check for null and skip.
DocValuesSkipper.globalMinValue() returns Long.MIN_VALUE when a leaf reader has no skipper, and Long.MAX_VALUE when no segments have the field. globalMaxValue() returns the opposite sentinels.

When hasDocValuesSkipper is determined from the first shard (e.g. a TSDB shard) and then applied to a standard shard that only has PointValues, globalMinValue/globalMaxValue are called on readers that have no skipper. This returns the sentinels Long.MIN_VALUE/Long.MAX_VALUE, which propagate as the min/max result.

This replaces the workaround in #142726 which filtered sentinels after the fact with hasMin/hasMax booleans. Instead, this fix addresses the root cause: min() and max() now always call the right API based on each shard's own DateFieldType.hasDocValuesSkipper(), using doc values skippers when available and BKD trees (point values) otherwise, rather than deriving the choice once from the first shard and applying it globally. This way sentinels can never leak in the first place.

What changed

min() and max() now iterate contexts directly instead of using doWithContexts, preserving the context-to-leaf-reader association needed to check hasDocValuesSkipper() per shard. Each shard always uses the correct API to retrieve min/max values: doc values skippers when available, BKD trees (point values) otherwise.
Simplified the early-return guard: removed the (hasDocValueSkipper == false && stat.config.indexed == false) check which was incorrect for mixed TSDB/standard environments (a TSDB shard with indexed=false would cause the global indexed to be false, bailing out even when standard shards have points)
Extracted helper methods (docValuesSkipperMinValue/docValuesSkipperMaxValue and pointMinValue/pointMaxValue) that wrap the underlying APIs and convert sentinel values to null. This gives both code paths a uniform Long-or-null interface, and also filters the Long.MIN_VALUE sentinel from DocValuesSkipper.globalMinValue() that the previous code did not guard against. The per-leaf results are aggregated via nullableMin/nullableMax helpers.

Tests

testPointValuesMinMaxDoesNotReturnSentinelValues: exercises the PointValues code path. Creates multiple standard (non-TSDB) contexts where the date field is mapped but has no actual date data. Asserts hasDocValuesSkipper() is false on each context and verifies that min() and max() return null as expected. This is the path that reproduces the original stack trace: without the fix, Long.MAX_VALUE leaks as min and Long.MIN_VALUE as max, causing Rounding.prepare(min, max) to throw IllegalArgumentException: [9223372036852975807] must be <= [-9223372036852975808].
testDocValuesSkipperMinMaxDoesNotReturnSentinelValues: exercises the DocValuesSkipper code path. Creates multiple TSDB contexts where @timestamp is mapped with hasDocValuesSkipper()=true but the Lucene index only contains keyword docs (no timestamp data written). Verifies that min() and max() return null instead of sentinel values. In practice, data streams always have @timestamp populated, but the test intentionally forces the empty-data corner case so that the sentinel handling is self-contained and does not rely on guarantees from upper layers.

Replaces #142726

Closes #142725

…chContextStats min/max hasDocValuesSkipper was derived once from the first shard's DateFieldType and applied globally via doWithContexts. In mixed environments (TSDB shards with DocValuesSkipper + standard shards with PointValues), calling the wrong API on some shards caused sentinel values (Long.MIN_VALUE/Long.MAX_VALUE) to leak into min/max. Replace doWithContexts with direct per-context iteration so each shard's own DateFieldType determines whether to use DocValuesSkipper or PointValues. This also simplifies the early-return guard by removing the incorrect indexed check that could bail out prematurely in mixed modes.

…null instead of sentinels Extract helper methods that convert sentinel values to null, making both code paths return Long (or null) uniformly. This simplifies the min/max logic and also filters the Long.MIN_VALUE sentinel from DocValuesSkipper.globalMinValue that the previous code did not guard against.

elasticsearchmachine · 2026-02-20T12:43:56Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

romseygeek

LGTM. I wonder if we need to build a more general 'get min and max values of a field' API directly into lucene here?

romseygeek · 2026-02-20T14:35:10Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/stats/SearchContextStats.java

+                        continue;
+                    }
+                    final MappedFieldType ctxFieldType = context.getFieldType(field.string());
+                    boolean ctxHasSkipper = ctxFieldType instanceof DateFieldType dft && dft.hasDocValuesSkipper();


Given that we know we're operating on a DateFieldType here (because of the instanceof check on line 236) we can use ctxFieldType.indexType().hasSkippers() directly here.

Right...no instanceof 💯

salvatore-campagna · 2026-02-20T14:41:58Z

LGTM. I wonder if we need to build a more general 'get min and max values of a field' API directly into lucene here?

Yeah I think it would be better to have something like that. Whatever the underlying data structure is, just give me min/max.

…anceof cast Use MappedFieldType.indexType().hasDocValuesSkipper() to check for doc values skipper support, avoiding the unnecessary instanceof DateFieldType cast since the outer guard already ensures the field type is a DateFieldType.

salvatore-campagna · 2026-02-20T15:28:11Z

I opened this issue in Lucene too: apache/lucene#15740

Link to apache/lucene#15740 so we remember to replace the wrapper helpers once a unified API is available.

salvatore-campagna self-assigned this Feb 20, 2026

salvatore-campagna added >non-issue :StorageEngine/TSDB You know, for Metrics labels Feb 20, 2026

elasticsearchmachine added the v9.4.0 label Feb 20, 2026

salvatore-campagna mentioned this pull request Feb 20, 2026

ES|QL: fix sentinel values leaking from SearchContextStats min/max #142726

Closed

salvatore-campagna and others added 3 commits February 20, 2026 12:50

Merge branch 'main' into fix/per-shard-doc-values-skipper

1fb89e1

[CI] Auto commit changes from spotless

e09db32

salvatore-campagna added backport pending auto-backport Automatically create backport pull requests when merged v9.3.2 v9.2.7 labels Feb 20, 2026

Merge branch 'main' into fix/per-shard-doc-values-skipper

4cf063c

salvatore-campagna marked this pull request as ready for review February 20, 2026 12:43

elasticsearchmachine added the Team:StorageEngine label Feb 20, 2026

kkrik-es requested review from dnhatn and romseygeek February 20, 2026 12:53

salvatore-campagna requested a review from fang-xing-esql February 20, 2026 13:00

romseygeek approved these changes Feb 20, 2026

View reviewed changes

salvatore-campagna and others added 2 commits February 20, 2026 15:46

Merge branch 'main' into fix/per-shard-doc-values-skipper

5605f96

chore(esql): add TODO referencing Lucene issue for unified min/max API

b1c787b

Link to apache/lucene#15740 so we remember to replace the wrapper helpers once a unified API is available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix(esql): check per-shard DateFieldType for DocValuesSkipper#142752

fix(esql): check per-shard DateFieldType for DocValuesSkipper#142752
salvatore-campagna wants to merge 8 commits intoelastic:mainfrom
salvatore-campagna:fix/per-shard-doc-values-skipper

salvatore-campagna commented Feb 20, 2026 •

edited

Loading

Uh oh!

elasticsearchmachine commented Feb 20, 2026

Uh oh!

romseygeek left a comment

Uh oh!

romseygeek Feb 20, 2026

Uh oh!

salvatore-campagna Feb 20, 2026

Uh oh!

salvatore-campagna commented Feb 20, 2026 •

edited

Loading

Uh oh!

salvatore-campagna commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

salvatore-campagna commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Tests

Uh oh!

elasticsearchmachine commented Feb 20, 2026

Uh oh!

romseygeek left a comment

Choose a reason for hiding this comment

Uh oh!

romseygeek Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

salvatore-campagna Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

salvatore-campagna commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

salvatore-campagna commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

salvatore-campagna commented Feb 20, 2026 •

edited

Loading

salvatore-campagna commented Feb 20, 2026 •

edited

Loading