Skip to content

ESQL: TSTEP #4#144967

Draft
sidosera wants to merge 5 commits intoelastic:mainfrom
sidosera:feature/esql-tstep-fn
Draft

ESQL: TSTEP #4#144967
sidosera wants to merge 5 commits intoelastic:mainfrom
sidosera:feature/esql-tstep-fn

Conversation

@sidosera
Copy link
Contributor

@sidosera sidosera commented Mar 25, 2026

Adds the TStep grouping function with time buckets aligned to the end of TRANGE and count backwards. It is independent of timezone and daylight savings.

This complements TBUCKET and enables PromQL-compatible step-based time grouping.

Stack:

Depends on #144966

This was referenced Mar 25, 2026
@sidosera sidosera added Team:StorageEngine :StorageEngine/ES|QL Timeseries / metrics / PromQL / logsdb capabilities in ES|QL labels Mar 25, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@elastic elastic deleted a comment from coderabbitai bot Mar 25, 2026
@sidosera
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

This pull request adds the TSTEP function to ESQL, a time-bucketing function that creates end-aligned time buckets over @timestamp using a fixed step size. The implementation includes the core TStep grouping expression class, supporting utility methods in DateUtils for offset-aligned time rounding, analyzer rules to resolve the bucket end anchor from TRANGE when available, and comprehensive documentation and test fixtures covering multiple timestamp types (DATETIME and DATE_NANOS), step formats (numeric and string durations), and timezone handling. Supporting infrastructure includes function registration in the ESQL registry, a new capability flag, and modifications to TRange to expose literal builders and to TranslateTimeSeriesAggregate to recognize TStep specifications.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java`:
- Around line 1789-1805: resolveAggregate currently only rewrites TStep
expressions in aggregate.groupings() and aggregate.aggregates(), but not in
aggregate.child(), so TSTEP aliases in the child plan remain unresolved; update
resolveAggregate to also call rewriteTSteps on aggregate.child() (e.g., newChild
= rewriteTSteps(aggregate.child())), include that in the change-detection logic
(changed |= newChild != aggregate.child()), and when building the new Aggregate
use aggregate.with(newChild, newGroupings, newAggregates) so the rewritten child
(with TSTEP/TRANGE resolved) is returned.
- Around line 1848-1855: matchingTRange currently returns the first TRange found
by child.forEachExpressionDown, making anchoring order-dependent; change
matchingTRange (and stop using Holder<TRange> with early return) to collect all
TRange matches where matchesTimestamp(trange.timestamp(), timestamp) and then
pick one deterministically (for example: choose the innermost/deepest match, or
the TRange with the earliest start or latest end) so TSTEP anchoring is stable
across traversal orders; update matchingTRange and its callers (e.g., any TSTEP
anchoring logic) to use this deterministic selection instead of the first-hit
behavior.
- Around line 1835-1845: resolveTStep currently calls
TRange.rangeEndLiteral(...) which folds range arguments immediately; guard this
call so unresolved/non-foldable TRanges are not folded here. In resolveTStep,
after obtaining TRange trange = matchingTRange(...), check that trange's bounds
are resolved/foldable (e.g. via the existing trange APIs that indicate
foldability/resolution) before calling
trange.rangeEndLiteral(FoldContext.small(), tstep.dataType()); if the trange is
not foldable/resolved, return the original tstep unchanged (so verification can
report the bad TRANGE) instead of invoking rangeEndLiteral and folding early.

In
`@x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStep.java`:
- Around line 121-132: resolveType currently only checks types for step and end
but not millisecond alignment, which lets DATE_NANOS values with sub-ms
precision or step <1ms slip through and later cause stepMillis==0 /
division-by-zero in offset()/floorRemainder; update resolveType to validate that
when step or end are DATE_NANOS they are millisecond-aligned (nanoseconds %
1_000_000 == 0) and that step represents at least 1 millisecond (reject step
values < 1ms), returning a TypeResolution error message referencing FIRST (step)
or SECOND (end) as appropriate; alternatively, if you prefer to support
nanosecond math, implement nanosecond-aware arithmetic in offset() and
associated helpers instead—pick one approach and enforce it consistently for
resolveType, offset(), and any planning code.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 3f565efa-3214-4e31-803d-26ccf6ddc1ba

📥 Commits

Reviewing files that changed from the base of the PR and between a33042d and 858becf.

⛔ Files ignored due to path filters (1)
  • docs/reference/query-languages/esql/images/functions/tstep.svg is excluded by !**/*.svg
📒 Files selected for processing (21)
  • docs/reference/query-languages/esql/_snippets/functions/description/tstep.md
  • docs/reference/query-languages/esql/_snippets/functions/examples/tstep.md
  • docs/reference/query-languages/esql/_snippets/functions/layout/tstep.md
  • docs/reference/query-languages/esql/_snippets/functions/parameters/tstep.md
  • docs/reference/query-languages/esql/_snippets/functions/types/tstep.md
  • docs/reference/query-languages/esql/kibana/definition/functions/tstep.json
  • docs/reference/query-languages/esql/kibana/docs/functions/tstep.md
  • server/src/main/java/org/elasticsearch/common/time/DateUtils.java
  • server/src/test/java/org/elasticsearch/common/time/DateUtilsTests.java
  • x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/WindowGroupingAggregatorFunction.java
  • x-pack/plugin/esql/compute/src/test/java/org/elasticsearch/compute/aggregation/WindowGroupingDerivAggregatorFunctionTests.java
  • x-pack/plugin/esql/qa/testFixtures/src/main/resources/tstep.csv-spec
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStep.java
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/TRange.java
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/TranslateTimeSeriesAggregate.java
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/Configuration.java
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStepErrorTests.java
  • x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStepTests.java

Comment on lines +1789 to +1805
private static Aggregate resolveAggregate(Aggregate aggregate) {
if (Expressions.anyMatch(aggregate.groupings(), ResolveTStepAnchors::containsUnresolvedTStep) == false
&& Expressions.anyMatch(aggregate.aggregates(), ResolveTStepAnchors::containsUnresolvedTStep) == false) {
return aggregate;
}

List<Expression> newGroupings = rewriteTSteps(aggregate.groupings(), aggregate.child());
List<NamedExpression> newAggregates = new ArrayList<>(aggregate.aggregates().size());
boolean changed = newGroupings != aggregate.groupings();

for (NamedExpression aggregateExpression : aggregate.aggregates()) {
NamedExpression rewritten = (NamedExpression) rewriteTSteps(aggregateExpression, aggregate.child());
changed |= rewritten != aggregateExpression;
newAggregates.add(rewritten);
}

return changed ? aggregate.with(aggregate.child(), newGroupings, newAggregates) : aggregate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Resolve TSTEP aliases in the child plan too.

This rule only rewrites TStep nodes that appear directly in aggregate.groupings() or aggregate.aggregates(). Downstream code also extracts time buckets from aggregate.child(), so a shape like EVAL bucket = TSTEP(...) | STATS ... BY bucket never gets the TRANGE end and silently falls back to configuration.now().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java`
around lines 1789 - 1805, resolveAggregate currently only rewrites TStep
expressions in aggregate.groupings() and aggregate.aggregates(), but not in
aggregate.child(), so TSTEP aliases in the child plan remain unresolved; update
resolveAggregate to also call rewriteTSteps on aggregate.child() (e.g., newChild
= rewriteTSteps(aggregate.child())), include that in the change-detection logic
(changed |= newChild != aggregate.child()), and when building the new Aggregate
use aggregate.with(newChild, newGroupings, newAggregates) so the rewritten child
(with TSTEP/TRANGE resolved) is returned.

Comment on lines +1835 to +1845
private static TStep resolveTStep(TStep tstep, LogicalPlan child) {
if (tstep.end() != null) {
return tstep;
}

TRange trange = matchingTRange(child, tstep.timestamp());
if (trange == null) {
return tstep;
}

return tstep.withEnd(trange.rangeEndLiteral(FoldContext.small(), tstep.dataType()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't fold unresolved TRANGEs here.

resolveTStep() calls trange.rangeEndLiteral(...) unconditionally, and rangeEndLiteral() immediately folds both range arguments. If the matched TRange is still unresolved/non-foldable, analysis will throw here instead of letting verification report the bad TRANGE.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java`
around lines 1835 - 1845, resolveTStep currently calls
TRange.rangeEndLiteral(...) which folds range arguments immediately; guard this
call so unresolved/non-foldable TRanges are not folded here. In resolveTStep,
after obtaining TRange trange = matchingTRange(...), check that trange's bounds
are resolved/foldable (e.g. via the existing trange APIs that indicate
foldability/resolution) before calling
trange.rangeEndLiteral(FoldContext.small(), tstep.dataType()); if the trange is
not foldable/resolved, return the original tstep unchanged (so verification can
report the bad TRANGE) instead of invoking rangeEndLiteral and folding early.

Comment on lines +1848 to +1855
private static TRange matchingTRange(LogicalPlan child, Expression timestamp) {
Holder<TRange> match = new Holder<>();
child.forEachExpressionDown(TRange.class, trange -> {
if (match.get() == null && matchesTimestamp(trange.timestamp(), timestamp)) {
match.set(trange);
}
});
return match.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Picking the first matching TRANGE makes the anchor order-dependent.

matchingTRange() stops at the first hit in the child tree. With multiple TRANGEs on the same timestamp, TSTEP can anchor to different ends depending on traversal order, which changes the buckets for otherwise equivalent filters.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java`
around lines 1848 - 1855, matchingTRange currently returns the first TRange
found by child.forEachExpressionDown, making anchoring order-dependent; change
matchingTRange (and stop using Holder<TRange> with early return) to collect all
TRange matches where matchesTimestamp(trange.timestamp(), timestamp) and then
pick one deterministically (for example: choose the innermost/deepest match, or
the TRange with the earliest start or latest end) so TSTEP anchoring is stable
across traversal orders; update matchingTRange and its callers (e.g., any TSTEP
anchoring logic) to use this deterministic selection instead of the first-hit
behavior.

Comment on lines +121 to +132
protected TypeResolution resolveType() {
if (childrenResolved() == false) {
return new TypeResolution("Unresolved children");
}
TypeResolution resolution = isType(step, dt -> dt == DataType.TIME_DURATION, sourceText(), FIRST, "time_duration").and(
isType(timestamp, DataType::isMillisOrNanos, sourceText(), IMPLICIT, "date_nanos or datetime")
);
if (resolution.unresolved() || end == null) {
return resolution;
}
return resolution.and(isType(end, DataType::isMillisOrNanos, sourceText(), SECOND, "date_nanos or datetime"));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

What does java.time.Duration.toMillis() return for sub-millisecond durations such as Duration.ofNanos(1)? Does it truncate nanoseconds?

💡 Result:

java.time.Duration.toMillis returns 0 for sub-millisecond durations such as Duration.ofNanos(1). Yes, it truncates (discards) the nanoseconds by performing integer division of the total nanoseconds by 1,000,000.

Citations:


🏁 Script executed:

# First, let's look at the TStep.java file to understand the full context
cat -n x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStep.java | head -250

Repository: elastic/elasticsearch

Length of output: 10720


🏁 Script executed:

# Let's check the DateUtils.floorRemainder method to confirm division-by-zero risk
rg "floorRemainder" --context 15 -A 10

Repository: elastic/elasticsearch

Length of output: 7233


🏁 Script executed:

# Let's look at the roundFloor implementation to see if there's division-by-zero risk when unitMillis = 0
rg "public static long roundFloor" --context 20 -A 15

Repository: elastic/elasticsearch

Length of output: 3433


🏁 Script executed:

# Let's check if there are any tests that demonstrate sub-millisecond step handling
rg "Duration.ofNanos|Duration.ofMicros" x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStepTests.java -A 3 -B 3

Repository: elastic/elasticsearch

Length of output: 47


🏁 Script executed:

# Let's examine the test file to see what step values are tested
cat -n x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStepTests.java | head -100

Repository: elastic/elasticsearch

Length of output: 4791


🏁 Script executed:

# Continue examining the test file
cat -n x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStepTests.java | tail -100

Repository: elastic/elasticsearch

Length of output: 5107


🏁 Script executed:

# Let's verify the DateUtils.toMilliSeconds method to confirm truncation behavior
rg "public static long toMilliSeconds" -A 10

Repository: elastic/elasticsearch

Length of output: 1188


🏁 Script executed:

# Get more of the toMilliSeconds implementation
rg "public static long toMilliSeconds" -A 20

Repository: elastic/elasticsearch

Length of output: 2254


Validate step/end before truncating them to millis.

offset() assumes both values are millisecond-aligned, but resolveType() only checks type. A DATE_NANOS anchor with sub-millisecond precision is silently truncated to the enclosing millisecond, and any step < 1ms causes stepMillis == 0, triggering a division-by-zero exception in floorRemainder during planning. Either make the offset math nanosecond-aware or reject non-millisecond-aligned inputs upfront.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/TStep.java`
around lines 121 - 132, resolveType currently only checks types for step and end
but not millisecond alignment, which lets DATE_NANOS values with sub-ms
precision or step <1ms slip through and later cause stepMillis==0 /
division-by-zero in offset()/floorRemainder; update resolveType to validate that
when step or end are DATE_NANOS they are millisecond-aligned (nanoseconds %
1_000_000 == 0) and that step represents at least 1 millisecond (reject step
values < 1ms), returning a TypeResolution error message referencing FIRST (step)
or SECOND (end) as appropriate; alternatively, if you prefer to support
nanosecond math, implement nanosecond-aware arithmetic in offset() and
associated helpers instead—pick one approach and enforce it consistently for
resolveType, offset(), and any planning code.

@sidosera sidosera force-pushed the feature/esql-tstep-fn branch from 858becf to 5dc190c Compare March 26, 2026 15:35
…addInitialIntermediateInput

Uses Block.asVector() to detect dense blocks upfront: all-vector fast path skips null
scanning entirely. For mixed pages, a single position scan replaces two separate passes
(null-finding then block.filter), and nullable blocks are identified per-block rather than
per-position.

Also removes stale IntArrayList import from TimeSeriesAggregationOperator.
@sidosera sidosera force-pushed the feature/esql-tstep-fn branch from 5dc190c to d1a4e04 Compare March 26, 2026 21:47
@sidosera sidosera force-pushed the feature/esql-tstep-fn branch from d1a4e04 to da081c0 Compare March 26, 2026 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:StorageEngine/ES|QL Timeseries / metrics / PromQL / logsdb capabilities in ES|QL Team:StorageEngine v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants