-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Open
Labels
:Analytics/ES|QLAKA ESQLAKA ESQL:StorageEngine/TSDBYou know, for MetricsYou know, for Metrics>enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)Meta label for analytical engine team (ESQL/Aggs/Geo)Team:StorageEngine
Description
Time-series aggregations, such as {agg}_over_time
and rate
, against time-series indices are currently slow due to several reasons:
- They require two phases:
- First, grouping by each time-series (by
tsid
andtimebucket
). - Then, grouping by user-specified groups.
- First, grouping by each time-series (by
- For
rate
aggregations, data must be provided in timestamp order per time-series.
This issue proposes some ideas and tracks optimizations to improve the performance of time-series aggregations in ES|QL.
Source command
- Translate time-series queries without
rate
toFROM
: Avoid sorted source for time_series aggs without rates #127033 - Avoid comparing
tsid
when iterating over documents in TS source: Optimize time-series source operator #127095 - Extract fields directly from the time-series source: Push down field extraction to time-series source #127445
- Speed up reading dimension fields: Speed up read dimension fields in TS #128283
- One segment: Add optimized path for single segment in TS source #131502
- Optimize loading of time-series data using
FROM
. - Field extraction for single segment
Execution
- Execute time-series source in a separate driver: Increase concurrency for TS command #128419
- Execute extract fields in a separate driver: Run field extraction concurrently in TS #128643
- Support segment data partitioning for TS
- Emit final results for non-overlapping buckets (drop tsid for these buckets)
- Run one shard at a time to leverage TimeSeriesBlockHash
- Constant blocks: Optimizations with constant blocks #132379
Values aggregation (for dimension fields)
- Emit ordinal output blocks: Emit ordinal output block for values aggregate #127201
- Handle ordinal input blocks: Optimize ordinal inputs in Values aggregation #127849
- Optimize for single-value aggregations (dimension fields?).
Block hash
- Enable time-series block hash: Enable time-series block hash #127488
- Leverage ordinal blocks in time-series block hash: Enable time-series block hash #127488
- Emit ordinal blocks in PackedValuesBlockHash.
Planning
- Use a single aggregation for the second phase.
- Optimize for a single target index.
- Skip backing indices with
start_time
andend_time
outside theTRANGE
filter.
Misc
- Ensure ordinal builder emit ordinal blocks #127949
- Load the first seen value only for last_over_time
- Lossy summation: Use lossy summation for time-series aggregations #132625
Migrated from 105397 and to be considered
- Add support of sparse index to easily navigate a time series documents (Sparse index for tsdb #95701). This is required for determining the last value of a metric and skipping to the next last value of the next time serie. And other functionally like interpolation and geo fencing. Additionally a query may be too selective, and mask documents which are valid metric of a time serie. A sparse index would allow us to access the metrics even if that would be the case.
- Enhancing the time serie grouping operator to also group by time series and time interval. A typical use case would group by time serie and time interval. This is when the BUCKET syntax is used.
Metadata
Metadata
Assignees
Labels
:Analytics/ES|QLAKA ESQLAKA ESQL:StorageEngine/TSDBYou know, for MetricsYou know, for Metrics>enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)Meta label for analytical engine team (ESQL/Aggs/Geo)Team:StorageEngine