Skip to content

Conversation

@dnhatn
Copy link
Member

@dnhatn dnhatn commented Aug 21, 2025

One of the slowest parts of time-series queries is reading metric values - it accounts for 30% of the profiler time for the following query:

TS my*
| WHERE `metrics.system.memory.utilization` IS NOT NULL
        AND @timestamp >= "2025-07-25T14:55:59.000Z"
        AND @timestamp <= "2025-07-25T16:25:59.000Z"
| STATS AVG(AVG_OVER_TIME(`metrics.system.memory.utilization`)) BY host.name, BUCKET(@timestamp, 1h)

This is because the metrics.system.memory.utilization field is sparse, requiring iteration over its DISI to find value indices when reading values. This change adds a flag named nullsFiltered to the column reader, signaling that all target docs have values for the field. This enables optimizations such as skipping value index lookups with DISI and performing bulk copying.

We can safely do this because if the filter WHERE metrics.system.memory.utilization IS NOT NULL is pushed down to Lucene, then every document returned from the Lucene operator will have a value for the metrics.system.memory.utilization field.

I was able to make changes that reduce the execution time for reading sparse metric values to be comparable with reading dense fields (like timestamp). To keep this PR small, I will open the codec-related changes in a separate PR.

@dnhatn dnhatn force-pushed the reading-dense-values branch from 882b679 to 1a5aecb Compare August 21, 2025 06:11
Copy link
Member Author

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the reviewers: except for the comments, most of the changes add a new parameter to the column reader.

/**
* Returns the set of fields that are guaranteed to be dense after the source query.
*/
static Set<String> nullsFilteredFieldsAfterSourceQuery(QueryBuilder sourceQuery) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we extract fields from the query that can be considered nullsFiltered when reading values. The nullsFiltered flag passed to each FieldInfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this done only for TS or for FROM too? For the latter, I wonder if ignoring nulls is desired.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should work for both, but it will be a no-op. Optimizations can be enabled at the codec level, and only possible with the TSDB codec now.

);
}
if (fields[field].info.nullsFiltered() && block.mayHaveNulls()) {
assert IntStream.range(0, block.getPositionCount()).noneMatch(block::isNull)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be expensive, so it is only enabled with assertions.

}
for (ColumnAtATimeWork r : columnAtATimeReaders) {
target[r.idx] = (Block) r.reader.read(loaderBlockFactory, docs, offset);
target[r.idx] = (Block) r.reader.read(loaderBlockFactory, docs, offset, operator.fields[r.idx].info.nullsFiltered());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pass the nullsFiltered from FieldInfo to column reader.

@dnhatn dnhatn marked this pull request as ready for review August 21, 2025 06:22
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine labels Aug 21, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@dnhatn
Copy link
Member Author

dnhatn commented Aug 21, 2025

reading-doubles

case TermsQueryBuilder q -> Set.of(q.fieldName());
case RangeQueryBuilder q -> Set.of(q.fieldName());
case ConstantScoreQueryBuilder q -> nullsFilteredFieldsAfterSourceQuery(q.innerQuery());
// TODO: support SingleValueQuery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also find and ignore coalence? Or is this part of default?

Copy link
Contributor

@kkrik-es kkrik-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

One note: what happens if we have multiple field metrics? For instance:

TS metrics 
| WHERE cpu_usage IS NOT NULL OR memory_usage IS NOT NULL
| STATS max(avg_over_time(cpu_usage)), max(avg_over_time(memory_usage)) BY tbucket(1 hour)

I'd think we don't necessarily get dense values here? If so, is this covered?

@dnhatn
Copy link
Member Author

dnhatn commented Aug 21, 2025

TS metrics 
| WHERE cpu_usage IS NOT NULL OR memory_usage IS NOT NULL
| STATS max(avg_over_time(cpu_usage)), max(avg_over_time(memory_usage)) BY tbucket(1 hour)

We won't be able to infer nullsFiltered from that query for either cpu_usage or memory_usage. Yes, we have tests for this case.

@dnhatn
Copy link
Member Author

dnhatn commented Aug 21, 2025

Thanks Kostas!

@dnhatn dnhatn merged commit 971c87d into elastic:main Aug 21, 2025
35 checks passed
@dnhatn dnhatn deleted the reading-dense-values branch August 21, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants