Skip to content

Conversation

@dnhatn
Copy link
Member

@dnhatn dnhatn commented Apr 19, 2025

This query against the TSDB track took 50 seconds and was reduced to 19 seconds with this changes.

TS tsdb 
| STATS sum(rate(kubernetes.container.memory.pagefaults)) by bucket(@timestamp, 5minute)

This change introduces several optimizations to improve the performance of the time-series source operator:

  • Split the leaf queue into two: one for _tsid and another for @timestamp. This avoids repeatedly comparing large _tsid values while iterating over a single _tsid.
  • Track the number of emitted documents per segment and use this data to build forward and backward document maps, reducing the need for expensive sorts.
  • Use ordinal blocks to avoid duplicating the same _tsid multiple times.

@dnhatn dnhatn force-pushed the time-series-source branch from d4f7e9b to 57ab327 Compare April 20, 2025 01:03
@dnhatn dnhatn requested review from kkrik-es and martijnvg April 20, 2025 01:03
@dnhatn dnhatn marked this pull request as ready for review April 20, 2025 01:03
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine labels Apr 20, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@dnhatn dnhatn requested a review from kkrik-es April 21, 2025 16:55
Copy link
Contributor

@kkrik-es kkrik-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, a few nits and questions about further improvements. Let's also have Martijn double-check the lucene part.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great job Nhat 👍

@dnhatn
Copy link
Member Author

dnhatn commented Apr 22, 2025

@kkrik-es @martijnvg Thanks for reviewing.

@dnhatn dnhatn merged commit 4f506d4 into elastic:main Apr 22, 2025
17 checks passed
@dnhatn dnhatn deleted the time-series-source branch April 22, 2025 21:32
@dnhatn dnhatn mentioned this pull request Apr 27, 2025
28 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants