Skip to content

Conversation

@martijnvg
Copy link
Member

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).

@martijnvg martijnvg added >bug auto-backport Automatically create backport pull requests when merged :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data v8.18.1 v8.19.0 v9.0.1 v9.1.0 labels Mar 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Hi @martijnvg, I've created a changelog YAML for you.

Copy link
Contributor

@kkrik-es kkrik-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat

@martijnvg martijnvg changed the title Improve downsample dimension processing. Avoid reading unnecessary dimension values when downsampling Mar 10, 2025
@martijnvg martijnvg enabled auto-merge (squash) March 10, 2025 10:40
@martijnvg martijnvg merged commit 6afd3ec into elastic:main Mar 10, 2025
16 of 17 checks passed
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Mar 10, 2025
…#124451)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Mar 10, 2025
…#124451)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.18
8.x
9.0

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Mar 10, 2025
…#124451)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
elasticsearchmachine pushed a commit that referenced this pull request Mar 10, 2025
#124470)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
elasticsearchmachine pushed a commit that referenced this pull request Mar 10, 2025
#124468)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
elasticsearchmachine pushed a commit that referenced this pull request Mar 10, 2025
#124469)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Mar 11, 2025
…#124451)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Mar 13, 2025
…#124451)

Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.

Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:StorageEngine v8.18.1 v8.19.0 v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants