-
Notifications
You must be signed in to change notification settings - Fork 25.7k
TSDB ingest performance: combine routing and tsdb hashing #132566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fall back to index.routing_path if the dimensions can't be identified by a simple path math
Collaborator
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
Contributor
henningandersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments, sorry for just missing the merge time.
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Show resolved
Hide resolved
...ata-streams/src/main/java/org/elasticsearch/datastreams/DataStreamIndexSettingsProvider.java
Show resolved
Hide resolved
...ata-streams/src/main/java/org/elasticsearch/datastreams/DataStreamIndexSettingsProvider.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
Show resolved
Hide resolved
szybia
added a commit
to szybia/elasticsearch
that referenced
this pull request
Sep 22, 2025
* upstream/main: (50 commits) Disable utf-8 parsing optimization (elastic#135172) rest-api-spec: fix master_timeout typo (elastic#135167) Fixes countDistinctWithConditions in csv-spec tests (elastic#135097) Fix test failure by checking for feature flag (elastic#135174) Fix deadlock in ThreadPoolMergeScheduler when a failing merge closes the IndexWriter (elastic#134656) Make SecureString comparisons constant time (elastic#135053) Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search/160_exists_query/Test exists query on mapped geo_point field with no doc values} elastic#135164 ESQL: Replace function count tests (elastic#134951) Mute org.elasticsearch.compute.aggregation.SampleBooleanAggregatorFunctionTests testSimpleWithCranky elastic#135163 Mute org.elasticsearch.xpack.test.rest.XPackRestIT test {p0=analytics/nested_top_metrics_sort/terms order by top metrics numeric not null integer values} elastic#135162 Mute org.elasticsearch.xpack.test.rest.XPackRestIT test {p0=analytics/nested_top_metrics_sort/terms order by top metrics numeric not null double values} elastic#135159 TSDB ingest performance: combine routing and tsdb hashing (elastic#132566) Mute org.elasticsearch.compute.aggregation.SampleBytesRefAggregatorFunctionTests testSimpleWithCranky elastic#135157 Mute org.elasticsearch.xpack.logsdb.qa.BulkStoredSourceChallengeRestIT testHistogramAggregation elastic#135156 Mute org.elasticsearch.xpack.logsdb.qa.StandardVersusStandardReindexedIntoLogsDbChallengeRestIT testHistogramAggregation elastic#135155 Mute org.elasticsearch.xpack.logsdb.qa.LogsDbVersusLogsDbReindexedIntoStandardModeChallengeRestIT testHistogramAggregation elastic#135154 Mute org.elasticsearch.xpack.logsdb.qa.BulkChallengeRestIT testHistogramAggregation elastic#135153 Mute org.elasticsearch.discovery.ClusterDisruptionIT testAckedIndexing elastic#117024 Mute org.elasticsearch.lucene.RollingUpgradeSearchableSnapshotIndexCompatibilityIT testMountSearchableSnapshot {p0=[9.2.0, 9.2.0, 9.2.0]} elastic#135151 Mute org.elasticsearch.lucene.RollingUpgradeSearchableSnapshotIndexCompatibilityIT testSearchableSnapshotUpgrade {p0=[9.2.0, 9.2.0, 9.2.0]} elastic#135150 ...
felixbarny
added a commit
to felixbarny/elasticsearch
that referenced
this pull request
Sep 22, 2025
With implementations IndexRouting.ExtractFromSource.ForRoutingPath and IndexRouting.ExtractFromSource.ForIndexDimensions. This addresses review comments from elastic#132566.
This was referenced Sep 22, 2025
gmjehovich
pushed a commit
to gmjehovich/elasticsearch
that referenced
this pull request
Sep 22, 2025
…2566) Instead of hashing dimensions during routing and then again during document parsing, this combines the two steps. The tsid is created during routing and then used to create a routing hash. The tsid is then sent to the data nodes which acts as a signal that creating the tsid during document parsing isn't required anymore.
DonalEvans
pushed a commit
to DonalEvans/elasticsearch
that referenced
this pull request
Sep 22, 2025
…2566) Instead of hashing dimensions during routing and then again during document parsing, this combines the two steps. The tsid is created during routing and then used to create a routing hash. The tsid is then sent to the data nodes which acts as a signal that creating the tsid during document parsing isn't required anymore.
felixbarny
added a commit
to felixbarny/elasticsearch
that referenced
this pull request
Sep 23, 2025
In elastic#133232, we've added the ability to provide index metadata with an IndexSettingProvider. It turned out that we don't need that functionality as we ended up using a private index setting in elastic#132566. This also adds the `IndexVersion` as another parameter. This is in preparation for [this](elastic#132566 (comment)) suggestion to conditionally set one or another setting, depending on the index version.
felixbarny
added a commit
that referenced
this pull request
Sep 24, 2025
In #133232, we've added the ability to provide index metadata with an IndexSettingProvider. It turned out that we don't need that functionality as we ended up using a private index setting in #132566. This also adds the `IndexVersion` as another parameter. This is in preparation for [this](#132566 (comment)) suggestion to conditionally set one or another setting, depending on the index version. `IndexSettingProvider`s are now disallowed from providing the `index.version.created` setting. Otherwise, they can't rely on the `IndexVersion` they receive to be the one that will be actually used for the created index as another provider may change it.
felixbarny
added a commit
that referenced
this pull request
Sep 25, 2025
) With implementations IndexRouting.ExtractFromSource.ForRoutingPath and IndexRouting.ExtractFromSource.ForIndexDimensions. This addresses review comments from #132566. Also fixes cases where the tsid is not provided by the coordinating node, such as for translog operations.
This was referenced Sep 25, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
external-contributor
Pull request authored by a developer outside the Elasticsearch team
>non-issue
serverless-linked
Added by automation, don't add manually
:StorageEngine/TSDB
You know, for Metrics
Team:StorageEngine
v9.2.0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Instead of hashing dimensions during routing and then again during document parsing, this combines the two steps. The tsid is created during routing and then used to create a routing hash. The tsid is then sent to the data nodes which acts as a signal that creating the tsid during document parsing isn't required anymore.
Instead of populating the
index.routing_pathsetting that can differ from the document dimensions, this now populates a newindex.dimensionsindex setting containing all dimensions. This setting isn't user-configurable (todo). In case users manually setindex.routing_path, the new optimization doesn't kick in so that routing and tsid creation is working as before. Additionally, if the dimension fields can't be expressed as a simple set of path matches (for example when using a dynamic template with amatch_mapping_typethat setstime_series_dimension: true), it falls back to populatingindex.routing_path.As an additional benefit, the new
_tsids are shorter, which may have benefits at query time. While they're shorter, they still retain the main properties: clustering similar time series together (which helps in compression) and making collisions very unlikely. More details in the JavaDoc ofTsidBuilder. In fact, based on my testing, the compression is even a bit better after this change.Remaining issues to work out:
index.dimensionsa private settingindex.dimensionswhen adding a new dimension field to the mappings.index.dimensionsso that the coordinating node always knows which paths will be considered dimensions.Sub-PRs