-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Change index.mapping.use_doc_values_skipper default setting to true
#134221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change index.mapping.use_doc_values_skipper default setting to true
#134221
Conversation
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
|
Hi @romseygeek, I've created a changelog YAML for you. |
martijnvg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Alan, LGTM!
I think doc value skippers will now also be enabled for _tsid field? (in snapshot versions) (If I understand TimeSeriesIdFieldMapper#getInstance(...) correctly)
docs/changelog/134221.yaml
Outdated
| @@ -0,0 +1,5 @@ | |||
| pr: 134221 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's label this as a non-issue? Because released versions will not run with doc value skippers enabled And when the feature flag gets removed then we label it as a feature or enhancement.
Yes, that looks right. It doesn't remove the backing index for this field though, it only enables the skipper. |
Yes, the |
…ers' into benchmark/main-enabled-skippers
| assertThat(dataProfiles, hasSize(1)); | ||
| List<OperatorStatus> ops = dataProfiles.get(0).operators(); | ||
| assertThat(ops, hasSize(5)); | ||
| assertThat(ops, hasSize(6)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change adds an extra ValuesSourceReaderOperator step into this ESQL query - does that sound right to you @martijnvg ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know for what field(s) both value source reader operations are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new one is for @timestamp:
{"operator":"ValuesSourceReaderOperator[fields = [@timestamp]]","status":{"readers_built":{"@timestamp:column_at_a_time:BlockDocValuesReader.SingletonLongs":3},"values_loaded":98,"process_nanos":887208,"pages_received":3,"pages_emitted":3,"rows_received":98,"rows_emitted":98}}
The other is for _tsid, cpu and cluster and is unchanged from before:
{"operator":"ValuesSourceReaderOperator[fields = [_tsid, cpu, cluster]]","status":{"readers_built":{"_tsid:column_at_a_time:BlockDocValuesReader.SingletonOrdinals":3,"cluster:column_at_a_time:BlockDocValuesReader.SingletonOrdinals":3,"cpu:column_at_a_time:BlockDocValuesReader.SingletonDoubles":3},"values_loaded":294,"process_nanos":1226959,"pages_received":3,"pages_emitted":3,"rows_received":98,"rows_emitted":98}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check whether the LuceneSourceOperator pushes down the buckets as filters? For example do see IndexOrDocValuesQuery(indexQuery=@timestamp:[1713139320000 TO 1713139379999], dvQuery=@timestamp:[1713139320000 TO 1713139379999]) in processedQueries of the status of LuceneSourceOperator?
Maybe this change results in compute engine no longer pushing down filters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, the field caps api for @timestamp field will no longer return that it is indexed and that is why I think the query plan is different, with this as a result.
I don't think this blocks merging this pr. Maybe we need to think about marking a field as indexed if it has doc value skippers and is part of index sorting before we release the feature flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked LuceneSourceOperator with this change and there aren't any IndexOrDocValuesQuery instances in processed query.
So I think this is good. Let's merge it when CI is green.
…ers' into benchmark/main-enabled-skippers
|
Buildkite benchmark this with tsdb please |
|
Trying a benchmark again after I've merged in |
|
Buildkite benchmark this with elastic-logs-logsdb please |
💚 Build Succeeded
This build ran two elastic-logs-logsdb benchmarks to evaluate performance impact of this PR. History
cc @romseygeek |
|
TSDB benchmark comparison: 12% reduction in index size, and between 10-50% penalty on query performance. The odd one is the throughput on 12% is a nice reduction but I do wonder if we would actually get better bang for our buck by looking at other fields. The |
We chatted about this field before and looks like this field should no have a points index to begin with for the reason you mentioned. In order to achieve this, our builtin stack templates should map this with |
|
I also wonder how much here is noise. If I look at |
|
Some things I'd like to note:
|
…ers' into benchmark/main-enabled-skippers
|
I ran the TSDB metricsgen track and got encouraging results. The highlights are:
Some more details on the benchmark:
Click to expand full rally compare... |
|
Very encouraging! The much smaller time range buckets being slower is expected I guess, as they won't be able to make use of the sparse indexes as efficiently as larger ranges. @martijnvg @kkrik-es I think this performance is close enough for us to merge and see if we can improve the small time range buckets performance in follow-ups. WDYT? |
Indeed, very promising! Let's get it in, behind a feature flag, to further assess without affecting serverless. |
martijnvg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
The index.mapping.use_doc_values_skipper index setting only has effect today if doc_values_skipper feature flag is enabled. So PR as is, is good!
@timestamp and host.name fields in LogsDBindex.mapping.use_doc_values_skipper default setting to true
…e` (elastic#134221) TSBD and LogsDB indexes with this setting set to true do not use backing indexes for `host.name`, `_tsid` or `@timestamp` fields, and instead enable sparse indexes on their doc_values fields. This was originally added and set to default to `true` a while back, but was then changed to default to `false` when performance suffered. Changes to lucene, and the addition of filter-by-filter optimizations on skipper queries, means that performance looks more acceptable now. This is still gated behind a feature flag so that we can measure the behaviour in nightly benchmarks before enabling for public use.
…e` (elastic#134221) TSBD and LogsDB indexes with this setting set to true do not use backing indexes for `host.name`, `_tsid` or `@timestamp` fields, and instead enable sparse indexes on their doc_values fields. This was originally added and set to default to `true` a while back, but was then changed to default to `false` when performance suffered. Changes to lucene, and the addition of filter-by-filter optimizations on skipper queries, means that performance looks more acceptable now. This is still gated behind a feature flag so that we can measure the behaviour in nightly benchmarks before enabling for public use.
TSBD and LogsDB indexes with this setting set to
truedo not use backing indexesfor
host.name,_tsidor@timestampfields, and instead enable sparse indexeson their doc_values fields.
This was originally added and set to
truea while back, but was then changed tofalsewhen performance suffered. Changes to lucene, and the addition of filter-by-filteroptimizations on skipper queries, means that performance looks more acceptable now.
This is still gated behind a feature flag so that we can measure the behaviour in nightly
benchmarks before enabling for public use.