-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Enable a doc values sparse index on the timestamp field #121673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Enable a doc values sparse index on the timestamp field #121673
Conversation
| final IndexSortConfig indexSortConfig, | ||
| final boolean hasDocValues | ||
| ) { | ||
| if (index.isConfigured()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not work in case index: true explicitly because isSet is true but getValue() == getDefaultValue(). As in the other PR I think the implementation of isConfigured is not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of @timestamp for isConfigured we have two possibilities:
isConfigured()istruewhich means the user explicitly set theindexvalue totrueorfalse. In such case no matter the value ofindexthe sparse doc values index should be disabled, and we would use the inverted index.isConfigured()isfalse: this means theindexparameter is not set by default which means, for LogsDB, under certain conditions, we can use the sparse index instead of the inverted index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should then just check for isSet() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use isSet than the other test fails...for instance testFieldTypeWithSkipDocValues_LogsDBMode
| configuredSettings.remove("meta"); | ||
| configuredSettings.remove("format"); | ||
| configuredSettings.remove("locale"); | ||
| if (isIndexParamExplicitOverrideAllowed()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why I had to add the indexSettings
| && IndexMode.LOGSDB.equals(indexMode) | ||
| && hasDocValues | ||
| && indexSortConfig != null | ||
| && indexSortConfig.hasPrimarySortOnField(DataStreamTimestampFieldMapper.DEFAULT_PATH) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also allow when timestamp is secondary sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I left it like that for the moment because that was not really an issue with the tests I have at the moment. Anyway I will replace that method with something like isSortedOnTimestamp.
| final IndexSortConfig indexSortConfig, | ||
| final boolean hasDocValues | ||
| ) { | ||
| if (index.isConfigured()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should then just check for isSet() ?
Can we extend this so that this is also enabled for TSDB?
Can we extend this to also work for when |
I think we can do this. The scope of this PR is enable sparse index in favor for indexed data structures, so that we can analyze the results in elastic/logs nightly benchmark. Note that everything is behind a feature flag. We can do this in a follow ups after we analyzed the nightly benchmark results. For tsdb, we like have a separate effort of checking how using sparse index effects querying (also on dimension fields). |
This PR introduces support for a sparse doc values index for the
@timestampfield inDateFieldMapperwhen specific conditions are met:@timestampand mapped as a date field.index: false).When all the conditions above hold true, we:
@timestampfield, dropping the inverted index in favor of the sparse doc values index.Some queries might experience slower performance as a result of using a doc values sparse index instead of an inverted index.
Disabling the inverted index on the
@timestampfield while enabling the sparse doc values index is expected to: