-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Push compute engine value loading down to tsdb codec #132460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I benchmarked this change with the following query:
This change only optimizes loading of Without this change default data partition on my local laptop the average query time after 36 executions is:
With this change using default data partition on my local laptop the average query time after 36 executions is:
The detailed profiling output seems to suggest that ~3 times improvement. Note that the profile data was captured with That improvement isn't as visible in the overall query time, because still a big part of time spent is in |
68d2784
to
2f1aa71
Compare
…als are dense. Also added block loader for dimension fields.
edb81ff
to
51e95a1
Compare
This is the first of many changes that pushes loading of field values to the es819 doc values codec in case of logsdb/tsdb and when the field supports it. This change first targets reading field values in bulk mode at codec level when doc values type is numeric doc values or sorted doc values, there is only one value per document, and the field is dense (all documents have a value). Multivalued and sparse fields are more complex to support bulk reading for, but it is possible. With this change, the following field types will support bulk read mode at codec level under the described conditions: long, date, geo_point, point and unsigned_long. Other number types like integer, short, double, float, scaled_float will be supported in a followup, but would be similar to long based fields, but required an additional conversion step to either an int or float vector. This change originates from elastic#132460 (which adds bulk reading to `@timestamp`, `_tsid` and dimension fields) and is basically the timestamp support part of it. In another followup, support for single valued, dense sorted (set) doc values will be added for field like _tsid. Relates to elastic#128445
) This is the first of many changes that pushes loading of field values to the es819 doc values codec in case of logsdb/tsdb and when the field supports it. This change first targets reading field values in bulk mode at codec level when doc values type is numeric doc values or sorted doc values, there is only one value per document, and the field is dense (all documents have a value). Multivalued and sparse fields are more complex to support bulk reading for, but it is possible. With this change, the following field types will support bulk read mode at codec level under the described conditions: long, date, geo_point, point and unsigned_long. Other number types like integer, short, double, float, scaled_float will be supported in a followup, but would be similar to long based fields, but required an additional conversion step to either an int or float vector. This change originates from #132460 (which adds bulk reading to `@timestamp`, `_tsid` and dimension fields) and is basically the timestamp support part of it. In another followup, support for single valued, dense sorted (set) doc values will be added for field like _tsid. Relates to #128445
Evolution of #128334, but is targeted for just loading
@timestamp
and_tsid
field values (tsid is missing and will be added soon) in the context of queries like:Relates to #128445 and #132379