You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Improvement](serialize) use streamvbyte_encode in DataTypeFixedLengthObject::serialize (#60526)
This pull request introduces a new serialization format for
`DataTypeFixedLengthObject` columns, leveraging streamvbyte encoding for
efficient storage and transmission of large data blocks. The new format
is activated for BE exec version 10 and above, which is now set as the
maximum supported version. Additionally, the `AggregateFunctionCount`
and `AggregateFunctionCountNotNullUnary` functions are marked as
trivial, likely for optimization purposes. Below are the most important
changes:
### Serialization and Deserialization Improvements
* Introduced a new serialization/deserialization format for
`DataTypeFixedLengthObject` that uses streamvbyte encoding for large
data, improving efficiency for big data columns. The new logic is gated
behind BE exec version 10 and includes fallback to the previous format
for older versions
(`be/src/vec/data_types/data_type_fixed_length_object.cpp`).
[[1]](diffhunk://#diff-7d29ab3e43d23db58f2216e23cc131705067e133fb7ab2da72f2e67c725beb48L36-L41)
[[2]](diffhunk://#diff-7d29ab3e43d23db58f2216e23cc131705067e133fb7ab2da72f2e67c725beb48L56-R130)
[[3]](diffhunk://#diff-7d29ab3e43d23db58f2216e23cc131705067e133fb7ab2da72f2e67c725beb48L84-R144)
[[4]](diffhunk://#diff-7d29ab3e43d23db58f2216e23cc131705067e133fb7ab2da72f2e67c725beb48R156-R172)
* Updated the calculation of uncompressed serialized bytes to account
for the new serialization format and potential streamvbyte compression
(`be/src/vec/data_types/data_type_fixed_length_object.cpp`).
* Added the `streamvbyte` library include to support the new
encoding/decoding logic
(`be/src/vec/data_types/data_type_fixed_length_object.cpp`).
### Version Management
* Increased `BeExecVersionManager::max_be_exec_version` from 8 to 10,
with detailed documentation and warnings about the sensitivity of this
field. The new version enables the updated serialization logic
(`be/src/agent/be_exec_version_manager.cpp`).
* Defined a new constant `USE_NEW_FIXED_OBJECT_SERIALIZATION_VERSION =
10` to clearly mark the threshold for the new serialization format
(`be/src/agent/be_exec_version_manager.h`).
### Aggregate Function Optimization
* Marked `AggregateFunctionCount` and
`AggregateFunctionCountNotNullUnary` as trivial by overriding the
`is_trivial()` method to return `true`, which may allow for performance
optimizations in the aggregation engine
(`be/src/vec/aggregate_functions/aggregate_function_count.h`).
[[1]](diffhunk://#diff-a5dbb09237f197bffdcbd3bec4fdd089913ec143d96806618c8eeb4c5dbb8cfeR64-R65)
[[2]](diffhunk://#diff-a5dbb09237f197bffdcbd3bec4fdd089913ec143d96806618c8eeb4c5dbb8cfeR212-R213)
0 commit comments