-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feature](variant) add variant doc snapshot mode #59183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
b307644 to
7d6f45a
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
7d6f45a to
a1b24a3
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
|
run p0 10 |
TPC-H: Total hot run time: 36375 ms |
TPC-DS: Total hot run time: 180018 ms |
ClickBench: Total hot run time: 27.17 s |
FE Regression Coverage ReportIncrement line coverage |
a1b24a3 to
943db93
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 35042 ms |
TPC-DS: Total hot run time: 178720 ms |
ClickBench: Total hot run time: 28.28 s |
| Status parse_variant_columns(vectorized::Block& block, const TabletSchema& tablet_schema, | ||
| const std::vector<uint32_t>& column_pos); | ||
|
|
||
| // Moved from `vec/common/schema_util.{h,cpp}`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this comment
|
|
||
| namespace doris::segment_v2::variant_util { | ||
|
|
||
| // Parse variant columns by picking variant positions from `column_pos` and generating ParseConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and generating ParseConfig?
| } | ||
|
|
||
| void ColumnVariant::serialize_from_doc_snapshot_to_json_format(int64_t row_num, | ||
| BufferWritable& output, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too many duplicated code
| BeConsts::DEFAULT_VARIANT_MAX_SPARSE_COLUMN_STATS_SIZE; | ||
| // default to 0, no shard | ||
| int32_t _variant_sparse_hash_shard_count = 0; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better wrap all the variant related properties to a structure
be/src/olap/tablet_schema.h
Outdated
|
|
||
| int64_t _variant_doc_snapshot_min_rows = 0; | ||
|
|
||
| int32_t _variant_doc_snapshot_shard_count = 128; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if specify _variant_sparse_hash_shard_count and _variant_doc_snapshot_shard_count at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 哪些互斥、哪些是必须一起的
|
|
||
| vectorized::MutableColumnPtr variant_column; | ||
| if (var.is_doc_snapshot_mode()) { | ||
| // doc snapshot mode, we need to parse the doc snapshot column |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
什么情况会走到这里, need comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
判断config
| auto converter = std::make_unique<vectorized::OlapBlockDataConvertor>(); | ||
| int column_id = 0; | ||
| int64_t variant_doc_snapshot_min_rows = parent_column.variant_doc_snapshot_min_rows(); | ||
| if (variant_doc_snapshot_min_rows == 0 || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variant_doc_snapshot_min_rows == 0 is redundant and could be removed ?
| @SerializedName(value = "variantSparseHashShardCount") | ||
| private final int variantSparseHashShardCount; | ||
|
|
||
| @SerializedName(value = "enableVariantDocSnapshotMode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enable_doc_mode?
| @SerializedName(value = "enableVariantDocSnapshotMode") | ||
| private final boolean enableVariantDocSnapshotMode; | ||
|
|
||
| @SerializedName(value = "variantDocSnapshotMinRows") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc_materizalization_min_rows?
| @SerializedName(value = "variantDocSnapshotMinRows") | ||
| private final long variantDocSnapshotMinRows; | ||
|
|
||
| @SerializedName(value = "variantDocSnapshotShardCount") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc_hash_shard_count?
|
run p0 10 |
943db93 to
3cb0f62
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
1828d8a to
73fd870
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 36638 ms |
TPC-DS: Total hot run time: 179402 ms |
ClickBench: Total hot run time: 28.53 s |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)