Skip to content

[refactor](be) Enforce COW ownership for assume_mutable#63001

Merged
zclllyybb merged 11 commits into
apache:masterfrom
zclllyybb:cow
May 23, 2026
Merged

[refactor](be) Enforce COW ownership for assume_mutable#63001
zclllyybb merged 11 commits into
apache:masterfrom
zclllyybb:cow

Conversation

@zclllyybb
Copy link
Copy Markdown
Contributor

@zclllyybb zclllyybb commented May 6, 2026

What problem does this PR solve?

Issue Number: N/A

Related PR: N/A

Problem Summary: This PR changes the BE COW contract around assume_mutable: callers may only use it when they already own the column exclusively, and the helper now behaves as an ownership assertion instead of a silent mutable borrow. Shared ColumnPtr / Block data must go through mutate() or one of the scoped owner-slot mutation helpers before being modified. This lets blocks and columns rely on real COW semantics and avoids both unsafe in-place mutation of shared data and unnecessary cross-operator copies.

The main changes are:

  • Make assume_mutable / assume_mutable_ref validate exclusive ownership and document audited usage in docs/dev/be-cow-assume-mutable-audit.md.
  • Add scoped COW mutation APIs for common owner-slot patterns, including whole-block scoped mutation, single-column scoped mutation, and rvalue-only stealing mutation for Block::mutate_columns().
  • Migrate BE paths that previously assumed mutable access to explicit COW ownership transfer, scoped restore, or MutableBlock / MutableColumns usage in hot paths.
  • Fix affected storage, scanner, external reader, parquet/orc/json/table-format, variant, aggregation, local exchange, and schema scanner paths so errors restore moved-out columns and nested subcolumns are written back through their owner slots.
  • Add focused BE UT coverage for the COW contract: shared-column detach, scoped restore on early error, block schema access through scoped guards, LocalExchanger restore-on-error, and table-format partition/missing-column COW behavior.

Release note

None

Check List (For Author)

  • Test:
    • Unit Test: ./run-be-ut.sh --run --filter=BlockTest.ScopedMutableColumnsRestoreOnErrorAndDetachSharedColumn:BlockTest.ScopedMutableColumnsReadSchemaFromLiveBlock:BlockTest.ScopedMutableColumnRestoreOnErrorDetachSharedAndCreateMissingColumn:BlockTest.ScopedMutableBlockRestoreOnErrorAndDetachSharedColumn:LocalExchangerTest.ShuffleExchangerRestoreOutputBlockOnAddRowsError:TableFormatReaderTest.FillPartitionColumnRestoresSharedColumnOnDeserializeError:TableFormatReaderTest.FillMissingNullableColumnDetachesSharedBlockSlot -j 100
    • Manual test: ./build.sh --be -j 100
    • Manual test: PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/check-format.sh
    • Manual test: git diff --check
  • Behavior changed: No user-visible behavior change. BE internal mutable-column ownership is now asserted and COW-safe.
  • Does this need documentation: No user documentation needed; BE developer audit documentation is included in this PR.

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a blocking COW correctness issue in complex-type deserialization. The PR changes these paths from mutating the existing child through assume_mutable() to mutating a possibly cloned child, but the cloned child is not written back to the parent complex column. That means deserialization can update offsets while dropping child data when nested children are shared, which is exactly the ownership case this PR is trying to make safe.

Critical checkpoint conclusions:

  • Goal/test: The goal is to restore COW-safe mutation after assume_mutable() enforces exclusive ownership. The broad BE unit-test list helps, but the complex-type deserialize path still has an uncovered shared-child case.
  • Scope/focus: The change is focused on COW ownership, but several deserialize updates need the same write-back pattern used elsewhere after mutating detached children.
  • Concurrency/lifecycle/config/compatibility: No new concurrency, lifecycle, configuration, or storage-format compatibility concern identified in the reviewed changes.
  • Parallel paths: Array, Map, and Struct deserialize have the same detached-child pattern and all need to be fixed consistently.
  • Tests: Please add or adjust coverage for deserializing complex columns whose child ColumnPtr is shared so this does not regress.
  • Observability/persistence/write correctness/performance: No additional observability, persistence, transaction, or performance blocker found beyond the data-correctness issue above.

User focus: No additional user-provided review focus was specified.

Comment thread be/src/core/data_type/data_type_array.cpp
Comment thread be/src/core/data_type/data_type_map.cpp
Comment thread be/src/core/data_type/data_type_struct.cpp
@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found several issues that should be addressed before merging. The existing inline threads already cover the Array/Map/Struct COW deserialization ownership problems, so I did not duplicate them.

Critical checkpoint conclusions:

  • Goal/test: The PR appears to make BE column mutation safe under COW and remove streaming-job filtered-row persistence. It is only partially achieved: the ORC schema-change nullable path can copy stale null flags, and filtered-row observability is left inconsistent. Existing tests were updated broadly, but the streaming filtered-row regression test was deleted without replacement coverage for the new expected behavior.
  • Scope/focus: The BE COW change is broad and not fully localized; this increases risk in hot scan/convert paths.
  • Concurrency/lifecycle: I did not find new lock-order or thread-lifecycle issues in the reviewed paths.
  • Config/incompatibility: No new config items. The protobuf field removal needs reserved tag/name handling to avoid future wire/persistence compatibility hazards.
  • Parallel paths: Filtered-row updates were removed from transactional streaming jobs, while non-transactional streaming offset commits and metric definitions still keep filtered rows, leaving behavior inconsistent.
  • Tests/results: Missing replacement coverage for the removed streaming filtered-row behavior. I did not run tests in this review runner.
  • Observability: Filtered-row metrics/job details can now report zero or omit data even when rows are filtered.

User focus: No additional user-provided review focus was specified.

auto converted_column = doris_column->assume_mutable();
if (converted_column->is_nullable()) {
align_orc_null_map(resolved_column,
reinterpret_cast<ColumnNullable*>(converted_column.get()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copies the source null map from offset 0, but in the non-consistent nullable converter path the source nullable column was built with the destination null map and then deep-mutated. If doris_column already has rows, fill_orc_null_map() appends the newly-read ORC nulls after those old rows in resolved_column; passing only resolved_column->size() here then makes align_orc_null_map() insert the first new_rows entries, which are stale old null flags rather than the newly appended slice. This can corrupt null flags when reading multiple batches through an ORC schema-change conversion. Please preserve the destination old row count and copy from that offset, or avoid cloning a source nullable wrapper that contains the destination's existing null map.

zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 7, 2026
Issue Number: None

Related PR: apache#63001

Problem Summary: After the COW mutable ownership changes, constructing a MutableBlock from the cached Iceberg equality-delete block transfers the block columns into the mutable block. The equality-delete loader merged rows into the mutable block but did not write the result back, leaving null columns in the cached block and crashing HybridSet while building delete filters. This also applies the requested BE clang-format updates.

None

- Test: Manual test
    - build-support/check-format.sh
    - ./build.sh --be -j 100
    - git diff --check
- Behavior changed: No
- Does this need documentation: No
@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
…sion

Issue Number: None

Related PR: apache#63001

Problem Summary: After assume_mutable started asserting exclusive ownership, sort aggregate state still appended through immutable Block columns on the hot add/merge path, and nullable schema-change conversion could copy a stale null-map prefix after ORC/Parquet COW cloning. Keep the sort aggregate state as a MutableBlock, remove a read-only mutable assertion, and copy nullable null-map slices from the appended source range.

Release note

None

Check List (For Author)

- Test: Unit Test

    - ./run-be-ut.sh --run --filter=AggregateFunctionSortDataTest.merge_does_not_share_rhs_block:OrcReaderFillDataTest.SchemaChangeNullableNullMapUsesAppendedSlice:ParquetColumnConvertTest.AlignNullMapUsesAppendedSourceSlice -j 100

    - ./run-regression-test.sh --run -f regression-test/suites/query_p0/runtimefilterV2/rfv2.groovy

    - ./build.sh --be -j 100

    - build-support/clang-format.sh

    - build-support/check-format.sh

    - git diff --check

    - Local cluster restart and select 1/show backends

    - build-support/run-clang-tidy.sh --build-dir be/build_ASAN attempted; changed-line findings were fixed, remaining failures are existing ORC/test include/jni-util baseline or tooling diagnostics

- Behavior changed: No

- Does this need documentation: No
zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
…sion

Issue Number: None

Related PR: apache#63001

Problem Summary: After assume_mutable started asserting exclusive ownership, sort aggregate state still appended through immutable Block columns on the hot add/merge path, and nullable schema-change conversion could copy a stale null-map prefix after ORC/Parquet COW cloning. Keep the sort aggregate state as a MutableBlock, remove a read-only mutable assertion, and copy nullable null-map slices from the appended source range.

Release note

None

Check List (For Author)

- Test: Unit Test

    - ./run-be-ut.sh --run --filter=AggregateFunctionSortDataTest.merge_does_not_share_rhs_block:OrcReaderFillDataTest.SchemaChangeNullableNullMapUsesAppendedSlice:ParquetColumnConvertTest.AlignNullMapUsesAppendedSourceSlice -j 100

    - ./run-regression-test.sh --run -f regression-test/suites/query_p0/runtimefilterV2/rfv2.groovy

    - ./build.sh --be -j 100

    - build-support/clang-format.sh

    - build-support/check-format.sh

    - git diff --check

    - Local cluster restart and select 1/show backends

    - build-support/run-clang-tidy.sh --build-dir be/build_ASAN attempted; changed-line findings were fixed, remaining failures are existing ORC/test include/jni-util baseline or tooling diagnostics

- Behavior changed: No

- Does this need documentation: No
@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
Issue Number: None

Related PR: apache#63001

Problem Summary: After the COW mutable ownership changes, constructing a MutableBlock from the cached Iceberg equality-delete block transfers the block columns into the mutable block. The equality-delete loader merged rows into the mutable block but did not write the result back, leaving null columns in the cached block and crashing HybridSet while building delete filters. This also applies the requested BE clang-format updates.

None

- Test: Manual test
    - build-support/check-format.sh
    - ./build.sh --be -j 100
    - git diff --check
- Behavior changed: No
- Does this need documentation: No
zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
…sion

Issue Number: None

Related PR: apache#63001

Problem Summary: After assume_mutable started asserting exclusive ownership, sort aggregate state still appended through immutable Block columns on the hot add/merge path, and nullable schema-change conversion could copy a stale null-map prefix after ORC/Parquet COW cloning. Keep the sort aggregate state as a MutableBlock, remove a read-only mutable assertion, and copy nullable null-map slices from the appended source range.

Release note

None

Check List (For Author)

- Test: Unit Test

    - ./run-be-ut.sh --run --filter=AggregateFunctionSortDataTest.merge_does_not_share_rhs_block:OrcReaderFillDataTest.SchemaChangeNullableNullMapUsesAppendedSlice:ParquetColumnConvertTest.AlignNullMapUsesAppendedSourceSlice -j 100

    - ./run-regression-test.sh --run -f regression-test/suites/query_p0/runtimefilterV2/rfv2.groovy

    - ./build.sh --be -j 100

    - build-support/clang-format.sh

    - build-support/check-format.sh

    - git diff --check

    - Local cluster restart and select 1/show backends

    - build-support/run-clang-tidy.sh --build-dir be/build_ASAN attempted; changed-line findings were fixed, remaining failures are existing ORC/test include/jni-util baseline or tooling diagnostics

- Behavior changed: No

- Does this need documentation: No
zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
### What problem does this PR solve?

Issue Number: None

Related PR: apache#63001

Problem Summary: Parquet conversion aligned nullable source null maps by source nested row counts. Under the COW assert-mutability contract, cached nullable sources may carry an old destination null-map prefix while their nested columns only contain the current batch, and fixed-length physical columns may store multiple nested values per logical row. This caused ParquetReaderTest.varbinary_string and adjacent varbinary paths to abort. Track destination null-map length separately and copy the appended source null-map slice by logical new rows.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=ParquetReaderTest.varbinary_string -j 100
    - ./run-be-ut.sh --run --filter=ParquetReaderTest.uuid_varbinary:ParquetReaderTest.varbinary_varbinary:ParquetReaderTest.varbinary_string:ParquetReaderTest.varbinary_string2:ParquetColumnConvertTest.*NullMap* -j 100
    - ./run-be-ut.sh --run --filter=ParquetReaderTest.*:ParquetColumnConvertTest.*:OrcReaderFillDataTest.SchemaChangeNullableNullMapUsesAppendedSlice -j 100
    - build-support/check-format.sh
- Behavior changed: No
- Does this need documentation: No
@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
Issue Number: None

Related PR: apache#63001

Problem Summary: After the COW mutable ownership changes, constructing a MutableBlock from the cached Iceberg equality-delete block transfers the block columns into the mutable block. The equality-delete loader merged rows into the mutable block but did not write the result back, leaving null columns in the cached block and crashing HybridSet while building delete filters. This also applies the requested BE clang-format updates.

None

- Test: Manual test
    - build-support/check-format.sh
    - ./build.sh --be -j 100
    - git diff --check
- Behavior changed: No
- Does this need documentation: No
zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
…sion

Issue Number: None

Related PR: apache#63001

Problem Summary: After assume_mutable started asserting exclusive ownership, sort aggregate state still appended through immutable Block columns on the hot add/merge path, and nullable schema-change conversion could copy a stale null-map prefix after ORC/Parquet COW cloning. Keep the sort aggregate state as a MutableBlock, remove a read-only mutable assertion, and copy nullable null-map slices from the appended source range.

Release note

None

Check List (For Author)

- Test: Unit Test

    - ./run-be-ut.sh --run --filter=AggregateFunctionSortDataTest.merge_does_not_share_rhs_block:OrcReaderFillDataTest.SchemaChangeNullableNullMapUsesAppendedSlice:ParquetColumnConvertTest.AlignNullMapUsesAppendedSourceSlice -j 100

    - ./run-regression-test.sh --run -f regression-test/suites/query_p0/runtimefilterV2/rfv2.groovy

    - ./build.sh --be -j 100

    - build-support/clang-format.sh

    - build-support/check-format.sh

    - git diff --check

    - Local cluster restart and select 1/show backends

    - build-support/run-clang-tidy.sh --build-dir be/build_ASAN attempted; changed-line findings were fixed, remaining failures are existing ORC/test include/jni-util baseline or tooling diagnostics

- Behavior changed: No

- Does this need documentation: No
zclllyybb added a commit to zclllyybb/doris that referenced this pull request May 8, 2026
### What problem does this PR solve?

Issue Number: None

Related PR: apache#63001

Problem Summary: Parquet conversion aligned nullable source null maps by source nested row counts. Under the COW assert-mutability contract, cached nullable sources may carry an old destination null-map prefix while their nested columns only contain the current batch, and fixed-length physical columns may store multiple nested values per logical row. This caused ParquetReaderTest.varbinary_string and adjacent varbinary paths to abort. Track destination null-map length separately and copy the appended source null-map slice by logical new rows.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=ParquetReaderTest.varbinary_string -j 100
    - ./run-be-ut.sh --run --filter=ParquetReaderTest.uuid_varbinary:ParquetReaderTest.varbinary_varbinary:ParquetReaderTest.varbinary_string:ParquetReaderTest.varbinary_string2:ParquetColumnConvertTest.*NullMap* -j 100
    - ./run-be-ut.sh --run --filter=ParquetReaderTest.*:ParquetColumnConvertTest.*:OrcReaderFillDataTest.SchemaChangeNullableNullMapUsesAppendedSlice -j 100
    - build-support/check-format.sh
- Behavior changed: No
- Does this need documentation: No
@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two COW correctness issues that should be fixed before this PR lands.

Critical checkpoint conclusions:

  • Goal and tests: the PR goal is valid, and it adds focused BE unit coverage for several mem-reuse/COW paths, but the new shared complex-column paths still miss important invariant coverage.
  • Scope/focus: the change is broad but mostly mechanical and aligned with the new assume_mutable() contract; the issues below are localized.
  • Concurrency/lifecycle/config/compatibility: no new threads, lifecycle objects, config items, or serialization format compatibility changes were identified in the reviewed changes.
  • Parallel paths: many parallel mem-reuse paths were updated; however, shared ColumnArray construction and recursive map dedup remain inconsistent with the same COW-safety principle.
  • Error handling: no ignored Status issue was found in the reviewed changes.
  • Tests: added tests cover several regressions, but not invalid shared ColumnArray::create(ColumnPtr, ...) construction or recursive nested-map dedup with shared subcolumns.
  • Observability: no additional observability appears necessary for these internal COW fixes.
  • Transaction/storage correctness: no version/delete-bitmap/persistence issue was identified for this PR.
  • Performance: no blocking performance regression was identified beyond the expected COW cloning when columns are shared.

User focus: no additional user-provided review focus was supplied.

@@ -98,6 +100,21 @@ ColumnArray::ColumnArray(MutableColumnPtr&& nested_column) : data(std::move(nest
offsets = ColumnOffsets::create();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shared-column constructor skips the invariants enforced by the mutable constructor above: offsets_column must actually be ColumnOffset64, and when offsets are non-empty the nested data size must match offsets.back(). Public callers such as ColumnArray::create(const ColumnPtr&, const ColumnPtr&) now route here, so an invalid array can be constructed and later offset_at/size_at, filtering, or serialization can read inconsistent offsets. Please keep this path shared, but still perform the same const-only, offsets-type, and data-size validation using const access.

RETURN_IF_ERROR(const_cast<ColumnMap*>(values_map)->deduplicate_keys(recursive));
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recursive mutation bypasses COW by const-casting the nested ColumnMap reached through values_column. If the map value subcolumn is shared, deduplicate_keys(true) can move/filter the nested map's subcolumns in place and mutate all aliases of that nested map. This is distinct from the top-level filtering below, which correctly detaches with IColumn::mutate(std::move(...)); the recursive value path should similarly detach the nullable/value subcolumn and write the deduplicated nested map back into values_column before continuing.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30127 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a1671565119c77c4cde89027b02b74621bd2f078, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17648	4077	4052	4052
q2	q3	10723	885	619	619
q4	4675	467	351	351
q5	7460	1356	1159	1159
q6	190	169	135	135
q7	932	984	764	764
q8	9278	1369	1296	1296
q9	5783	5486	5519	5486
q10	6254	2224	1944	1944
q11	471	274	256	256
q12	622	423	295	295
q13	18119	3319	2765	2765
q14	292	290	263	263
q15	q16	913	880	796	796
q17	1013	1054	713	713
q18	6611	5882	5672	5672
q19	1290	1163	1019	1019
q20	523	394	264	264
q21	4913	2424	1949	1949
q22	497	403	329	329
Total cold run time: 98207 ms
Total hot run time: 30127 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5117	4937	4988	4937
q2	q3	4751	4895	4320	4320
q4	2132	2169	1401	1401
q5	5244	5258	5523	5258
q6	210	167	137	137
q7	2133	1853	1669	1669
q8	3498	3221	3364	3221
q9	8820	9055	9036	9036
q10	4773	4773	4528	4528
q11	601	434	405	405
q12	752	801	543	543
q13	3261	3589	3151	3151
q14	302	310	287	287
q15	q16	809	778	668	668
q17	1366	1346	1296	1296
q18	8149	7285	7222	7222
q19	1152	1175	1140	1140
q20	2253	2267	2002	2002
q21	6137	5385	4955	4955
q22	555	529	435	435
Total cold run time: 62015 ms
Total hot run time: 56611 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 174246 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a1671565119c77c4cde89027b02b74621bd2f078, data reload: false

query5	4327	669	530	530
query6	340	226	206	206
query7	4223	588	316	316
query8	322	226	218	218
query9	8851	4068	4089	4068
query10	447	380	306	306
query11	5817	2450	2237	2237
query12	193	134	130	130
query13	1295	615	416	416
query14	6153	5524	5269	5269
query14_1	4539	4571	4523	4523
query15	217	211	188	188
query16	1010	449	475	449
query17	1143	821	648	648
query18	2492	520	367	367
query19	225	211	173	173
query20	142	137	137	137
query21	231	135	119	119
query22	13607	13603	13394	13394
query23	17706	16910	16642	16642
query23_1	16778	16801	16849	16801
query24	7426	1791	1402	1402
query24_1	1396	1384	1379	1379
query25	584	512	465	465
query26	1316	320	180	180
query27	2679	605	357	357
query28	4447	2047	2065	2047
query29	1049	641	537	537
query30	316	245	203	203
query31	1121	1087	942	942
query32	90	78	77	77
query33	561	356	307	307
query34	1205	1137	685	685
query35	759	793	694	694
query36	1403	1463	1214	1214
query37	153	98	85	85
query38	3247	3170	3093	3093
query39	940	930	911	911
query39_1	881	876	874	874
query40	241	159	133	133
query41	63	62	60	60
query42	110	109	108	108
query43	324	336	291	291
query44	
query45	211	205	195	195
query46	1072	1181	734	734
query47	2362	2380	2277	2277
query48	407	411	289	289
query49	621	527	429	429
query50	749	284	223	223
query51	4431	4347	4288	4288
query52	104	106	94	94
query53	257	277	207	207
query54	307	264	261	261
query55	94	89	83	83
query56	287	312	292	292
query57	1416	1465	1332	1332
query58	292	271	271	271
query59	1627	1627	1448	1448
query60	348	336	321	321
query61	160	157	155	155
query62	697	671	612	612
query63	237	203	209	203
query64	2409	846	694	694
query65	
query66	1754	519	398	398
query67	30236	30159	29926	29926
query68	
query69	467	334	303	303
query70	1043	1043	971	971
query71	314	277	272	272
query72	2917	2797	2382	2382
query73	828	728	411	411
query74	5105	4997	4835	4835
query75	2784	2707	2356	2356
query76	2256	1147	795	795
query77	425	432	349	349
query78	13337	13312	12597	12597
query79	1482	968	760	760
query80	1363	604	502	502
query81	515	277	239	239
query82	1197	155	121	121
query83	355	272	250	250
query84	266	140	109	109
query85	933	523	451	451
query86	447	347	318	318
query87	3404	3404	3253	3253
query88	3601	2685	2687	2685
query89	454	387	342	342
query90	1904	180	182	180
query91	187	168	140	140
query92	82	80	69	69
query93	1030	989	571	571
query94	737	355	270	270
query95	640	449	342	342
query96	990	760	358	358
query97	2791	2774	2625	2625
query98	241	231	227	227
query99	1199	1185	1043	1043
Total cold run time: 256715 ms
Total hot run time: 174246 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 92.92% (1207/1299) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.79% (27801/37674)
Line Coverage 57.69% (301315/522324)
Region Coverage 54.96% (251300/457220)
Branch Coverage 56.44% (108582/192385)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 90.88% (2032/2236) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.84% (28003/37926)
Line Coverage 57.73% (304172/526892)
Region Coverage 54.92% (254693/463734)
Branch Coverage 56.42% (109957/194888)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 91.06% (2036/2236) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.77% (27978/37926)
Line Coverage 57.69% (303952/526892)
Region Coverage 54.90% (254572/463734)
Branch Coverage 56.38% (109878/194888)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 91.06% (2036/2236) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.77% (27977/37926)
Line Coverage 57.69% (303943/526892)
Region Coverage 54.90% (254604/463734)
Branch Coverage 56.38% (109878/194888)

zclllyybb added 11 commits May 23, 2026 00:45
Issue Number: N/A

Related PR: apache#63001

Problem Summary: Restore ownership-safe COW mutation after assume_mutable became an assertion, and preserve shared immutable subcolumns for return-new wrappers to avoid unnecessary deep copies.

None

- Test: Unit Test / Manual test
    - ./run-be-ut.sh --run --filter=ColumnArrayTest.SharedCreateValidatesOffsetsAndDataSize:ColumnNullableTest.SharedCreatePreservesImmutableSubcolumns:ColumnMapTest2.SharedCreatePreservesImmutableSubcolumns:ColumnMapTest2.ConstFilterAndPermuteKeepInputAliasesUntouched:ColumnMapTest2.DeduplicateNestedNullableMapValuesDetachesSharedValueColumn:ComplexTypeTest.DeserializeStructWritesBackSharedChildren:VariantUtilTest.ParseNullableScalarVariantDetachesNestedAlias -j 100
    - PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.codex/tmp/arg0/codex-arg083BlKk:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin build-support/check-format.sh
    - git diff --check
    - ./build.sh --be -j 100
- Behavior changed: No
- Does this need documentation: No
Issue Number: None

Related PR: apache#63001

Problem Summary: The BE COW refactor makes assume_mutable an ownership assertion, but several live Block paths still expressed mutation by manually moving ColumnPtr or MutableBlock owners out and writing them back. That made it easy to leave a Block with moved-from columns on RETURN_IF_ERROR, or to reintroduce hot-path per-row mutate calls.

This change codifies the new pattern: std::move(block).mutate_columns() is the stealing API for throwaway/rvalue Blocks; block.mutate_columns_scoped() is the RAII API for temporary whole-block mutation and restores on every exit path; block.mutate_column_scoped(pos) is the RAII owner-slot API for modifying one live Block column; VectorizedUtils::build_scoped_mutable_mem_reuse_block() returns a scoped guard, so callers first hold the guard and then borrow MutableBlock.

The patch removes default ScopedMutableBlock construction, avoids copying schema in ScopedMutableColumns, migrates table/orc/parquet/es readers and executor paths to the scoped APIs, and keeps hot row loops on MutableBlock/MutableColumns instead of repeated mutate(). Tests exercise early-return restore, shared-column detach, null-slot recreation, live schema access, LocalExchanger error paths, and TableFormatReader partition/missing-column COW contracts.

None

- Test:
    - Unit Test: ./run-be-ut.sh --run --filter=BlockTest.ScopedMutableColumnsRestoreOnErrorAndDetachSharedColumn:BlockTest.ScopedMutableColumnsReadSchemaFromLiveBlock:BlockTest.ScopedMutableColumnRestoreOnErrorDetachSharedAndCreateMissingColumn:BlockTest.ScopedMutableBlockRestoreOnErrorAndDetachSharedColumn:LocalExchangerTest.ShuffleExchangerRestoreOutputBlockOnAddRowsError:TableFormatReaderTest.FillPartitionColumnRestoresSharedColumnOnDeserializeError:TableFormatReaderTest.FillMissingNullableColumnDetachesSharedBlockSlot -j 100
    - Regression test: ./run-regression-test.sh --run -d external_table_p0/hive -s test_hive_openx_json
    - Manual test: ./build.sh --be -j 100
    - Static check: git diff --check; build-support/check-format.sh; build-support/run-clang-tidy.sh --build-dir be/build_ASAN
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: apache#63001

Problem Summary: Parquet FIXED_LEN_BYTE_ARRAY physical columns are now read through ColumnFixedLengthObject for COW-safe reuse, but several decoder paths still assumed primitive fixed-size data types. Dictionary and byte-stream-split decoders called get_size_of_value_in_memory() on DataTypeFixedLengthObject, and delta byte array fixed-length decode still wrote through a ColumnInt8 owner. This could fail external scans with DataTypeFixedLengthObject size checks or write through the wrong column representation.

This change teaches the fixed-length parquet decoders to use the ColumnFixedLengthObject item size when that physical column is used, keeps the legacy primitive/ColumnInt8 paths, and adds BE UT coverage for plain, dictionary, byte-stream-split, and delta byte array fixed-length object decoding including filter/null cases.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=FixLengthPlainDecoderTest.*:FixLengthDictDecoderTest.*:ByteStreamSplitDecoderTest.*:DeltaByteArrayDecoderTest.*:DeltaLengthByteArrayDecoderTest.*:ParquetColumnConvertTest.*:ParquetReaderTest.uuid_varbinary:ParquetReaderTest.varbinary_varbinary:ParquetReaderTest.varbinary_string:ParquetReaderTest.varbinary_string2
- Behavior changed: No
- Does this need documentation: No
Issue Number: close #xxx

Related PR: apache#63001

Problem Summary: A few COW mutation call sites still manually moved column owners out of a live Block and restored them only at the end of the local success path. In RowIdStorageReader::read_doris_format_row(), seek_and_read_by_rowid() may return an error after moving one result column, leaving result_block with a moved-from column. StreamingAggLocalState::_pre_agg_with_serialized_key() had the same restore-on-error risk in the mem-reuse output path when streaming_agg_serialize_to_column() returned an error.

This change uses Block::mutate_column_scoped() for rowid single-column owner slots and Block::mutate_columns_scoped() for the streaming pre-agg mem-reuse output block, so every Status return path restores the live Block. Adjacent rowid single-column append paths were also moved to the same scoped owner-slot pattern for consistency.

None

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=StreamingAggOperatorTest.*:BlockTest.ScopedMutableColumnRestoreOnErrorDetachSharedAndCreateMissingColumn:BlockTest.ScopedMutableColumnsRestoreOnErrorAndDetachSharedColumn
- Behavior changed: No
- Does this need documentation: No
Issue Number: close #xxx

Related PR: apache#63001

Problem Summary: BE UT merges PR head into the latest master before building tests. After master added string-overflow MutableBlock merge tests, those tests still used the removed MutableBlock(Block*) live-block constructor. That constructor is intentionally unavailable in the new COW model because live output blocks need scoped restore-on-error ownership. This updates the tests to use ScopedMutableBlock while preserving the checked overflow and ignore-overflow assertions.

None

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=BlockTest.merge_returns_error_when_checked_string_append_exceeds_limit:BlockTest.merge_ignore_overflow_keeps_owned_accumulation_convertible -j100
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: apache#63001

Problem Summary: The branch accidentally included a local COW audit document under docs/dev. The document is useful as local working notes, but it should not be part of the BE COW implementation diff. Remove it from the tracked branch state while leaving local notes outside the commit.

### Release note

None

### Check List (For Author)

- Test: No need to test (document removal only)
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: apache#63001

Problem Summary: After rebasing the COW branch onto latest master, several newly introduced or newly exposed code paths still called Block::mutate_columns() on live Block objects. The COW API now intentionally makes mutate_columns() an rvalue-only stealing operation, so live blocks must use scoped owner APIs that restore columns on every exit path. This updates nested-loop join lazy materialization, row binlog block filling, historical row retrieval, and affected tests to use mutate_columns_scoped() when the Block remains live. It also updates the block overflow tests to use the current string overflow debug point instead of removed test helpers/configuration.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./build.sh --be
    - ./run-be-ut.sh
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: apache#63001

Problem Summary: BE UT on PR apache#63001 failed in BlockTest.merge_returns_error_when_checked_string_append_exceeds_limit and aborted in ComplexTypeTest.DeserializeArrayWritesBackSharedNestedColumn. The block test relied on a string-overflow debug point that is not used by MutableBlock::merge(), so it expected an error on a path that legitimately succeeded. The complex-type test also constructed the destination array with a raw Int32 nested column, while DataTypeArray canonicalizes nested values as nullable and therefore calls DataTypeNullable::deserialize(). That exposed a real COW gap: DataTypeNullable::deserialize() wrote directly into the nested column and null map even when those subcolumns were still shared after a shallow COW clone.

This patch detaches nullable nested and null-map owner slots before deserializing and writes them back through ColumnNullable::replace_columns(). It also updates the array COW deserialize test to use shared nullable subcolumns and changes the scoped block merge test to use a deterministic schema mismatch error.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - PATH=/tmp/codex-clang-format-16:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.codex/tmp/arg0/codex-arg0qpYuvr:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin build-support/clang-format.sh
    - PATH=/tmp/codex-clang-format-16:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.codex/tmp/arg0/codex-arg0qpYuvr:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin build-support/check-format.sh
    - ./run-be-ut.sh --run --filter=BlockTest.merge_returns_error_and_restores_output_block:BlockTest.merge_ignore_overflow_keeps_owned_accumulation_convertible:ComplexTypeTest.DeserializeArrayWritesBackSharedNestedColumn
    - ./run-be-ut.sh --run --filter=ComplexTypeTest.Deserialize*
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?

Issue Number: None

Related PR: apache#63001

Problem Summary: DataTypeArray now normalizes array elements to nullable physical columns. AggregateFunctionForEach still passed the array data column directly into the nested aggregate when writing results, so nested aggregates such as array_agg could receive ColumnNullable where they expected ColumnArray. This commit unwraps the array element nullable wrapper for non-nullable nested aggregate results, maintains the element null map, and updates tests that still built DataTypeArray columns with the old raw nested shape.

### Release note

None

### Check List (For Author)

- Test: Unit Test

    - ./run-be-ut.sh --run --filter='FunctionVariantCast.*:AggregateFunctionArrayAggTest.*:VRetentionTest.*:SchemaUtilTest.TestArrayDimensions:SchemaUtilTest.TestCastColumnEdgeCases:DataTypeArrayTest.CreateColumnUsesNullableNestedColumn:AIFunctionTest.AIMaskTest:AIFunctionTest.AIExtractTest:AIFunctionTest.AIClassifyTest'

    - ./run-be-ut.sh --run --filter='AggGroupArrayIntersectTest.*:TableFunctionOperatorTest.block_fast_path_explode*'

    - PATH=/tmp/codex-clang-format-16:$PATH build-support/check-format.sh

- Behavior changed: No

- Does this need documentation: No
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: apache#63001

Problem Summary: After rebasing onto upstream master, Segment::seek_and_read_by_rowid expects a sorted vector of row ids instead of a single row id. The COW conflict resolution kept scoped block/column ownership but still called the old single-row form in rowid fetch and point query paths, causing BE compilation to fail. This updates both call sites to pass the batched row id vector while preserving scoped column ownership and restore-on-error behavior.

### Release note

None

### Check List (For Author)

- Test: Unit Test

    - ./build.sh --be

    - PATH=/tmp/codex-clang-format-16:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.codex/tmp/arg0/codex-arg0DFZQLQ:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/codex-path:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/tableau/tableau_server/packages/customer-bin.20251.25.0520.1026 build-support/check-format.sh

    - ./run-be-ut.sh --run --filter='FunctionVariantCast.*:AggregateFunctionArrayAggTest.*:VRetentionTest.*:SchemaUtilTest.TestArrayDimensions:SchemaUtilTest.TestCastColumnEdgeCases:DataTypeArrayTest.CreateColumnUsesNullableNestedColumn:AIFunctionTest.AIMaskTest:AIFunctionTest.AIExtractTest:AIFunctionTest.AIClassifyTest'

    - ./run-be-ut.sh --run --filter='AggGroupArrayIntersectTest.*:TableFunctionOperatorTest.block_fast_path_explode*'

    - ./run-be-ut.sh --run --filter='BlockTest.ClearSelectedColumnDataClonesSharedColumn:BlockTest.ClearColumnDataPropagatesSharedCloneEmptyFailure:BlockTest.ClearSelectedColumnDataPropagatesSharedCloneEmptyFailure:BlockTest.ScopedMutable*'

- Behavior changed: No

- Does this need documentation: No
@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31612 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 53830dfdbd1b4b809065283a86171558ca74a451, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17754	4126	4039	4039
q2	q3	10765	1386	809	809
q4	4720	482	345	345
q5	7926	2291	2095	2095
q6	364	175	149	149
q7	929	815	637	637
q8	9585	1841	1633	1633
q9	7058	4976	4894	4894
q10	6433	2349	1887	1887
q11	440	267	244	244
q12	689	425	304	304
q13	18209	3441	2804	2804
q14	276	258	242	242
q15	q16	823	777	710	710
q17	995	1031	933	933
q18	6922	5717	5529	5529
q19	1255	1393	1094	1094
q20	547	412	271	271
q21	6058	2830	2632	2632
q22	448	371	361	361
Total cold run time: 102196 ms
Total hot run time: 31612 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4800	4945	4784	4784
q2	q3	4988	5298	4672	4672
q4	2165	2262	1476	1476
q5	4990	4653	4749	4653
q6	240	189	141	141
q7	1898	1790	1506	1506
q8	2321	1981	1969	1969
q9	7504	7509	7454	7454
q10	4729	4697	4261	4261
q11	549	404	386	386
q12	743	750	555	555
q13	3017	3425	2814	2814
q14	274	283	255	255
q15	q16	680	706	625	625
q17	1304	1294	1275	1275
q18	7316	6948	6973	6948
q19	1139	1094	1085	1085
q20	2243	2233	1982	1982
q21	5352	4692	4493	4493
q22	530	490	425	425
Total cold run time: 56782 ms
Total hot run time: 51759 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172368 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 53830dfdbd1b4b809065283a86171558ca74a451, data reload: false

query5	4325	659	525	525
query6	348	217	200	200
query7	4335	559	306	306
query8	329	250	218	218
query9	8831	4172	4156	4156
query10	453	343	306	306
query11	5803	2550	2272	2272
query12	179	128	126	126
query13	1267	587	428	428
query14	6200	5541	5224	5224
query14_1	4562	4539	4493	4493
query15	214	207	181	181
query16	1041	444	433	433
query17	1177	720	587	587
query18	2623	474	354	354
query19	215	203	165	165
query20	141	132	130	130
query21	218	141	113	113
query22	13696	13531	13317	13317
query23	17367	16546	16313	16313
query23_1	16462	16313	16408	16313
query24	7391	1790	1327	1327
query24_1	1340	1345	1347	1345
query25	561	480	427	427
query26	1329	323	176	176
query27	2664	544	366	366
query28	4443	2069	2050	2050
query29	988	649	524	524
query30	307	249	206	206
query31	1144	1087	975	975
query32	90	77	79	77
query33	558	378	311	311
query34	1196	1166	661	661
query35	797	804	703	703
query36	1390	1421	1227	1227
query37	157	110	94	94
query38	3216	3170	3080	3080
query39	936	927	907	907
query39_1	887	883	888	883
query40	234	153	132	132
query41	71	73	68	68
query42	114	113	111	111
query43	335	347	297	297
query44	
query45	217	208	199	199
query46	1110	1215	719	719
query47	2404	2346	2251	2251
query48	413	416	299	299
query49	642	509	413	413
query50	972	362	262	262
query51	4326	4273	4263	4263
query52	110	107	98	98
query53	255	292	210	210
query54	333	296	272	272
query55	98	94	89	89
query56	322	327	310	310
query57	1439	1443	1339	1339
query58	317	295	289	289
query59	1590	1677	1481	1481
query60	349	370	313	313
query61	182	181	181	181
query62	702	658	601	601
query63	249	214	213	213
query64	2449	863	706	706
query65	
query66	1732	509	377	377
query67	30098	29426	29895	29426
query68	
query69	475	340	318	318
query70	1028	1015	1005	1005
query71	315	278	271	271
query72	2980	2747	2451	2451
query73	833	755	431	431
query74	5162	4989	4793	4793
query75	2703	2610	2286	2286
query76	2322	1155	785	785
query77	432	435	342	342
query78	12387	12461	11833	11833
query79	1503	1096	778	778
query80	1352	543	447	447
query81	521	283	238	238
query82	1073	153	119	119
query83	353	289	249	249
query84	261	143	108	108
query85	938	522	474	474
query86	460	344	348	344
query87	3460	3411	3246	3246
query88	3658	2787	2752	2752
query89	452	394	342	342
query90	1913	183	182	182
query91	178	168	143	143
query92	77	77	72	72
query93	1568	1487	907	907
query94	716	362	308	308
query95	679	475	351	351
query96	1069	772	352	352
query97	2721	2750	2593	2593
query98	238	229	239	229
query99	1184	1149	1051	1051
Total cold run time: 256474 ms
Total hot run time: 172368 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 57.96% (1292/2229) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.78% (20854/38776)
Line Coverage 37.37% (197609/528821)
Region Coverage 33.68% (154843/459690)
Branch Coverage 34.66% (67368/194367)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 90.80% (2024/2229) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.67% (27977/37977)
Line Coverage 57.64% (304018/527472)
Region Coverage 54.70% (253874/464120)
Branch Coverage 56.26% (109770/195097)

@zclllyybb
Copy link
Copy Markdown
Contributor Author

skip buildall

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@zclllyybb zclllyybb merged commit dcee505 into apache:master May 23, 2026
31 of 32 checks passed
@zclllyybb zclllyybb deleted the cow branch May 23, 2026 04:42
zclllyybb added a commit that referenced this pull request May 23, 2026
followup #63001. we changed the
actual meaning of `assume_mutable`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants