Skip to content

Conversation

@kaka11chen
Copy link
Contributor

@kaka11chen kaka11chen commented Dec 3, 2025

What problem does this PR solve?

Related PR: #58124

Problem Summary:

Release note

[Opt] (multi-catalog) Opt by avoid building name_to_index map every time.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

Copilot AI review requested due to automatic review settings December 3, 2025 08:31
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Contributor Author

run buildall

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request attempts to optimize file reading performance by avoiding repeated construction of name-to-index maps when processing blocks. However, the PR contains critical bugs that will prevent it from compiling and cause crashes at runtime.

Key Issues:

  • Critical bugs in scanner_context.cpp: Debug code includes std::cout statements, an infinite loop (while(true)), and syntax errors from improper code commenting
  • Null pointer dereferences: Multiple readers use _col_name_to_block_idx without checking for null or properly initializing it
  • Incorrect API usage: std::unordered_map::erase() called with iterator pairs instead of individual keys
  • Missing variable declarations: Test code references undeclared variables

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 17 comments.

Show a summary per file
File Description
be/src/vec/exec/scan/scanner_context.cpp CRITICAL: Contains debug code, infinite loops, syntax errors, and commented-out critical concurrency logic
be/src/vec/exec/scan/file_scanner.cpp Adds caching for name-to-index map, but insufficient validation for multi-file scenarios
be/src/vec/exec/jni_connector.h/cpp Adds setter for map pointer and uses it in _fill_block without null checks
be/src/vec/exec/format/orc/vorc_reader.h/cpp Updates init_reader signature but fails to assign the parameter to member variable
be/src/vec/exec/format/parquet/vparquet_reader.h/cpp Properly propagates map pointer to group readers
be/src/vec/exec/format/parquet/vparquet_group_reader.h/cpp Uses map pointer without null checks throughout
be/src/vec/exec/format/table/*_reader.h/cpp Updates all table format reader signatures; some missing proper initialization
be/src/vec/exec/format/table/equality_delete.h/cpp Updates filter methods to accept map pointer; no null checks
be/src/vec/exec/format/table/transactional_hive_*.cpp Incorrect use of map erase() method with iterator pairs
be/src/vec/exec/format/table/iceberg_reader.cpp Incorrect use of map erase() method; missing pointer assignment in init_reader
be/src/olap/push_handler.cpp Passes nullptr for map pointer, will cause crashes
be/test/vec/exec/*.cpp Test updates; some with missing variable declarations
Comments suppressed due to low confidence (1)

be/src/vec/exec/format/orc/vorc_reader.cpp:360

  • The col_name_to_block_idx parameter is accepted but never assigned to _col_name_to_block_idx member variable. This will cause all subsequent uses of _col_name_to_block_idx to access a nullptr. Add _col_name_to_block_idx = col_name_to_block_idx; after line 360.
Status OrcReader::init_reader(
        const std::vector<std::string>* column_names,
        std::unordered_map<std::string, uint32_t>* col_name_to_block_idx,
        const VExprContextSPtrs& conjuncts, bool is_acid, const TupleDescriptor* tuple_descriptor,
        const RowDescriptor* row_descriptor,
        const VExprContextSPtrs* not_single_slot_filter_conjuncts,
        const std::unordered_map<int, VExprContextSPtrs>* slot_id_to_filter_conjuncts,
        std::shared_ptr<TableSchemaChangeHelper::Node> table_info_node_ptr,
        const std::set<uint64_t>& column_ids, const std::set<uint64_t>& filter_column_ids) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 528 to 549
// // We need to add it back to task queue to make sure it could be resubmitted.
// if (current_scan_task) {
// // This usually happens when we should downgrade the concurrency.
// _pending_scanners.push(current_scan_task);
// VLOG_DEBUG << fmt::format(
// "{} push back scanner to task queue, because diff <= 0, task_queue size "
// "{}, _num_scheduled_scanners {}",
// ctx_id, _tasks_queue.size(), _num_scheduled_scanners);
// }

// #ifndef NDEBUG
// // This DCHECK is necessary.
// // We need to make sure each scan operator could have at least 1 scan tasks.
// // Or this scan operator will not be re-scheduled.
// if (!_pending_scanners.empty() && _num_scheduled_scanners == 0 && _tasks_queue.empty()) {
// throw doris::Exception(ErrorCode::INTERNAL_ERROR, "Scanner scheduler logical error.");
// }
// #endif

// return Status::OK();
// }

Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large blocks of commented-out code (lines 523-548) should be removed. This includes critical logic for checking margins and concurrency limits that appears to have been disabled. If this is intentional, it needs proper documentation explaining why this critical logic was disabled.

Suggested change
// // We need to add it back to task queue to make sure it could be resubmitted.
// if (current_scan_task) {
// // This usually happens when we should downgrade the concurrency.
// _pending_scanners.push(current_scan_task);
// VLOG_DEBUG << fmt::format(
// "{} push back scanner to task queue, because diff <= 0, task_queue size "
// "{}, _num_scheduled_scanners {}",
// ctx_id, _tasks_queue.size(), _num_scheduled_scanners);
// }
// #ifndef NDEBUG
// // This DCHECK is necessary.
// // We need to make sure each scan operator could have at least 1 scan tasks.
// // Or this scan operator will not be re-scheduled.
// if (!_pending_scanners.empty() && _num_scheduled_scanners == 0 && _tasks_queue.empty()) {
// throw doris::Exception(ErrorCode::INTERNAL_ERROR, "Scanner scheduler logical error.");
// }
// #endif
// return Status::OK();
// }
// [Removed: previously disabled margin and concurrency check logic. If needed, see git history for details.]

Copilot uses AI. Check for mistakes.
Comment on lines +317 to +318
auto& column_with_type_and_name =
block->get_by_position(_col_name_to_block_idx->at(_column_names[i]));
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential null pointer dereference. _col_name_to_block_idx is used without checking if it's nullptr. If set_col_name_to_block_idx() is not called before _fill_block(), this will crash. Add a null check or ensure initialization in the constructor.

Copilot uses AI. Check for mistakes.
Status SimpleEqualityDelete::filter_data_block(Block* data_block) {
Status SimpleEqualityDelete::filter_data_block(
Block* data_block, const std::unordered_map<std::string, uint32_t>* col_name_to_block_idx) {
SCOPED_TIMER(equality_delete_time);
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential null pointer dereference. col_name_to_block_idx is used with ->at() without checking if it's nullptr. Add a null check before dereferencing.

Suggested change
SCOPED_TIMER(equality_delete_time);
SCOPED_TIMER(equality_delete_time);
if (col_name_to_block_idx == nullptr) {
return Status::InternalError("col_name_to_block_idx is nullptr in filter_data_block");
}

Copilot uses AI. Check for mistakes.
Comment on lines +1340 to +1342
auto col_ptr = block->get_by_position((*_col_name_to_block_idx)[kv.first])
.column->assume_mutable();
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential null pointer dereference. _col_name_to_block_idx is dereferenced without null checking. If not properly initialized, this will crash. Add a null check or ensure it's always set before use.

Copilot uses AI. Check for mistakes.
Comment on lines 171 to 180
Status init_reader(
const std::vector<std::string>& file_col_names, const VExprContextSPtrs& conjuncts,
const TupleDescriptor* tuple_descriptor, const RowDescriptor* row_descriptor,
const std::vector<std::string>& file_col_names,
std::unordered_map<std::string, uint32_t>* col_name_to_block_idx,
const VExprContextSPtrs& conjuncts, const TupleDescriptor* tuple_descriptor,
const RowDescriptor* row_descriptor,
const std::unordered_map<std::string, int>* colname_to_slot_id,
const VExprContextSPtrs* not_single_slot_filter_conjuncts,
const std::unordered_map<int, VExprContextSPtrs>* slot_id_to_filter_conjuncts);
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature of init_reader has been changed to include col_name_to_block_idx parameter, but the implementation in iceberg_reader.cpp may not have been updated accordingly. Additionally, the implementation needs to store this pointer in the _col_name_to_block_idx member variable to avoid null pointer dereferences in _expand_block_if_need and _shrink_block_if_need. Add _col_name_to_block_idx = col_name_to_block_idx; at the start of the init_reader implementation.

Copilot uses AI. Check for mistakes.
Comment on lines 131 to 132
_col_name_to_block_idx->erase(READ_ROW_COLUMN_NAMES_LOWER_CASE.begin(),
READ_ROW_COLUMN_NAMES_LOWER_CASE.end());
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The erase method signature is incorrect. std::unordered_map::erase() doesn't accept iterator pairs like this. It should be a loop erasing individual keys or use a helper function. This code will fail to compile. Example: for (const auto& name : READ_ROW_COLUMN_NAMES_LOWER_CASE) { _col_name_to_block_idx->erase(name); }

Suggested change
_col_name_to_block_idx->erase(READ_ROW_COLUMN_NAMES_LOWER_CASE.begin(),
READ_ROW_COLUMN_NAMES_LOWER_CASE.end());
for (const auto& name : READ_ROW_COLUMN_NAMES_LOWER_CASE) {
_col_name_to_block_idx->erase(name);
}

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +124
if (!col_name_to_block_idx->contains(column_name)) {
return Status::InternalError("Column '{}' not found in data block: {}", column_name,
data_block->dump_structure());
}
auto column_and_type =
data_block->safe_get_by_position(col_name_to_block_idx->at(column_name));
if (!delete_col.type->equals(*column_and_type.type)) {
return Status::InternalError(
"Not support type change in column '{}', src type: {}, target type: {}",
column_name, delete_col.type->get_name(), column_and_type.type->get_name());
}
_data_column_index[column_index++] = name_to_pos_map[column_name];
_data_column_index[column_index++] = col_name_to_block_idx->at(column_name);
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential null pointer dereference. col_name_to_block_idx is used with ->contains() and ->at() without checking if it's nullptr. Add a null check before dereferencing.

Copilot uses AI. Check for mistakes.
Comment on lines 219 to 228
Status init_reader(
const std::vector<std::string>& file_col_names, const VExprContextSPtrs& conjuncts,
const TupleDescriptor* tuple_descriptor, const RowDescriptor* row_descriptor,
const std::vector<std::string>& file_col_names,
std::unordered_map<std::string, uint32_t>* col_name_to_block_idx,
const VExprContextSPtrs& conjuncts, const TupleDescriptor* tuple_descriptor,
const RowDescriptor* row_descriptor,
const std::unordered_map<std::string, int>* colname_to_slot_id,
const VExprContextSPtrs* not_single_slot_filter_conjuncts,
const std::unordered_map<int, VExprContextSPtrs>* slot_id_to_filter_conjuncts);
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature of init_reader has been changed to include col_name_to_block_idx parameter, but the implementation needs to store this pointer in the _col_name_to_block_idx member variable. Add _col_name_to_block_idx = col_name_to_block_idx; at the start of the init_reader implementation to avoid null pointer dereferences.

Copilot uses AI. Check for mistakes.
positions_to_erase.emplace((*_col_name_to_block_idx)[expand_col]);
}
block->erase(positions_to_erase);
_col_name_to_block_idx->erase(_expand_col_names.begin(), _expand_col_names.end());
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The erase method signature is incorrect. std::unordered_map::erase() doesn't accept iterator pairs like this. It should be a loop erasing individual keys. Example: for (const auto& name : _expand_col_names) { _col_name_to_block_idx->erase(name); }

Suggested change
_col_name_to_block_idx->erase(_expand_col_names.begin(), _expand_col_names.end());
for (const auto& name : _expand_col_names) {
_col_name_to_block_idx->erase(name);
}

Copilot uses AI. Check for mistakes.
Comment on lines 306 to 311
// VLOG_DEBUG << fmt::format(
// "ScannerContext {} get block from queue, task_queue size {}, current scan "
// "task remaing cached_block size {}, eos {}, scheduled tasks {}",
// ctx_id, _tasks_queue.size(), scan_task->cached_blocks.size(), scan_task->is_eos(),
// _num_scheduled_scanners);
else {
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VLOG_DEBUG statement has been commented out but the logic that follows has been changed from if (scan_task->cached_blocks.empty()) to else. This creates an else without a corresponding if block, which is a syntax error. The commented-out VLOG_DEBUG should either be removed entirely or the control flow should be fixed.

Copilot uses AI. Check for mistakes.
@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from 826def9 to ce1ef37 Compare December 3, 2025 09:30
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from ce1ef37 to ccea5d6 Compare December 3, 2025 09:53
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from ccea5d6 to 79b5757 Compare December 3, 2025 10:40
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from 79b5757 to a157c5f Compare December 3, 2025 11:29
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from a157c5f to dd836e2 Compare December 3, 2025 13:58
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from dd836e2 to 998beee Compare December 3, 2025 14:22
@doris-robot
Copy link

TPC-H: Total hot run time: 34338 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dd836e24c2d3644e6538f6dfc9c628e50c242702, data reload: false

------ Round 1 ----------------------------------
q1	17663	5101	4927	4927
q2	2054	328	204	204
q3	10228	1287	748	748
q4	10243	906	329	329
q5	7532	2410	2196	2196
q6	192	173	142	142
q7	938	798	628	628
q8	9352	1430	1174	1174
q9	6869	5359	5345	5345
q10	6834	2183	1764	1764
q11	524	300	297	297
q12	338	375	224	224
q13	17771	3620	3022	3022
q14	228	233	215	215
q15	596	506	509	506
q16	912	859	807	807
q17	664	800	524	524
q18	7431	7179	7027	7027
q19	1192	958	608	608
q20	359	343	226	226
q21	3962	3212	2493	2493
q22	1038	989	932	932
Total cold run time: 106920 ms
Total hot run time: 34338 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4981	4961	4917	4917
q2	347	419	317	317
q3	2150	2629	2281	2281
q4	1316	1740	1278	1278
q5	4195	4347	4516	4347
q6	228	184	135	135
q7	2070	1968	1853	1853
q8	2680	2485	2446	2446
q9	7530	7653	7579	7579
q10	3079	3271	2809	2809
q11	574	530	486	486
q12	693	763	658	658
q13	3532	3934	3429	3429
q14	295	314	282	282
q15	566	524	521	521
q16	903	939	881	881
q17	1129	1449	1444	1444
q18	8232	7631	7640	7631
q19	901	865	887	865
q20	1996	2052	1938	1938
q21	4610	4197	4096	4096
q22	1076	1058	991	991
Total cold run time: 53083 ms
Total hot run time: 51184 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182015 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dd836e24c2d3644e6538f6dfc9c628e50c242702, data reload: false

query5	5056	666	501	501
query6	334	239	224	224
query7	4694	519	316	316
query8	306	274	240	240
query9	8716	2658	2674	2658
query10	561	378	322	322
query11	15406	15050	14894	14894
query12	180	128	117	117
query13	1688	604	455	455
query14	6500	3260	3006	3006
query14_1	2914	2939	2907	2907
query15	215	199	181	181
query16	7643	690	533	533
query17	1245	810	617	617
query18	2023	416	318	318
query19	202	203	174	174
query20	128	122	120	120
query21	217	131	118	118
query22	3837	3943	3776	3776
query23	16575	16255	16001	16001
query23_1	16032	15930	16037	15930
query24	6918	1632	1207	1207
query24_1	1229	1186	1209	1186
query25	596	515	457	457
query26	1236	278	164	164
query27	2729	507	338	338
query28	4320	2193	2166	2166
query29	801	630	492	492
query30	313	245	214	214
query31	809	706	628	628
query32	78	74	76	74
query33	593	380	328	328
query34	854	879	542	542
query35	785	829	743	743
query36	878	899	851	851
query37	123	116	90	90
query38	3915	3936	3795	3795
query39	765	764	723	723
query39_1	690	694	694	694
query40	224	140	118	118
query41	71	77	61	61
query42	129	111	118	111
query43	440	462	420	420
query44	1319	755	755	755
query45	197	193	182	182
query46	899	1026	638	638
query47	1681	1756	1681	1681
query48	403	425	335	335
query49	795	491	402	402
query50	686	709	423	423
query51	3881	3848	3769	3769
query52	119	116	104	104
query53	237	258	198	198
query54	323	303	290	290
query55	94	93	92	92
query56	334	323	324	323
query57	1107	1179	1103	1103
query58	286	271	297	271
query59	2388	2504	2375	2375
query60	354	363	331	331
query61	165	162	153	153
query62	761	714	647	647
query63	227	193	190	190
query64	4509	1214	904	904
query65	4041	4007	3964	3964
query66	1096	455	335	335
query67	15140	15104	15030	15030
query68	4660	1018	649	649
query69	516	346	314	314
query70	1137	1048	1001	1001
query71	430	345	318	318
query72	5889	5011	5022	5011
query73	658	581	366	366
query74	8886	8826	8587	8587
query75	3042	3050	2574	2574
query76	3255	1160	793	793
query77	512	432	326	326
query78	9428	9679	8821	8821
query79	2013	839	577	577
query80	1681	581	499	499
query81	554	268	247	247
query82	425	141	112	112
query83	368	272	252	252
query84	273	117	92	92
query85	951	505	464	464
query86	386	282	298	282
query87	4159	4043	3938	3938
query88	3061	2326	2310	2310
query89	393	340	289	289
query90	1745	234	227	227
query91	180	178	141	141
query92	74	73	67	67
query93	1279	1039	685	685
query94	741	451	336	336
query95	501	409	407	407
query96	549	552	288	288
query97	2607	2683	2585	2585
query98	239	216	217	216
query99	1331	1373	1294	1294
Total cold run time: 265466 ms
Total hot run time: 182015 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.49 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dd836e24c2d3644e6538f6dfc9c628e50c242702, data reload: false

query1	0.05	0.04	0.05
query2	0.11	0.05	0.05
query3	0.26	0.09	0.09
query4	1.63	0.12	0.11
query5	0.27	0.27	0.27
query6	1.19	0.63	0.63
query7	0.04	0.03	0.02
query8	0.06	0.05	0.05
query9	0.57	0.51	0.49
query10	0.56	0.55	0.55
query11	0.16	0.11	0.11
query12	0.16	0.12	0.12
query13	0.63	0.63	0.61
query14	0.98	0.98	0.98
query15	0.80	0.80	0.80
query16	0.41	0.41	0.40
query17	1.04	1.04	1.06
query18	0.24	0.21	0.21
query19	1.86	1.71	1.77
query20	0.01	0.01	0.01
query21	15.43	0.29	0.14
query22	4.66	0.05	0.04
query23	16.09	0.27	0.10
query24	1.99	0.49	0.93
query25	0.08	0.09	0.09
query26	0.15	0.13	0.14
query27	0.06	0.05	0.07
query28	5.04	1.22	1.03
query29	12.57	4.00	3.16
query30	0.28	0.14	0.12
query31	2.81	0.64	0.39
query32	3.22	0.54	0.46
query33	2.97	3.03	3.04
query34	16.76	5.13	4.59
query35	4.56	4.58	4.53
query36	0.68	0.51	0.49
query37	0.10	0.07	0.07
query38	0.07	0.04	0.04
query39	0.05	0.03	0.03
query40	0.16	0.14	0.14
query41	0.08	0.03	0.02
query42	0.05	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 98.93 s
Total hot run time: 27.49 s

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from 998beee to 5f3512a Compare December 3, 2025 16:14
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34564 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5f3512a390bbb7e6578f149638ceffdaa4e556bb, data reload: false

------ Round 1 ----------------------------------
q1	17622	5121	4997	4997
q2	2008	320	206	206
q3	10246	1333	742	742
q4	10226	901	325	325
q5	7515	2418	2157	2157
q6	190	175	139	139
q7	964	798	634	634
q8	9350	1464	1156	1156
q9	7053	5436	5459	5436
q10	6832	2180	1791	1791
q11	519	320	292	292
q12	343	377	233	233
q13	17815	3697	3039	3039
q14	230	238	215	215
q15	601	526	509	509
q16	890	885	815	815
q17	693	842	526	526
q18	7523	7055	7097	7055
q19	1195	965	596	596
q20	366	355	231	231
q21	4026	3762	2550	2550
q22	1052	1022	920	920
Total cold run time: 107259 ms
Total hot run time: 34564 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5056	5009	4995	4995
q2	332	402	326	326
q3	2119	2820	2252	2252
q4	1302	1777	1303	1303
q5	4248	4527	4544	4527
q6	230	186	136	136
q7	2054	2049	1838	1838
q8	2651	2655	2533	2533
q9	7442	7458	7508	7458
q10	3161	3247	2807	2807
q11	596	531	484	484
q12	708	749	831	749
q13	3526	3918	3256	3256
q14	273	295	274	274
q15	566	514	507	507
q16	915	961	870	870
q17	1200	1411	1404	1404
q18	7810	7629	7542	7542
q19	914	871	893	871
q20	2017	2017	1915	1915
q21	4756	4276	4114	4114
q22	1103	1003	967	967
Total cold run time: 52979 ms
Total hot run time: 51128 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182412 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5f3512a390bbb7e6578f149638ceffdaa4e556bb, data reload: false

query5	4922	661	493	493
query6	344	243	235	235
query7	4654	515	311	311
query8	314	256	244	244
query9	8726	2646	2658	2646
query10	552	374	314	314
query11	15414	15110	14984	14984
query12	188	129	122	122
query13	1687	610	455	455
query14	6555	3420	3125	3125
query14_1	3060	3034	2997	2997
query15	215	200	182	182
query16	7697	703	538	538
query17	1225	794	640	640
query18	2046	438	355	355
query19	222	208	189	189
query20	130	128	122	122
query21	226	138	117	117
query22	3861	4201	3874	3874
query23	16672	16032	15799	15799
query23_1	15995	15866	15933	15866
query24	6871	1651	1228	1228
query24_1	1218	1205	1242	1205
query25	660	558	488	488
query26	775	282	179	179
query27	2721	513	349	349
query28	4329	2187	2180	2180
query29	762	651	518	518
query30	324	250	223	223
query31	848	721	620	620
query32	86	79	73	73
query33	604	402	348	348
query34	838	893	587	587
query35	796	832	741	741
query36	890	935	841	841
query37	133	116	95	95
query38	3809	3840	3794	3794
query39	752	741	715	715
query39_1	721	698	696	696
query40	225	133	127	127
query41	66	62	64	62
query42	127	112	114	112
query43	430	445	405	405
query44	1332	766	759	759
query45	203	194	187	187
query46	906	1005	647	647
query47	1665	1741	1625	1625
query48	411	455	331	331
query49	742	505	423	423
query50	693	697	421	421
query51	3925	3926	3926	3926
query52	117	115	104	104
query53	250	264	200	200
query54	315	303	279	279
query55	104	98	95	95
query56	335	329	350	329
query57	1132	1152	1120	1120
query58	292	278	316	278
query59	2347	2462	2333	2333
query60	355	354	338	338
query61	168	157	162	157
query62	802	719	666	666
query63	238	199	211	199
query64	3635	1242	911	911
query65	4058	4007	3974	3974
query66	918	458	334	334
query67	15174	15110	14712	14712
query68	8391	1001	641	641
query69	541	353	313	313
query70	1068	1049	1051	1049
query71	492	345	321	321
query72	5975	4934	4893	4893
query73	738	592	359	359
query74	8867	8812	8640	8640
query75	3662	3045	2559	2559
query76	3755	1166	762	762
query77	819	447	320	320
query78	9332	9564	8832	8832
query79	2236	862	605	605
query80	661	615	499	499
query81	537	281	241	241
query82	472	146	113	113
query83	273	274	248	248
query84	255	120	98	98
query85	956	504	462	462
query86	407	309	273	273
query87	4037	4077	3923	3923
query88	4183	2330	2286	2286
query89	388	341	301	301
query90	1906	231	225	225
query91	167	179	141	141
query92	77	73	70	70
query93	1946	1055	672	672
query94	719	464	350	350
query95	519	424	420	420
query96	539	563	286	286
query97	2651	2675	2605	2605
query98	246	221	213	213
query99	1382	1384	1264	1264
Total cold run time: 270579 ms
Total hot run time: 182412 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.29 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5f3512a390bbb7e6578f149638ceffdaa4e556bb, data reload: false

query1	0.05	0.05	0.04
query2	0.11	0.05	0.05
query3	0.25	0.09	0.08
query4	1.62	0.13	0.11
query5	0.27	0.28	0.25
query6	1.15	0.64	0.66
query7	0.03	0.02	0.02
query8	0.05	0.04	0.04
query9	0.55	0.51	0.51
query10	0.57	0.58	0.56
query11	0.17	0.10	0.10
query12	0.16	0.13	0.11
query13	0.63	0.62	0.60
query14	1.02	0.99	1.00
query15	0.81	0.80	0.81
query16	0.41	0.42	0.39
query17	1.08	1.07	1.04
query18	0.23	0.21	0.22
query19	1.91	1.86	1.81
query20	0.02	0.01	0.01
query21	15.43	0.28	0.13
query22	4.70	0.05	0.05
query23	15.99	0.28	0.10
query24	2.64	0.59	0.23
query25	0.11	0.06	0.07
query26	0.16	0.13	0.13
query27	0.06	0.06	0.07
query28	5.15	1.20	1.02
query29	12.62	4.06	3.25
query30	0.28	0.14	0.13
query31	2.83	0.62	0.38
query32	3.25	0.56	0.48
query33	3.10	3.06	3.03
query34	16.99	5.16	4.51
query35	4.54	4.52	4.57
query36	0.65	0.52	0.49
query37	0.11	0.07	0.06
query38	0.07	0.04	0.04
query39	0.04	0.03	0.03
query40	0.16	0.14	0.14
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.03
Total cold run time: 100.14 s
Total hot run time: 27.29 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 17.73% (36/203) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.35% (18640/34937)
Line Coverage 39.01% (172092/441121)
Region Coverage 33.62% (133187/396192)
Branch Coverage 34.59% (57340/165789)

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from 5f3512a to 0488d41 Compare December 4, 2025 00:49
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34296 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0488d41413c85e45223e3dee694cf5920d111f13, data reload: false

------ Round 1 ----------------------------------
q1	17719	5136	4900	4900
q2	2262	315	215	215
q3	10242	1312	724	724
q4	10221	832	322	322
q5	7504	2385	2148	2148
q6	181	166	141	141
q7	949	782	656	656
q8	9363	1397	1115	1115
q9	7113	5342	5268	5268
q10	7089	2202	1760	1760
q11	651	307	292	292
q12	371	379	233	233
q13	17809	3656	3070	3070
q14	231	247	218	218
q15	604	523	515	515
q16	882	871	793	793
q17	672	760	586	586
q18	7771	7006	7060	7006
q19	1161	938	600	600
q20	350	357	230	230
q21	3953	3652	2547	2547
q22	1068	1002	957	957
Total cold run time: 108166 ms
Total hot run time: 34296 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4958	4930	4892	4892
q2	316	395	318	318
q3	2131	2635	2312	2312
q4	1313	1761	1287	1287
q5	4293	4558	4464	4464
q6	233	187	135	135
q7	2063	2014	1751	1751
q8	2690	2555	2510	2510
q9	7621	7652	7541	7541
q10	3029	3331	2824	2824
q11	595	530	475	475
q12	673	741	658	658
q13	3722	3820	3315	3315
q14	291	297	285	285
q15	552	508	504	504
q16	956	918	895	895
q17	1175	1400	1459	1400
q18	7733	7675	7611	7611
q19	869	846	850	846
q20	2008	2063	1912	1912
q21	4627	4288	4100	4100
q22	1081	1040	984	984
Total cold run time: 52929 ms
Total hot run time: 51019 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181985 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0488d41413c85e45223e3dee694cf5920d111f13, data reload: false

query5	5190	648	494	494
query6	347	254	250	250
query7	4697	518	317	317
query8	305	271	253	253
query9	8714	2634	2649	2634
query10	540	385	324	324
query11	15601	15309	14628	14628
query12	224	118	119	118
query13	1697	586	473	473
query14	6699	3310	3062	3062
query14_1	2962	3037	2932	2932
query15	212	200	196	196
query16	7703	693	526	526
query17	1234	783	647	647
query18	2040	438	354	354
query19	224	219	204	204
query20	129	135	123	123
query21	428	137	114	114
query22	4030	4012	3931	3931
query23	16573	16189	16044	16044
query23_1	16240	15929	15948	15929
query24	6800	1673	1209	1209
query24_1	1231	1229	1259	1229
query25	650	559	498	498
query26	1284	291	190	190
query27	2659	502	353	353
query28	4306	2171	2157	2157
query29	815	657	534	534
query30	327	245	219	219
query31	848	717	660	660
query32	91	79	76	76
query33	642	406	337	337
query34	868	893	587	587
query35	811	825	732	732
query36	903	960	842	842
query37	147	114	92	92
query38	3880	3914	3721	3721
query39	737	727	720	720
query39_1	701	712	688	688
query40	228	130	121	121
query41	68	62	62	62
query42	136	119	113	113
query43	453	440	423	423
query44	1359	755	792	755
query45	200	192	186	186
query46	929	1028	670	670
query47	1688	1714	1609	1609
query48	401	444	322	322
query49	788	505	394	394
query50	705	714	406	406
query51	3851	3925	3872	3872
query52	115	121	113	113
query53	241	263	198	198
query54	325	297	278	278
query55	101	100	94	94
query56	340	345	332	332
query57	1128	1165	1084	1084
query58	281	315	267	267
query59	2300	2432	2361	2361
query60	354	352	329	329
query61	161	157	155	155
query62	788	704	658	658
query63	232	198	199	198
query64	4437	1183	889	889
query65	4061	3977	3968	3968
query66	1094	432	346	346
query67	15126	15073	14800	14800
query68	8431	1019	639	639
query69	522	364	313	313
query70	1102	1016	1009	1009
query71	484	362	343	343
query72	5522	4855	4758	4758
query73	723	583	345	345
query74	8838	8765	8608	8608
query75	3644	3023	2547	2547
query76	3825	1152	737	737
query77	809	421	319	319
query78	9526	9654	8920	8920
query79	2196	834	585	585
query80	650	581	493	493
query81	532	272	231	231
query82	481	144	116	116
query83	274	268	251	251
query84	262	119	99	99
query85	912	489	440	440
query86	396	300	300	300
query87	4084	4078	3957	3957
query88	4121	2286	2280	2280
query89	393	326	296	296
query90	1981	225	212	212
query91	171	174	139	139
query92	87	73	67	67
query93	1910	1018	686	686
query94	745	467	345	345
query95	512	406	409	406
query96	548	561	285	285
query97	2616	2691	2577	2577
query98	245	221	220	220
query99	1332	1371	1241	1241
Total cold run time: 273484 ms
Total hot run time: 181985 ms

@kaka11chen kaka11chen force-pushed the opt_name_to_index_map_cost branch from 0488d41 to 73fc308 Compare December 4, 2025 07:48
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35399 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 73fc308da83ac6f3fbb18a8b6e9e88bd8400983c, data reload: false

------ Round 1 ----------------------------------
q1	17668	5333	5057	5057
q2	2044	317	197	197
q3	10254	1444	744	744
q4	10203	839	314	314
q5	7554	2698	2357	2357
q6	212	197	142	142
q7	1029	850	643	643
q8	9364	1516	1213	1213
q9	7005	5586	5525	5525
q10	6909	2215	1792	1792
q11	546	337	281	281
q12	359	406	222	222
q13	17785	3732	3064	3064
q14	233	234	204	204
q15	602	519	508	508
q16	904	878	811	811
q17	678	840	610	610
q18	7994	7215	7207	7207
q19	1086	1085	630	630
q20	391	361	223	223
q21	4150	3881	2708	2708
q22	1044	1032	947	947
Total cold run time: 108014 ms
Total hot run time: 35399 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5229	5177	5165	5165
q2	357	413	325	325
q3	2149	2733	2318	2318
q4	1352	1811	1320	1320
q5	4631	4559	4610	4559
q6	265	194	135	135
q7	2093	1940	1835	1835
q8	2982	2732	2766	2732
q9	7603	7577	7517	7517
q10	3189	3313	2881	2881
q11	706	583	498	498
q12	703	780	589	589
q13	3528	4194	3254	3254
q14	285	306	289	289
q15	591	533	505	505
q16	1033	960	896	896
q17	1224	1591	1538	1538
q18	7861	7681	7933	7681
q19	960	873	939	873
q20	2006	2120	1954	1954
q21	5003	4319	4170	4170
q22	1129	1036	984	984
Total cold run time: 54879 ms
Total hot run time: 52018 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 179889 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 73fc308da83ac6f3fbb18a8b6e9e88bd8400983c, data reload: false

query5	4853	616	480	480
query6	338	221	209	209
query7	4659	496	298	298
query8	320	245	240	240
query9	8737	2626	2643	2626
query10	558	320	286	286
query11	15328	14883	14811	14811
query12	177	125	122	122
query13	1697	501	387	387
query14	6322	3319	3135	3135
query14_1	2955	2920	2987	2920
query15	216	205	183	183
query16	7682	514	486	486
query17	1224	733	616	616
query18	2064	440	355	355
query19	217	205	170	170
query20	134	120	121	120
query21	225	136	119	119
query22	4004	4061	3889	3889
query23	16646	16120	15906	15906
query23_1	16208	15934	16072	15934
query24	7233	1644	1204	1204
query24_1	1210	1197	1227	1197
query25	634	515	451	451
query26	1275	285	187	187
query27	2872	475	323	323
query28	4399	2188	2183	2183
query29	830	584	475	475
query30	324	243	215	215
query31	829	720	629	629
query32	84	72	71	71
query33	673	364	330	330
query34	855	925	544	544
query35	794	811	732	732
query36	891	914	840	840
query37	118	96	78	78
query38	3872	3787	3752	3752
query39	762	728	728	728
query39_1	707	721	699	699
query40	237	131	133	131
query41	64	63	75	63
query42	123	99	104	99
query43	430	431	399	399
query44	1352	766	759	759
query45	203	188	187	187
query46	892	978	579	579
query47	1688	1729	1619	1619
query48	391	325	241	241
query49	764	436	365	365
query50	698	303	234	234
query51	3850	3933	3866	3866
query52	108	95	85	85
query53	234	229	177	177
query54	342	256	237	237
query55	97	83	76	76
query56	325	289	285	285
query57	1190	1155	1088	1088
query58	304	259	250	250
query59	2317	2332	2255	2255
query60	354	328	332	328
query61	167	164	158	158
query62	795	686	613	613
query63	228	170	178	170
query64	4464	1177	882	882
query65	4052	3933	3986	3933
query66	1176	449	336	336
query67	15131	14785	14655	14655
query68	8275	933	681	681
query69	545	303	270	270
query70	1104	1004	983	983
query71	443	286	278	278
query72	5978	4942	5035	4942
query73	708	576	303	303
query74	8708	8930	8647	8647
query75	3217	3025	2460	2460
query76	3538	1153	763	763
query77	729	412	305	305
query78	9466	9690	8814	8814
query79	1841	862	601	601
query80	703	562	466	466
query81	511	270	236	236
query82	228	127	104	104
query83	296	282	261	261
query84	266	123	97	97
query85	949	506	459	459
query86	382	308	282	282
query87	4034	4038	3984	3984
query88	4348	2176	2165	2165
query89	403	324	281	281
query90	2043	171	168	168
query91	176	168	146	146
query92	79	72	61	61
query93	1892	1084	684	684
query94	752	329	287	287
query95	566	340	338	338
query96	537	512	213	213
query97	2594	2668	2600	2600
query98	256	211	197	197
query99	1383	1310	1205	1205
Total cold run time: 271994 ms
Total hot run time: 179889 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.84 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 73fc308da83ac6f3fbb18a8b6e9e88bd8400983c, data reload: false

query1	0.05	0.06	0.05
query2	0.11	0.05	0.04
query3	0.25	0.08	0.08
query4	1.61	0.12	0.11
query5	0.27	0.25	0.26
query6	1.15	0.63	0.62
query7	0.03	0.02	0.03
query8	0.06	0.04	0.04
query9	0.58	0.52	0.50
query10	0.57	0.54	0.54
query11	0.16	0.11	0.11
query12	0.15	0.11	0.11
query13	0.63	0.60	0.62
query14	0.99	0.99	0.98
query15	0.82	0.80	0.81
query16	0.40	0.42	0.39
query17	1.06	1.07	1.08
query18	0.23	0.22	0.22
query19	1.89	1.89	1.87
query20	0.01	0.02	0.01
query21	15.43	0.28	0.13
query22	4.77	0.05	0.05
query23	15.88	0.29	0.10
query24	1.32	0.58	0.59
query25	0.09	0.07	0.10
query26	0.14	0.13	0.13
query27	0.07	0.06	0.05
query28	4.97	1.21	1.03
query29	12.64	4.00	3.34
query30	0.28	0.13	0.14
query31	2.82	0.63	0.41
query32	3.23	0.56	0.46
query33	3.04	3.11	3.02
query34	16.83	5.22	4.52
query35	4.59	4.58	4.59
query36	0.66	0.49	0.50
query37	0.12	0.06	0.06
query38	0.08	0.04	0.04
query39	0.05	0.04	0.03
query40	0.16	0.14	0.12
query41	0.10	0.04	0.03
query42	0.05	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 98.38 s
Total hot run time: 27.84 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 17.14% (36/210) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.35% (18633/34929)
Line Coverage 39.01% (172079/441148)
Region Coverage 33.63% (133281/396339)
Branch Coverage 34.58% (57350/165838)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 28.10% (59/210) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.14% (24709/34251)
Line Coverage 58.85% (259404/440769)
Region Coverage 53.88% (216161/401159)
Branch Coverage 55.35% (92290/166737)

range);
init_status = ((AvroJNIReader*)(_cur_reader.get()))->init_reader();
// Set col_name_to_block_idx for JNI readers to avoid repeated map creation
if (_cur_reader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if (_cur_reader) is unnecessary.

_cur_reader =
RemoteDorisReader::create_unique(_file_slot_descs, _state, _profile, range);
init_status = ((RemoteDorisReader*)(_cur_reader.get()))->init_reader();
if (_cur_reader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

PR approved by anyone and no changes requested.

@morningman morningman merged commit e6b2370 into apache:master Dec 10, 2025
29 of 32 checks passed
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Dec 29, 2025
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Dec 29, 2025
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Dec 29, 2025
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Dec 29, 2025
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Dec 29, 2025
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Dec 29, 2025
yiguolei pushed a commit that referenced this pull request Dec 30, 2025
…map every time. (#59453)

### What problem does this PR solve?


Problem Summary:

### Release note

Cherry pick #58679 

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants