Skip to content

Conversation

@wenzhenghu
Copy link
Contributor

@wenzhenghu wenzhenghu commented Feb 6, 2026

What problem does this PR solve?

Issue Number: #59531

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring
Copy link
Contributor

Code review

Found 2 issues:

  1. Inverted is_disposable logic in beta_rowset_reader.cpp -- Missing ! negation. The old code was is_disposable = disable_file_cache, so when cache is disabled, data is disposable (correct). The new code is is_disposable = enable_file_cache_olap_tables, which marks data as disposable when cache is enabled (wrong). Should be !enable_file_cache_olap_tables, consistent with how file_scanner.cpp handles the same pattern.

_read_options.io_ctx.query_id = &_read_context->runtime_state->query_id();
_read_options.io_ctx.is_disposable =
_read_context->runtime_state->query_options().enable_file_cache_olap_tables;
}

  1. Wrong variable used in file_scanner.cpp -- FileScanner is used for external catalog scans (Hive, S3, etc.), but line 172 uses enable_file_cache_olap_tables instead of enable_file_cache_external_catalogs. Line 1815 in the same file correctly uses enable_file_cache_external_catalogs, confirming the variable at line 172 should match. This defeats the purpose of separating OLAP vs external catalog cache control.

_io_ctx->file_reader_stats = _file_reader_stats.get();
_io_ctx->is_disposable = !_state->query_options().enable_file_cache_olap_tables;

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@wenzhenghu
Copy link
Contributor Author

wenzhenghu commented Feb 9, 2026

Code review

Found 2 issues:

  1. Inverted is_disposable logic in beta_rowset_reader.cpp -- Missing ! negation. The old code was is_disposable = disable_file_cache, so when cache is disabled, data is disposable (correct). The new code is is_disposable = enable_file_cache_olap_tables, which marks data as disposable when cache is enabled (wrong). Should be !enable_file_cache_olap_tables, consistent with how file_scanner.cpp handles the same pattern.

_read_options.io_ctx.query_id = &_read_context->runtime_state->query_id();
_read_options.io_ctx.is_disposable =
_read_context->runtime_state->query_options().enable_file_cache_olap_tables;
}

  1. Wrong variable used in file_scanner.cpp -- FileScanner is used for external catalog scans (Hive, S3, etc.), but line 172 uses enable_file_cache_olap_tables instead of enable_file_cache_external_catalogs. Line 1815 in the same file correctly uses enable_file_cache_external_catalogs, confirming the variable at line 172 should match. This defeats the purpose of separating OLAP vs external catalog cache control.

_io_ctx->file_reader_stats = _file_reader_stats.get();
_io_ctx->is_disposable = !_state->query_options().enable_file_cache_olap_tables;

🤖 Generated with Claude Code

  • If this code review was useful, please react with 👍. Otherwise, react with 👎.

This feature is still under development and self-testing. Since the PR title can no longer be modified on GitHub, I wasn't able to add the draft label. I can request your review at the latest after the Spring Festival holiday. @dataroaring

@wenzhenghu wenzhenghu marked this pull request as draft February 9, 2026 01:06
@xuchenhao xuchenhao force-pushed the refactor_datacache_session_switchvar branch from 940d4db to 72f04db Compare February 11, 2026 05:52
@xuchenhao
Copy link
Contributor

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.32% (1795/2263)
Line Coverage 64.83% (31963/49300)
Region Coverage 65.56% (15951/24330)
Branch Coverage 56.04% (8476/15124)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 58.82% (10/17) 🎉
Increment coverage report
Complete coverage report

@xuchenhao
Copy link
Contributor

run buildall

@xuchenhao
Copy link
Contributor

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 70.59% (12/17) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 72.73% (8/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.77% (19474/36906)
Line Coverage 36.25% (181466/500550)
Region Coverage 32.62% (140687/431236)
Branch Coverage 33.66% (61017/181277)

@doris-robot
Copy link

TPC-H: Total hot run time: 30729 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3b59a3bc88b62728bd1b58199a075a23823e07c9, data reload: false

------ Round 1 ----------------------------------
q1	17618	4556	4313	4313
q2	2024	384	237	237
q3	10132	1375	738	738
q4	10200	786	312	312
q5	7494	2175	1961	1961
q6	198	190	151	151
q7	908	752	596	596
q8	9270	1496	1266	1266
q9	4732	4657	4640	4640
q10	6831	1953	1565	1565
q11	484	258	250	250
q12	343	398	230	230
q13	17780	4159	3228	3228
q14	230	240	214	214
q15	864	812	802	802
q16	673	713	632	632
q17	684	801	558	558
q18	6584	5826	6555	5826
q19	1138	1072	636	636
q20	584	539	417	417
q21	2795	2073	1906	1906
q22	347	305	251	251
Total cold run time: 101913 ms
Total hot run time: 30729 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4586	4558	4701	4558
q2	286	359	248	248
q3	2496	2855	2507	2507
q4	1502	1859	1464	1464
q5	4569	4451	4502	4451
q6	223	179	138	138
q7	2064	1953	1768	1768
q8	2586	2398	2390	2390
q9	7689	7604	7524	7524
q10	2906	3137	2586	2586
q11	501	448	420	420
q12	740	794	626	626
q13	3808	4387	3624	3624
q14	306	321	282	282
q15	876	792	793	792
q16	657	683	642	642
q17	1098	1284	1276	1276
q18	7666	7373	7389	7373
q19	867	809	815	809
q20	1947	2034	1855	1855
q21	4507	4256	4104	4104
q22	492	457	416	416
Total cold run time: 52372 ms
Total hot run time: 49853 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190832 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3b59a3bc88b62728bd1b58199a075a23823e07c9, data reload: false

query5	4328	624	492	492
query6	351	230	212	212
query7	4224	494	273	273
query8	347	245	236	236
query9	8698	2737	2740	2737
query10	502	386	328	328
query11	16519	16410	16134	16134
query12	195	125	125	125
query13	1260	465	370	370
query14	6700	3340	3006	3006
query14_1	2868	2789	2773	2773
query15	200	190	177	177
query16	985	437	443	437
query17	1103	696	592	592
query18	2714	447	348	348
query19	212	213	192	192
query20	130	124	121	121
query21	216	151	126	126
query22	4981	5119	4840	4840
query23	19197	18513	18357	18357
query23_1	18450	18484	18361	18361
query24	7185	1612	1201	1201
query24_1	1208	1236	1245	1236
query25	557	457	419	419
query26	1240	267	159	159
query27	2749	465	292	292
query28	4489	1888	1886	1886
query29	839	558	471	471
query30	319	261	222	222
query31	912	745	659	659
query32	92	80	73	73
query33	560	347	296	296
query34	949	911	564	564
query35	660	685	609	609
query36	1126	1153	997	997
query37	151	102	92	92
query38	2967	2950	2914	2914
query39	944	924	896	896
query39_1	882	874	880	874
query40	221	142	159	142
query41	68	63	63	63
query42	102	98	101	98
query43	431	444	407	407
query44	1339	711	716	711
query45	199	192	187	187
query46	889	975	615	615
query47	2169	2207	2006	2006
query48	308	320	235	235
query49	594	444	353	353
query50	666	274	212	212
query51	4189	4130	4136	4130
query52	104	103	93	93
query53	294	325	275	275
query54	290	265	255	255
query55	92	87	80	80
query56	326	307	327	307
query57	1430	1409	1277	1277
query58	278	268	259	259
query59	2028	2159	2095	2095
query60	360	331	316	316
query61	150	141	148	141
query62	604	568	516	516
query63	294	287	267	267
query64	4912	1248	963	963
query65	4532	4403	4466	4403
query66	1382	453	337	337
query67	16592	16306	16218	16218
query68	2400	1073	713	713
query69	413	315	286	286
query70	1023	937	971	937
query71	324	322	297	297
query72	2948	2801	2499	2499
query73	515	560	324	324
query74	9675	9696	9573	9573
query75	2774	2760	2402	2402
query76	2323	1049	648	648
query77	365	386	296	296
query78	11434	11517	10860	10860
query79	1063	916	589	589
query80	1336	562	492	492
query81	574	270	260	260
query82	1026	151	117	117
query83	360	264	242	242
query84	249	125	101	101
query85	892	479	447	447
query86	414	306	301	301
query87	3161	3097	3024	3024
query88	3596	2681	2689	2681
query89	392	334	318	318
query90	1927	178	171	171
query91	163	160	132	132
query92	77	76	68	68
query93	1067	890	474	474
query94	695	327	302	302
query95	582	339	370	339
query96	635	512	228	228
query97	2495	2494	2430	2430
query98	225	215	209	209
query99	947	934	799	799
Total cold run time: 264325 ms
Total hot run time: 190832 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.99 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3b59a3bc88b62728bd1b58199a075a23823e07c9, data reload: false

query1	0.05	0.04	0.04
query2	0.10	0.05	0.05
query3	0.26	0.09	0.08
query4	1.60	0.12	0.10
query5	0.27	0.24	0.25
query6	1.17	0.69	0.66
query7	0.03	0.03	0.02
query8	0.05	0.04	0.04
query9	0.58	0.50	0.49
query10	0.55	0.55	0.55
query11	0.14	0.10	0.10
query12	0.14	0.10	0.10
query13	0.63	0.62	0.62
query14	1.06	1.06	1.06
query15	0.88	0.86	0.87
query16	0.41	0.40	0.39
query17	1.12	1.15	1.18
query18	0.23	0.21	0.22
query19	2.11	2.04	2.03
query20	0.02	0.01	0.02
query21	15.42	0.26	0.14
query22	5.25	0.05	0.04
query23	15.85	0.30	0.10
query24	0.98	0.24	0.61
query25	0.11	0.09	0.09
query26	0.15	0.14	0.13
query27	0.06	0.07	0.06
query28	4.04	1.16	0.97
query29	12.56	3.96	3.19
query30	0.31	0.13	0.11
query31	2.82	0.64	0.41
query32	3.24	0.59	0.50
query33	3.21	3.27	3.24
query34	16.15	5.33	4.72
query35	4.76	5.28	5.36
query36	0.68	0.55	0.56
query37	0.12	0.08	0.08
query38	0.07	0.04	0.04
query39	0.05	0.04	0.03
query40	0.20	0.17	0.17
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 97.59 s
Total hot run time: 28.99 s

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (11/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.70% (25928/36161)
Line Coverage 54.31% (271194/499311)
Region Coverage 51.59% (224748/435615)
Branch Coverage 53.18% (96782/181981)

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 64.71% (11/17) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants