Skip to content

Conversation

@Userwhite
Copy link
Contributor

@Userwhite Userwhite commented Jul 12, 2025

What problem does this PR solve?

Issue Number: close #53139

Problem Summary:

Release note

use stream load async return to optimize the performance under high concurrency, we treat send_reply as callback of exec_plan_framgnet .

    1. because pipeline may invoke callback when prepare failed , so we use is_prepare_success to ensure no dead lock at on_header.
    1. ctx-> _finish_send_reply ensure send_reply only be invoke one time and request is valid. becuase we can't decide if callback will send_reply again when load fail or load timeout.
    1. ctx-> _can_send_reply ensure send_reply only when on_header fail or handle finished, or sender will receive broken-pipe when meet data-quality error

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

曹俊辉 added 3 commits July 12, 2025 12:04
…e better http performance

* fix some bug for partition

* fix for thrift

* fix the thrift exit bug

* Revert "[feat](http): use async reply to provide better http performance

* ensure free order

* [Fix](stream-load) Fix stream load stuck under high concurrency (apache#36772)

When the concurrency of streamload exceeds the number of threads in the
remote scanner, streamload may get stuck. The reason is that the
libevent thread blocks and waits for streamload to complete, and when
there is no intersection between the tasks handled by the scanner thread
and the libevent thread, it gets stuck.
The solution is to convert the synchronous waiting tasks of libevent
into asynchronous execution by using callbacks in the streamload
executor thread.

See merge request: !740"
Revert commit d9e74efa762c8161a5ca3df4290bbd0ab896f1ef

See merge request: !745"
Revert commit 396cb2ec7e0b1a21bc0d7424c627f0d9321884bc
* fix missing ;


* fix stream load block


See merge request: !855
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

}

int StreamLoadAction::on_header(HttpRequest* req) {
req->mark_send_reply();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这么改完之后 所有的请求都走async了?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只有 StreamLoad 请求,另外只是提交 reply 是 async 的,客户端感知仍然是同步的。

}

private:
SendReplyType _send_reply_type = REPLY_SYNC;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个初始值是不是没啥用 因为 on_header 把它改写了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还有其他 HttpAction 会用到,他们需要默认是 SYNC 的。

gavinchou
gavinchou previously approved these changes Jul 14, 2025
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 14, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@dataroaring
Copy link
Contributor

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Jul 14, 2025
gavinchou
gavinchou previously approved these changes Jul 14, 2025
@Userwhite
Copy link
Contributor Author

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 14, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

曹俊辉 added 4 commits July 15, 2025 10:55
…e better http performance

* fix some bug for partition

* fix for thrift

* fix the thrift exit bug

* Revert "[feat](http): use async reply to provide better http performance

* ensure free order

* [Fix](stream-load) Fix stream load stuck under high concurrency (apache#36772)

When the concurrency of streamload exceeds the number of threads in the
remote scanner, streamload may get stuck. The reason is that the
libevent thread blocks and waits for streamload to complete, and when
there is no intersection between the tasks handled by the scanner thread
and the libevent thread, it gets stuck.
The solution is to convert the synchronous waiting tasks of libevent
into asynchronous execution by using callbacks in the streamload
executor thread.

See merge request: !740"
Revert commit d9e74efa762c8161a5ca3df4290bbd0ab896f1ef

See merge request: !745"
Revert commit 396cb2ec7e0b1a21bc0d7424c627f0d9321884bc
* fix missing ;


* fix stream load block


See merge request: !855
@Userwhite
Copy link
Contributor Author

run buildall

@Userwhite
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32001 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e3fd0ba4fd56f8e4109b3598db618f34d74a32a1, data reload: false

------ Round 1 ----------------------------------
q1	17755	4173	4099	4099
q2	2180	352	238	238
q3	10214	1309	767	767
q4	10230	867	333	333
q5	7510	2135	1851	1851
q6	199	176	143	143
q7	957	804	671	671
q8	9282	1429	1187	1187
q9	5335	4776	4644	4644
q10	7254	1805	1422	1422
q11	558	314	293	293
q12	688	740	613	613
q13	17825	3843	3133	3133
q14	291	297	271	271
q15	594	518	511	511
q16	677	690	617	617
q17	756	734	624	624
q18	6812	6335	6413	6335
q19	1107	975	604	604
q20	399	377	256	256
q21	3169	2537	2403	2403
q22	1069	1023	986	986
Total cold run time: 104861 ms
Total hot run time: 32001 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4089	4033	4060	4033
q2	348	373	322	322
q3	2118	2619	2263	2263
q4	1292	1722	1329	1329
q5	4136	4033	4070	4033
q6	207	170	132	132
q7	1928	2163	1863	1863
q8	2641	2403	2378	2378
q9	7204	7090	7096	7090
q10	2477	2733	2381	2381
q11	568	485	441	441
q12	728	802	623	623
q13	3667	3979	3544	3544
q14	294	315	271	271
q15	550	518	501	501
q16	678	697	623	623
q17	1151	1311	1409	1311
q18	8375	7757	7848	7757
q19	906	890	915	890
q20	2008	2101	1920	1920
q21	4936	4639	4273	4273
q22	1118	1029	964	964
Total cold run time: 51419 ms
Total hot run time: 48942 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173277 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e3fd0ba4fd56f8e4109b3598db618f34d74a32a1, data reload: false

query5	5029	589	446	446
query6	332	227	206	206
query7	4221	462	275	275
query8	336	250	235	235
query9	8732	2612	2609	2609
query10	518	385	327	327
query11	15250	15042	14852	14852
query12	179	112	117	112
query13	1286	510	383	383
query14	6105	2968	2728	2728
query14_1	2656	2622	2681	2622
query15	199	196	176	176
query16	3196	478	440	440
query17	1077	674	561	561
query18	2186	449	334	334
query19	213	218	185	185
query20	128	116	114	114
query21	533	138	137	137
query22	3956	4119	3841	3841
query23	15953	15611	15354	15354
query23_1	15370	15347	15580	15347
query24	7436	1589	1207	1207
query24_1	1243	1216	1253	1216
query25	577	445	392	392
query26	1221	260	156	156
query27	2744	456	294	294
query28	4511	2190	2197	2190
query29	756	527	428	428
query30	343	232	216	216
query31	817	638	557	557
query32	76	75	64	64
query33	554	335	285	285
query34	898	890	540	540
query35	741	796	713	713
query36	871	914	843	843
query37	151	96	82	82
query38	2745	2790	2673	2673
query39	786	757	721	721
query39_1	728	718	717	717
query40	218	131	120	120
query41	69	66	62	62
query42	111	100	107	100
query43	462	441	457	441
query44	1334	751	751	751
query45	192	184	173	173
query46	859	972	634	634
query47	1411	1472	1314	1314
query48	319	336	250	250
query49	595	443	348	348
query50	664	289	217	217
query51	3785	3759	3814	3759
query52	109	109	100	100
query53	302	333	277	277
query54	310	280	262	262
query55	84	77	75	75
query56	302	305	310	305
query57	1041	1001	968	968
query58	291	264	252	252
query59	2179	2223	2196	2196
query60	346	330	313	313
query61	197	192	184	184
query62	411	374	324	324
query63	303	274	287	274
query64	4671	1411	1131	1131
query65	3823	3759	3693	3693
query66	1412	446	409	409
query67	15091	14741	14545	14545
query68	8277	1023	733	733
query69	499	343	315	315
query70	1007	942	927	927
query71	374	299	285	285
query72	6011	3404	3418	3404
query73	780	736	312	312
query74	8754	8811	8597	8597
query75	2886	2846	2483	2483
query76	3442	1061	677	677
query77	556	383	288	288
query78	9727	9935	9218	9218
query79	1246	921	605	605
query80	665	601	479	479
query81	518	266	232	232
query82	212	146	107	107
query83	266	267	254	254
query84	266	121	109	109
query85	907	508	470	470
query86	390	322	318	318
query87	2885	2860	2765	2765
query88	3272	2311	2316	2311
query89	388	350	335	335
query90	2084	164	155	155
query91	170	171	140	140
query92	73	68	63	63
query93	1122	912	563	563
query94	572	321	306	306
query95	589	339	362	339
query96	604	492	213	213
query97	2351	2389	2283	2283
query98	220	197	200	197
query99	620	576	494	494
Total cold run time: 256607 ms
Total hot run time: 173277 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.01 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e3fd0ba4fd56f8e4109b3598db618f34d74a32a1, data reload: false

query1	0.06	0.05	0.05
query2	0.17	0.05	0.05
query3	0.28	0.09	0.09
query4	1.61	0.11	0.12
query5	0.26	0.26	0.27
query6	1.16	0.67	0.66
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.59	0.50	0.50
query10	0.55	0.54	0.55
query11	0.19	0.11	0.12
query12	0.16	0.13	0.13
query13	0.62	0.60	0.60
query14	1.01	1.00	0.97
query15	0.82	0.81	0.80
query16	0.39	0.37	0.40
query17	0.98	1.00	1.06
query18	0.23	0.22	0.22
query19	1.88	1.71	1.81
query20	0.02	0.02	0.01
query21	15.45	0.30	0.13
query22	4.92	0.05	0.06
query23	16.15	0.29	0.10
query24	1.96	0.53	0.57
query25	0.09	0.07	0.05
query26	0.13	0.13	0.14
query27	0.05	0.04	0.05
query28	4.71	1.05	0.88
query29	12.56	3.92	3.13
query30	0.28	0.15	0.12
query31	2.82	0.61	0.38
query32	3.23	0.56	0.47
query33	2.96	3.02	3.03
query34	16.78	5.10	4.45
query35	4.45	4.44	4.49
query36	0.66	0.51	0.49
query37	0.13	0.07	0.06
query38	0.08	0.04	0.04
query39	0.06	0.03	0.03
query40	0.17	0.15	0.14
query41	0.11	0.04	0.03
query42	0.06	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 98.92 s
Total hot run time: 27.01 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 11.11% (14/126) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.39% (18967/35526)
Line Coverage 39.27% (176015/448245)
Region Coverage 33.81% (136114/402617)
Branch Coverage 34.76% (58806/169183)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.81% (119/134) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.20% (25077/34731)
Line Coverage 58.94% (263507/447052)
Region Coverage 53.77% (218743/406776)
Branch Coverage 55.37% (93979/169741)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Stream Load Async Return Optimization

Thanks for this PR! The optimization concept is sound - making HTTP replies asynchronous should improve thread utilization under high concurrency. I've traced through the synchronization logic and verified it handles race conditions correctly.

Critical Issues (Please Fix Before Merge)

1. Typo in Variable Name (http_request.h:108)

std::future<bool> _http_reply_futrue = _http_reply_promise.get_future();
//                         ^^^^^^ should be "future"

2. Hardcoded 3-Second Timeout (http_request.cpp:175)

auto status = _http_reply_futrue.wait_for(std::chrono::seconds(3));

3 seconds is too aggressive for legitimate long-running loads. Consider:

  • Making it configurable via config::stream_load_async_reply_timeout_s
  • Default to something like 300 seconds

3. Debug Stack Trace Logging (http_request.cpp:159)

VLOG_NOTICE << "finish send reply, infos=" << infos
            << ", stack=" << get_stack_trace(); // temp locate problem
  • get_stack_trace() is expensive even when VLOG is disabled (string is still constructed)
  • Comment says "temp locate problem" - should be removed before merge

Minor Issues

  1. Duplicate error logging - Same error logged in both handle() (line 122) and _send_reply() (line 157)

  2. Missing finish_send_reply() in send_error() - http_channel.cpp calls finish_send_reply() in send_reply() and send_file() but not in send_error(). Is this intentional?

  3. Suggestion for is_prepare_success - Using shared_ptr<bool> works but std::atomic<bool> would make the intent clearer

  4. Consider adding comments - The synchronization state machine with _can_send_reply, _finish_send_reply, and the CV is correct but complex. Comments explaining the wait conditions would help future maintainers.


Thread Safety Analysis ✅

I traced through these scenarios and verified the logic is correct:

Scenario Result
Normal success path ✅ Reply sent via callback
on_header failure ✅ Reply sent immediately, handle() skipped
Fragment prepare failure ✅ Callback not invoked, handle() sends reply
Request timeout _finish_send_reply=true prevents use-after-free

Overall

Concept: ✅ Sound optimization approach
Implementation: ⚠️ Needs polish (typo, timeout, debug logging)
Thread Safety: ✅ Correct

Recommend addressing the 3 critical issues before merge.

@dataroaring
Copy link
Contributor

Follow-up on send_error() missing finish_send_reply():

After deeper investigation, this is NOT a bug. Here's why:

  1. StreamLoadAction inherits from HttpHandler (not HttpHandlerWithAuth)
  2. StreamLoadAction never calls send_error() - it uses its own _send_reply() method for all error paths
  3. mark_send_reply() is only called in StreamLoadAction::on_header(), marking it as REPLY_ASYNC
  4. Other handlers (like HttpHandlerWithAuth) that use send_error() never call mark_send_reply(), so they stay in default REPLY_SYNC mode
  5. In REPLY_SYNC mode, wait_finish_send_reply() returns immediately without waiting

Therefore, send_error() doesn't need finish_send_reply() because it's never called from an async reply context. The code is correct as-is.

@Userwhite
Copy link
Contributor Author

Userwhite commented Jan 10, 2026

is_prepare_success must use std::shared_ptr
s_prepare_success is freed after _exec_env->fragment_mgr()->exec_plan_fragment() is called,
but the exec_fragment callback still needs to access it.

@Userwhite
Copy link
Contributor Author

run buildall

@Userwhite
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31474 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ebb3f8e2a46fa984729197459954c85a0b3acfb9, data reload: false

------ Round 1 ----------------------------------
q1	17612	4184	4039	4039
q2	2013	364	243	243
q3	10158	1337	722	722
q4	10243	884	329	329
q5	8140	2107	1853	1853
q6	247	177	138	138
q7	926	803	660	660
q8	9264	1369	1167	1167
q9	5124	4689	4656	4656
q10	6812	1829	1405	1405
q11	498	293	282	282
q12	759	737	572	572
q13	17761	3845	3104	3104
q14	291	299	273	273
q15	600	523	495	495
q16	673	685	625	625
q17	684	803	536	536
q18	7007	6285	6409	6285
q19	1218	979	620	620
q20	389	368	247	247
q21	3030	2505	2268	2268
q22	1037	1019	955	955
Total cold run time: 104486 ms
Total hot run time: 31474 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4114	4050	4041	4041
q2	332	388	310	310
q3	2098	2576	2199	2199
q4	1351	1794	1314	1314
q5	4116	4065	4152	4065
q6	203	175	134	134
q7	1901	1820	2085	1820
q8	2527	2512	2451	2451
q9	7216	7135	7159	7135
q10	2524	2742	2306	2306
q11	598	499	490	490
q12	738	748	602	602
q13	3726	4017	3364	3364
q14	320	334	274	274
q15	612	598	544	544
q16	649	676	622	622
q17	1155	1330	1402	1330
q18	7918	7903	8000	7903
q19	848	879	827	827
q20	2017	2043	2032	2032
q21	5055	4478	4330	4330
q22	1178	1047	982	982
Total cold run time: 51196 ms
Total hot run time: 49075 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173067 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ebb3f8e2a46fa984729197459954c85a0b3acfb9, data reload: false

query5	4414	609	428	428
query6	321	246	235	235
query7	4217	461	255	255
query8	327	244	224	224
query9	8725	2651	2700	2651
query10	485	370	308	308
query11	15289	15068	14748	14748
query12	168	124	125	124
query13	1247	492	367	367
query14	6112	2977	2722	2722
query14_1	2634	2614	2641	2614
query15	204	192	170	170
query16	968	490	460	460
query17	1094	675	596	596
query18	2474	425	325	325
query19	227	220	197	197
query20	123	121	116	116
query21	214	138	118	118
query22	4067	4076	3982	3982
query23	15993	15546	15633	15546
query23_1	15384	15456	15369	15369
query24	7324	1564	1176	1176
query24_1	1189	1171	1180	1171
query25	525	478	396	396
query26	1245	263	154	154
query27	2764	454	283	283
query28	4599	2145	2137	2137
query29	761	552	440	440
query30	305	239	212	212
query31	816	623	548	548
query32	72	69	68	68
query33	529	335	281	281
query34	891	884	517	517
query35	735	784	681	681
query36	824	874	843	843
query37	124	94	83	83
query38	2779	2739	2599	2599
query39	782	748	733	733
query39_1	718	727	727	727
query40	219	133	116	116
query41	69	62	63	62
query42	107	105	106	105
query43	482	494	455	455
query44	1364	725	708	708
query45	189	182	174	174
query46	861	1000	598	598
query47	1423	1453	1408	1408
query48	357	332	252	252
query49	632	436	343	343
query50	635	272	215	215
query51	3817	3766	3745	3745
query52	108	110	96	96
query53	315	327	293	293
query54	309	281	269	269
query55	87	82	75	75
query56	309	306	313	306
query57	999	1027	957	957
query58	276	262	262	262
query59	2158	2153	2056	2056
query60	332	326	308	308
query61	197	189	187	187
query62	403	370	311	311
query63	300	276	277	276
query64	5029	1421	1121	1121
query65	3796	3724	3742	3724
query66	1447	437	317	317
query67	15239	14766	15880	14766
query68	4773	1028	717	717
query69	511	367	324	324
query70	1049	969	837	837
query71	370	310	282	282
query72	6098	3420	3437	3420
query73	742	727	301	301
query74	8873	8712	8657	8657
query75	2835	2824	2452	2452
query76	3455	1068	648	648
query77	519	383	279	279
query78	9794	9961	9200	9200
query79	951	903	595	595
query80	1127	584	486	486
query81	541	263	233	233
query82	409	157	111	111
query83	371	263	243	243
query84	260	109	99	99
query85	925	514	453	453
query86	396	328	326	326
query87	2905	2895	2781	2781
query88	3245	2233	2220	2220
query89	387	360	336	336
query90	1919	161	153	153
query91	177	169	140	140
query92	67	68	62	62
query93	1049	882	531	531
query94	637	319	297	297
query95	578	377	311	311
query96	584	478	209	209
query97	2362	2402	2314	2314
query98	228	212	200	200
query99	566	578	509	509
Total cold run time: 251146 ms
Total hot run time: 173067 ms

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 11.29% (14/124) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.02% (18980/35797)
Line Coverage 39.09% (175899/449950)
Region Coverage 33.68% (136263/404617)
Branch Coverage 34.70% (58879/169701)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.64% (117/132) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.87% (25849/34992)
Line Coverage 61.27% (274942/448716)
Region Coverage 56.20% (229696/408729)
Branch Coverage 58.03% (98793/170246)

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 14, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 3ec56d7 into apache:master Jan 14, 2026
29 of 30 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 14, 2026
…e under high concurrency (#53144)

Issue Number: close #53139

Co-authored-by: 曹俊辉 <[email protected]>
Co-authored-by: Yongqiang YANG <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.x dev/3.1.x dev/4.0.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] use stream load async return to optimize the performance under high concurrency

7 participants