Skip to content

Fix reentrant assert failures in session resumption with custom executions#5844

Open
leikong wants to merge 7 commits intomainfrom
kong/fix.retrentrant.bugcheck.in.sesstion.resumption
Open

Fix reentrant assert failures in session resumption with custom executions#5844
leikong wants to merge 7 commits intomainfrom
kong/fix.retrentrant.bugcheck.in.sesstion.resumption

Conversation

@leikong
Copy link
Contributor

@leikong leikong commented Mar 4, 2026

When MsQuicLib.CustomExecutions is enabled, all MsQuicSetParam calls execute inline with Connection->State.InlineApiExecution = TRUE (see api.c:1666). For QUIC_PARAM_CONN_RESUMPTION_TICKET, this triggers a debug assertion failure at connection.c:693:

CXPLAT_DBG_ASSERT(!Connection->State.InlineApiExecution || Connection->State.HandleClosed);

The assertion fires because QuicConnIndicateEvent is called while InlineApiExecution is TRUE and HandleClosed is FALSE. Two distinct call paths trigger this:

Path 1: Via streams-available indication

MsQuicSetParam(QUIC_PARAM_CONN_RESUMPTION_TICKET)
  -> InlineApiExecution = TRUE
    -> QuicConnProcessPeerTransportParameters(FromResumptionTicket=TRUE)
      -> QuicStreamSetInitializeTransportParameters(FlushIfUnblocked=FALSE)
        -> QuicStreamSetIndicateStreamsAvailable
          -> QuicConnIndicateEvent  <-- ASSERT

Path 2: Via datagram state change

MsQuicSetParam(QUIC_PARAM_CONN_RESUMPTION_TICKET)
  -> InlineApiExecution = TRUE
    -> QuicConnProcessPeerTransportParameters(FromResumptionTicket=TRUE)
      -> QuicDatagramOnSendStateChanged
        -> QuicConnIndicateEvent  <-- ASSERT

Fix

Relax the assertion to not check app callback reentrancy when custom execution is enabled, per suggestion by @guhetier.

leikong added 2 commits March 4, 2026 08:33
…cutions

When MsQuicLib.CustomExecutions is true, SetParam calls execute inline with
InlineApiExecution = TRUE. For QUIC_PARAM_CONN_RESUMPTION_TICKET, the call
chain reaches QuicStreamSetIndicateStreamsAvailable -> QuicConnIndicateEvent,

Since the resumption ticket is set before ConnectionStart, there are no streams
yet, so the STREAMS_AVAILABLE event is meaningless. Gate the indication on
FlushIfUnblocked (which is already FALSE for the resumption ticket path),
consistent with the existing flush guard.
@leikong leikong requested a review from a team as a code owner March 4, 2026 17:02
@leikong leikong changed the title Fix reentrant assertion failures when enabling session resumption with custom executions Fix reentrant assert failures when enabling session resumption with custom executions Mar 4, 2026
@leikong leikong changed the title Fix reentrant assert failures when enabling session resumption with custom executions Fix reentrant assert failures in session resumption with custom executions Mar 4, 2026
@codecov
Copy link

codecov bot commented Mar 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.63%. Comparing base (9b5b9df) to head (b2c600d).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5844      +/-   ##
==========================================
- Coverage   86.24%   85.63%   -0.61%     
==========================================
  Files          60       60              
  Lines       18732    18732              
==========================================
- Hits        16156    16042     -114     
- Misses       2576     2690     +114     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@guhetier guhetier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we can simply skip those notifications.

With 0-RTT, it should be possible for the app to queue streams and datagram before the connection is started, and it seems reasonable to me the app would rely on STREAM_AVAIALBLE and DATAGRAM_STATE_CHANGED notification to know what to queue as a reaction to setting the resumption ticket.

The assert is meant to avoid re-entrant notifications. In your scenario, the problem is not a re-entrant notification AFAIU. Can we relax the assertion instead? Or is the assertion reporting a real issue here?

@leikong
Copy link
Contributor Author

leikong commented Mar 5, 2026

I am not sure we can simply skip those notifications.

With 0-RTT, it should be possible for the app to queue streams and datagram before the connection is started, and it seems reasonable to me the app would rely on STREAM_AVAIALBLE and DATAGRAM_STATE_CHANGED notification to know what to queue as a reaction to setting the resumption ticket.

The assert is meant to avoid re-entrant notifications. In your scenario, the problem is not a re-entrant notification AFAIU. Can we relax the assertion instead? Or is the assertion reporting a real issue here?

The following are the two assert failure stacks, I don't see callback reentrancy, please also verify, thanks.

Stack with QuicStreamSetIndicateStreamsAvailable:

#5  0x00007edfac619ded in quic_bugcheck (File=0x7edfac95b400 <str> "../../src/ext/msquic/src/core/connection.c", Line=695, Expr=0x7edfac95c3c0 <str> "!Connection->State.InlineApiExecution || Connection->State.HandleClosed") at ../../src/ext/msquic/src/platform/platform_posix.c:93
#6  0x00007edfac542c93 in QuicConnIndicateEvent (Connection=0x7cefab684510, Event=0x7adfa13aaca0) at ../../src/ext/msquic/src/core/connection.c:693
#7  0x00007edfac5132ad in QuicStreamSetIndicateStreamsAvailable (StreamSet=0x7cefab684e10) at ../../src/ext/msquic/src/core/stream_set.c:324
#8  0x00007edfac513d04 in QuicStreamSetInitializeTransportParameters (StreamSet=0x7cefab684e10, BidiStreamCount=256, UnidiStreamCount=0, FlushIfUnblocked=0 '\000') at ../../src/ext/msquic/src/core/stream_set.c:427
#9  0x00007edfac554498 in QuicConnProcessPeerTransportParameters (Connection=0x7cefab684510, FromResumptionTicket=1 '\001') at ../../src/ext/msquic/src/core/connection.c:3073
#10 0x00007edfac56f942 in QuicConnParamSet (Connection=0x7cefab684510, Param=83886096, BufferLength=1798, Buffer=0x7c9fab460080) at ../../src/ext/msquic/src/core/connection.c:6609
#11 0x00007edfac4bd039 in QuicLibrarySetParam (Handle=0x7cefab684510, Param=83886096, BufferLength=1798, Buffer=0x7c9fab460080) at ../../src/ext/msquic/src/core/library.c:1893
#12 0x00007edfac529edc in MsQuicSetParam (Handle=0x7cefab684510, Param=83886096, BufferLength=1798, Buffer=0x7c9fab460080) at ../../src/ext/msquic/src/core/api.c:1670
#13 0x0000590f192046d0 in quicpp::connection::set_resumption_ticket(std::span<unsigned char const, 18446744073709551615ul>) (this=0x7b4fab425ee0, ticket=std::span of length 1798 = {...}) at ../src/src/prod/quicpp/lib/connection.cpp:180
#14 0x0000590f18eaf772 in meru::net::quic_transport::detail::quic_connection::connect_async(quicpp::address const&) [clone .resume] (this=0x7befab692b90, address=...) at ../src/src/prod/quic_transport/lib/quic_connection.cpp:166
#15 0x0000590f186504d4 in std::__n4861::coroutine_handle<void>::resume() const (this=0x7defac81d4c8) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/coroutine:135
#16 0x0000590f18650024 in lnm::detail::scheduler_entry_base::resume(bool) (this=0x7defac81d4c8, invoke_on_resume=true) at ../src/ext/base/src/prod/task_scheduler/inc/task_scheduler/lnm/detail/scheduler_entry_base.h:81
#17 0x0000590f1864f919 in lnm::detail::scheduler_entry<void, void, true>::resume(bool) (this=0x7defac81d4c0, invoke_on_resume=true) at ../src/ext/base/src/prod/task_scheduler/inc/task_scheduler/lnm/detail/scheduler_entry.h:111
#18 0x0000590f195d6f14 in lnm::detail::worker_thread::dispatch_work(lnm::detail::scheduler_entry<void, void, true>*) (this=0x7c6fab3eff80, entry=0x7defac81d4c0) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1844
#19 0x0000590f195d5936 in lnm::detail::worker_thread::dispatch_work() (this=0x7c6fab3eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1368
#20 0x0000590f195d50ab in lnm::detail::worker_thread::dispatch() (this=0x7c6fab3eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1284
#21 0x0000590f195d4692 in lnm::detail::worker_thread::native_dispatch() (this=0x7c6fab3eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1265
#22 0x0000590f195d4cbf in lnm::detail::worker_thread::native_dispatch_affinitized() (this=0x7c6fab3eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1248
#23 0x0000590f195ec559 in lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::operator()(lnm::detail::worker_thread*) const (this=0x7adfa11db020, self=0x7c6fab3eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:291
#24 0x0000590f195ec4e1 in lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::__invoke(lnm::detail::worker_thread*) (self=0x7c6fab3eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:291
#25 0x0000590f195fe2e4 in std::__invoke_impl<void, void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*>(std::__invoke_other, void (*&&)(lnm::detail::worker_thread*), lnm::detail::worker_thread*&&) (__f=@0x7b0fab4010e0: 0x590f195ec420 <lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::__invoke(lnm::detail::worker_thread*)>, __args=@0x7b0fab4010d8: 0x7c6fab3eff80) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61
#26 0x0000590f195fe22d in std::__invoke<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*>(void (*&&)(lnm::detail::worker_thread*), lnm::detail::worker_thread*&&) (__fn=@0x7b0fab4010e0: 0x590f195ec420 <lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::__invoke(lnm::detail::worker_thread*)>, __args=@0x7b0fab4010d8: 0x7c6fab3eff80) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96
#27 0x0000590f195fe202 in std::thread::_Invoker<std::tuple<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (this=0x7b0fab4010d8) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:279
#28 0x0000590f195fe1c5 in std::thread::_Invoker<std::tuple<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*> >::operator()() (this=0x7b0fab4010d8) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:286
#29 0x0000590f195fe019 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*> > >::_M_run() (this=0x7b0fab4010d0) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:231
#30 0x00007edfac2b0253 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#31 0x0000590f1851f177 in asan_thread_start(void*) ()
#32 0x00007edfac03fac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#33 0x00007edfac0d18d0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Stack with QuicDatagramOnSendStateChanged:

#5  0x000077783d619ded in quic_bugcheck (File=0x77783d95b400 <str> "../../src/ext/msquic/src/core/connection.c", Line=695, Expr=0x77783d95c3c0 <str> "!Connection->State.InlineApiExecution || Connection->State.HandleClosed") at ../../src/ext/msquic/src/platform/platform_posix.c:93
#6  0x000077783d542c93 in QuicConnIndicateEvent (Connection=0x75883c744110, Event=0x737834978120) at ../../src/ext/msquic/src/core/connection.c:693
#7  0x000077783d5a2245 in QuicDatagramOnSendStateChanged (Datagram=0x75883c745060) at ../../src/ext/msquic/src/core/datagram.c:306
#8  0x000077783d5544ab in QuicConnProcessPeerTransportParameters (Connection=0x75883c744110, FromResumptionTicket=1 '\001') at ../../src/ext/msquic/src/core/connection.c:3079
#9  0x000077783d56f942 in QuicConnParamSet (Connection=0x75883c744110, Param=83886096, BufferLength=1798, Buffer=0x75383c285880) at ../../src/ext/msquic/src/core/connection.c:6609
#10 0x000077783d4bd039 in QuicLibrarySetParam (Handle=0x75883c744110, Param=83886096, BufferLength=1798, Buffer=0x75383c285880) at ../../src/ext/msquic/src/core/library.c:1893
#11 0x000077783d529edc in MsQuicSetParam (Handle=0x75883c744110, Param=83886096, BufferLength=1798, Buffer=0x75383c285880) at ../../src/ext/msquic/src/core/api.c:1670
#12 0x0000571e96ad4740 in quicpp::connection::set_resumption_ticket(std::span<unsigned char const, 18446744073709551615ul>) (this=0x73e83c391230, ticket=std::span of length 1798 = {...}) at ../src/src/prod/quicpp/lib/connection.cpp:176
#13 0x0000571e9677fea5 in meru::net::quic_transport::detail::quic_connection::connect_async(quicpp::address const&) [clone .resume] (this=0x74883c47fa90, address=...) at ../src/src/prod/quic_transport/lib/quic_connection.cpp:166
#14 0x0000571e95f254d4 in std::__n4861::coroutine_handle<void>::resume() const (this=0x768849f21418) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/coroutine:135
#15 0x0000571e95f25024 in lnm::detail::scheduler_entry_base::resume(bool) (this=0x768849f21418, invoke_on_resume=true) at ../src/ext/base/src/prod/task_scheduler/inc/task_scheduler/lnm/detail/scheduler_entry_base.h:81
#16 0x0000571e95f24919 in lnm::detail::scheduler_entry<void, void, true>::resume(bool) (this=0x768849f21410, invoke_on_resume=true) at ../src/ext/base/src/prod/task_scheduler/inc/task_scheduler/lnm/detail/scheduler_entry.h:111
#17 0x0000571e96ea6de4 in lnm::detail::worker_thread::dispatch_work(lnm::detail::scheduler_entry<void, void, true>*) (this=0x75083c1eff80, entry=0x768849f21410) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1844
#18 0x0000571e96ea5806 in lnm::detail::worker_thread::dispatch_work() (this=0x75083c1eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1368
#19 0x0000571e96ea4f7b in lnm::detail::worker_thread::dispatch() (this=0x75083c1eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1284
#20 0x0000571e96ea4562 in lnm::detail::worker_thread::native_dispatch() (this=0x75083c1eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1265
#21 0x0000571e96ea4b8f in lnm::detail::worker_thread::native_dispatch_affinitized() (this=0x75083c1eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:1248
#22 0x0000571e96ebc429 in lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::operator()(lnm::detail::worker_thread*) const (this=0x7378347e0020, self=0x75083c1eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:291
#23 0x0000571e96ebc3b1 in lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::__invoke(lnm::detail::worker_thread*) (self=0x75083c1eff80) at ../src/ext/base/src/prod/task_scheduler/lib/lnm/worker_thread.cpp:291
#24 0x0000571e96ece1b4 in std::__invoke_impl<void, void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*>(std::__invoke_other, void (*&&)(lnm::detail::worker_thread*), lnm::detail::worker_thread*&&) (__f=@0x73a83c2010e0: 0x571e96ebc2f0 <lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::__invoke(lnm::detail::worker_thread*)>, __args=@0x73a83c2010d8: 0x75083c1eff80) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61
#25 0x0000571e96ece0fd in std::__invoke<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*>(void (*&&)(lnm::detail::worker_thread*), lnm::detail::worker_thread*&&) (__fn=@0x73a83c2010e0: 0x571e96ebc2f0 <lnm::detail::worker_thread::init_dispatch_thread(lnm::external_worker::thread_info const&)::$_0::__invoke(lnm::detail::worker_thread*)>, __args=@0x73a83c2010d8: 0x75083c1eff80) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96
#26 0x0000571e96ece0d2 in std::thread::_Invoker<std::tuple<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (this=0x73a83c2010d8) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:279
#27 0x0000571e96ece095 in std::thread::_Invoker<std::tuple<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*> >::operator()() (this=0x73a83c2010d8) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:286
#28 0x0000571e96ecdee9 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(lnm::detail::worker_thread*), lnm::detail::worker_thread*> > >::_M_run() (this=0x73a83c2010d0) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:231
#29 0x000077783d2b0253 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#30 0x0000571e95df4177 in asan_thread_start(void*) ()
#31 0x000077783d03fac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#32 0x000077783d0d18d0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

leikong added 2 commits March 4, 2026 16:09
…essing

Instead of skipping STREAMS_AVAILABLE and DATAGRAM_STATE_CHANGED notifications
when called from the resumption ticket path (which breaks 0-RTT scenarios),
defer them by queuing operations that get processed when the worker drains
the operation queue, outside the InlineApiExecution context.

- Add QUIC_OPER_TYPE_STREAMS_AVAILABLE and QUIC_OPER_TYPE_DATAGRAM_STATE_CHANGED
  operation types.
  instead of calling QuicStreamSetIndicateStreamsAvailable directly.
- When FromResumptionTicket, enqueue DATAGRAM_STATE_CHANGED instead of calling
  QuicDatagramOnSendStateChanged directly.
- Handle both new operation types in QuicConnDrainOperations.
- Make QuicSendQueueFlush unconditional since it only enqueues a FLUSH_SEND
  operation and does not trigger app callbacks.
- Export QuicStreamSetIndicateStreamsAvailable declaration in stream_set.h.
@guhetier
Copy link
Collaborator

guhetier commented Mar 5, 2026

I am not sure we can simply skip those notifications.
With 0-RTT, it should be possible for the app to queue streams and datagram before the connection is started, and it seems reasonable to me the app would rely on STREAM_AVAIALBLE and DATAGRAM_STATE_CHANGED notification to know what to queue as a reaction to setting the resumption ticket.
The assert is meant to avoid re-entrant notifications. In your scenario, the problem is not a re-entrant notification AFAIU. Can we relax the assertion instead? Or is the assertion reporting a real issue here?

The following are the two assert failure stacks, I don't see callback reentrancy, please also verify, thanks.

Stack with QuicStreamSetIndicateStreamsAvailable:
....

I think we are saying the same thing. The original intent of the assert is to avoid reentrancy when in notification handling.
In your scenario (with custom execution), we can hit the assert without reentrancy, because custom execution always execute the api call inline.

My point is that we should consider relaxing the assertion without changing the logic unless we have a clear issue to justify the behavior change.

@leikong
Copy link
Contributor Author

leikong commented Mar 5, 2026

I am not sure we can simply skip those notifications.
With 0-RTT, it should be possible for the app to queue streams and datagram before the connection is started, and it seems reasonable to me the app would rely on STREAM_AVAIALBLE and DATAGRAM_STATE_CHANGED notification to know what to queue as a reaction to setting the resumption ticket.
The assert is meant to avoid re-entrant notifications. In your scenario, the problem is not a re-entrant notification AFAIU. Can we relax the assertion instead? Or is the assertion reporting a real issue here?

The following are the two assert failure stacks, I don't see callback reentrancy, please also verify, thanks.
Stack with QuicStreamSetIndicateStreamsAvailable:
....

I think we are saying the same thing. The original intent of the assert is to avoid reentrancy when in notification handling. In your scenario (with custom execution), we can hit the assert without reentrancy, because custom execution always execute the api call inline.

My point is that we should consider relaxing the assertion without changing the logic unless we have a clear issue to justify the behavior change.

I have made the change to queue the notifications when FromResumptionTicket is set.
Relaxing the assert appears to be a bigger change to me, it risks hiding real reentrant issues, I don't feel comfortable changing the API contract. Custom execution is different from the case of closed handle (the only current exception), which can never lead to callback reentrant, or I could be missing some context.

        //
        // MsQuic shouldn't indicate reentrancy to the app when at all possible.
        // The general exception to this rule is when the connection is being
        // closed because the API MUST block until all work is completed, so we
        // have to execute the event callbacks inline.
        //

@leikong leikong requested a review from guhetier March 5, 2026 23:07
@guhetier
Copy link
Collaborator

Sorry for the delay.
I don't think that queuing notification is a good option.

  1. It opens the door for races between the notification and other operations on the connection. A get/set param could change the state of the connection while other tasks are already queued. The connection state could then change, processing of tasks (like a receive, etc...) happen with the new connection state but before the app is notified.

  2. The comment about re-entrancy was done before custom executions. Custom executions fundamentally changes this design: set/get param must be executed inline to not block the thread. This isn't the only trade off that comes with using custom execution, and I think requiring the app to be aware of the possibility of re-entrancy is ok.

I think we should relax the assertion to:

        CXPLAT_DBG_ASSERT(
            !Connection->State.InlineApiExecution ||
            Connection->State.HandleClosed || 
            <isUsingCustomExecution>);

unless there is an actual issue you are hitting on release builds currently. I don't think changing the behavior when custom execution is not used only because of a debug assert hit with custom execution enabled is acceptable.

On a second time, we can re-think the design of custom executions if needed (maybe have an async version of get/set param, or another pattern to deal with the category of issues).

@leikong
Copy link
Contributor Author

leikong commented Mar 11, 2026

Sorry for the delay. I don't think that queuing notification is a good option.

  1. It opens the door for races between the notification and other operations on the connection. A get/set param could change the state of the connection while other tasks are already queued. The connection state could then change, processing of tasks (like a receive, etc...) happen with the new connection state but before the app is notified.
  2. The comment about re-entrancy was done before custom executions. Custom executions fundamentally changes this design: set/get param must be executed inline to not block the thread. This isn't the only trade off that comes with using custom execution, and I think requiring the app to be aware of the possibility of re-entrancy is ok.

I think we should relax the assertion to:

        CXPLAT_DBG_ASSERT(
            !Connection->State.InlineApiExecution ||
            Connection->State.HandleClosed || 
            <isUsingCustomExecution>);

unless there is an actual issue you are hitting on release builds currently. I don't think changing the behavior when custom execution is not used only because of a debug assert hit with custom execution enabled is acceptable.

On a second time, we can re-think the design of custom executions if needed (maybe have an async version of get/set param, or another pattern to deal with the category of issues).

Updated the assert per suggestion, please review again, thanks.

guhetier
guhetier previously approved these changes Mar 11, 2026
@guhetier
Copy link
Collaborator

nit: There is a similar assertion in QuicStreamIndicateEvent. A similar fix might be needed there, but we can keep it for later to be aware if that's a path we ever exercise.

@leikong
Copy link
Contributor Author

leikong commented Mar 11, 2026

nit: There is a similar assertion in QuicStreamIndicateEvent. A similar fix might be needed there, but we can keep it for later to be aware if that's a path we ever exercise.

I haven't run into it so far, but will keep it in mind, thanks for the info.

@leikong
Copy link
Contributor Author

leikong commented Mar 12, 2026

nit: There is a similar assertion in QuicStreamIndicateEvent. A similar fix might be needed there, but we can keep it for later to be aware if that's a path we ever exercise.

For consistency and to avoid potential confusion in future, I decided to also skip reentrancy assert check for custom execution in QuicStreamIndicateEvent, although this was not encountered in my testing.

Also, InlineApiExecution is only read in the two assertions, so technically, we can remove the code that sets InlineApiExecution because of CustomExecutions, but that is a bigger change that affects more code paths, so it's better to err on the safe side to keep the fix simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants