-
Notifications
You must be signed in to change notification settings - Fork 246
Description
We have an ASAN failure in CI, which I had tried to address previously: #7326
==16142==ERROR: AddressSanitizer: stack-use-after-return on address 0x78264e200ab0 at pc 0x5bf50862ce57 bp 0x7ffd590c6460 sp 0x7ffd590c6458
WRITE of size 8 at 0x78264e200ab0 thread T0
#0 0x5bf50862ce56 in std::__atomic_base<long>::fetch_add(long, std::memory_order) /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.0/../../../../include/c++/13.2.0/bits/atomic_base.h:635:16
#1 0x5bf50862ce56 in std::__atomic_base<long>::operator++(int) /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.0/../../../../include/c++/13.2.0/bits/atomic_base.h:386:16
#2 0x5bf50865ef59 in long asynchost::ConnIDGenerator::get_next_id<std::unordered_map<long, asynchost::proxy_ptr<asynchost::TCPImpl>, std::hash<long>, std::equal_to<long>, std::allocator<std::pair<long const, asynchost::proxy_ptr<asynchost::TCPImpl>>>>>(std::unordered_map<long, asynchost::proxy_ptr<asynchost::TCPImpl>, std::hash<long>, std::equal_to<long>, std::allocator<std::pair<long const, asynchost::proxy_ptr<asynchost::TCPImpl>>>>&) CCF/build/CCF/src/host/rpc_connections.h:63:17
#3 0x5bf50865e629 in asynchost::RPCConnectionsImpl<asynchost::proxy_ptr<asynchost::TCPImpl>>::get_next_id() CCF/build/CCF/src/host/rpc_connections.h:477:20
#4 0x5bf508661821 in asynchost::RPCConnectionsImpl<asynchost::proxy_ptr<asynchost::TCPImpl>>::RPCServerBehaviour::on_accept(asynchost::proxy_ptr<asynchost::TCPImpl>&) CCF/build/CCF/src/host/rpc_connections.h:171:33
#5 0x5bf5083b751b in asynchost::TCPImpl::on_accept(int) CCF/build/CCF/src/host/tcp.h:682:18
#6 0x5bf5083b5243 in asynchost::TCPImpl::on_accept(uv_stream_s*, int) CCF/build/CCF/src/host/tcp.h:650:44
#7 0x782650137c81 (/usr/lib/libuv.so.1+0x1ec81) (BuildId: 89410ca0b8401d55965003994b04e3171b4f89be)
#8 0x78265013ea21 (/usr/lib/libuv.so.1+0x25a21) (BuildId: 89410ca0b8401d55965003994b04e3171b4f89be)
#9 0x78265012bc87 in uv_run (/usr/lib/libuv.so.1+0x12c87) (BuildId: 89410ca0b8401d55965003994b04e3171b4f89be)
#10 0x5bf5081d417e in ccf::run(int, char**) CCF/build/CCF/src/host/run.cpp:1122:7
#11 0x5bf5070aeb11 in main CCF/build/CCF/samples/apps/main.cpp:8:10
#12 0x78264fa8befa (/usr/lib/libc.so.6+0x27efa) (BuildId: 0d12acd21d14b15e173fe794eae4ca16754613fa)
#13 0x78264fa8bfba in __libc_start_main (/usr/lib/libc.so.6+0x27fba) (BuildId: 0d12acd21d14b15e173fe794eae4ca16754613fa)
#14 0x5bf506b1a984 in _start (/__w/CCF/CCF/build/samples/apps/logging/logging+0x252e984)
Address 0x78264e200ab0 is located in stack of thread T0 at offset 2736 in frame
#0 0x5bf5081be2bf in ccf::run_main_loop(host::CCHostConfig&, messaging::BufferProcessor&, ringbuffer::Circuit&, EnclaveConfig&, ccf::LoggerLevel) CCF/build/CCF/src/host/run.cpp:655
This test reproduces the entire situation more simply: https://github.com/cjen1-msft/CCF/blob/repro-asan-failure/src/host/test/uv_loop.cpp
The problem is:
IdGen is freed at the end of the main loop, along with the proxy_ptr<Timer<RPCConnections>>.
This kicks off the closing of the Timer, but as the callback has not yet fired, the RPCConnections has not yet been freed and so the TCP's proxy_ptr owned by the RPCConnections is still live.
So if the accept triggers here, then there is no way to use the TCP handle closing as a guard against the access of the IdGen.
The problem in general is that we need to communicate to perhaps deeply nested objects that they are shutting down and if they are run via their UV handle they should abort.
One option is to walk the loop before freeing the state, and explicitly close all handles, however I'm concerned this will interact badly with the proxy_ptrs.
Another option is to use uv_loop_get/set_data to hold a stop flag, which we then use to communicate that the loop is closing.