Skip to content

Commit bc090b9

Browse files
committed
[fuzzer][Fuchsia] Prevent deadlock from suspending threads
Every once in a couple hundred runs of a downstream fuzzer test, we see a fuzzing test freeze while waiting for a thread to be suspended. The main thread is frozen because it's waiting to suspend either the alarm or rss thread which is stuck waiting for an exception they sent out to be handled. Specifically, both threads send out a synthetic `ZX_EXCP_THREAD_STARTING` exception to be handled by the crash handling thread which sets up an exception channel on the whole process with `ZX_EXCEPTION_CHANNEL_DEBUGGER`. This is the only channel type that listens to thread stop/start exceptions. Normally, the exception would be ignored and the alarm or rss thread would continue normally once the crash handling thread closes the read exception. However, the memory snapshot machinery can suspend this thread while its in the process of waiting for or handling a `ZX_EXCP_THREAD_STARTING` sent by either the rss or alarm thread. If this is suspended first, then we attempt to suspend either the alarm or rss thread while they're still waiting for the crash handling thread to handle its exception, we will freeze waiting for those threads to give the suspend signal, which they won't because they're blocked on waiting for the exception handler. This is the deadlock. Until there's a way for the memory snapshot machinery to suspend the thread while it's stuck on an exception, then we can work around this in the meantime by just ensuring the alarm and rss threads start normally via signals on the initial startup path. I can assert locally the freezing doesn't occur after 6000 runs where prior we would see it every couple hundred runs. Note this type of issue can arise again if the fuzzing test launches any dangling threads that happen to not start yet. One of the recommendations for writing a fuzz test is that the test may launch threads, but they should be joined by the end of the test (https://llvm.org/docs/LibFuzzer.html#fuzz-target), so hopefully we won't see this type of bug rise frequently from fuzz tests. More broadly, this can also arise if any process launches its own debugger via `ZX_EXCEPTION_CHANNEL_DEBUGGER`, but I would think in practice this isn't very likely to happen. More context in https://fxbug.dev/436923423.
1 parent 38f0b9e commit bc090b9

File tree

2 files changed

+61
-3
lines changed

2 files changed

+61
-3
lines changed

compiler-rt/lib/fuzzer/FuzzerDriver.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,6 +306,9 @@ static int RunInMultipleProcesses(const std::vector<std::string> &Args,
306306
return HasErrors ? 1 : 0;
307307
}
308308

309+
// Fuchsia needs to do some book checking before starting the RssThread,
310+
// so it has its own implementation.
311+
#if !LIBFUZZER_FUCHSIA
309312
static void RssThread(Fuzzer *F, size_t RssLimitMb) {
310313
while (true) {
311314
SleepSeconds(1);
@@ -321,6 +324,7 @@ static void StartRssThread(Fuzzer *F, size_t RssLimitMb) {
321324
std::thread T(RssThread, F, RssLimitMb);
322325
T.detach();
323326
}
327+
#endif
324328

325329
int RunOneTest(Fuzzer *F, const char *InputFilePath, size_t MaxLen) {
326330
Unit U = FileToVector(InputFilePath);

compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp

Lines changed: 57 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,9 @@ void ExitOnErr(zx_status_t Status, const char *Syscall) {
6868
}
6969

7070
void AlarmHandler(int Seconds) {
71+
// Signal the alarm thread started.
72+
ExitOnErr(_zx_object_signal(SignalHandlerEvent, 0, ZX_USER_SIGNAL_0),
73+
"_zx_object_signal AlarmHandler");
7174
while (true) {
7275
SleepSeconds(Seconds);
7376
Fuzzer::StaticAlarmCallback();
@@ -282,6 +285,7 @@ void CrashHandler() {
282285
Self, ZX_EXCEPTION_CHANNEL_DEBUGGER, &Channel.Handle),
283286
"_zx_task_create_exception_channel");
284287

288+
// Signal the crash thread started.
285289
ExitOnErr(_zx_object_signal(SignalHandlerEvent, 0, ZX_USER_SIGNAL_0),
286290
"_zx_object_signal");
287291

@@ -385,10 +389,49 @@ void StopSignalHandler() {
385389
_zx_handle_close(SignalHandlerEvent);
386390
}
387391

392+
void RssThread(Fuzzer *F, size_t RssLimitMb) {
393+
// Signal the rss thread started.
394+
//
395+
// We must wait for this thread to start because we could accidentally suspend
396+
// it while the crash handler is attempting to handle the
397+
// ZX_EXCP_THREAD_STARTING exception. If the crash handler is suspended by the
398+
// lsan machinery, then there's no way for this thread to indicate it's
399+
// suspended because it's blocked on waiting for the exception to be handled.
400+
ExitOnErr(_zx_object_signal(SignalHandlerEvent, 0, ZX_USER_SIGNAL_0),
401+
"_zx_object_signal rss");
402+
while (true) {
403+
SleepSeconds(1);
404+
size_t Peak = GetPeakRSSMb();
405+
if (Peak > RssLimitMb)
406+
F->RssLimitCallback();
407+
}
408+
}
409+
388410
} // namespace
389411

412+
void StartRssThread(Fuzzer *F, size_t RssLimitMb) {
413+
// Set up the crash handler and wait until it is ready before proceeding.
414+
assert(SignalHandlerEvent == ZX_HANDLE_INVALID);
415+
ExitOnErr(_zx_event_create(0, &SignalHandlerEvent), "_zx_event_create");
416+
417+
if (!RssLimitMb)
418+
return;
419+
std::thread T(RssThread, F, RssLimitMb);
420+
T.detach();
421+
422+
// Wait for the rss thread to start.
423+
ExitOnErr(_zx_object_wait_one(SignalHandlerEvent, ZX_USER_SIGNAL_0,
424+
ZX_TIME_INFINITE, nullptr),
425+
"_zx_object_wait_one rss");
426+
ExitOnErr(_zx_object_signal(SignalHandlerEvent, ZX_USER_SIGNAL_0, 0),
427+
"_zx_object_signal rss clear");
428+
}
429+
390430
// Platform specific functions.
391431
void SetSignalHandler(const FuzzingOptions &Options) {
432+
assert(SignalHandlerEvent != ZX_HANDLE_INVALID &&
433+
"This should've been setup by StartRssThread.");
434+
392435
// Make sure information from libFuzzer and the sanitizers are easy to
393436
// reassemble. `__sanitizer_log_write` has the added benefit of ensuring the
394437
// DSO map is always available for the symbolizer.
@@ -404,6 +447,20 @@ void SetSignalHandler(const FuzzingOptions &Options) {
404447
if (Options.HandleAlrm && Options.UnitTimeoutSec > 0) {
405448
std::thread T(AlarmHandler, Options.UnitTimeoutSec / 2 + 1);
406449
T.detach();
450+
451+
// Wait for the alarm thread to start.
452+
//
453+
// We must wait for this thread to start because we could accidentally
454+
// suspend it while the crash handler is attempting to handle the
455+
// ZX_EXCP_THREAD_STARTING exception. If the crash handler is suspended by
456+
// the lsan machinery, then there's no way for this thread to indicate it's
457+
// suspended because it's blocked on waiting for the exception to be
458+
// handled.
459+
ExitOnErr(_zx_object_wait_one(SignalHandlerEvent, ZX_USER_SIGNAL_0,
460+
ZX_TIME_INFINITE, nullptr),
461+
"_zx_object_wait_one alarm");
462+
ExitOnErr(_zx_object_signal(SignalHandlerEvent, ZX_USER_SIGNAL_0, 0),
463+
"_zx_object_signal alarm clear");
407464
}
408465

409466
// Options.HandleInt and Options.HandleTerm are not supported on Fuchsia
@@ -413,9 +470,6 @@ void SetSignalHandler(const FuzzingOptions &Options) {
413470
!Options.HandleFpe && !Options.HandleAbrt && !Options.HandleTrap)
414471
return;
415472

416-
// Set up the crash handler and wait until it is ready before proceeding.
417-
ExitOnErr(_zx_event_create(0, &SignalHandlerEvent), "_zx_event_create");
418-
419473
SignalHandler = std::thread(CrashHandler);
420474
zx_status_t Status = _zx_object_wait_one(SignalHandlerEvent, ZX_USER_SIGNAL_0,
421475
ZX_TIME_INFINITE, nullptr);

0 commit comments

Comments
 (0)