Skip to content

LLDB TestNetBSDCore.py is flakey on a few bots since main executable detection changes #159377

@DavidSpickett

Description

@DavidSpickett

Around the time #157170 landed, I started seeing reports of this test failing on at least 2 of the lldb bots, without any clear changes to cause it.

Example 1: x86_64 Windows:
https://lab.llvm.org/buildbot/#/builders/211/builds/2037

PASS: LLDB (C:\buildbot\as-builder-10\lldb-x86-64\build\bin\clang.exe-x86_64) :: test_aarch64_single_threaded (TestNetBSDCore.NetBSD1LWPCoreTestCase.test_aarch64_single_threaded)
PASS: LLDB (C:\buildbot\as-builder-10\lldb-x86-64\build\bin\clang.exe-x86_64) :: test_amd64_single_threaded (TestNetBSDCore.NetBSD1LWPCoreTestCase.test_amd64_single_threaded)
PASS: LLDB (C:\buildbot\as-builder-10\lldb-x86-64\build\bin\clang.exe-x86_64) :: test_aarch64_process_signaled (TestNetBSDCore.NetBSD2LWPProcessSigCoreTestCase.test_aarch64_process_signaled)
Windows fatal exception: code 0xc0000374
Current thread 0x000034f0 (most recent call first):
  File "C:\buildbot\as-builder-10\lldb-x86-64\build\Lib\site-packages\lldb\__init__.py", line 5596 in DeleteTarget
  File "C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\test\API\functionalities\postmortem\netbsd-core\TestNetBSDCore.py", line 133 in do_test
  File "C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\test\API\functionalities\postmortem\netbsd-core\TestNetBSDCore.py", line 217 in test_amd64_process_signaled
  File "C:\Python312\Lib\unittest\case.py", line 589 in _callTestMethod
  File "C:\Python312\Lib\unittest\case.py", line 634 in run
  File "C:\Python312\Lib\unittest\case.py", line 690 in __call__
  File "C:\Python312\Lib\unittest\suite.py", line 122 in run
  File "C:\Python312\Lib\unittest\suite.py", line 84 in __call__
  File "C:\Python312\Lib\unittest\suite.py", line 122 in run
  File "C:\Python312\Lib\unittest\suite.py", line 84 in __call__
  File "C:\Python312\Lib\unittest\runner.py", line 240 in run
  File "C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\packages\Python\lldbsuite\test\dotest.py", line 1161 in run_suite
  File "C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\test\API\dotest.py", line 8 in <module>

Example 2: 32-bit Arm Linux:
https://lab.llvm.org/buildbot/#/builders/18/builds/21208

PASS: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_aarch64_single_threaded (TestNetBSDCore.NetBSD1LWPCoreTestCase)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_amd64_single_threaded (TestNetBSDCore.NetBSD1LWPCoreTestCase)
malloc(): unaligned tcache chunk detected
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
malloc(): unaligned tcache chunk detected
Fatal Python error: Aborted

Thread 0xf7ece020 (most recent call first):
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/local/lib/python3.10/dist-packages/lldb/__init__.py", line 12444 in LoadCore
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/functionalities/postmortem/netbsd-core/TestNetBSDCore.py", line 121 in do_test
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/functionalities/postmortem/netbsd-core/TestNetBSDCore.py", line 212 in test_aarch64_process_signaled
  File "/usr/lib/python3.10/unittest/case.py", line 549 in _callTestMethod
  File "/usr/lib/python3.10/unittest/case.py", line 591 in run
  File "/usr/lib/python3.10/unittest/case.py", line 650 in __call__
  File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.10/unittest/runner.py", line 184 in run
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/dotest.py", line 1161 in run_suite
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py", line 8 in <module>

I reproduced this in the same environment as the 32-bit Arm build and was able to get it to fail after running it repeatedly. The crash point would sometimes differ. Sometimes it had a backtrace within our code:

 #0 0xdd42db44 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0xf03b44)
 #1 0xdd42b034 llvm::sys::RunSignalHandlers() (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0xf01034)
 #2 0xdd42eb30 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0xe831fe50 __default_rt_sa_restorer ./signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:80:0
 #4 0xdcf02504 std::__detail::_Map_base<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, lldb_private::UUID>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, lldb_private::UUID>>, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true>, true>::operator[](std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&&) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x9d8504)
 #5 0xdd1d5bb4 ProcessElfCore::FindModuleUUID(llvm::StringRef) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0xcabbb4)
 #6 0xdf456f4c lldb_private::DynamicLoader::FindModuleViaTarget(lldb_private::FileSpec const&) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x2f2cf4c)
 #7 0xdf457294 lldb_private::DynamicLoader::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long long, unsigned long long, bool) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x2f2d294)
 #8 0xdcf238d0 DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long long, unsigned long long, bool) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x9f98d0)
 #9 0xdcf246ec DynamicLoaderPOSIXDYLD::LoadAllCurrentModules()::$_0::operator()(DYLDRendezvous::SOEntry const&) const DynamicLoaderPOSIXDYLD.cpp:0:0
#10 0xdcca1bb4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<std::function<void ()>>>, void>>::_M_invoke(std::_Any_data const&) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x777bb4)
#11 0xdcca1b0c std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x777b0c)
#12 0xe8352e0c __pthread_once_slow ./nptl/pthread_once.c:118:7
#13 0xdcca1f04 std::__future_base::_Deferred_state<std::thread::_Invoker<std::tuple<std::function<void ()>>>, void>::_M_complete_async() (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x777f04)
#14 0xdcca1fc0 std::_Function_handler<void (), std::shared_future<void> llvm::ThreadPoolInterface::asyncImpl<void>(std::function<void ()>, llvm::ThreadPoolTaskGroup*)::'lambda'()>::_M_invoke(std::_Any_data const&) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0x777fc0)
#15 0xdd3e5854 llvm::StdThreadPool::processTasks(llvm::ThreadPoolTaskGroup*) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0xebb854)
#16 0xdd3e6aa8 void* llvm::thread::ThreadProxy<std::tuple<llvm::StdThreadPool::grow(int)::$_0>>(void*) ThreadPool.cpp:0:0
Fatal Python error: Segmentation fault

Sometimes it was at a higher level somewhere in glibc:

OK
free(): invalid pointer
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
#0 0xe4dd5b44 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0xf03b44)
#1 0xe4dd3034 llvm::sys::RunSignalHandlers() (/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/_lldb.cpython-312-arm-linux-gnueabihf.so+0xf01034)
#2 0xe4dd6b30 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
#3 0xefbcde50 __default_rt_sa_restorer ./signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:80:0
#4 0xefbbe6c6 __libc_do_syscall ./csu/../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47:0
#5 0xefbfe69c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
#6 0xefbccfc6 raise ./signal/../sysdeps/posix/raise.c:27:6
Fatal Python error: Aborted

Thread 0xefb83020 (most recent call first):
  File "/home/david.spickett/build-llvm-arm/local/lib/python3.12/dist-packages/lldb/__init__.py", line 5369 in Terminate
  File "/home/david.spickett/llvm-project/lldb/packages/Python/lldbsuite/test/dotest.py", line 747 in exitTestSuite
  File "/home/david.spickett/llvm-project/lldb/packages/Python/lldbsuite/test/dotest.py", line 1190 in run_suite
  File "/home/david.spickett/llvm-project/lldb/test/API/dotest.py", line 8 in <module>

With a test that fails sometimes, you can't be 100% sure of anything, but I do not recall seeing these failures in the past, and with #157170 reverted I was able to run the test around 2000 times without issue. Whereas normally somewhere around 200 runs would get a failure.

This may be more common on 32-bit due to the reduced address space, perhaps common on Windows because it's allocator is more strict. Not sure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions