-
Notifications
You must be signed in to change notification settings - Fork 14.8k
Description
Split off from #153123, as I believe the illegal instruction crash and the hang are two separate issues based on the below bisect result (BOLTing clang
right before the bad commit still results in the illegal instruction crash from that issue whereas it fixes this one).
I have noticed a hang occurs when running a BOLT instrumented clang
in a Fedora Linux virtual machine on my M1 Max Mac Studio.
$ llvm-bolt \
--instrument \
--instrumentation-file=/tmp/clang.fdata \
--instrumentation-file-append-pid \
-o clang.inst \
clang-21
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: d8e9216c27b82b4292e83437d58aebf594adb111
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x6c00000, offset 0x6c00000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-WARNING: 1 collisions detected while hashing binary objects. Use -v=1 to see the list.
BOLT-INFO: number of removed linker-inserted veneers: 0
BOLT-INFO: 0 out of 129351 functions in the binary (0.0%) have non-empty execution profile
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 47228
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 127221
BOLT-INSTRUMENTER: Number of function descriptors: 127221
BOLT-INSTRUMENTER: Number of branch counters: 1381499
BOLT-INSTRUMENTER: Number of ST leaf node counters: 678042
BOLT-INSTRUMENTER: Number of direct call counters: 0
BOLT-INSTRUMENTER: Total number of counters: 2059541
BOLT-INSTRUMENTER: Total size of counters: 16476328 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 14682724 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 144601856 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /tmp/clang.fdata
BOLT-INFO: Starting stub-insertion pass
BOLT-INFO: Inserted 3810 stubs in the hot area and 0 stubs in the cold area. Shared 64123 times, iterated 4 times.
BOLT-INFO: padding code to 0x10400000 to accommodate hot text
BOLT-INFO: output linked against instrumentation runtime library, lib entry point is 0x121639fc
BOLT-INFO: clear procedure is 0x121600f0
BOLT-INFO: setting __bolt_runtime_start to 0x121639fc
BOLT-INFO: setting __bolt_runtime_fini to 0x12163a8c
BOLT-INFO: setting __hot_start to 0x6e00000
BOLT-INFO: setting __hot_end to 0x10318b94
$ ./clang-21 --version
ClangBuiltLinux clang version 21.1.0-rc2 (https://github.com/llvm/llvm-project.git d8e9216c27b82b4292e83437d58aebf594adb111)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/nathan
$ ./clang.inst --version
<hangs indefinitely>
On an Ampere Altra system running Fedora on bare metal, clang.inst --version
shows the same output as clang-21 --version
.
I tested various versions of llvm-bolt
with the same clang-21
binary as input to see if this was a regression at some point (since I have never tried building LLVM in a virtual machine on my Mac until I recently needed to while the Altra system was having issues). In testing, I saw that llvm-bolt
@ llvmorg-19-init
produced an instrumented binary that would run correctly (or at least had the same result for --version
) but llvmorg-20-init
did not. That bisect landed on b98e6a5, which seems like a reasonable change to blame (cc @ElvinaYakubova):
# bad: [10c6d6349e51bb245b9deec4aafca9885971135b] Clear release notes for upcoming LLVM 20 dev cycle
# good: [987087df90026605fc8d03ebda5a1cd31b71e609] Bump trunk version to 19.0.0git
git bisect start 'llvmorg-20-init' 'llvmorg-19-init'
# bad: [bd34bc6dc2e4e60813ddea31bfb4ca46d3a96013] [Clang][AArch64] Extend diagnostics when warning non/streaming about vector size difference (#88380)
git bisect bad bd34bc6dc2e4e60813ddea31bfb4ca46d3a96013
# bad: [851ab41d33fcbc72bc334dfc2d5d4c0902ccbb23] [clang][test] Fix constant __builtin_popcountg test requiring __int128 (#84412)
git bisect bad 851ab41d33fcbc72bc334dfc2d5d4c0902ccbb23
# good: [78d401b02a2dc1ed5446546a149030184f24bee0] Revert "[libc][NFC] Use user defined literals to build 128 and 256 bit constants." (#81771)
git bisect good 78d401b02a2dc1ed5446546a149030184f24bee0
# good: [822142ffdfbe93f213c2c6b3f2aec7fe5f0af072] [OpenMP][OMPD] libompd must not link libomp (#83119)
git bisect good 822142ffdfbe93f213c2c6b3f2aec7fe5f0af072
# bad: [bbeb946652f2830b3211dcd8c6836bce4dbdd188] [clang][analyzer] Change value of checker option in unix.StdCLibraryFunctions (second try). (#80457)
git bisect bad bbeb946652f2830b3211dcd8c6836bce4dbdd188
# bad: [22f5e30c1798280c7476c0374280342b48880bb5] [libc][NFC] Rename `LIBC_COMPILER_HAS_C23_FLOAT16` to `LIBC_TYPES_HAS_FLOAT16` (#83396)
git bisect bad 22f5e30c1798280c7476c0374280342b48880bb5
# bad: [ad43ea3328dad844c245caf93509c2facba1ec32] [TableGen] Add support for DefaultMode in per-HwMode encode/decode. (#83029)
git bisect bad ad43ea3328dad844c245caf93509c2facba1ec32
# bad: [cf1c97b2d29c51d6c2e79454f6ec3d1f8f98e672] [AMDGPU] Do not attempt to fallback to default mutations (#83208)
git bisect bad cf1c97b2d29c51d6c2e79454f6ec3d1f8f98e672
# bad: [9ca8db352d22444feabd859380252f13826a8aff] [SHT_LLVM_BB_ADDR_MAP] Adds pretty printing of BFI and BPI for PGO Analysis Map in tools. (#82292)
git bisect bad 9ca8db352d22444feabd859380252f13826a8aff
# skip: [f7cf1f6236ee299d65c2b33429c1d3b729f54c32] [CodeGen][MISched] dumpSched direction depends on field in DAG.
git bisect skip f7cf1f6236ee299d65c2b33429c1d3b729f54c32
# good: [55783bd0f9cfc30aa93c718919dab5419d86a2c6] [HIP] fix host min/max in header (#82956)
git bisect good 55783bd0f9cfc30aa93c718919dab5419d86a2c6
# bad: [9106b58ce4e8dada167eec50178a9e154342e4ba] [CodeGen][MISched] Add misched post-regalloc bottom-up scheduling
git bisect bad 9106b58ce4e8dada167eec50178a9e154342e4ba
# bad: [a6b4e29c77ceb49e16bda38cfc4eddc2c4c76c0b] [libc] Re-enable several GPU math smoke tests (#83147)
git bisect bad a6b4e29c77ceb49e16bda38cfc4eddc2c4c76c0b
# good: [8a87f763a6841832e71bcd24dea45eac8d2dbee1] Aim debugserver workaround more precisely. (#83099)
git bisect good 8a87f763a6841832e71bcd24dea45eac8d2dbee1
# bad: [183b6b56f2602ea171502f9f2843c2c1caca2919] [clang][Interp] Ignore unnamed bitfields when checking init
git bisect bad 183b6b56f2602ea171502f9f2843c2c1caca2919
# bad: [b98e6a5ced8328fdefa9a519ae98052a29462e23] [BOLT][AArch64] Skip BBs only instead of functions (#81989)
git bisect bad b98e6a5ced8328fdefa9a519ae98052a29462e23
# first bad commit: [b98e6a5ced8328fdefa9a519ae98052a29462e23] [BOLT][AArch64] Skip BBs only instead of functions (#81989)
Running the binary under gdb
shows it hanging in llvm::InitLLVM::InitLLVM
.
$ gdb --args ./clang.inst --version
GNU gdb (Fedora Linux) 16.3-4.fc43
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./clang.inst...
(No debugging symbols found in ./clang.inst)
(gdb) run
Starting program: /run/host/home/nathan/tmp/bolt-clang-aarch64-issues/clang.inst --version
This GDB supports auto-downloading debuginfo from the following URLs:
<ima:enforcing>
<https://debuginfod.fedoraproject.org/>
<ima:ignore>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGINT, Interrupt.
0x0000aaaab4be38e4 in llvm::InitLLVM::InitLLVM(int&, char const**&, bool) ()
(gdb) info registers
x0 0x1 1
x1 0x1 1
x2 0xaaaabcdc4b50 187650289716048
x3 0x4e225 320037
x4 0xaaaabcdc4b60 187650289716064
x5 0xaaaab150e848 187650096031816
x6 0x0 0
x7 0x0 0
x8 0xaaaab150f5a0 187650096035232
x9 0xaaaab150f5b0 187650096035248
x10 0xaaaab150f5b0 187650096035248
x11 0x1 1
x12 0x1 1
x13 0xfffff7fff340 281474842489664
x14 0x2ef 751
x15 0xb 11
x16 0xaaaab14aef68 187650095640424
x17 0xfffff7ba0b00 281474837908224
x18 0x0 0
x19 0xffffffffe208 281474976702984
x20 0xffffffffe2a0 281474976703136
x21 0xffffffffe278 281474976703096
x22 0xffffffffe448 281474976703560
x23 0x2 2
x24 0xaaaab150f588 187650096035208
x25 0xaaaab150f598 187650096035224
x26 0xfffff7fea7a0 281474842404768
x27 0x1 1
x28 0x0 0
x29 0xffffffffe190 281474976702864
x30 0xaaaab4be37d0 187650153527248
sp 0xffffffffe190 0xffffffffe190
pc 0xaaaab4be38e4 0xaaaab4be38e4 <llvm::InitLLVM::InitLLVM(int&, char const**&, bool)+2256>
cpsr 0x40001000 [ EL=0 BTYPE=0 SSBS Z ]
fpsr 0x10 [ IXC ]
fpcr 0x0 [ Len=0 Stride=0 RMode=0 ]
tpidr 0xfffff7fea7a0 0xfffff7fea7a0
tpidr2 0x0 0x0
(gdb) disas 0xaaaab4be3800,0xaaaab4be3930
Dump of assembler code from 0xaaaab4be3800 to 0xaaaab4be3930:
0x0000aaaab4be3800 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2028>: mov w8, #0x1 // #1
0x0000aaaab4be3804 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2032>: ldaxr w9, [x25]
0x0000aaaab4be3808 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2036>: cbnz w9, 0xaaaab4be3894 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2176>
0x0000aaaab4be380c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2040>: add x9, x24, #0x10
0x0000aaaab4be3810 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2044>: stlxr w10, w8, [x9]
0x0000aaaab4be3814 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2048>: cbnz w10, 0xaaaab4be3804 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2032>
0x0000aaaab4be3818 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2052>: mov x8, x24
0x0000aaaab4be381c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2056>: stp x0, x1, [sp, #-16]!
0x0000aaaab4be3820 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2060>: mrs x1, nzcv
0x0000aaaab4be3824 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2064>: adrp x0, 0xaaaabc10c000
0x0000aaaab4be3828 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2068>: add x0, x0, #0x370
0x0000aaaab4be382c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2072>: str x2, [sp, #-16]!
0x0000aaaab4be3830 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2076>: mov x2, #0x1 // #1
0x0000aaaab4be3834 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2080>: stadd x2, [x0]
0x0000aaaab4be3838 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2084>: ldr x2, [sp], #16
0x0000aaaab4be383c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2088>: msr nzcv, x1
0x0000aaaab4be3840 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2092>: ldp x0, x1, [sp], #16
0x0000aaaab4be3844 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2096>: adrp x10, 0xaaaab74ab000 <_ZN4llvm12write_doubleERNS_11raw_ostreamEdNS_10FloatStyleESt8optionalImE+244>
0x0000aaaab4be3848 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2100>: add x10, x10, #0xe10
0x0000aaaab4be384c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2104>: stp x10, xzr, [x8]
0x0000aaaab4be3850 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2108>: mov w8, #0x2 // #2
0x0000aaaab4be3854 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2112>: stlr w8, [x9]
0x0000aaaab4be3858 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2116>: bl 0xaaaab4be4100 <_ZL16RegisterHandlersv.llvm.7055692918415979921>
0x0000aaaab4be385c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2120>: adrp x0, 0xaaaab150e000 <_ZZN4llvm11BuryPointerEPKvE9GraveYard+8>
0x0000aaaab4be3860 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2124>: add x0, x0, #0x848
0x0000aaaab4be3864 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2128>: bl 0xaaaab11bd640 <__cxa_guard_release@plt>
0x0000aaaab4be3868 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2132>: stp x0, x1, [sp, #-16]!
0x0000aaaab4be386c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2136>: mrs x1, nzcv
0x0000aaaab4be3870 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2140>: adrp x0, 0xaaaabc10c000
0x0000aaaab4be3874 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2144>: add x0, x0, #0x2d8
0x0000aaaab4be3878 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2148>: str x2, [sp, #-16]!
0x0000aaaab4be387c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2152>: mov x2, #0x1 // #1
0x0000aaaab4be3880 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2156>: stadd x2, [x0]
0x0000aaaab4be3884 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2160>: ldr x2, [sp], #16
0x0000aaaab4be3888 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2164>: msr nzcv, x1
0x0000aaaab4be388c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2168>: ldp x0, x1, [sp], #16
0x0000aaaab4be3890 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2172>: b 0xaaaab4be31e8 <_ZN4llvm8InitLLVMC2ERiRPPKcb+468>
0x0000aaaab4be3894 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2176>: adrp x10, 0xaaaab150f000 <_ZZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEiE10StackTrace+648>
0x0000aaaab4be3898 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2180>: add x10, x10, #0x5b0
0x0000aaaab4be389c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2184>: adrp x8, 0xaaaab150f000 <_ZZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEiE10StackTrace+648>
0x0000aaaab4be38a0 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2188>: add x8, x8, #0x5a0
0x0000aaaab4be38a4 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2192>: mov w11, #0x1 // #1
0x0000aaaab4be38a8 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2196>: clrex
0x0000aaaab4be38ac <_ZN4llvm8InitLLVMC2ERiRPPKcb+2200>: ldaxr w9, [x10]
0x0000aaaab4be38b0 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2204>: cbnz w9, 0xaaaab4be38ec <_ZN4llvm8InitLLVMC2ERiRPPKcb+2264>
0x0000aaaab4be38b4 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2208>: stp x0, x1, [sp, #-16]!
0x0000aaaab4be38b8 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2212>: mrs x1, nzcv
0x0000aaaab4be38bc <_ZN4llvm8InitLLVMC2ERiRPPKcb+2216>: adrp x0, 0xaaaabc10c000
0x0000aaaab4be38c0 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2220>: add x0, x0, #0x378
0x0000aaaab4be38c4 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2224>: str x2, [sp, #-16]!
0x0000aaaab4be38c8 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2228>: mov x2, #0x1 // #1
0x0000aaaab4be38cc <_ZN4llvm8InitLLVMC2ERiRPPKcb+2232>: stadd x2, [x0]
0x0000aaaab4be38d0 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2236>: ldr x2, [sp], #16
0x0000aaaab4be38d4 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2240>: msr nzcv, x1
0x0000aaaab4be38d8 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2244>: ldp x0, x1, [sp], #16
0x0000aaaab4be38dc <_ZN4llvm8InitLLVMC2ERiRPPKcb+2248>: add x9, x8, #0x10
0x0000aaaab4be38e0 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2252>: stlxr w12, w11, [x9]
=> 0x0000aaaab4be38e4 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2256>: cbnz w12, 0xaaaab4be38ac <_ZN4llvm8InitLLVMC2ERiRPPKcb+2200>
0x0000aaaab4be38e8 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2260>: b 0xaaaab4be381c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2056>
0x0000aaaab4be38ec <_ZN4llvm8InitLLVMC2ERiRPPKcb+2264>: adrp x10, 0xaaaab150f000 <_ZZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEiE10StackTrace+648>
0x0000aaaab4be38f0 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2268>: add x10, x10, #0x5c8
0x0000aaaab4be38f4 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2272>: adrp x8, 0xaaaab150f000 <_ZZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEiE10StackTrace+648>
0x0000aaaab4be38f8 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2276>: add x8, x8, #0x5b8
0x0000aaaab4be38fc <_ZN4llvm8InitLLVMC2ERiRPPKcb+2280>: mov w11, #0x1 // #1
0x0000aaaab4be3900 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2284>: clrex
0x0000aaaab4be3904 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2288>: ldaxr w9, [x10]
0x0000aaaab4be3908 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2292>: cbnz w9, 0xaaaab4be3944 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2352>
0x0000aaaab4be390c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2296>: stp x0, x1, [sp, #-16]!
0x0000aaaab4be3910 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2300>: mrs x1, nzcv
0x0000aaaab4be3914 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2304>: adrp x0, 0xaaaabc10c000
0x0000aaaab4be3918 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2308>: add x0, x0, #0x380
0x0000aaaab4be391c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2312>: str x2, [sp, #-16]!
0x0000aaaab4be3920 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2316>: mov x2, #0x1 // #1
0x0000aaaab4be3924 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2320>: stadd x2, [x0]
0x0000aaaab4be3928 <_ZN4llvm8InitLLVMC2ERiRPPKcb+2324>: ldr x2, [sp], #16
0x0000aaaab4be392c <_ZN4llvm8InitLLVMC2ERiRPPKcb+2328>: msr nzcv, x1
End of assembler dump.
I have no way to run this on Linux bare metal on this machine, as I cannot install Asahi, so it is entirely possible I am hitting a virtualization issue but I can reproduce with two different VMMs (OrbStack and QEMU via UTM).
I have uploaded the original and instrumented binaries above here, compressed with zstd -19
to stay under GitHub's single file size limit. If there is any other information I can provide or patches I can test, please let me know.