Skip to content

Commit 6f6fa6b

Browse files
vtjnashKristofferC
authored andcommitted
stackwalk: fix heuristic termination (#57801)
When getting stacktraces on non-X86 platforms, the first frame may not have been set up yet, incorrectly triggering this bad-frame detection logic. This should fix the issue of async unwind failing after only getting 2 frames, if the first frame happens to land in the function header. This is not normally an issue on X86 or non-signals, but also causes no expected issues to be the same logic there too. Fix #52334 After (on arm64-apple-darwin24.3.0): ``` julia> f(1) Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable. ERROR: StackOverflowError: Stacktrace: [1] f(x::Int64) @ Main ./REPL[3]:1 [2] g(x::Int64) @ Main ./REPL[4]:1 --- the above 2 lines are repeated 39990 more times --- [79983] f(x::Int64) @ Main ./REPL[3]:1 ``` n.b. This will not fix and is not related to any issues where profiling gets only a single stack frame during profiling of syscalls on Apple AArch64. This fix is specific to the bug where it gets exactly 2 frames. (cherry picked from commit f82917a)
1 parent f814101 commit 6f6fa6b

File tree

1 file changed

+7
-3
lines changed

1 file changed

+7
-3
lines changed

src/stackwalk.c

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -97,9 +97,13 @@ static int jl_unw_stepn(bt_cursor_t *cursor, jl_bt_element_t *bt_data, size_t *b
9797
}
9898
uintptr_t oldsp = thesp;
9999
have_more_frames = jl_unw_step(cursor, from_signal_handler, &return_ip, &thesp);
100-
if (oldsp >= thesp && !jl_running_under_rr(0)) {
101-
// The stack pointer is clearly bad, as it must grow downwards.
100+
if ((n < 2 ? oldsp > thesp : oldsp >= thesp) && !jl_running_under_rr(0)) {
101+
// The stack pointer is clearly bad, as it must grow downwards,
102102
// But sometimes the external unwinder doesn't check that.
103+
// Except for n==0 when there is no oldsp and n==1 on all platforms but i686/x86_64.
104+
// (on x86, the platform first pushes the new stack frame, then does the
105+
// call, on almost all other platforms, the platform first does the call,
106+
// then the user pushes the link register to the frame).
103107
have_more_frames = 0;
104108
}
105109
if (return_ip == 0) {
@@ -131,11 +135,11 @@ static int jl_unw_stepn(bt_cursor_t *cursor, jl_bt_element_t *bt_data, size_t *b
131135
// * The way that libunwind handles it in `unw_get_proc_name`:
132136
// https://lists.nongnu.org/archive/html/libunwind-devel/2014-06/msg00025.html
133137
uintptr_t call_ip = return_ip;
138+
#if defined(_CPU_ARM_)
134139
// ARM instruction pointer encoding uses the low bit as a flag for
135140
// thumb mode, which must be cleared before further use. (Note not
136141
// needed for ARM AArch64.) See
137142
// https://github.com/libunwind/libunwind/pull/131
138-
#ifdef _CPU_ARM_
139143
call_ip &= ~(uintptr_t)0x1;
140144
#endif
141145
// Now there's two main cases to adjust for:

0 commit comments

Comments
 (0)