[lldb] Add test showing UnwindAssemblyInstEmulation can't handle backwards branches #168398

felipepiovezan · 2025-11-17T16:31:04Z

If we have a conditional branch, followed by an epilogue, followed by more code, LLDB will incorrectly compute unwind information through instruction emulation. Consider this:

// ...
<+16>: b.ne   ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE

// epilogue start
<+20>: ldp    x29, x30, [sp, #0x20]
<+24>: add    sp, sp, #0x30
<+28>: ret
// epilogue end

AFTER_EPILOGUE:
<+32>: do something
// ...
<+48>: ret

DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE:
<+52>: stp    x22, x23, [sp, #0x10]
<+56>: mov    x22, #0x1
<+64>: b      ; <+32> AFTER_EPILOGUE

LLDB will think that the unwind state of +32 is the same as +28. This is false, as +32 never executes after +28.

The root cause of the problem is the order in which instructions are visited; they are visited in the order they appear in the text, with unwind state always being forwarded to positive branch offsets, but never to negative offsets.

In the example above, AFTER_EPILOGUE should inherit the state of the branch in +64, but it doesn't because AFTER_EPILOGUE is visited right after the ret in +28.

Fixing this should be simple: maintain a stack of instructions to visit. While the stack is not empty, take the next instruction on stack and visit it.

After visiting a non-branching instruction, push the next instruction and forward unwind state to it.
After visiting a branch with one or more known targets, push the known branch targets and forward state to them.
In all other cases (ret, or branch to register), don't push nor forward anything.

Never push an instruction already on the stack. Like the algorithm today, this new algorithm also assumes that, if two instructions branch to the same target, the unwind state in both better be the same.

(Note: yes, branch to register is also handled incorrectly today, and will still be incorrect).

llvmbot · 2025-11-17T16:31:38Z

@llvm/pr-subscribers-lldb

Author: Felipe de Azevedo Piovezan (felipepiovezan)

Changes

If we have a conditional branch, followed by an epilogue, followed by more code, LLDB will incorrectly compute unwind information through instruction emulation. Consider this:

// ...
&lt;+16&gt;: b.ne   ; &lt;+52&gt; DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE

// epilogue start
&lt;+20&gt;: ldp    x29, x30, [sp, #<!-- -->0x20]
&lt;+24&gt;: add    sp, sp, #<!-- -->0x30
&lt;+28&gt;: ret
// epilogue end

AFTER_EPILOGUE:
&lt;+32&gt;: do something
// ...
&lt;+48&gt;: ret

DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE:
&lt;+52&gt;: stp    x22, x23, [sp, #<!-- -->0x10]
&lt;+56&gt;: mov    x22, #<!-- -->0x1
&lt;+64&gt;: b      ; &lt;+32&gt; AFTER_EPILOGUE

LLDB will think that the unwind state of +32 is the same as +28. This is false, as +32 never executes after +28.

The root cause of the problem is the order in which instructions are visited; they are visited in the order they appear in the text, with unwind state always being forwarded to positive branch offsets, but never to negative offsets.

In the example above, AFTER_EPILOGUE should inherit the state of the branch in +64, but it doesn't because AFTER_EPILOGUE is visited right after the ret in +28.

Fixing this should be simple: maintain a stack of instructions to visit. While the stack is not empty, take the next instruction on stack and visit it.

After visiting a non-branching instruction, push the next instruction and forward unwind state to it.
After visiting a branch with one or more known targets, push the known branch targets and forward state to them.
In all other cases (ret, or branch to register), don't push nor forward anything.

Never push an instruction already on the stack. Like the algorithm today, this new algorithm also assume that, if two instructions branch to the same target, the unwind state in both better be the same.

(Note: yes, branch to register is also handled incorrectly today, and will still be incorrect).

Full diff: https://github.com/llvm/llvm-project/pull/168398.diff

1 Files Affected:

(modified) lldb/unittests/UnwindAssembly/ARM64/TestArm64InstEmulation.cpp (+97)

diff --git a/lldb/unittests/UnwindAssembly/ARM64/TestArm64InstEmulation.cpp b/lldb/unittests/UnwindAssembly/ARM64/TestArm64InstEmulation.cpp
index eaf23fd72d6d1..4ab9bb554fca2 100644
--- a/lldb/unittests/UnwindAssembly/ARM64/TestArm64InstEmulation.cpp
+++ b/lldb/unittests/UnwindAssembly/ARM64/TestArm64InstEmulation.cpp
@@ -856,3 +856,100 @@ TEST_F(TestArm64InstEmulation, TestCFAResetToSP) {
   EXPECT_TRUE(row->GetCFAValue().GetRegisterNumber() == gpr_sp_arm64);
   EXPECT_TRUE(row->GetCFAValue().IsRegisterPlusOffset() == true);
 }
+
+TEST_F(TestArm64InstEmulation, TestMidFunctionEpilogueAndBackwardsJump) {
+  ArchSpec arch("arm64-apple-ios15");
+  std::unique_ptr<UnwindAssemblyInstEmulation> engine(
+      static_cast<UnwindAssemblyInstEmulation *>(
+          UnwindAssemblyInstEmulation::CreateInstance(arch)));
+  ASSERT_NE(nullptr, engine);
+
+  const UnwindPlan::Row *row;
+  AddressRange sample_range;
+  UnwindPlan unwind_plan(eRegisterKindLLDB);
+  UnwindPlan::Row::AbstractRegisterLocation regloc;
+
+  uint8_t data[] = {
+      0xff, 0xc3, 0x00, 0xd1, // <+0>:  sub    sp, sp, #0x30
+      0xfd, 0x7b, 0x02, 0xa9, // <+4>:  stp    x29, x30, [sp, #0x20]
+      0xfd, 0x83, 0x00, 0x91, // <+8>:  add    x29, sp, #0x20
+      0x1f, 0x04, 0x00, 0xf1, // <+12>: cmp    x0, #0x1
+      0x21, 0x01, 0x00, 0x54, // <+16>: b.ne   ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE
+      0xfd, 0x7b, 0x42, 0xa9, // <+20>: ldp    x29, x30, [sp, #0x20]
+      0xff, 0xc3, 0x00, 0x91, // <+24>: add    sp, sp, #0x30
+      0xc0, 0x03, 0x5f, 0xd6, // <+28>: ret
+      // AFTER_EPILOGUE:  LLDB computes the next 5 unwind states incorrectly.
+      0x37, 0x00, 0x80, 0xd2, // <+32>: mov    x23, #0x1
+      0xf6, 0x5f, 0x41, 0xa9, // <+36>: ldp    x22, x23, [sp, #0x10]
+      0xfd, 0x7b, 0x42, 0xa9, // <+40>: ldp    x29, x30, [sp, #0x20]
+      0xff, 0xc3, 0x00, 0x91, // <+44>: add    sp, sp, #0x30
+      0xc0, 0x03, 0x5f, 0xd6, // <+48>: ret
+      // DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE
+      0xf6, 0x5f, 0x01, 0xa9, // <+52>: stp    x22, x23, [sp, #0x10]
+      0x36, 0x00, 0x80, 0xd2, // <+56>: mov    x22, #0x1
+      0x37, 0x00, 0x80, 0xd2, // <+60>: mov    x23, #0x1
+      0xf8, 0xff, 0xff, 0x17, // <+64>: b      ; <+32> AFTER_EPILOGUE
+  };
+
+  // UnwindPlan we expect:
+  // row[0]:    0: CFA=sp +0 =>
+  // row[1]:    4: CFA=sp+48 =>
+  // row[2]:    8: CFA=sp+16 => fp=[CFA-16] lr=[CFA-8]
+  // row[3]:   12: CFA=fp+16 => fp=[CFA-16] lr=[CFA-8]
+  // row[5]:   24: CFA=sp+48 => fp=<same> lr=<same>
+  // row[5]:   28: CFA=sp+ 0 => fp=<same> lr=<same>
+
+  sample_range = AddressRange(0x1000, sizeof(data));
+
+  EXPECT_TRUE(engine->GetNonCallSiteUnwindPlanFromAssembly(
+      sample_range, data, sizeof(data), unwind_plan));
+
+  // At the end of prologue (+12), CFA = fp + 16.
+  // <+0>:  sub    sp, sp, #0x30
+  // <+4>:  stp    x29, x30, [sp, #0x20]
+  // <+8>:  add    x29, sp, #0x20
+  row = unwind_plan.GetRowForFunctionOffset(12);
+  EXPECT_EQ(12, row->GetOffset());
+  EXPECT_TRUE(row->GetCFAValue().IsRegisterPlusOffset());
+  EXPECT_EQ(row->GetCFAValue().GetRegisterNumber(), gpr_fp_arm64);
+  EXPECT_EQ(row->GetCFAValue().GetOffset(), 16);
+
+  // +16 and +20 are the same as +12.
+  // <+12>: cmp    x0, #0x1
+  // <+16>: b.ne   ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE
+  EXPECT_EQ(12, unwind_plan.GetRowForFunctionOffset(16)->GetOffset());
+  EXPECT_EQ(12, unwind_plan.GetRowForFunctionOffset(20)->GetOffset());
+
+  // After restoring $fp to caller's value, CFA = $sp + 48
+  // <+20>: ldp    x29, x30, [sp, #0x20]
+  row = unwind_plan.GetRowForFunctionOffset(24);
+  EXPECT_EQ(24, row->GetOffset());
+  EXPECT_TRUE(row->GetCFAValue().IsRegisterPlusOffset());
+  EXPECT_TRUE(row->GetCFAValue().GetRegisterNumber() == gpr_sp_arm64);
+  EXPECT_EQ(row->GetCFAValue().GetOffset(), 48);
+
+  // $sp has been restored
+  // <+24>: add    sp, sp, #0x30
+  row = unwind_plan.GetRowForFunctionOffset(28);
+  EXPECT_EQ(28, row->GetOffset());
+  EXPECT_TRUE(row->GetCFAValue().IsRegisterPlusOffset());
+  EXPECT_TRUE(row->GetCFAValue().GetRegisterNumber() == gpr_sp_arm64);
+  EXPECT_EQ(row->GetCFAValue().GetOffset(), 0);
+
+  // FIXME: Row for offset +32 incorrectly inherits the state of the `ret`
+  // instruction, but +32 _never_ executes after the `ret`.
+  // <+28>: ret
+  // <+32>: mov    x23, #0x1
+  row = unwind_plan.GetRowForFunctionOffset(32);
+  // FIXME: EXPECT_NE(32, row->GetOffset());
+
+  // Check that the state of this branch
+  // <+16>: b.ne   ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE
+  // was forwarded to the branch target:
+  // <+52>: stp    x22, x23, [sp, #0x10]
+  row = unwind_plan.GetRowForFunctionOffset(52);
+  EXPECT_EQ(52, row->GetOffset());
+  EXPECT_TRUE(row->GetCFAValue().IsRegisterPlusOffset());
+  EXPECT_EQ(row->GetCFAValue().GetRegisterNumber(), gpr_fp_arm64);
+  EXPECT_EQ(row->GetCFAValue().GetOffset(), 16);
+}

felipepiovezan · 2025-11-17T16:32:21Z

@DavidSpickett you voluntarily reviewed some other NFC patches in this area, so I tagged you here in case you're interested :)

github-actions · 2025-11-17T16:32:43Z

✅ With the latest revision this PR passed the C/C++ code formatter.

DavidSpickett

Certainly shows the problem, LGTM.

…wards branches If we have a conditional branch, followed by an epilogue, followed by more code, LLDB will incorrectly compute unwind information through instruction emulation. Consider this: ``` // ... <+16>: b.ne ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE // epilogue start <+20>: ldp x29, x30, [sp, #0x20] <+24>: add sp, sp, #0x30 <+28>: ret // epilogue end AFTER_EPILOGUE: <+32>: do something // ... <+48>: ret DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE: <+52>: stp x22, x23, [sp, #0x10] <+56>: mov x22, #0x1 <+64>: b ; <+32> AFTER_EPILOGUE ``` LLDB will think that the unwind state of +32 is the same as +28. This is false, as +32 _never_ executes after +28. The root cause of the problem is the order in which instructions are visited; they are visited in the order they appear in the text, with unwind state always being forwarded to positive branch offsets, but never to negative offsets. In the example above, `AFTER_EPILOGUE` should inherit the state of the branch in +64, but it doesn't because `AFTER_EPILOGUE` is visited right after the `ret` in +28. Fixing this should be simple: maintain a stack of instructions to visit. While the stack is not empty, take the next instruction on stack and visit it. * After visiting a non-branching instruction, push the next instruction and forward unwind state to it. * After visiting a branch with one or more known targets, push the known branch targets and forward state to them. * In all other cases (ret, or branch to register), don't push nor forward anything. Never push an instruction already on the stack. Like the algorithm today, this new algorithm also assume that, if two instructions branch to the same target, the unwind state in both better be the same. (Note: yes, branch to register is also handled incorrectly today, and will still be incorrect).

felipepiovezan · 2025-11-19T14:40:53Z

Rebased to resolve conflicts with newly added test in another PR.

github-actions · 2025-11-19T14:47:04Z

🐧 Linux x64 Test Results

33168 tests passed
490 tests skipped

felipepiovezan requested a review from JDevlieghere as a code owner November 17, 2025 16:31

llvmbot added the lldb label Nov 17, 2025

felipepiovezan changed the title ~~[lldb] Add test showing UnwindAssemblyInstEmulation can't handle back…~~ [lldb] Add test showing UnwindAssemblyInstEmulation can't handle backwards branches Nov 17, 2025

felipepiovezan requested review from DavidSpickett and jasonmolenda November 17, 2025 16:31

felipepiovezan force-pushed the felipe/gc_lldb_add_test_showing_unwindassemblyinstemulation_cant_handle_backwards_branches_xljg4 branch from 022df2e to f984113 Compare November 17, 2025 16:40

DavidSpickett approved these changes Nov 18, 2025

View reviewed changes

felipepiovezan force-pushed the felipe/gc_lldb_add_test_showing_unwindassemblyinstemulation_cant_handle_backwards_branches_xljg4 branch from f984113 to 2f81505 Compare November 19, 2025 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[lldb] Add test showing UnwindAssemblyInstEmulation can't handle backwards branches #168398

[lldb] Add test showing UnwindAssemblyInstEmulation can't handle backwards branches #168398

felipepiovezan commented Nov 17, 2025 •

edited

Loading

Uh oh!

llvmbot commented Nov 17, 2025

Uh oh!

felipepiovezan commented Nov 17, 2025

Uh oh!

github-actions bot commented Nov 17, 2025 •

edited

Loading

Uh oh!

DavidSpickett left a comment

Uh oh!

felipepiovezan commented Nov 19, 2025

Uh oh!

github-actions bot commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[lldb] Add test showing UnwindAssemblyInstEmulation can't handle backwards branches #168398

Are you sure you want to change the base?

[lldb] Add test showing UnwindAssemblyInstEmulation can't handle backwards branches #168398

Conversation

felipepiovezan commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 17, 2025

Uh oh!

felipepiovezan commented Nov 17, 2025

Uh oh!

github-actions bot commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSpickett left a comment

Choose a reason for hiding this comment

Uh oh!

felipepiovezan commented Nov 19, 2025

Uh oh!

github-actions bot commented Nov 19, 2025

🐧 Linux x64 Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

felipepiovezan commented Nov 17, 2025 •

edited

Loading

github-actions bot commented Nov 17, 2025 •

edited

Loading