Skip to content

[BOLT][DWARF] Reusing Stubs in Longjmp causes temporary issues with unwinding #160989

@bgergely0

Description

@bgergely0

LongJmp pass reuses Stubs jumping to the same location to save binary size.

The diff that BOLT applies could look like this:

<func_a>:
	inst a
	inst b
-	bl <target>
+   bl <stub>
    inst c // the stub might not be at the exact place where the "bl target" was.
    inst d
+   b cont
+   adrp
+   add
+   br x16
+cont:
    inst e
    ...
    
<func b>
	inst a
-   bl <target>
+   bl <stub>
    inst b 	

The problem is: BOLT does not modify DWARF CFIs to account for this, meaning we cannot unwind from the stub.

What should happen?

DWARF CFIs should indicate that while we are in the Stub, the return address is in x30/LR[1].

What happens instead?

After applying BOLT to Clang, and running llvm-objdump --dwarf=frames clang.bolt, I see this:

  DW_CFA_advance_loc: 4 to 0xadab298
  DW_CFA_offset: reg28 -8
  DW_CFA_offset: reg27 -16
  // Stub was at 0xadab364, we jump over it.
  DW_CFA_advance_loc2: 272 to 0xadab3a8  
  DW_CFA_restore: reg26
  DW_CFA_restore: reg25
  DW_CFA_advance_loc: 4 to 0xadab3ac

As there is no CFI to indicate the return address changed after BL stub, the return address is indicated to be where func_a's return address is. This is incorrect.

I also checked using AOSP's unwind_reg_info helper tool[2], and it also showed that the return address location is not updated.

Note: this is temporary

Once the execution contrinues to <target>, we can unwind again: the unwinding here would skip the stub (which is expected).
So the problem is only present while execution is inside a Stub.

Because of this, the issue does not affect exception handling, only unwinding coming from external (async) sources, e.g. signal handlers, debuggers.

Probability of incorrect unwinding

When running an optimization of nginx, in the output binary I had ~278k instructions (in the .text section), while BOLT generated ~200 Stubs, so that's 600 Stub instructions.

If we have an external request to unwind coming at a random moment, and we assume that all instructions are equally likely to be executed we have a ~0.07% change of incorrect unwinding.
In reality this is even less likely, because many Stubs are placed to jump from the hot code to the cold code (so they are not the part of the hot instructions).

I don't think processes receive too many requests to unwind, so this is unlikely to cause issues in practice, but if we can generate DWARF to match the stubs, we can eliminate any chance of such a problem occuring.


[1]: I think a restore-remember pair is needed around the Stub, plus an indication that the return address is now in x30.

[2]: Just to avoid confusion: this tool does not require that the input ELF is for AOSP. I used an AArch64 Linux Clang. Source to the tool: https://cs.android.com/android/platform/superproject/main/+/main:system/unwinding/libunwindstack/tools/unwind_reg_info.cpp

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions