[BOLT][DWARF] Reusing Stubs in Longjmp causes temporary issues with unwinding

LongJmp pass reuses Stubs jumping to the same location to save binary size.

The diff that BOLT applies could look like this:


```diff
<func_a>:
	inst a
	inst b
-	bl <target>
+   bl <stub>
    inst c // the stub might not be at the exact place where the "bl target" was.
    inst d
+   b cont
+   adrp
+   add
+   br x16
+cont:
    inst e
    ...
    
<func b>
	inst a
-   bl <target>
+   bl <stub>
    inst b 	
```

The problem is: BOLT does not modify DWARF CFIs to account for this, meaning we cannot unwind from the stub.

### What should happen?

DWARF CFIs should indicate that while we are in the Stub, the return address is in x30/LR[1].

### What happens instead?

After applying BOLT to Clang, and running `llvm-objdump --dwarf=frames clang.bolt`, I see this:
```
  DW_CFA_advance_loc: 4 to 0xadab298
  DW_CFA_offset: reg28 -8
  DW_CFA_offset: reg27 -16
  // Stub was at 0xadab364, we jump over it.
  DW_CFA_advance_loc2: 272 to 0xadab3a8  
  DW_CFA_restore: reg26
  DW_CFA_restore: reg25
  DW_CFA_advance_loc: 4 to 0xadab3ac
```
As there is no CFI to indicate the return address changed after `BL stub`, the return address is indicated to be where `func_a`'s return address is. This is incorrect.

I also checked using AOSP's unwind_reg_info helper tool[2], and it also showed that the return address location is not updated.

### Note: this is temporary

Once  the execution contrinues to `<target>`, we can unwind again: the unwinding here would skip the stub (which is expected).
So the problem is only present while execution is inside a Stub.

Because of this, the issue does not affect exception handling, only unwinding coming from external (async) sources, e.g. signal handlers, debuggers.

### Probability of incorrect unwinding

When running an optimization of nginx, in the output binary I had ~278k instructions (in the .text section), while BOLT generated ~200 Stubs, so that's 600 Stub instructions.

If we have an external request to unwind coming at a random moment, and we assume that all instructions are equally likely to be executed we have a ~0.07% change of incorrect unwinding.
In reality this is even less likely, because many Stubs are placed to jump from the hot code to the cold code (so they are not the part of the hot instructions).

I don't think processes receive too many requests to unwind, so this is unlikely to cause issues in practice, but if we can generate DWARF to match the stubs, we can eliminate any chance of such a problem occuring.

--- 
[1]: I _think_ a restore-remember pair is needed  around the Stub, plus an indication that the return address is now in x30.

[2]: Just to avoid confusion: this tool does not require that the input ELF is for AOSP. I used an AArch64 Linux Clang. Source to the tool: https://cs.android.com/android/platform/superproject/main/+/main:system/unwinding/libunwindstack/tools/unwind_reg_info.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BOLT][DWARF] Reusing Stubs in Longjmp causes temporary issues with unwinding #160989

What should happen?

What happens instead?

Note: this is temporary

Probability of incorrect unwinding

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BOLT][DWARF] Reusing Stubs in Longjmp causes temporary issues with unwinding #160989

Description

What should happen?

What happens instead?

Note: this is temporary

Probability of incorrect unwinding

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions