Skip to content

Conversation

@Asher8118
Copy link
Contributor

There are cases in which getEntryIDForSymbol is called, where the given Symbol is in a constant island, and so BOLT can not find its function. This causes BOLT to reach llvm_unreachable("symbol not found") and crash. This patch adds a check that avoids this crash and gives a warning to the user.

There are cases in which `getEntryIDForSymbol` is called,
where the given Symbol is in a constant island, and so BOLT
can not find its function. This causes BOLT to reach
`llvm_unreachable("symbol not found")` and crash. This patch adds
a check that avoid this crash and gives a warning to the user.
@llvmbot
Copy link
Member

llvmbot commented Sep 22, 2025

@llvm/pr-subscribers-bolt

Author: Asher Dobrescu (Asher8118)

Changes

There are cases in which getEntryIDForSymbol is called, where the given Symbol is in a constant island, and so BOLT can not find its function. This causes BOLT to reach llvm_unreachable("symbol not found") and crash. This patch adds a check that avoids this crash and gives a warning to the user.


Full diff: https://github.com/llvm/llvm-project/pull/160143.diff

2 Files Affected:

  • (modified) bolt/lib/Core/BinaryContext.cpp (+9-1)
  • (added) bolt/test/AArch64/check-symbol-area.s (+49)
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index 72c72bbaf4a65..5f3b3c0e57152 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -2374,8 +2374,16 @@ BinaryFunction *BinaryContext::getFunctionForSymbol(const MCSymbol *Symbol,
     return nullptr;
 
   BinaryFunction *BF = BFI->second;
-  if (EntryDesc)
+  if (EntryDesc) {
+    const uint64_t Address = BF->getAddress() + Symbol->getOffset();
+    if (BF->isInConstantIsland(Address)) {
+      this->outs() << "BOLT-WARNING: symbol " << Symbol->getName()
+                   << " is in data region of function 0x"
+                   << Twine::utohexstr(Address) << ".\n";
+      return nullptr;
+    }
     *EntryDesc = BF->getEntryIDForSymbol(Symbol);
+  }
 
   return BF;
 }
diff --git a/bolt/test/AArch64/check-symbol-area.s b/bolt/test/AArch64/check-symbol-area.s
new file mode 100644
index 0000000000000..43ce25e16181f
--- /dev/null
+++ b/bolt/test/AArch64/check-symbol-area.s
@@ -0,0 +1,49 @@
+// This test checks that when looking for a function
+// correspnding to a symbol, BOLT is not looking 
+// through a data area (constant island).
+
+# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o
+# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
+# RUN: llvm-bolt %t.exe -o %t.bolt 2>&1 | FileCheck %s
+
+// Before adding a check for constant islands, BOLT would exit with an error
+// of the form: "symbol not found" and throw an LLVM UNREACHABLE error.
+# CHECK-NOT: symbol not found
+# CHECK-NOT: UNREACHABLE
+
+// Now BOLT throws a warning and does not crash.
+# CHECK: BOLT-WARNING: symbol [[SYM:.*]]  is in data region of function 0x{{.*}}.
+
+.text
+.global main
+main:
+        stp     x14, x15, [sp, -8]!
+        mov     x14, sp
+        adrp    x1, .test
+        add     x0, x1, :lo12:.test
+        bl      first_block
+        ret
+
+.global first_block
+$d:
+first_block:
+        stp     x14, x15, [sp, -8]!
+        mov     x14, sp
+        bl      second_block
+        ret
+second_block:
+        stp     x14, x15, [sp, -8]!
+        mov     x14, sp
+        bl      third_block
+        ret
+$x:
+third_block:
+        stp     x14, x15, [sp, -8]!
+        mov     x14, sp
+        adrp    x1, .data
+        add     x0, x1, :lo12:.test
+        ret
+
+.data
+.test:
+        .string "test"

BinaryFunction *BF = BFI->second;
if (EntryDesc)
if (EntryDesc) {
const uint64_t Address = BF->getAddress() + Symbol->getOffset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I'm not sure Symbol->getOffset() will always produce the expected output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. Could you please elaborate on what might be the problem here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, sorry. Symbol->getOffset() is not guaranteed to return the offset from the start of the function at arbitrary point of execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, if Symbol->getOffset() does not return the symbol offset, then it returns 0. In which case const uint64_t Address = BF->getAddress() + Symbol->getOffset(); becomes the same as BF->getAddress(). For the purpose of this patch, I think that is fine. Whether we're checking the function address or the symbol address, the point is to check if we're in a data area (constant island) before proceeding. This check is only reached in getFunctionForSymbol as a way to prevent calling getEntryIDForSymbol, and avoid hitting the llvm_unreachable, if we're in a data area.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'd prefer to make getEntryIDForSymbol() return an optional (i.e. remove the unreachable) and not rely on the implicit behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll look into making it an optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now returning an optional. I did keep the Symbol->getOffset() check in order to give a warning to the user, but we are no longer relying on it to avoid thellvm_unreachable.

Copy link
Member

@paschalis-mpeis paschalis-mpeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Ash,

Thanks for the patch! Trying to understand the cases we hit this.

Is this the case where we fail on a symbol in a code-marked region
that follows a data marked region, like L2 below?

.text
.global foo
$d.foo:
foo:
        nop
$x.foo:
L2:
        ret

Maybe the below is a more realistic example than the above?
Note that replacing BL with a B does not trigger it.

.text
.global foo
foo:
        adrp    x0, .Lp
        add     x0, x0, :lo12:.Lp
        ldr     w0, [x0]
        bl L2
        ret
$d.L1:
.Lp:
        .word   0x123FFF
$x.L2:
L2:
        ret

@Asher8118
Copy link
Contributor Author

Thanks for your comments, @paschalis-mpeis. I've tried to address them.

Trying to understand the cases we hit this. Is this the case where we fail on a symbol in a code-marked region

We hit this when we ask BOLT to read code from a data area. So references to a symbol from a data area (marked with $d) that are called from a code area (marked with $x) will hit this. The two examples you offered hit this issue.

Note that replacing BL with a B does not trigger it.

That's because there's no link to a data area without BL, as with BL we save the return address to the link register.

@maksfb
Copy link
Contributor

maksfb commented Oct 7, 2025

.text
.global foo
foo:
adrp x0, .Lp
add x0, x0, :lo12:.Lp
ldr w0, [x0]
bl L2
ret
$d.L1:
.Lp:
.word 0x123FFF
$x.L2:
L2:
ret

I tried this example with the patch, and I'm getting:

BOLT-WARNING: symbol L2/1 is in data region of function foo(0x10284).

which is incorrect since L2 is not in the data area. Without looking further into it, I suspect it's the consequence of using Symbol->getOffset() to establish constant island an symbol association.

@maksfb
Copy link
Contributor

maksfb commented Oct 7, 2025

Note that replacing BL with a B does not trigger it.

That's because there's no link to a data area without BL, as with BL we save the return address to the link register.

With B we will generate different code eliminating the branch and making Symbol->getOffset() return a different value resulting in a different code path.

@maksfb
Copy link
Contributor

maksfb commented Oct 7, 2025

Hi Ash,

Overall, it's a good change. Besides the elimination of the unreachable/failure, we should also check that we generate correct code.

The warning, as it is, will not be helpful to the user without any further context, i.e. why it is a problem that there's a symbol in the data area of a function.

We should probably warn the user whenever the data is being accessed as code, e.g. via branch/call, and this should be done while building CFG (or with post-CFG analysis). It's possible, we will have to ignore/skip functions containing such patterns.

@Asher8118
Copy link
Contributor Author

The warning, as it is, will not be helpful to the user without any further context, i.e. why it is a problem that there's a symbol in the data area of a function.

Hi Maks, thank you for your comments. I see what you mean, the symbol being in a data area is more of a symptom than the actual underlying problem. Instead we should check when control flow goes into data and skip those functions. I'll have to look into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants