Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion bolt/lib/Core/BinaryContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2374,8 +2374,16 @@ BinaryFunction *BinaryContext::getFunctionForSymbol(const MCSymbol *Symbol,
return nullptr;

BinaryFunction *BF = BFI->second;
if (EntryDesc)
if (EntryDesc) {
const uint64_t Address = BF->getAddress() + Symbol->getOffset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I'm not sure Symbol->getOffset() will always produce the expected output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. Could you please elaborate on what might be the problem here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, sorry. Symbol->getOffset() is not guaranteed to return the offset from the start of the function at arbitrary point of execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, if Symbol->getOffset() does not return the symbol offset, then it returns 0. In which case const uint64_t Address = BF->getAddress() + Symbol->getOffset(); becomes the same as BF->getAddress(). For the purpose of this patch, I think that is fine. Whether we're checking the function address or the symbol address, the point is to check if we're in a data area (constant island) before proceeding. This check is only reached in getFunctionForSymbol as a way to prevent calling getEntryIDForSymbol, and avoid hitting the llvm_unreachable, if we're in a data area.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'd prefer to make getEntryIDForSymbol() return an optional (i.e. remove the unreachable) and not rely on the implicit behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll look into making it an optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now returning an optional. I did keep the Symbol->getOffset() check in order to give a warning to the user, but we are no longer relying on it to avoid thellvm_unreachable.

if (BF->isInConstantIsland(Address)) {
this->outs() << "BOLT-WARNING: symbol " << Symbol->getName()
<< " is in data region of function 0x"
<< Twine::utohexstr(BF->getAddress()) << ".\n";
return nullptr;
}
*EntryDesc = BF->getEntryIDForSymbol(Symbol);
}

return BF;
}
Expand Down
49 changes: 49 additions & 0 deletions bolt/test/AArch64/check-symbol-area.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
// This test checks that when looking for a function
// corresponding to a symbol, BOLT is not looking
// through a data area (constant island).

# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o
# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
# RUN: llvm-bolt %t.exe -o %t.bolt 2>&1 | FileCheck %s

// Before adding a check for constant islands, BOLT would exit with an error
// of the form: "symbol not found" and throw an LLVM UNREACHABLE error.
# CHECK-NOT: symbol not found
# CHECK-NOT: UNREACHABLE

// Now BOLT throws a warning and does not crash.
# CHECK: BOLT-WARNING: symbol [[SYM:.*]] is in data region of function 0x{{.*}}.

.text
.global main
main:
stp x14, x15, [sp, -8]!
mov x14, sp
adrp x1, .test
add x0, x1, :lo12:.test
bl first_block
ret

.global first_block
$d:
first_block:
stp x14, x15, [sp, -8]!
mov x14, sp
bl second_block
ret
second_block:
stp x14, x15, [sp, -8]!
mov x14, sp
bl third_block
ret
$x:
third_block:
stp x14, x15, [sp, -8]!
mov x14, sp
adrp x1, .data
add x0, x1, :lo12:.test
ret

.data
.test:
.string "test"
Loading