Skip to content

Commit 29fef3a

Browse files
authored
[BOLT] Improve DWARF CFI generation for pac-ret binaries (#163381)
During InsertNegateRAState pass we check the annotations on instructions, to decide where to generate the OpNegateRAState CFIs in the output binary. As only instructions in the input binary were annotated, we have to make a judgement on instructions generated by other BOLT passes. Incorrect placement may cause issues when an (async) unwind request is received during the new "unknown" instructions. This patch adds more logic to make a more informed decision on by taking into account: - unknown instructions in a BasicBlock with other instruction have the same RAState. Previously, if the BasicBlock started with an unknown instruction, the RAState was copied from the preceding block. Now, the RAState is copied from the succeeding instructions in the same block. - Some BasicBlocks may only contain instructions with unknown RAState, As explained in issue #160989, these blocks already have incorrect unwind info. Because of this, the last known RAState based on the layout order is copied. Updated bolt/docs/PacRetDesign.md to reflect changes.
1 parent 8ceeba8 commit 29fef3a

File tree

6 files changed

+525
-32
lines changed

6 files changed

+525
-32
lines changed

bolt/docs/PacRetDesign.md

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -200,15 +200,22 @@ This pass runs after optimizations. It performns the _inverse_ of MarkRAState pa
200200
Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
201201
to know what RA state these have.
202202

203-
The current solution has the `inferUnknownStates` function to cover these, using
204-
a fairly simple strategy: unknown states inherit the last known state.
203+
> [!important]
204+
> As issue #160989 explains, unwind info is missing from stubs.
205+
> For this same reason, we cannot generate correct pac-specific unwind info: the
206+
> signedness of the _incorrect_ return address is meaningless.
205207
206-
This will be updated to a more robust solution.
208+
Assignment of RAStates to newly generated instructions is done in `inferUnknownStates`.
209+
We have two different cases to cover:
207210

208-
> [!important]
209-
> As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
210-
> For this same reason, we cannot generate correct pac-specific unwind info: the signess
211-
> of the _incorrect_ return address is meaningless.
211+
1. If a BasicBlock has some instructions with known RA state, and some without, we
212+
can copy the RAState of known instructions to the unknown ones. As the control
213+
flow only changes between BasicBlocks, instructions in the same BasicBlock have
214+
the same return address. (The exception is noreturn calls, but these would only
215+
cause problems, if the newly inserted instruction is right after the call.)
216+
217+
2. If a BasicBlock has no instructions with known RAState, we have to copy the
218+
RAState of the previous BasicBlock in layout order.
212219

213220
### Optimizations requiring special attention
214221

bolt/include/bolt/Passes/InsertNegateRAStatePass.h

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
//===- bolt/Passes/InsertNegateRAStatePass.cpp ----------------------------===//
1+
//===- bolt/Passes/InsertNegateRAStatePass.h ------------------------------===//
22
//
33
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
44
// See https://llvm.org/LICENSE.txt for license information.
@@ -30,9 +30,30 @@ class InsertNegateRAState : public BinaryFunctionPass {
3030
private:
3131
/// Because states are tracked as MCAnnotations on individual instructions,
3232
/// newly inserted instructions do not have a state associated with them.
33-
/// New states are "inherited" from the last known state.
33+
/// Uses fillUnknownStateInBB and fillUnknownStubs.
3434
void inferUnknownStates(BinaryFunction &BF);
3535

36+
/// Simple case: copy RAStates to unknown insts from previous inst.
37+
/// If the first inst has unknown state, copy set it to the first known state.
38+
/// Accounts for signing and authenticating insts.
39+
void fillUnknownStateInBB(BinaryContext &BC, BinaryBasicBlock &BB);
40+
41+
/// Fill in RAState in BasicBlocks consisting entirely of new instructions.
42+
/// As of #160989, we have to copy the RAState from the previous BB in the
43+
/// layout, because CFIs are already incorrect here.
44+
void fillUnknownStubs(BinaryFunction &BF);
45+
46+
/// Returns the first known RAState from \p BB, or std::nullopt if all are
47+
/// unknown.
48+
std::optional<bool> getFirstKnownRAState(BinaryContext &BC,
49+
BinaryBasicBlock &BB);
50+
51+
/// \p Return true if all instructions have unknown RAState.
52+
bool isUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB);
53+
54+
/// Set all instructions in \p BB to \p State.
55+
void markUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB, bool State);
56+
3657
/// Support for function splitting:
3758
/// if two consecutive BBs with Signed state are going to end up in different
3859
/// functions (so are held by different FunctionFragments), we have to add a

bolt/lib/Passes/InsertNegateRAStatePass.cpp

Lines changed: 124 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
5252
MCInst &Inst = *It;
5353
if (BC.MIB->isCFI(Inst))
5454
continue;
55-
auto RAState = BC.MIB->getRAState(Inst);
56-
if (!RAState) {
55+
std::optional<bool> RAState = BC.MIB->getRAState(Inst);
56+
if (!RAState.has_value()) {
5757
BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
5858
<< " in function " << BF.getPrintName() << "\n";
5959
PassFailed = true;
@@ -74,6 +74,20 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
7474
}
7575
}
7676

77+
void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
78+
BinaryContext &BC = BF.getBinaryContext();
79+
80+
// Fill in missing RAStates in simple cases (inside BBs).
81+
for (BinaryBasicBlock &BB : BF) {
82+
fillUnknownStateInBB(BC, BB);
83+
}
84+
// BasicBlocks which are made entirely of "new instructions" (instructions
85+
// without RAState annotation) are stubs, and do not have correct unwind info.
86+
// We should iterate in layout order and fill them based on previous known
87+
// RAState.
88+
fillUnknownStubs(BF);
89+
}
90+
7791
void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
7892
FunctionFragment &FF) {
7993
BinaryContext &BC = BF.getBinaryContext();
@@ -92,8 +106,8 @@ void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
92106
// If a function is already split in the input, the first FF can also start
93107
// with Signed state. This covers that scenario as well.
94108
auto II = (*FirstNonEmpty)->getFirstNonPseudo();
95-
auto RAState = BC.MIB->getRAState(*II);
96-
if (!RAState) {
109+
std::optional<bool> RAState = BC.MIB->getRAState(*II);
110+
if (!RAState.has_value()) {
97111
BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
98112
<< " in function " << BF.getPrintName() << "\n";
99113
PassFailed = true;
@@ -104,32 +118,119 @@ void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
104118
MCCFIInstruction::createNegateRAState(nullptr));
105119
}
106120

107-
void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
121+
std::optional<bool>
122+
InsertNegateRAState::getFirstKnownRAState(BinaryContext &BC,
123+
BinaryBasicBlock &BB) {
124+
for (const MCInst &Inst : BB) {
125+
if (BC.MIB->isCFI(Inst))
126+
continue;
127+
std::optional<bool> RAState = BC.MIB->getRAState(Inst);
128+
if (RAState.has_value())
129+
return RAState;
130+
}
131+
return std::nullopt;
132+
}
133+
134+
bool InsertNegateRAState::isUnknownBlock(BinaryContext &BC,
135+
BinaryBasicBlock &BB) {
136+
std::optional<bool> FirstRAState = getFirstKnownRAState(BC, BB);
137+
return !FirstRAState.has_value();
138+
}
139+
140+
void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
141+
BinaryBasicBlock &BB) {
142+
143+
auto First = BB.getFirstNonPseudo();
144+
if (First == BB.end())
145+
return;
146+
// If the first instruction has unknown RAState, we should copy the first
147+
// known RAState.
148+
std::optional<bool> RAState = BC.MIB->getRAState(*First);
149+
if (!RAState.has_value()) {
150+
std::optional<bool> FirstRAState = getFirstKnownRAState(BC, BB);
151+
if (!FirstRAState.has_value())
152+
// We fill unknown BBs later.
153+
return;
154+
155+
BC.MIB->setRAState(*First, *FirstRAState);
156+
}
157+
158+
// At this point we know the RAState of the first instruction,
159+
// so we can propagate the RAStates to all subsequent unknown instructions.
160+
MCInst Prev = *First;
161+
for (auto It = First + 1; It != BB.end(); ++It) {
162+
MCInst &Inst = *It;
163+
if (BC.MIB->isCFI(Inst))
164+
continue;
165+
166+
// No need to check for nullopt: we only entered this loop after the first
167+
// instruction had its RAState set, and RAState is always set for the
168+
// previous instruction in the previous iteration of the loop.
169+
std::optional<bool> PrevRAState = BC.MIB->getRAState(Prev);
170+
171+
std::optional<bool> RAState = BC.MIB->getRAState(Inst);
172+
if (!RAState.has_value()) {
173+
if (BC.MIB->isPSignOnLR(Prev))
174+
PrevRAState = true;
175+
else if (BC.MIB->isPAuthOnLR(Prev))
176+
PrevRAState = false;
177+
BC.MIB->setRAState(Inst, *PrevRAState);
178+
}
179+
Prev = Inst;
180+
}
181+
}
182+
183+
void InsertNegateRAState::markUnknownBlock(BinaryContext &BC,
184+
BinaryBasicBlock &BB, bool State) {
185+
// If we call this when an Instruction has either kRASigned or kRAUnsigned
186+
// annotation, setRASigned or setRAUnsigned would fail.
187+
assert(isUnknownBlock(BC, BB) &&
188+
"markUnknownBlock should only be called on unknown blocks");
189+
for (MCInst &Inst : BB) {
190+
if (BC.MIB->isCFI(Inst))
191+
continue;
192+
BC.MIB->setRAState(Inst, State);
193+
}
194+
}
195+
196+
void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
108197
BinaryContext &BC = BF.getBinaryContext();
109198
bool FirstIter = true;
110199
MCInst PrevInst;
111-
for (BinaryBasicBlock &BB : BF) {
112-
for (MCInst &Inst : BB) {
113-
if (BC.MIB->isCFI(Inst))
114-
continue;
200+
for (FunctionFragment &FF : BF.getLayout().fragments()) {
201+
for (BinaryBasicBlock *BB : FF) {
202+
if (FirstIter) {
203+
FirstIter = false;
204+
if (isUnknownBlock(BC, *BB))
205+
// If the first BasicBlock is unknown, the function's entry RAState
206+
// should be used.
207+
markUnknownBlock(BC, *BB, BF.getInitialRAState());
208+
} else if (isUnknownBlock(BC, *BB)) {
209+
// As explained in issue #160989, the unwind info is incorrect for
210+
// stubs. Indicating the correct RAState without the rest of the unwind
211+
// info being correct is not useful. Instead, we copy the RAState from
212+
// the previous instruction.
213+
std::optional<bool> PrevRAState = BC.MIB->getRAState(PrevInst);
214+
if (!PrevRAState.has_value()) {
215+
// No non-cfi instruction encountered in the function yet.
216+
// This means the RAState is the same as at the function entry.
217+
markUnknownBlock(BC, *BB, BF.getInitialRAState());
218+
continue;
219+
}
115220

116-
auto RAState = BC.MIB->getRAState(Inst);
117-
if (!FirstIter && !RAState) {
118221
if (BC.MIB->isPSignOnLR(PrevInst))
119-
RAState = true;
222+
PrevRAState = true;
120223
else if (BC.MIB->isPAuthOnLR(PrevInst))
121-
RAState = false;
122-
else {
123-
auto PrevRAState = BC.MIB->getRAState(PrevInst);
124-
RAState = PrevRAState ? *PrevRAState : false;
125-
}
126-
BC.MIB->setRAState(Inst, *RAState);
127-
} else {
128-
FirstIter = false;
129-
if (!RAState)
130-
BC.MIB->setRAState(Inst, BF.getInitialRAState());
224+
PrevRAState = false;
225+
markUnknownBlock(BC, *BB, *PrevRAState);
131226
}
132-
PrevInst = Inst;
227+
// This function iterates on BasicBlocks, so the PrevInst has to be
228+
// updated to the last instruction of the current BasicBlock. If the
229+
// BasicBlock is empty, or only has PseudoInstructions, PrevInst will not
230+
// be updated.
231+
auto Last = BB->getLastNonPseudo();
232+
if (Last != BB->rend())
233+
PrevInst = *Last;
133234
}
134235
}
135236
}

bolt/unittests/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,4 @@ endfunction()
77

88
add_subdirectory(Core)
99
add_subdirectory(Profile)
10+
add_subdirectory(Passes)
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
set(LLVM_LINK_COMPONENTS
2+
DebugInfoDWARF
3+
Object
4+
MC
5+
${BOLT_TARGETS_TO_BUILD}
6+
)
7+
8+
add_bolt_unittest(PassTests
9+
InsertNegateRAState.cpp
10+
11+
DISABLE_LLVM_LINK_LLVM_DYLIB
12+
)
13+
14+
target_link_libraries(PassTests
15+
PRIVATE
16+
LLVMBOLTCore
17+
LLVMBOLTRewrite
18+
LLVMBOLTPasses
19+
LLVMBOLTProfile
20+
LLVMBOLTUtils
21+
)
22+
23+
foreach (tgt ${BOLT_TARGETS_TO_BUILD})
24+
include_directories(
25+
${LLVM_MAIN_SRC_DIR}/lib/Target/${tgt}
26+
${LLVM_BINARY_DIR}/lib/Target/${tgt}
27+
)
28+
string(TOUPPER "${tgt}" upper)
29+
target_compile_definitions(PassTests PRIVATE "${upper}_AVAILABLE")
30+
endforeach()

0 commit comments

Comments
 (0)