-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[BOLT] Improve InsertNegateRAStatePass::inferUnknownStates #163381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-bolt Author: Gergely Bálint (bgergely0) ChangesPrevious implementation used a simple heuristic. This can be improved in
Updated bolt/docs/PacRetDesign.md to reflect changes. Full diff: https://github.com/llvm/llvm-project/pull/163381.diff 3 Files Affected:
diff --git a/bolt/docs/PacRetDesign.md b/bolt/docs/PacRetDesign.md
index f3fe5fbd522cb..c7c76cac3a100 100644
--- a/bolt/docs/PacRetDesign.md
+++ b/bolt/docs/PacRetDesign.md
@@ -200,16 +200,29 @@ This pass runs after optimizations. It performns the _inverse_ of MarkRAState pa
Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
to know what RA state these have.
-The current solution has the `inferUnknownStates` function to cover these, using
-a fairly simple strategy: unknown states inherit the last known state.
-
-This will be updated to a more robust solution.
-
> [!important]
> As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
> For this same reason, we cannot generate correct pac-specific unwind info: the signess
> of the _incorrect_ return address is meaningless.
+Assignment of RAStates to newly generated instructions is done in `inferUnknownStates`.
+We have three different cases to cover:
+
+1. If a BasicBlock has some instructions with known RA state, and some without, we
+ can copy the RAState of known instructions to the unknown ones. As the control
+ flow only changes between BasicBlocks, instructions in the same BasicBlock have the
+ same return address.
+
+2. If all instructions in a BasicBlock are unknown, we can look at all CFG neighbors
+ (that is predecessors/successors). The RAState should be the same as of the
+ neighboring blocks. Conflicting RAStates in neighbors indicate an error. Such
+ functions should be ignored.
+
+3. If a BasicBlock has no CFG neighbors, we have to copy the RAState of the previous
+ BasicBlock in layout order.
+
+If any BasicBlocks remain with unknown instructions, the function will be ignored.
+
### Optimizations requiring special attention
Marking states before optimizations ensure that instructions can be moved around
diff --git a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
index 836948bf5e9c0..b4b428207b657 100644
--- a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
+++ b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
@@ -1,4 +1,4 @@
-//===- bolt/Passes/InsertNegateRAStatePass.cpp ----------------------------===//
+//===- bolt/Passes/InsertNegateRAStatePass.h ------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -30,9 +30,39 @@ class InsertNegateRAState : public BinaryFunctionPass {
private:
/// Because states are tracked as MCAnnotations on individual instructions,
/// newly inserted instructions do not have a state associated with them.
- /// New states are "inherited" from the last known state.
void inferUnknownStates(BinaryFunction &BF);
+ /// Simple case: copy RAStates to unknown insts from previous inst.
+ /// Account for signing and authenticating insts.
+ void fillUnknownStateInBB(BinaryContext &BC, BinaryBasicBlock &BB);
+
+ /// Fill unknown RAStates in BBs with no successors/predecessors. These are
+ /// Stubs inserted by LongJmp. As of #160989, we have to copy the RAState from
+ /// the previous BB in the layout, because CFIs are already incorrect here.
+ void fillUnknownStubs(BinaryFunction &BF);
+
+ /// Fills unknowns RAStates of BBs with successors/predecessors. Uses
+ /// getRAStateByCFG to determine the RAState. Does more than one iteration if
+ /// needed. Reports an error, if it cannot find the RAState for all BBs with
+ /// predecessors/successors.
+ void fillUnknownBlocksInCFG(BinaryFunction &BF);
+
+ /// For BBs which only hold instructions with unknown RAState, we check
+ /// CFG neighbors (successors, predecessors) of the BB. If they have different
+ /// RAStates, we report an inconsistency. Otherwise, we return the found
+ /// RAState.
+ std::optional<bool> getRAStateByCFG(BinaryBasicBlock &BB, BinaryFunction &BF);
+ /// Returns the first known RAState from \p BB, or std::nullopt if all are
+ /// unknown.
+ std::optional<bool> getFirstKnownRAState(BinaryContext &BC,
+ BinaryBasicBlock &BB);
+
+ /// \p Return true if all instructions have unknown RAState.
+ bool isUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB);
+
+ /// Set all instructions in \p BB to \p State.
+ void markUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB, bool State);
+
/// Support for function splitting:
/// if two consecutive BBs with Signed state are going to end up in different
/// functions (so are held by different FunctionFragments), we have to add a
diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index c29f26c10f253..dce634ae88d5a 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -50,8 +50,10 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
continue;
auto RAState = BC.MIB->getRAState(Inst);
if (!RAState) {
- BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
- << " in function " << BF.getPrintName() << "\n";
+ BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates"
+ << " in function " << BF.getPrintName()
+ << ". Function will not be optimized.\n";
+ BF.setIgnored();
}
if (!FirstIter) {
// Consecutive instructions with different RAState means we need to
@@ -69,6 +71,20 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
}
}
+void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
+ BinaryContext &BC = BF.getBinaryContext();
+
+ // Fill in missing RAStates in simple cases (inside BBs).
+ for (BinaryBasicBlock &BB : BF) {
+ fillUnknownStateInBB(BC, BB);
+ }
+ // Some stubs have no predecessors. For those, we iterate once in the layout
+ // order to fill their RAState.
+ fillUnknownStubs(BF);
+
+ fillUnknownBlocksInCFG(BF);
+}
+
void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
FunctionFragment &FF) {
BinaryContext &BC = BF.getBinaryContext();
@@ -86,43 +102,217 @@ void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
});
// If a function is already split in the input, the first FF can also start
// with Signed state. This covers that scenario as well.
- auto RAState = BC.MIB->getRAState(*(*FirstNonEmpty)->begin());
+ auto II = (*FirstNonEmpty)->getFirstNonPseudo();
+ auto RAState = BC.MIB->getRAState(*II);
if (!RAState) {
- BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
- << " in function " << BF.getPrintName() << "\n";
+ BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates"
+ << " in function " << BF.getPrintName()
+ << ". Function will not be optimized.\n";
+ BF.setIgnored();
return;
}
if (*RAState)
- BF.addCFIInstruction(*FirstNonEmpty, (*FirstNonEmpty)->begin(),
+ BF.addCFIInstruction(*FirstNonEmpty, II,
MCCFIInstruction::createNegateRAState(nullptr));
}
-void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
+std::optional<bool>
+InsertNegateRAState::getFirstKnownRAState(BinaryContext &BC,
+ BinaryBasicBlock &BB) {
+ for (const MCInst &Inst : BB) {
+ if (BC.MIB->isCFI(Inst))
+ continue;
+ auto RAStateOpt = BC.MIB->getRAState(Inst);
+ if (RAStateOpt)
+ return RAStateOpt;
+ }
+ return std::nullopt;
+}
+
+void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
+ BinaryBasicBlock &BB) {
+
+ auto First = BB.getFirstNonPseudo();
+ if (First == BB.end())
+ return;
+ // If the first instruction has unknown RAState, we should copy the first
+ // known RAState.
+ auto RAStateOpt = BC.MIB->getRAState(*First);
+ if (!RAStateOpt) {
+ auto FirstRAState = getFirstKnownRAState(BC, BB);
+ if (!FirstRAState)
+ // We fill unknown BBs later.
+ return;
+
+ BC.MIB->setRAState(*First, *FirstRAState);
+ }
+
+ // At this point we know the RAState of the first instruction,
+ // so we can propagate the RAStates to all subsequent unknown instructions.
+ MCInst Prev = *First;
+ for (auto It = BB.begin() + 1; It != BB.end(); ++It) {
+ MCInst &Inst = *It;
+ if (BC.MIB->isCFI(Inst))
+ continue;
+
+ auto PrevRAState = BC.MIB->getRAState(Prev);
+ if (!PrevRAState)
+ llvm_unreachable("Previous Instruction has no RAState.");
+
+ auto RAState = BC.MIB->getRAState(Inst);
+ if (!RAState) {
+ if (BC.MIB->isPSignOnLR(Prev))
+ PrevRAState = true;
+ else if (BC.MIB->isPAuthOnLR(Prev))
+ PrevRAState = false;
+ BC.MIB->setRAState(Inst, *PrevRAState);
+ }
+ Prev = Inst;
+ }
+}
+
+bool InsertNegateRAState::isUnknownBlock(BinaryContext &BC,
+ BinaryBasicBlock &BB) {
+ for (const MCInst &Inst : BB) {
+ if (BC.MIB->isCFI(Inst))
+ continue;
+ auto RAState = BC.MIB->getRAState(Inst);
+ if (RAState)
+ return false;
+ }
+ return true;
+}
+
+void InsertNegateRAState::markUnknownBlock(BinaryContext &BC,
+ BinaryBasicBlock &BB, bool State) {
+ // If we call this when an Instruction has either kRASigned or kRAUnsigned
+ // annotation, setRASigned or setRAUnsigned would fail.
+ assert(isUnknownBlock(BC, BB) &&
+ "markUnknownBlock should only be called on unknown blocks");
+ for (MCInst &Inst : BB) {
+ if (BC.MIB->isCFI(Inst))
+ continue;
+ BC.MIB->setRAState(Inst, State);
+ }
+}
+
+std::optional<bool> InsertNegateRAState::getRAStateByCFG(BinaryBasicBlock &BB,
+ BinaryFunction &BF) {
BinaryContext &BC = BF.getBinaryContext();
- bool FirstIter = true;
- MCInst PrevInst;
- for (BinaryBasicBlock &BB : BF) {
- for (MCInst &Inst : BB) {
- if (BC.MIB->isCFI(Inst))
+
+ auto checkRAState = [&](std::optional<bool> &NeighborRAState, MCInst &Inst) {
+ auto RAState = BC.MIB->getRAState(Inst);
+ if (!RAState)
+ return;
+ if (!NeighborRAState) {
+ NeighborRAState = *RAState;
+ return;
+ }
+ if (NeighborRAState != *RAState) {
+ BC.outs() << "BOLT-WARNING: Conflicting RAState found in function "
+ << BF.getPrintName() << ". Function will not be optimized.\n";
+ BF.setIgnored();
+ }
+ };
+
+ // Holds the first found RAState from CFG neighbors.
+ std::optional<bool> NeighborRAState = std::nullopt;
+ if (BB.pred_size() != 0) {
+ for (BinaryBasicBlock *PredBB : BB.predecessors()) {
+ // find last inst of Predecessor with known RA State.
+ auto LI = PredBB->getLastNonPseudo();
+ if (LI == PredBB->rend())
+ continue;
+ MCInst &LastInst = *LI;
+ checkRAState(NeighborRAState, LastInst);
+ }
+ } else if (BB.succ_size() != 0) {
+ for (BinaryBasicBlock *SuccBB : BB.successors()) {
+ // find first inst of Successor with known RA State.
+ auto FI = SuccBB->getFirstNonPseudo();
+ if (FI == SuccBB->end())
continue;
+ MCInst &FirstInst = *FI;
+ checkRAState(NeighborRAState, FirstInst);
+ }
+ } else {
+ llvm_unreachable("Called getRAStateByCFG on a BB with no preds or succs.");
+ }
+
+ return NeighborRAState;
+}
- auto RAState = BC.MIB->getRAState(Inst);
- if (!FirstIter && !RAState) {
- if (BC.MIB->isPSignOnLR(PrevInst))
- RAState = true;
- else if (BC.MIB->isPAuthOnLR(PrevInst))
- RAState = false;
- else {
+void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
+ BinaryContext &BC = BF.getBinaryContext();
+ bool FirstIter = true;
+ MCInst PrevInst;
+ for (FunctionFragment &FF : BF.getLayout().fragments()) {
+ for (BinaryBasicBlock *BB : FF) {
+ if (!FirstIter && isUnknownBlock(BC, *BB)) {
+ // If we have no predecessors or successors, current BB is a Stub called
+ // from another BinaryFunction. As of #160989, we have to copy the
+ // PrevInst's RAState, because CFIs are already incorrect here.
+ if (BB->pred_size() == 0 && BB->succ_size() == 0) {
auto PrevRAState = BC.MIB->getRAState(PrevInst);
- RAState = PrevRAState ? *PrevRAState : false;
+ if (!PrevRAState) {
+ BF.dump();
+ BB->dump();
+ llvm_unreachable(
+ "Previous Instruction has no RAState in fillUnknownStubs.");
+ continue;
+ }
+
+ if (BC.MIB->isPSignOnLR(PrevInst)) {
+ PrevRAState = true;
+ } else if (BC.MIB->isPAuthOnLR(PrevInst)) {
+ PrevRAState = false;
+ }
+ markUnknownBlock(BC, *BB, *PrevRAState);
}
- BC.MIB->setRAState(Inst, *RAState);
- } else {
+ }
+ if (FirstIter) {
FirstIter = false;
- if (!RAState)
- BC.MIB->setRAState(Inst, BF.getInitialRAState());
+ if (isUnknownBlock(BC, *BB))
+ markUnknownBlock(BC, *BB, false);
}
- PrevInst = Inst;
+ auto Last = BB->getLastNonPseudo();
+ if (Last != BB->rend())
+ PrevInst = *Last;
+ }
+ }
+}
+void InsertNegateRAState::fillUnknownBlocksInCFG(BinaryFunction &BF) {
+ BinaryContext &BC = BF.getBinaryContext();
+
+ auto fillUnknowns = [&](BinaryFunction &BF) -> std::pair<int, bool> {
+ int Unknowns = 0;
+ bool Updated = false;
+ for (BinaryBasicBlock &BB : BF) {
+ // Only try to iterate if the BB has either predecessors or successors.
+ if (isUnknownBlock(BC, BB) &&
+ (BB.pred_size() != 0 || BB.succ_size() != 0)) {
+ auto RAStateOpt = getRAStateByCFG(BB, BF);
+ if (RAStateOpt) {
+ markUnknownBlock(BC, BB, *RAStateOpt);
+ Updated = true;
+ } else {
+ Unknowns++;
+ }
+ }
+ }
+ return std::pair<int, bool>{Unknowns, Updated};
+ };
+
+ while (true) {
+ std::pair<int, bool> Iter = fillUnknowns(BF);
+ if (Iter.first == 0)
+ break;
+ if (!Iter.second) {
+ BC.errs() << "BOLT-WARNING: Could not infer RAState for " << Iter.first
+ << " BBs in function " << BF.getPrintName()
+ << ". Function will not be optimized.\n";
+ BF.setIgnored();
+ break;
}
}
}
|
b066501 to
1ee52ed
Compare
dfdaf23 to
e8149e4
Compare
e8149e4 to
75cb9f6
Compare
195e99c to
6ad31fe
Compare
How to test this?I have a few concerns on testing:
ProposalI think the following plan is more reasonable:
I also think this single-pass gtest way of testing would be useful in other future additions to BOLT. |
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
9f5be17 to
ad33b18
Compare
| BC.outs() << "BOLT-WARNING: Conflicting RAState found in function " | ||
| << BF.getPrintName() << ". Function will not be optimized.\n"; | ||
| BF.setIgnored(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: decide if we want to ignore + continue execution, or create a fatal error.
| checkRAState(NeighborRAState, FirstInst); | ||
| } | ||
| } else { | ||
| llvm_unreachable("Called getRAStateByCFG on a BB with no preds or succs."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: change to fatal BOLT error, based on the discussion in #162820
| llvm_unreachable( | ||
| "Previous Instruction has no RAState in fillUnknownStubs."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: change to fatal BOLT error, based on the discussion in #162820
b9102ab to
d1f66f7
Compare
0453667 to
18993b2
Compare
18993b2 to
56251e7
Compare
|
force push: synced after #162820 got merged. |
paschalis-mpeis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Gergely,
Thanks for your patch. This is not a full review of the patch.
I've added a few comments that may benefit from a bit more context, and some points for clarification.
| return; | ||
| // If the first instruction has unknown RAState, we should copy the first | ||
| // known RAState. | ||
| auto RAStateOpt = BC.MIB->getRAState(*First); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for my own understanding:
In this case, is it okay to copy from the first known state, which follows the first unknown instruction? Is this a case that occurs in practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I encountered it in practice. I included it because:
- I don't have a "global view" of all passes, with each insertion exactly understood,
- even if it does not occur currently, it could happen later, as BOLT is changing,
- if we don't assign an RA state to each and every instruction here, the pass will crash later, as we expect all to have an RA state
So we have to add an RA state, and this is a more sophisticated method, then copying the RA state of the previous BB.
As of soundness: all instructions in a BB have the same RA state, except:
- if the BB has psign/pauth
- if a BB has a noreturn call, and it is incorrectly parsed as one BB instead of two (see: [BOLT] Lack of support for noreturn functions results in incorrect CFG generation. #115154)
I do not think BOLT ever inserts a new call to a noreturn function, so this should be correct.
| // At this point we know the RAState of the first instruction, | ||
| // so we can propagate the RAStates to all subsequent unknown instructions. | ||
| MCInst Prev = *First; | ||
| for (auto It = BB.begin() + 1; It != BB.end(); ++It) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to start at Prev+1 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a first glance, that should also be correct. Although it wouldn't be a big difference, as we skip CFIs early in the loop.
| MCInst &LastInst = *LI; | ||
| checkRAState(NeighborRAState, LastInst); | ||
| } | ||
| } else if (BB.succ_size() != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be an else if, or shall we confirm a matching state with successors as well?
| for (BinaryBasicBlock &BB : BF) { | ||
| // Only try to iterate if the BB has either predecessors or successors. | ||
| if (isUnknownBlock(BC, BB) && | ||
| (BB.pred_size() != 0 || BB.succ_size() != 0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to check if that block is an entrypoint too?
This could also clarify that the BB is indeed reachable by the program, ie just a single block in a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so? But checking if there is a BB with no predecessors or successors AND it is not an entry point is a check unrelated to Pauth, and should be done elsewhere.
Or I am not sure what error would it exactly indicate.
WDYT?
Previous implementation used a simple heuristic. This can be improved in several ways: - If a BasicBlock has instruction both with known RAState and unknown RAState, use the known states to work out the unknown ones. - If a BasicBlock only consists of instructions with unknown RAState, use the last known RAState from its predecessors, or the first known from its successors to set the RAStates in the BasicBlock. This includes error checking: all predecessors/successors should have the same RAState. - Some BasicBlocks may only contain instructions with unknown RAState, and have no CFG neighbors. These already have incorrect unwind info. For these, we copy the last known RAState based on the layout order. Updated bolt/docs/PacRetDesign.md to reflect changes.
This commit creates a new directory: bolt/unittests/Passes, to be used by unittests that need to register and run passes with the BinaryFunctionPassManager. An example test is created for InsertNegateRAState pass. Actual tests will be added in followup commits.
56251e7 to
fb20d93
Compare

Previous implementation used a simple heuristic. This can be improved in
several ways:
use the known states to work out the unknown ones.
use the last known RAState from its predecessors, or the first known
from its successors to set the RAStates in the BasicBlock. This includes
error checking: all predecessors/successors should have the same RAState.
and have no CFG neighbors. These already have incorrect unwind info.
For these, we copy the last known RAState based on the layout order.
Updated bolt/docs/PacRetDesign.md to reflect changes.