Skip to content

Commit 11e897d

Browse files
committed
[BOLT] Update bolt/docs/PacRetDesign.md
1 parent c0b4df4 commit 11e897d

File tree

1 file changed

+47
-45
lines changed

1 file changed

+47
-45
lines changed

bolt/docs/PacRetDesign.md

Lines changed: 47 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ in assembly, or `OpNegateRAState` in BOLT sources. In this document, I will use
1414

1515
### Pointer Authentication
1616

17-
Refer to the [pac-ret section of the BOLT-binary-analysis document](BinaryAnalysis.md#pac-ret-analysis).
17+
For more information, see the [pac-ret section of the BOLT-binary-analysis document](BinaryAnalysis.md#pac-ret-analysis).
1818

1919
### DW_CFA_AARCH64_negate_ra_state
2020

@@ -27,14 +27,14 @@ The DW_CFA_AARCH64_negate_ra_state operation negates bit[0] of the RA_SIGN_STATE
2727

2828
This bit indicates to the unwinder whether the current return address is signed
2929
or not (hence the name). The unwinder uses this information to authenticate the
30-
pointer, and remove the Pointer Authentication Code (PAC) bits. Incorrect
31-
negate-ra-state placement can lead to the unwinder trying to authenticate an
32-
unsigned pointer (which segfaults), or skipping authenticating a signed pointer,
33-
and trying to access an incorrect location (also leading to a segfault).
30+
pointer, and remove the Pointer Authentication Code (PAC) bits.
31+
Incorrect placment of negate-ra-state CFIs causes the unwinder to either attempt
32+
to authenticate an unsigned pointer (resulting in a segmentation fault), or skip
33+
authentication on a signed pointer, which can also cause a fault.
3434

35-
Note: not *all* unwinders do this. Some use the `xpac` instruction to strip the
36-
PAC bits without authenticating the pointer. This is an incorrect (incomplete)
37-
implementation, as it allows control-flow modification in the case of unwinding.
35+
Note: some unwinders use the `xpac` instruction to strip the PAC bits without
36+
authenticating the pointer. This is an incorrect (incomplete) implementation,
37+
as it allows control-flow modification in the case of unwinding.
3838

3939
There are no DWARF instructions to directly set or clear the RA State. However,
4040
two other CFIs can also affect the RA state:
@@ -56,12 +56,12 @@ is not widely used, and is [likely to become deprecated](https://github.com/ARM-
5656

5757
### Where are these CFIs needed?
5858

59-
In all locations, where two consecutive instructions have different RA state,
60-
this needs to be indicated to the unwinder. This happens at pointer signing and
61-
authenticating. The other case where two consecutive instructions have different
62-
RA state, but neither of them is signing or authenticating means that they are
63-
not next to each other in control flow. One is part of an execution path with
64-
signed RA, the other is part of a path with an unsigned RA.
59+
Whenever two consecutive instructions have different RA states, the unwinder must
60+
be informed of the change. This typically occurs during pointer signing or
61+
authentication. If adjacent instructions differ in RA state but neither signs
62+
nor authenticates the return address, they must belong to different control flow
63+
paths. One is part of an execution path with signed RA, the other is part of a
64+
path with an unsigned RA.
6565

6666
In the example below, the first BasicBlock ends in a conditional branch, and
6767
jumps to two different BasicBlocks, each with their own authentication, and
@@ -103,7 +103,7 @@ negate-ra-state CFIs will become invalid during BasicBlock reordering.
103103

104104
## Solution design
105105

106-
The patch introduces two new passes:
106+
The implementation introduces two new passes:
107107
1. `MarkRAStatesPass`: assigns the RA state to each instruction based on the CFIs
108108
in the input binary
109109
2. `InsertNegateRAStatePass`: reads those assigned instruction RA states after
@@ -123,24 +123,21 @@ with the CFI processing that already happens in BOLT (e.g. remember-state and
123123
restore-state CFIs are removed in `normalizeCFIState` for reasons unrelated to PAC).
124124

125125
As we add the MCAnnotations *to instructions*, we have to account for the case
126-
where the function starts with a CFI altering the RA state. If a function starts
127-
with a negate-ra-state CFI for example, we cannot save the annotation on the
128-
first instruction, because that itself should already be signed. This is why all
129-
BinaryFunctions have an `initialRAState` bool. If the `Offset` the CFI refers to
130-
is zero, we don't store an annotation, but set the `initialRAState` in
131-
`FillCFIInfoFor`. This information is then used in `MarkRAStates`.
126+
where the function starts with a CFI altering the RA state. As CFIs modify the RA
127+
state of the instructions before them, we cannot add the annotation to the first
128+
instruction.
129+
This special case is handled by adding an `initialRAState` bool to each BinaryFunction.
130+
If the `Offset` the CFI refers to is zero, we don't store an annotation, but set
131+
the `initialRAState` in `FillCFIInfoFor`. This information is then used in
132+
`MarkRAStates`.
132133

133134
### Binaries without DWARF info
134135

135136
In some cases, the DWARF tables are stripped from the binary. These programs
136-
usually have some other unwind-mechanism. To account for code that uses Pointer
137-
Authentication, but does not have DWARF CFIs, the passes only run on functions
138-
that had at least one negate-ra-state CFI. This information is saved on the
139-
functions during CFI reading.
140-
141-
This also makes sure that the passes don't run on functions that do not store
142-
the return address to the stack, and don't need Pointer Authentication, saving
143-
on runtime overhead.
137+
usually have some other unwind-mechanism.
138+
These passes only run on functions that include at least one negate-ra-state CFI.
139+
This avoids processing functions that do not use Pointer Authentication, or on
140+
functions that use Pointer Authentication, but do not have DWARF info.
144141

145142
In summary:
146143
- pointer auth is not used: no change, the new passes do not run.
@@ -149,20 +146,20 @@ In summary:
149146
- pointer auth is used, and we have DWARF CFIs: passes run, and rewrite the
150147
negate-ra-state CFI.
151148

152-
### MarkRAStates Pass
149+
### MarkRAStates pass
153150

154151
This pass runs before optimizations reorder anything.
155152

156153
It processes MCAnnotations generated during the CFI reading stage to check if
157154
instructions have either of the three CFIs that can modify RA state:
158-
- negate-ra-state
159-
- remember-state
160-
- restore-state
155+
- negate-ra-state,
156+
- remember-state,
157+
- restore-state.
161158

162159
Then it adds new MCAnnotations to each instruction, indicating their RA state.
163160
Those annotations are:
164-
- Signed
165-
- Unsigned
161+
- Signed,
162+
- Unsigned.
166163

167164
Below is a simple example, that shows the two different type of annotations:
168165
what we have before the pass, and after it.
@@ -179,9 +176,9 @@ what we have before the pass, and after it.
179176
##### Error handling in MarkRAState Pass:
180177

181178
Whenever the MarkRAStates pass finds inconsistencies in the current
182-
BinaryFunction, it ignores it by calling `BF.setIgnored()`. This prevents BOLT
183-
from optimizing that function, but it will still be emitted as part of the
184-
original section (`.bolt.org.text`) in its original form.
179+
BinaryFunction, it marks the function as ignored using `BF.setIgnored()`. BOLT
180+
will not optimize this function but will emit it unchanged in the original section
181+
(`.bolt.org.text`).
185182

186183
The inconsistencies are as follows:
187184
- finding a `pac*` instruction when already in signed state
@@ -193,8 +190,7 @@ exact functions ignored, and the found inconsistency.
193190

194191
### InsertNegateRAStatePass
195192

196-
This pass runs after the optimizations are done. In essence, it does the _inverse_
197-
of MarkRAState pass:
193+
This pass runs after optimizations. It performns the _inverse_ of MarkRAState pa s:
198194
1. it reads the RA state annotations attached to the instructions, and
199195
2. whenever the state changes, it adds a PseudoInstruction that holds an
200196
OpNegateRAState CFI.
@@ -205,8 +201,14 @@ Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
205201
to know what RA state these have.
206202

207203
The current solution has the `inferUnknownStates` function to cover these, using
208-
a fairly simple strategy: unknown states inherit the last known state. Testing so
209-
far has shown that this implementation is sufficient.
204+
a fairly simple strategy: unknown states inherit the last known state.
205+
206+
This will be updated to a more robust solution.
207+
208+
> [!important]
209+
> As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
210+
> For this same reason, we cannot generate correct pac-specific unwind info: the signess
211+
> of the _incorrect_ return address is meaningless.
210212
211213
### Optimizations requiring special attention
212214

@@ -221,6 +223,6 @@ to indicate this.
221223

222224
## Option to disallow the feature
223225

224-
To aid debugging, we added the `--disallow-pacret` flag. If the flag is used,
225-
and a function `containedNegateRAState()` after `FillCFIInfoFor()`, BOLT exits
226-
with an error. With this flag, the feature is on by default.
226+
The feature can be guarded with the `--update-branch-prediction` flag, which is
227+
on by default. If the flag is set to false, and a function
228+
`containedNegateRAState()` after `FillCFIInfoFor()`, BOLT exits with an error.

0 commit comments

Comments
 (0)