@@ -14,7 +14,7 @@ in assembly, or `OpNegateRAState` in BOLT sources. In this document, I will use
1414
1515### Pointer Authentication
1616
17- Refer to the [ pac-ret section of the BOLT-binary-analysis document] ( BinaryAnalysis.md#pac-ret-analysis ) .
17+ For more information, see the [ pac-ret section of the BOLT-binary-analysis document] ( BinaryAnalysis.md#pac-ret-analysis ) .
1818
1919### DW_CFA_AARCH64_negate_ra_state
2020
@@ -27,14 +27,14 @@ The DW_CFA_AARCH64_negate_ra_state operation negates bit[0] of the RA_SIGN_STATE
2727
2828This bit indicates to the unwinder whether the current return address is signed
2929or not (hence the name). The unwinder uses this information to authenticate the
30- pointer, and remove the Pointer Authentication Code (PAC) bits. Incorrect
31- negate-ra-state placement can lead to the unwinder trying to authenticate an
32- unsigned pointer (which segfaults), or skipping authenticating a signed pointer,
33- and trying to access an incorrect location ( also leading to a segfault) .
30+ pointer, and remove the Pointer Authentication Code (PAC) bits.
31+ Incorrect placment of negate-ra-state CFIs causes the unwinder to either attempt
32+ to authenticate an unsigned pointer (resulting in a segmentation fault), or skip
33+ authentication on a signed pointer, which can also cause a fault .
3434
35- Note: not * all * unwinders do this. Some use the ` xpac ` instruction to strip the
36- PAC bits without authenticating the pointer. This is an incorrect (incomplete)
37- implementation, as it allows control-flow modification in the case of unwinding.
35+ Note: some unwinders use the ` xpac ` instruction to strip the PAC bits without
36+ authenticating the pointer. This is an incorrect (incomplete) implementation,
37+ as it allows control-flow modification in the case of unwinding.
3838
3939There are no DWARF instructions to directly set or clear the RA State. However,
4040two other CFIs can also affect the RA state:
@@ -56,12 +56,12 @@ is not widely used, and is [likely to become deprecated](https://github.com/ARM-
5656
5757### Where are these CFIs needed?
5858
59- In all locations, where two consecutive instructions have different RA state,
60- this needs to be indicated to the unwinder . This happens at pointer signing and
61- authenticating. The other case where two consecutive instructions have different
62- RA state, but neither of them is signing or authenticating means that they are
63- not next to each other in control flow . One is part of an execution path with
64- signed RA, the other is part of a path with an unsigned RA.
59+ Whenever two consecutive instructions have different RA states, the unwinder must
60+ be informed of the change . This typically occurs during pointer signing or
61+ authentication. If adjacent instructions differ in RA state but neither signs
62+ nor authenticates the return address, they must belong to different control flow
63+ paths . One is part of an execution path with signed RA, the other is part of a
64+ path with an unsigned RA.
6565
6666In the example below, the first BasicBlock ends in a conditional branch, and
6767jumps to two different BasicBlocks, each with their own authentication, and
@@ -103,7 +103,7 @@ negate-ra-state CFIs will become invalid during BasicBlock reordering.
103103
104104## Solution design
105105
106- The patch introduces two new passes:
106+ The implementation introduces two new passes:
1071071 . ` MarkRAStatesPass ` : assigns the RA state to each instruction based on the CFIs
108108 in the input binary
1091092 . ` InsertNegateRAStatePass ` : reads those assigned instruction RA states after
@@ -123,24 +123,21 @@ with the CFI processing that already happens in BOLT (e.g. remember-state and
123123restore-state CFIs are removed in ` normalizeCFIState ` for reasons unrelated to PAC).
124124
125125As we add the MCAnnotations * to instructions* , we have to account for the case
126- where the function starts with a CFI altering the RA state. If a function starts
127- with a negate-ra-state CFI for example, we cannot save the annotation on the
128- first instruction, because that itself should already be signed. This is why all
129- BinaryFunctions have an ` initialRAState ` bool. If the ` Offset ` the CFI refers to
130- is zero, we don't store an annotation, but set the ` initialRAState ` in
131- ` FillCFIInfoFor ` . This information is then used in ` MarkRAStates ` .
126+ where the function starts with a CFI altering the RA state. As CFIs modify the RA
127+ state of the instructions before them, we cannot add the annotation to the first
128+ instruction.
129+ This special case is handled by adding an ` initialRAState ` bool to each BinaryFunction.
130+ If the ` Offset ` the CFI refers to is zero, we don't store an annotation, but set
131+ the ` initialRAState ` in ` FillCFIInfoFor ` . This information is then used in
132+ ` MarkRAStates ` .
132133
133134### Binaries without DWARF info
134135
135136In some cases, the DWARF tables are stripped from the binary. These programs
136- usually have some other unwind-mechanism. To account for code that uses Pointer
137- Authentication, but does not have DWARF CFIs, the passes only run on functions
138- that had at least one negate-ra-state CFI. This information is saved on the
139- functions during CFI reading.
140-
141- This also makes sure that the passes don't run on functions that do not store
142- the return address to the stack, and don't need Pointer Authentication, saving
143- on runtime overhead.
137+ usually have some other unwind-mechanism.
138+ These passes only run on functions that include at least one negate-ra-state CFI.
139+ This avoids processing functions that do not use Pointer Authentication, or on
140+ functions that use Pointer Authentication, but do not have DWARF info.
144141
145142In summary:
146143- pointer auth is not used: no change, the new passes do not run.
@@ -149,20 +146,20 @@ In summary:
149146- pointer auth is used, and we have DWARF CFIs: passes run, and rewrite the
150147 negate-ra-state CFI.
151148
152- ### MarkRAStates Pass
149+ ### MarkRAStates pass
153150
154151This pass runs before optimizations reorder anything.
155152
156153It processes MCAnnotations generated during the CFI reading stage to check if
157154instructions have either of the three CFIs that can modify RA state:
158- - negate-ra-state
159- - remember-state
160- - restore-state
155+ - negate-ra-state,
156+ - remember-state,
157+ - restore-state.
161158
162159Then it adds new MCAnnotations to each instruction, indicating their RA state.
163160Those annotations are:
164- - Signed
165- - Unsigned
161+ - Signed,
162+ - Unsigned.
166163
167164Below is a simple example, that shows the two different type of annotations:
168165what we have before the pass, and after it.
@@ -179,9 +176,9 @@ what we have before the pass, and after it.
179176##### Error handling in MarkRAState Pass:
180177
181178Whenever the MarkRAStates pass finds inconsistencies in the current
182- BinaryFunction, it ignores it by calling ` BF.setIgnored() ` . This prevents BOLT
183- from optimizing that function, but it will still be emitted as part of the
184- original section (` .bolt.org.text ` ) in its original form .
179+ BinaryFunction, it marks the function as ignored using ` BF.setIgnored() ` . BOLT
180+ will not optimize this function but will emit it unchanged in the original section
181+ (` .bolt.org.text ` ).
185182
186183The inconsistencies are as follows:
187184- finding a ` pac* ` instruction when already in signed state
@@ -193,8 +190,7 @@ exact functions ignored, and the found inconsistency.
193190
194191### InsertNegateRAStatePass
195192
196- This pass runs after the optimizations are done. In essence, it does the _ inverse_
197- of MarkRAState pass:
193+ This pass runs after optimizations. It performns the _ inverse_ of MarkRAState pa s:
1981941 . it reads the RA state annotations attached to the instructions, and
1991952 . whenever the state changes, it adds a PseudoInstruction that holds an
200196 OpNegateRAState CFI.
@@ -205,8 +201,14 @@ Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
205201to know what RA state these have.
206202
207203The current solution has the ` inferUnknownStates ` function to cover these, using
208- a fairly simple strategy: unknown states inherit the last known state. Testing so
209- far has shown that this implementation is sufficient.
204+ a fairly simple strategy: unknown states inherit the last known state.
205+
206+ This will be updated to a more robust solution.
207+
208+ > [ !important]
209+ > As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
210+ > For this same reason, we cannot generate correct pac-specific unwind info: the signess
211+ > of the _ incorrect_ return address is meaningless.
210212
211213### Optimizations requiring special attention
212214
@@ -221,6 +223,6 @@ to indicate this.
221223
222224## Option to disallow the feature
223225
224- To aid debugging, we added the ` --disallow-pacret ` flag. If the flag is used,
225- and a function ` containedNegateRAState() ` after ` FillCFIInfoFor() ` , BOLT exits
226- with an error. With this flag, the feature is on by default .
226+ The feature can be guarded with the ` --update-branch-prediction ` flag, which is
227+ on by default. If the flag is set to false, and a function
228+ ` containedNegateRAState() ` after ` FillCFIInfoFor() ` , BOLT exits with an error .
0 commit comments