Skip to content

Commit bf5f5f0

Browse files
author
MUSTAPHA BARKI
authored
Merge branch 'main' into Uncontrolled-data-used-in-path-expression
2 parents f708b0b + 69f9138 commit bf5f5f0

File tree

1,257 files changed

+103164
-14996
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,257 files changed

+103164
-14996
lines changed

.ci/generate_test_report_github.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,15 @@
88

99
import generate_test_report_lib
1010

11-
PLATFORM_TITLES = {
12-
"Windows": ":window: Windows x64 Test Results",
13-
"Linux": ":penguin: Linux x64 Test Results",
14-
}
11+
def compute_platform_title() -> str:
12+
logo = ":window:" if platform.system() == "Windows" else ":penguin:"
13+
# On Linux the machine value is x86_64 on Windows it is AMD64.
14+
if platform.machine() == "x86_64" or platform.machine() == "AMD64":
15+
arch = "x64"
16+
else:
17+
arch = platform.machine()
18+
return f"{logo} {platform.system()} {arch} Test Results"
19+
1520

1621
if __name__ == "__main__":
1722
parser = argparse.ArgumentParser()
@@ -22,7 +27,7 @@
2227
args = parser.parse_args()
2328

2429
report = generate_test_report_lib.generate_report_from_files(
25-
PLATFORM_TITLES[platform.system()], args.return_code, args.build_test_logs
30+
compute_platform_title(), args.return_code, args.build_test_logs
2631
)
2732

2833
print(report)

.ci/utils.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ function start-group {
5656
export PIP_BREAK_SYSTEM_PACKAGES=1
5757
pip install -q -r "${MONOREPO_ROOT}"/.ci/all_requirements.txt
5858

59-
if [[ "$GITHUB_ACTIONS" != "" ]]; then
59+
# The ARM64 builders run on AWS and don't have access to the GCS cache.
60+
if [[ "$GITHUB_ACTIONS" != "" ]] && [[ "$RUNNER_ARCH" != "ARM64" ]]; then
6061
python .ci/cache_lit_timing_files.py download
6162
fi

.github/workflows/build-ci-container-tooling.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: Build CI Container
1+
name: Build CI Tooling Containers
22

33
permissions:
44
contents: read
@@ -101,7 +101,7 @@ jobs:
101101
}
102102
103103
podman login -u ${{ github.actor }} -p $GITHUB_TOKEN ghcr.io
104-
for f in $(find . -iname *.tar); do
104+
for f in $(find . -iname '*.tar'); do
105105
image_name=$(podman load -q -i $f | sed 's/Loaded image: //g')
106106
push_container $image_name
107107

.github/workflows/build-ci-container.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ jobs:
103103
}
104104
105105
podman login -u ${{ github.actor }} -p $GITHUB_TOKEN ghcr.io
106-
for f in $(find . -iname *.tar); do
106+
for f in $(find . -iname '*.tar'); do
107107
image_name=$(podman load -q -i $f | sed 's/Loaded image: //g')
108108
push_container $image_name
109109

.github/workflows/premerge.yaml

Lines changed: 41 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,17 +24,45 @@ concurrency:
2424

2525
jobs:
2626
premerge-checks-linux:
27-
name: Build and Test Linux
27+
name: Build and Test Linux${{ (startsWith(matrix.runs-on, 'depot-ubuntu-24.04-arm') && ' AArch64') || '' }}
2828
if: >-
2929
github.repository_owner == 'llvm' &&
3030
(github.event_name != 'pull_request' || github.event.action != 'closed')
31-
runs-on: llvm-premerge-linux-runners
31+
strategy:
32+
fail-fast: false
33+
matrix:
34+
runs-on:
35+
- depot-ubuntu-24.04-arm-16
36+
- llvm-premerge-linux-runners
37+
runs-on: ${{ matrix.runs-on }}
38+
container:
39+
# The llvm-premerge agents are already containers and running the
40+
# this same image, so we can't use a container for the github action
41+
# job. The depot containers are running on VMs, so we can use a
42+
# container. This helps ensure the build environment is as close
43+
# as possible on both the depot runners and the llvm-premerge runners.
44+
image: ${{ (startsWith(matrix.runs-on, 'depot-ubuntu-24.04-arm') && format('ghcr.io/{0}/arm64v8/ci-ubuntu-24.04',github.repository_owner) ) || null }}
45+
# --privileged is needed to run the lldb tests that disable aslr.
46+
# The SCCACHE environment variables are need to be copied from the host
47+
# to the container to make sure it is configured correctly to use the
48+
# depot cache.
49+
options: >-
50+
--privileged
51+
--env SCCACHE_WEBDAV_ENDPOINT
52+
--env SCCACHE_WEBDAV_TOKEN
53+
defaults:
54+
run:
55+
# The run step defaults to using sh as the shell when running in a
56+
# container, so make bash the default to ensure consistency between
57+
# container and non-container jobs.
58+
shell: bash
3259
steps:
3360
- name: Checkout LLVM
3461
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
3562
with:
3663
fetch-depth: 2
3764
- name: Build and Test
65+
continue-on-error: ${{ runner.arch == 'ARM64' }}
3866
run: |
3967
git config --global --add safe.directory '*'
4068
@@ -54,11 +82,16 @@ jobs:
5482
export CC=/opt/llvm/bin/clang
5583
export CXX=/opt/llvm/bin/clang++
5684
57-
# This environment variable is passes into the container through the
58-
# runner pod definition. This differs between our two clusters which
59-
# why we do not hardcode it.
60-
export SCCACHE_GCS_BUCKET=$CACHE_GCS_BUCKET
61-
export SCCACHE_GCS_RW_MODE=READ_WRITE
85+
# The linux-premerge runners are hosted on GCP and have a different
86+
# cache setup than the depot runners.
87+
if [[ "${{ matrix.runs-on }}" = "llvm-premerge-linux-runners" ]]; then
88+
# This environment variable is passes into the container through the
89+
# runner pod definition. This differs between our two clusters which
90+
# why we do not hardcode it.
91+
export SCCACHE_GCS_BUCKET=$CACHE_GCS_BUCKET
92+
export SCCACHE_GCS_RW_MODE=READ_WRITE
93+
fi
94+
env
6295
6396
# Set the idle timeout to zero to ensure sccache runs for the
6497
# entire duration of the job. Otherwise it might stop if we run
@@ -78,7 +111,7 @@ jobs:
78111
if: '!cancelled()'
79112
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
80113
with:
81-
name: Premerge Artifacts (Linux)
114+
name: Premerge Artifacts (Linux ${{ runner.arch }})
82115
path: artifacts/
83116
retention-days: 5
84117
include-hidden-files: 'true'

.github/workflows/release-asset-audit.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,8 @@ def _get_uploaders(release_version):
5454
"tru",
5555
"tstellar",
5656
"github-actions[bot]",
57+
"c-rhodes",
58+
"dyung",
5759
]
5860
)
5961

bolt/docs/PacRetDesign.md

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
# Optimizing binaries with pac-ret hardening
2+
3+
This is a design document about processing the `DW_CFA_AARCH64_negate_ra_state`
4+
DWARF instruction in BOLT. As it describes internal design decisions, the
5+
intended audience is BOLT developers. The document is an updated version of the
6+
[RFC posted on the LLVM Discourse](https://discourse.llvm.org/t/rfc-bolt-aarch64-handle-opnegaterastate-to-enable-optimizing-binaries-with-pac-ret-hardening/86594).
7+
8+
9+
`DW_CFA_AARCH64_negate_ra_state` is also referred to as `.cfi_negate_ra_state`
10+
in assembly, or `OpNegateRAState` in BOLT sources. In this document, I will use
11+
**negate-ra-state** as a shorthand.
12+
13+
## Introduction
14+
15+
### Pointer Authentication
16+
17+
For more information, see the [pac-ret section of the BOLT-binary-analysis document](BinaryAnalysis.md#pac-ret-analysis).
18+
19+
### DW_CFA_AARCH64_negate_ra_state
20+
21+
The negate-ra-state CFI is a vendor-specific Call Frame Instruction defined in
22+
the [Arm ABI](https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id1).
23+
24+
```
25+
The DW_CFA_AARCH64_negate_ra_state operation negates bit[0] of the RA_SIGN_STATE pseudo-register.
26+
```
27+
28+
This bit indicates to the unwinder whether the current return address is signed
29+
or not (hence the name). The unwinder uses this information to authenticate the
30+
pointer, and remove the Pointer Authentication Code (PAC) bits.
31+
Incorrect placement of negate-ra-state CFIs causes the unwinder to either attempt
32+
to authenticate an unsigned pointer (resulting in a segmentation fault), or skip
33+
authentication on a signed pointer, which can also cause a fault.
34+
35+
Note: some unwinders use the `xpac` instruction to strip the PAC bits without
36+
authenticating the pointer. This is an incorrect (incomplete) implementation,
37+
as it allows control-flow modification in the case of unwinding.
38+
39+
There are no DWARF instructions to directly set or clear the RA State. However,
40+
two other CFIs can also affect the RA state:
41+
- `DW_CFA_remember_state`: this CFI stores register rules onto an implicit stack.
42+
- `DW_CFA_restore_state`: this CFI pops rules from this stack.
43+
44+
Example:
45+
46+
| CFI | Effect on RA state |
47+
| ------------------------------ | ------------------------------ |
48+
| (default) | 0 |
49+
| DW_CFA_AARCH64_negate_ra_state | 0 -> 1 |
50+
| DW_CFA_remember_state | 1 pushed to the stack |
51+
| DW_CFA_AARCH64_negate_ra_state | 1 -> 0 |
52+
| DW_CFA_restore_state | 0 -> 1 (popped from the stack) |
53+
54+
The Arm ABI also defines the DW_CFA_AARCH64_negate_ra_state_with_pc CFI, but it
55+
is not widely used, and is [likely to become deprecated](https://github.com/ARM-software/abi-aa/issues/327).
56+
57+
### Where are these CFIs needed?
58+
59+
Whenever two consecutive instructions have different RA states, the unwinder must
60+
be informed of the change. This typically occurs during pointer signing or
61+
authentication. If adjacent instructions differ in RA state but neither signs
62+
nor authenticates the return address, they must belong to different control flow
63+
paths. One is part of an execution path with signed RA, the other is part of a
64+
path with an unsigned RA.
65+
66+
In the example below, the first BasicBlock ends in a conditional branch, and
67+
jumps to two different BasicBlocks, each with their own authentication, and
68+
return. The instructions on the border of the second and third BasicBlock have
69+
different RA states. The `ret` at the end of the second BasicBlock is in unsigned
70+
state. The start of the third BasicBlock is after the `paciasp` in the control
71+
flow, but before the authentication. In this case, a negate-ra-state is needed
72+
at the end of the second BasicBlock.
73+
74+
```
75+
+----------------+
76+
| paciasp |
77+
| |
78+
| b.cc |
79+
+--------+-------+
80+
|
81+
+----------------+
82+
| |
83+
| +--------v-------+
84+
| | |
85+
| | autiasp |
86+
| | ret | // RA: unsigned
87+
| +----------------+
88+
+----------------+
89+
|
90+
+--------v-------+ // RA: signed
91+
| |
92+
| autiasp |
93+
| ret |
94+
+----------------+
95+
```
96+
97+
> [!important]
98+
> The unwinder does not follow the control flow graph. It reads unwind
99+
> information in the layout order.
100+
101+
Because these locations are dependent on how the function layout looks,
102+
negate-ra-state CFIs will become invalid during BasicBlock reordering.
103+
104+
## Solution design
105+
106+
The implementation introduces two new passes:
107+
1. `MarkRAStatesPass`: assigns the RA state to each instruction based on the CFIs
108+
in the input binary
109+
2. `InsertNegateRAStatePass`: reads those assigned instruction RA states after
110+
optimizations, and emits `DW_CFA_AARCH64_negate_ra_state` CFIs at the correct
111+
places: wherever there is a state change between two consecutive instructions
112+
in the layout order.
113+
114+
To track metadata on individual instructions, the `MCAnnotation` class was
115+
extended. These also have helper functions in `MCPlusBuilder`.
116+
117+
### Saving annotations at CFI reading
118+
119+
CFIs are read and added to BinaryFunctions in `CFIReaderWriter::FillCFIInfoFor`.
120+
At this point, we add MCAnnotations about negate-ra-state, remember-state and
121+
restore-state CFIs to the instructions they refer to. This is to not interfere
122+
with the CFI processing that already happens in BOLT (e.g. remember-state and
123+
restore-state CFIs are removed in `normalizeCFIState` for reasons unrelated to PAC).
124+
125+
As we add the MCAnnotations *to instructions*, we have to account for the case
126+
where the function starts with a CFI altering the RA state. As CFIs modify the RA
127+
state of the instructions before them, we cannot add the annotation to the first
128+
instruction.
129+
This special case is handled by adding an `initialRAState` bool to each BinaryFunction.
130+
If the `Offset` the CFI refers to is zero, we don't store an annotation, but set
131+
the `initialRAState` in `FillCFIInfoFor`. This information is then used in
132+
`MarkRAStates`.
133+
134+
### Binaries without DWARF info
135+
136+
In some cases, the DWARF tables are stripped from the binary. These programs
137+
usually have some other unwind-mechanism.
138+
These passes only run on functions that include at least one negate-ra-state CFI.
139+
This avoids processing functions that do not use Pointer Authentication, or on
140+
functions that use Pointer Authentication, but do not have DWARF info.
141+
142+
In summary:
143+
- pointer auth is not used: no change, the new passes do not run.
144+
- pointer auth is used, but DWARF info is stripped: no change, the new passes
145+
do not run.
146+
- pointer auth is used, and we have DWARF CFIs: passes run, and rewrite the
147+
negate-ra-state CFI.
148+
149+
### MarkRAStates pass
150+
151+
This pass runs before optimizations reorder anything.
152+
153+
It processes MCAnnotations generated during the CFI reading stage to check if
154+
instructions have either of the three CFIs that can modify RA state:
155+
- negate-ra-state,
156+
- remember-state,
157+
- restore-state.
158+
159+
Then it adds new MCAnnotations to each instruction, indicating their RA state.
160+
Those annotations are:
161+
- Signed,
162+
- Unsigned.
163+
164+
Below is a simple example, that shows the two different type of annotations:
165+
what we have before the pass, and after it.
166+
167+
| Instruction | Before | After |
168+
| ----------------------------- | --------------- | -------- |
169+
| paciasp | negate-ra-state | unsigned |
170+
| stp x29, x30, [sp, #-0x10]! | | signed |
171+
| mov x29, sp | | signed |
172+
| ldp x29, x30, [sp], #0x10 | | signed |
173+
| autiasp | negate-ra-state | signed |
174+
| ret | | unsigned |
175+
176+
##### Error handling in MarkRAState Pass:
177+
178+
Whenever the MarkRAStates pass finds inconsistencies in the current
179+
BinaryFunction, it marks the function as ignored using `BF.setIgnored()`. BOLT
180+
will not optimize this function but will emit it unchanged in the original section
181+
(`.bolt.org.text`).
182+
183+
The inconsistencies are as follows:
184+
- finding a `pac*` instruction when already in signed state
185+
- finding an `aut*` instruction when already in unsigned state
186+
- finding `pac*` and `aut*` instructions without `.cfi_negate_ra_state`.
187+
188+
Users will be informed about the number of ignored functions in the pass, the
189+
exact functions ignored, and the found inconsistency.
190+
191+
### InsertNegateRAStatePass
192+
193+
This pass runs after optimizations. It performns the _inverse_ of MarkRAState pa s:
194+
1. it reads the RA state annotations attached to the instructions, and
195+
2. whenever the state changes, it adds a PseudoInstruction that holds an
196+
OpNegateRAState CFI.
197+
198+
##### Covering newly generated instructions:
199+
200+
Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
201+
to know what RA state these have.
202+
203+
The current solution has the `inferUnknownStates` function to cover these, using
204+
a fairly simple strategy: unknown states inherit the last known state.
205+
206+
This will be updated to a more robust solution.
207+
208+
> [!important]
209+
> As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
210+
> For this same reason, we cannot generate correct pac-specific unwind info: the signess
211+
> of the _incorrect_ return address is meaningless.
212+
213+
### Optimizations requiring special attention
214+
215+
Marking states before optimizations ensure that instructions can be moved around
216+
freely. The only special case is function splitting. When a function is split,
217+
the split part becomes a new function in the emitted binary. For unwinding to
218+
work, it needs to "replay" all CFIs that lead up to the split point. BOLT does
219+
this for other CFIs. As negate-ra-state is not read (only stored as an Annotation),
220+
we have to do this manually in InsertNegateRAStatePass. Here, if the split part
221+
starts with an instruction that has Signed RA state, we add a negate-ra-state CFI
222+
to indicate this.
223+
224+
## Option to disallow the feature
225+
226+
The feature can be guarded with the `--update-branch-prediction` flag, which is
227+
on by default. If the flag is set to false, and a function
228+
`containedNegateRAState()` after `FillCFIInfoFor()`, BOLT exits with an error.

0 commit comments

Comments
 (0)