Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ jobs:
-e ENABLE_SANITIZER_UNDEFINED_BEHAVIOR="${ENABLE_SANITIZER_UNDEFINED_BEHAVIOR}" \
-e ENABLE_SANITIZER_ADDRESS="${ENABLE_SANITIZER_ADDRESS}" \
-e CI=true \
$DEV_IMAGE \
"$DEV_IMAGE" \
cmake --preset ci -DPE_USE_VENDORED_Z3=OFF -DLLVM_EXTERNAL_LIT=/usr/local/bin/lit -DLLVM_Z3_INSTALL_DIR=/usr/local
- name: Build ${{ matrix.build-type }} with sanitizers set ${{ matrix.sanitizers }}
Expand All @@ -139,6 +139,23 @@ jobs:
run: |
bash ./scripts/ghidra/build-headless-docker.sh
- name: Run cached patch matrix
if: matrix.build-type == 'Debug' && matrix.sanitizers == 'OFF'
run: |
docker run --rm \
-v ${{ github.workspace }}:/workspace \
-v /tmp/.gitconfig:/root/.gitconfig:ro \
-w /workspace \
-v /var/run/docker.sock:/var/run/docker.sock \
-e CI=true \
-e HOST_WORKSPACE=${{ github.workspace }} \
$DEV_IMAGE \
bash ./scripts/test-patch-matrix.sh \
--build-type Debug \
--build-root ci \
--rebuild-firmware \
--rebuild-fixtures
- name: Test ${{ matrix.build-type }} with sanitizers set ${{ matrix.sanitizers }}
run: |
docker run --rm \
Expand All @@ -149,4 +166,4 @@ jobs:
-e CI=true \
-e HOST_WORKSPACE=${{ github.workspace }} \
$DEV_IMAGE \
lit ./builds/ci/test -D BUILD_TYPE=${{ matrix.build-type }} -v -DCI_OUTPUT_FOLDER=/workspace/builds/ci/test/ghidra/Output
lit ./builds/ci/test -D BUILD_TYPE=${{ matrix.build-type }} -v -DCI_OUTPUT_FOLDER=/workspace/builds/ci/test/ghidra/Output
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
.vscode/
build/
builds/
firmwares/output/
firmwares/repos/
cmake-build-*/
prefix/
.clangd
Expand All @@ -16,4 +18,3 @@ _site/
.classpath
.project
.settings

124 changes: 116 additions & 8 deletions docs/GettingStarted/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@
# Building
We mostly rely on a build container, but some dependencies are still needed outside that container: our fork of [LLVM20](https://github.com/trail-of-forks/clangir), a local copy of `lld`, and LLVM [LIT](https://llvm.org/docs/CommandGuide/lit.html).

From a fresh checkout, initialize vendored sources first:
```sh
git submodule update --init --recursive
```

In order to set up those and build Patchestry, please follow the first-time instructions for your development environment of choice:
- [macOS](#first-time-setup-macos)
- [Linux](#first-time-setup-linux)
Expand All @@ -27,10 +32,15 @@ See also: [Development](#development)
```
mkdir -p ~/.docker/cli-plugins
ln -s $(which docker-buildx) ~/.docker/cli-plugins/docker-buildx
colima restart
colima start --vm-type vz
docker buildx version
docker ps
```

The validated Apple Silicon macOS path uses the `vz` backend.
The `linux/amd64` emulation path is materially slower and is not the routine
workflow described in this document.

4. Log into Docker Hub (this may not be needed - it is not needed on Linux):
```
docker login -u <username>
Expand All @@ -52,16 +62,32 @@ See also: [Development](#development)

The targets list of `"host;AArch64;ARM;X86"` is intentional (to always build host arch, AArch64, ARM, and x86), even if host arch is almost certainly either AArch64 or X86.

This must be the patched `trail-of-forks/clangir` toolchain, or an equivalent
install built from the same fork. A stock Homebrew `llvm` or `llvm@20` install
is not a supported substitute for host-native patchestry builds.
The `.devcontainer/README-HOST-BUILD.md` workflow builds a Linux arm64 toolchain
for container images; it is not a host-native macOS ClangIR install.


6. Build with:
6. Configure and build with the patched ClangIR toolchain you just installed:
```
CC=$(which clang) CXX=$(which clang++) cmake \
--preset default \
-DCMAKE_PREFIX_PATH=<path_to_llvm_install>/lib/cmake/ \
export LLVM_INSTALL_PREFIX=<path_to_llvm_install>
export CC="${LLVM_INSTALL_PREFIX}/bin/clang"
export CXX="${LLVM_INSTALL_PREFIX}/bin/clang++"
export CMAKE_PREFIX_PATH="${LLVM_INSTALL_PREFIX}/lib/cmake/llvm;${LLVM_INSTALL_PREFIX}/lib/cmake/mlir;${LLVM_INSTALL_PREFIX}/lib/cmake/clang"

cmake \
--fresh --preset default \
-DLLVM_EXTERNAL_LIT=$(which lit)

cmake --build --preset debug -j
```

This setup provides a complete development environment for building and running the project on MacOS. The configuration uses Colima as a Docker backend, which provides better performance and resource management compared to Docker Desktop on MacOS.
This setup provides a host-native development environment when the patched
ClangIR fork is already installed. The configuration uses Colima as the Docker
backend for Docker-backed workflows on macOS.
This workflow expects `CC` and `CXX` to point at the patched ClangIR toolchain,
not AppleClang or a stock Homebrew LLVM install.

# First Time Development Setup: Linux
If you'd like to either follow step by step instructions or run a script to automatically follow them in a fresh Linux instance, here's a [Gist](https://gist.github.com/kaoudis/e734c6197dbed595586ab659844df737) that sets everything up from zero in a fresh VM for you and runs the Patchestry tests to confirm the setup works. This Gist should stay reasonably up to date since it's used to initialize ephemeral coding environments. It's been tested on Ubuntu 24.04. The only thing that should be different for other Ubuntus or for Debian is the `apt` package naming.
Expand All @@ -77,8 +103,90 @@ Steps followed in the [Gist](https://gist.github.com/kaoudis/e734c6197dbed595586
# Development

## CMake Commands
- to build, see the command referenced in step 6 [above](#first-time-development-setup-macos) or the commands used for [Linux](#first-time-development-setup-linux). You'll use the `default` preset to configure and most likely the `debug` or `release` presets for the subsequent build command after configuration.
- to run tests, ensure the headless container is available first by running `scripts/ghidra/build-headless-docker.sh`, then you may `cmake --build builds/default/ -j$((`nproc`+1)) --preset debug --target test` (using the preset of your choice but selecting the `test` target)
- To build, configure with the `default` preset and build with `cmake --build --preset debug` or `cmake --build --preset release`.
- To run tests, first build the headless container with `scripts/ghidra/build-headless-docker.sh`, then run `ctest --preset debug --output-on-failure` or `lit ./builds/default/test -D BUILD_TYPE=Debug -v`.
- To run the cached patch/contract matrix from one command, use `scripts/test-patch-matrix.sh --build-type Debug`.
- To run the example firmware end-to-end flow and get a report, use `scripts/test-example-firmwares.sh --build-type Debug`.

## Fresh checkout to validated build

The validated Apple Silicon macOS path is the host-native patched ClangIR
workflow:

```sh
git submodule update --init --recursive

export LLVM_INSTALL_PREFIX=<path_to_llvm_install>
export CC="${LLVM_INSTALL_PREFIX}/bin/clang"
export CXX="${LLVM_INSTALL_PREFIX}/bin/clang++"
export CMAKE_PREFIX_PATH="${LLVM_INSTALL_PREFIX}/lib/cmake/llvm;${LLVM_INSTALL_PREFIX}/lib/cmake/mlir;${LLVM_INSTALL_PREFIX}/lib/cmake/clang"

cmake --fresh --preset default \
-DLLVM_EXTERNAL_LIT=$(which lit)

cmake --build --preset debug -j

cmake -S lib/patchestry/intrinsics -B lib/patchestry/intrinsics/build_standalone \
-DCMAKE_BUILD_TYPE=Release
cmake --build lib/patchestry/intrinsics/build_standalone -j

bash ./scripts/ghidra/build-headless-docker.sh

lit ./builds/default/test -D BUILD_TYPE=Debug -v
```

This validates:
1. native configure against the patched fork,
2. the Debug patchestry build,
3. the standalone intrinsics library,
4. the headless Ghidra Docker image on Apple Silicon,
5. the full lit tree.

To validate the documented example firmware patching flow and generate a report:

```sh
scripts/test-example-firmwares.sh --build-type Debug
```

This writes per-case artifacts plus:

- `builds/example-firmware-e2e/summary.md`
- `builds/example-firmware-e2e/summary.tsv`

To validate the broader patch/contract matrix from cached generated fixtures:

```sh
scripts/test-patch-matrix.sh --build-type Debug
```

This reuses firmware artifacts in `firmwares/output/` and fixture caches in
`builds/test-fixtures/` when present. Use `--rebuild-firmware`,
`--rebuild-ghidra`, `--rebuild-fixtures`, or `--clean` to refresh caches
explicitly.

This writes per-case artifacts plus:

- `builds/patch-matrix/summary.md`
- `builds/patch-matrix/summary.tsv`

Docker-backed workflows are still required for `build.sh` and Ghidra headless
tasks. On Apple Silicon, the routine workflow is the host-native path described
above. The default `linux/amd64` emulation path remains available, but with the
expected emulation overhead.
The validated Ghidra image build used Colima with the `vz` backend and built
Ghidra natives for `linux_arm_64`.

CI uses the same high-level sequence on Linux:
1. Configure with `cmake --preset ci`.
2. Build with `cmake --build --preset ci --config <Debug|Release>`.
3. Build the standalone intrinsics library.
4. Build the headless Ghidra Docker image.
5. Run `scripts/test-patch-matrix.sh --build-type Debug --rebuild-firmware --rebuild-fixtures`.
6. Run `lit ./builds/ci/test`.

The narrower example firmware runner remains available for focused inspection
and reporting. Use the opt-in CTest target by configuring with
`-DPE_ENABLE_EXAMPLE_FIRMWARE_E2E=ON` if you want CTest to invoke it.

## Ghidra

Expand Down
131 changes: 126 additions & 5 deletions docs/GettingStarted/firmware_examples.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,76 @@
# How To Run Patchestry on Firmware Examples

## Automated end-to-end runner

The repository runner provides one command that:

1. builds the example firmware artifacts,
2. decompiles representative example functions to JSON,
3. converts JSON to CIR,
4. applies the in-repo example patch specs,
5. lowers the patched CIR to LLVM IR,
6. writes a report and per-case logs/artifacts.

```sh
scripts/test-example-firmwares.sh --build-type Debug
```

Artifacts and reports are written to:

```sh
builds/example-firmware-e2e/
```

The runner currently validates these repository-supported example cases:

- `pulseox_measurement_update`
- `bloodlight_usb_send_message`
- `bloodview_device_process_entry`

Generated reports:

- `builds/example-firmware-e2e/summary.md`
- `builds/example-firmware-e2e/summary.tsv`

The tested endpoint remains patched CIR and LLVM IR/bitcode, not a final
rewritten firmware binary.

## Cached patch/contract matrix runner

The matrix runner provides one command that:

1. reuses or rebuilds the example firmware artifacts,
2. reuses or rebuilds cached decompile JSON and base CIR fixtures,
3. validates the repository-supported patch and contract spec matrix,
4. lowers each patched CIR to LLVM IR,
5. writes a summary report plus per-case logs and artifacts.

```sh
scripts/test-patch-matrix.sh --build-type Debug
```

Artifacts and reports are written to:

```sh
builds/patch-matrix/
```

Fixture caches are written to:

```sh
builds/test-fixtures/
```

Firmware caches remain under:

```sh
firmwares/output/
```

By default the runner reuses any existing caches. Use `--rebuild-firmware`,
`--rebuild-ghidra`, `--rebuild-fixtures`, or `--clean` when you want to force
fresh inputs.

## Build the Ghidra docker image

First, make sure that the firwmare decompilation Ghidra docker image is set up correctly:
Expand Down Expand Up @@ -31,18 +102,68 @@ For each firmware blob you want to decompile, use the decompile-headless script
scripts/ghidra/decompile-headless.sh --input firmwares/output/bloodlight-firmware.elf --output ~/temp/patchestry/bloodlight-firmware.json
```

This should produce the output json file, which can be used with tools like `pcode-lifter`.
This should produce the output JSON file, which can be consumed by `patchir-decomp`.

## Convert it to JSON to CIR
## Convert JSON to CIR

The JSON (which encompasses Ghidra high-pcode) can then be converted to ClangIR via `pcode-lifter` as follows:
The JSON (which encompasses Ghidra high-pcode) can then be converted to CIR via
`patchir-decomp` as follows:
```sh
builds/default/tools/pcode-lifter/Release/pcode-lifter --input ~/temp/patchestry/pulseox-firmware.json --emit-cir --output ~/temp/patchestry/pulseox-firmware_cir --print-tu
builds/default/tools/patchir-decomp/Debug/patchir-decomp \
--input ~/temp/patchestry/pulseox-firmware.json \
--emit-cir \
--output ~/temp/patchestry/pulseox-firmware_cir \
--print-tu
```

The `--print-tu` argument is optional, it will emit C along with the ClangIR. The output looks like:
The `--print-tu` argument is optional; it emits C alongside the CIR. The output
looks like:
```sh
ls -1 ~/temp/patchestry/pulseox-firmware_cir*
/Users/artem/temp/patchestry/pulseox-firmware_cir.c
/Users/artem/temp/patchestry/pulseox-firmware_cir.cir
```

## Optional patching and lowering flow

Once you have CIR, the repository-supported patching flow is:

```sh
# Validate a YAML patch specification
builds/default/tools/patchir-yaml-parser/Debug/patchir-yaml-parser patch.yaml --validate

# Apply the patch spec to CIR
builds/default/tools/patchir-transform/Debug/patchir-transform \
~/temp/patchestry/pulseox-firmware_cir.cir \
--spec patch.yaml \
-o ~/temp/patchestry/pulseox-firmware_patched.cir

# Lower patched CIR to LLVM IR
builds/default/tools/patchir-cir2llvm/Debug/patchir-cir2llvm \
-S \
~/temp/patchestry/pulseox-firmware_patched.cir \
-o ~/temp/patchestry/pulseox-firmware_patched.ll
```

This repository's native tested endpoint is patched CIR and LLVM IR/bitcode.
Producing a final rewritten firmware binary is downstream of patchestry and
typically handled by external tooling.

## Opt-in automation via CTest

If you want this flow exposed through CTest, reconfigure with:

```sh
cmake --fresh --preset default \
-DPE_ENABLE_EXAMPLE_FIRMWARE_E2E=ON \
-DLLVM_EXTERNAL_LIT=$(which lit)
```

Then run:

```sh
ctest --preset debug -R example-firmware-e2e-tests --output-on-failure
```

This target is opt-in because it builds external example firmware repositories
and requires Docker-backed Ghidra decompilation.
Loading