Skip to content

Commit 2b58b00

Browse files
authored
feat: python package refactor (#75)
* Rename to internal * Reorg * Fill in jingle types * More shuffling * More shuffling * Add some stuff * uv venv * uv nox stuff * stuff * test * test * blah * blah * hate doing this garbage * Fix ruff lints * Fix formatting * Add in pydantic models * Ruff fmt * Ruff fmt * pydantic * pydantic * does this work? * Try it out * blah * Bump min wheel version * Readme tweaks * Readme tweaks * Readme tweaks * Readme tweaks * Readme tweaks * Defining the pydantic interface I would like to have * Reorg * Incremental progress * More stuff * Log bridging * More stuff! * More stuff! * fmt + clippy * Fix? * Reorg * Update README.md * fmt clippy and sdist * README.md * readme grammar * python readme * Typing stuff * Typing fixes
1 parent 8911e8d commit 2b58b00

31 files changed

+1116
-329
lines changed

.github/workflows/python-style.yml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
name: Python Style
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
pull_request:
8+
branches:
9+
- main
10+
workflow_dispatch:
11+
12+
jobs:
13+
lint:
14+
name: ${{ matrix.session }}
15+
runs-on: ubuntu-latest
16+
strategy:
17+
matrix:
18+
session: [ruff, mypy]
19+
steps:
20+
- uses: actions/checkout@v4
21+
- uses: actions/setup-python@v5
22+
with:
23+
python-version: '3.x'
24+
- name: Install uv
25+
run: |
26+
curl -LsSf https://astral.sh/uv/install.sh | sh
27+
- name: Create uv virtual environment
28+
run: |
29+
cd crackers_python
30+
uv venv
31+
- name: Install dev dependencies
32+
run: |
33+
cd crackers_python
34+
uv pip install -e .[dev]
35+
- name: Run nox session
36+
run: |
37+
cd crackers_python
38+
uv run nox -s ${{ matrix.session }}

.github/workflows/python.yml

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ on:
1414
- main
1515
workflow_dispatch:
1616
release:
17-
types: [published]
17+
types: [ published ]
1818

1919
permissions:
2020
contents: read
@@ -34,14 +34,11 @@ jobs:
3434
- uses: actions/checkout@v4
3535
with:
3636
submodules: true
37-
- uses: actions/setup-python@v5
38-
with:
39-
python-version: 3.x
4037
- name: Build wheels
4138
uses: PyO3/maturin-action@v1
4239
with:
4340
target: ${{ matrix.platform.target }}
44-
args: --release --out dist --find-interpreter --auditwheel skip
41+
args: --release --out dist --interpreter python3.10 python3.11 python3.12 python3.13 python3.13t pypy3.10 pypy3.11 --auditwheel skip
4542
working-directory: crackers_python
4643
manylinux: manylinux2_28
4744
before-script-linux: |

README.md

Lines changed: 141 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -7,45 +7,129 @@
77

88
# `crackers`: A Tool for Synthesizing Code-Reuse Attacks from `p-code` Programs
99

10-
This repo contains the source code of `crackers`, a procedure for synthesizing
11-
code-reuse attacks (e.g. ROP). `crackers` takes as input a "reference program", usually
12-
written in an assembly language, a binary (of the same architecture) in which to look
13-
for gadgets, and user-provided constraints to enforce on synthesized chains. It will always
14-
return an answer (though there is no strict bound to runtime), reporting either that the problem
15-
is UNSAT, or providing an assignment of gadgets that meet all constraints, and a model
16-
of the memory state of the PCODE virtual machine at every stage of the computation.
10+
[![Build](https://github.com/toolCHAINZ/crackers/actions/workflows/build.yml/badge.svg)](https://github.com/toolCHAINZ/crackers/actions/workflows/build.yml)
11+
[![docs.rs](https://docs.rs/crackers/badge.svg)](https://docs.rs/crackers)
12+
13+
This repository contains the source code for `crackers`, a tool for synthesizing
14+
code-reuse attacks (e.g., ROP) built around the Z3 SMT Solver and Ghidra's SLEIGH code translator.
1715

18-
`crackers` will assume that _all_ system state is usable unless the user prohibits it by providing a constraint.
19-
This approach, while requiring more human guidance, allows it to minimize the assumptions it makes about the
20-
arrangement and roles of memory in an exploit.
16+
## How does it work?
2117

22-
To validate chains, `crackers` builds a mathematical model of the trace through the gadgets. This model, along
23-
with user-provided constraints, is verified against a model of the "reference program". When this verification
18+
`crackers` takes as input a "reference program," usually
19+
written in an assembly language, a binary (of the same architecture) in which to look
20+
for gadgets, and user-provided constraints to enforce on synthesized chains. It will always
21+
return an answer (though there is no strict bound on runtime), reporting either that the problem
22+
is UNSAT or providing an assignment of gadgets that meet all constraints, along with a model
23+
of the memory state of the PCODE virtual machine at every step in the chain. This memory model can
24+
then be used to derive the inputs necessary to realize the ROP chain.
25+
26+
`crackers` itself makes _no_ assumptions about the layout of memory in the target program, nor the extent of an attacker's
27+
control over it: it assumes that _all_ system state is usable unless explicitly prohibited through a constraint.
28+
This approach increases flexibility, with the drawback of requiring more human-guided
29+
configuration than many other ROP tools.
30+
31+
To validate chains, `crackers` builds a mathematical model of the execution of a candidate chain and makes assertions on it
32+
against a model generated from a specification (itself expressed as a sequence of PCODE operations). When this verification
2433
returns SAT, `crackers` returns the Z3 model of the memory state of the chain at every point of its execution. This
2534
memory model may be used to derive the contents of memory needed to invoke the chain, and transitively the input needed to
2635
provide to the program to realize it.
2736

28-
Note that the provided CLI of crackers currently only prints the selected gadgets to the command line.
29-
To make use of the logical memory model it must be used through the programmatic API.
30-
31-
There is also an experimental Python binding for `crackers`, allowing for basic configuration, execution, and model
32-
inspection from python. These bindings also feature cross-FFI support with the python z3 bindings, allowing
33-
constraints to be expressed as python functions or `lambda` expressions returning Python Z3 `BoolRef` instances.
37+
`crackers` is available as a command-line tool, a Rust crate, or a Python package.
3438

3539
### This software is still in alpha and may change at any time
3640

37-
## CLI Usage
41+
## How do I use it?
42+
43+
You have three options:
44+
45+
### Python Package
46+
47+
[![PyPI](https://img.shields.io/pypi/v/crackers)](https://pypi.org/project/crackers/)
48+
49+
The easiest way to use `crackers` is through the [PyPI](https://pypi.org/project/crackers/) package. For every release, we provide wheels for `[MacOS, Windows, Linux] x [3.10, 3.11, 3.12, 3.13]`.
50+
51+
A simple usage looks like the following:
52+
53+
```python
54+
import logging
55+
logging.basicConfig(level=logging.INFO)
56+
57+
from z3 import BoolRef, BoolVal
58+
59+
from crackers import State, ModeledBlock
60+
from crackers.config import MetaConfig, LibraryConfig, SleighConfig,
61+
ReferenceProgramConfig, SynthesisConfig, ConstraintConfig, CrackersConfig
62+
from crackers.config.constraint import RegisterValuation,
63+
RegisterStringValuation, MemoryValuation, PointerRange,
64+
CustomStateConstraint, CustomTransitionConstraint, PointerRangeRole
65+
from crackers.config.log_level import LogLevel
66+
from crackers.config.synthesis import SynthesisStrategy
67+
68+
# Custom state constraint example
69+
def my_constraint(s: State, _addr: int) -> BoolRef:
70+
rdi = s.read_register("RDI")
71+
rcx = s.read_register("RCX")
72+
return rdi == (rcx ^ 0x5a5a5a5a5a5a5a5a)
73+
74+
75+
# Custom transition constraint example
76+
def my_transition_constraint(block: ModeledBlock) -> BoolRef:
77+
# Dummy: always true
78+
return BoolVal(True)
79+
80+
81+
meta = MetaConfig(log_level=LogLevel.INFO, seed=42)
82+
library = LibraryConfig(max_gadget_length=8, path="libz.so.1", sample_size=None,
83+
base_address=None)
84+
sleigh = SleighConfig(ghidra_path="/Applications/ghidra")
85+
reference_program = ReferenceProgramConfig(path="sample.o", max_instructions=8, base_address=library.base_address)
86+
synthesis = SynthesisConfig(strategy=SynthesisStrategy.SAT, max_candidates_per_slot=200, parallel=8, combine_instructions=True)
87+
constraint = ConstraintConfig(
88+
precondition=[
89+
RegisterValuation(name="rdi", value=0xdeadbeef),
90+
MemoryValuation(space="ram", address=0x1000, size=4, value=0x41),
91+
RegisterStringValuation(reg="rsi", value="/bin/sh"),
92+
CustomStateConstraint(code=my_constraint)
93+
],
94+
postcondition=[
95+
RegisterValuation(name="rax", value=0x1337),
96+
CustomStateConstraint(code=my_constraint)
97+
],
98+
transition=[
99+
PointerRange(role=PointerRangeRole.READ, min=0x2000, max=0x3000),
100+
CustomTransitionConstraint(code=my_transition_constraint)
101+
]
102+
)
103+
config = CrackersConfig(meta=meta, library=library, sleigh=sleigh, specification=reference_program, synthesis=synthesis, constraint=constraint)
104+
config.run()
105+
```
106+
107+
### Rust CLI
108+
109+
You can install the `crackers` CLI from `crates.io` by running:
110+
111+
```sh
112+
cargo install --all-features crackers
113+
```
114+
115+
You can then run:
116+
117+
```sh
118+
crackers new my_config.toml
119+
```
120+
121+
to generate a new configuration for the tool at `my_config.toml`. This config file can be adjusted
122+
for your use case and then used with:
38123

39124
```sh
40-
cargo install --all-features --path .
125+
crackers synth my_config.toml
41126
```
42127

43-
This will install the `crackers` binary in your path. `crackers` takes a single command line argument,
44-
pointing to a config file. An example file follows:
128+
There are many options to configure in this file. An example is below:
45129

46130
```toml
47-
# location to find a ghidra installation. This is only used for
48-
# locating architecture definitions
131+
# Location to find a Ghidra installation. This is only used for
132+
# SLEIGH architecture definitions
49133
[sleigh]
50134
ghidra_path = "/Applications/ghidra"
51135

@@ -56,34 +140,34 @@ ghidra_path = "/Applications/ghidra"
56140
# * "optimize" is a weighted SAT problem, giving preference to shorter gadgets
57141
# Optimize tends to perform better when only one validation worker is present and SAT scales better with more workers
58142
strategy = "sat"
59-
# the maximum number of candidates that are considered for each sub-slice of the specification
60-
# if you don't want to cap this, just set it arbitrarily high. Might make it optional later
143+
# The maximum number of candidates considered for each sub-slice of the specification
144+
# If you don't want to cap this, just set it arbitrarily high. Might make it optional later
61145
max_candidates_per_slot = 50
62146
# The number of chain validation workers to use
63147
parallel = 8
64148

65149
# crackers works by taking in an "example" computation and synthesizing a compatible chain
66-
# right now, it does not support specifications with controlflow
150+
# Right now, it does not support specifications with control flow
67151
[specification]
68-
# the path at which to find the raw binary containing the bytes of the specification computation
152+
# The path at which to find the raw binary containing the bytes of the specification computation
69153
path = "bin/execve_instrs.bin"
70-
# the number of assembly instructions in the specification
154+
# The number of assembly instructions in the specification
71155
max_instructions = 5
72156

73-
# settings involving the file from which to pull gadgets
157+
# Settings involving the file from which to pull gadgets
74158
[library]
75-
# the path to the file. It can be any type of object file that gimli_object can parse (e.g. ELF, PE)
159+
# The path to the file. It can be any type of object file that gimli_object can parse (e.g., ELF, PE)
76160
path = "bin/libc_wrapper"
77-
# the maximum length of gadget to extract. Raising this number increases both the complexity of the gadgets
161+
# The maximum length of gadget to extract. Raising this number increases both the complexity of the gadgets
78162
# that are reasoned about and the total number of found gadgets
79163
max_gadget_length = 4
80-
# optionally randomly sample the set of parsed gadgets to a given size
164+
# Optionally randomly sample the set of parsed gadgets to a given size
81165
random_sample_size = 20000
82-
# optionally use a set seed for gadget selection
166+
# Optionally use a set seed for gadget selection
83167
# random_sample_seed = 0x234
84168

85-
# from this point on are constraints that we put on the synthesis
86-
# these are fairly self-explanatory
169+
# From this point on are constraints that we put on the synthesis
170+
# These are fairly self-explanatory
87171
[constraint.precondition.register]
88172
RAX = 0
89173
RCX = 0x440f30
@@ -109,75 +193,48 @@ RAX = 0x3b
109193
RSI = 0
110194
RDX = 0
111195

112-
# this constraint enforces that the value pointed to by this register
196+
# This constraint enforces that the value pointed to by this register
113197
# must be equal to the given string
114198
[constraint.postcondition.pointer]
115199
RDI = "/bin/sh"
116200

117-
# ANY pointer access, read or write must fall in this range
118-
# might separate read/write later
201+
# ANY pointer access, read or write, must fall in this range
202+
# Might separate read/write later
119203
[constraint.pointer]
120204
min = 0x7fffffffde00
121205
max = 0x7ffffffff000
122206
```
123207

124-
A successful synthesis will print out a listing of the gadgets were selected.
125-
126-
## Library Usage
208+
Note that using the CLI, a successful synthesis will print out a listing of the gadgets that were selected,
209+
but not the memory model found in synthesis.
127210

128-
`crackers` intended mode of use is as a library. All of the above settings from the config correspond
129-
to settings that can be set programmatically by API consumers.
211+
### Rust Crate
130212

131-
When using the API, rather than getting a listing of gadgets as an output, you get a model of the synthesized chain.
132-
This model of the chain includes information about what gadgets were selected as well as a Z3 `Model` representing the
133-
memory at all states of execution in the gadget chain. This model can be queried to derive the memory conditions
134-
necessary to execute the chain.
213+
[![Crates.io](https://img.shields.io/crates/v/crackers.svg)](https://crates.io/crates/crackers)
135214

136-
### Constraints
215+
`crackers` is on `crates.io` and can be added to your project with:
137216

138-
Constraints work a little differently with the API. Instead of specifying registers and register equality,
139-
`crackers` allows consumers to provide a closure of the following types:
140-
141-
```rust
142-
pub type StateConstraintGenerator = dyn for<'a, 'b> Fn(&'a Context, &'b State<'a>) -> Result<Bool<'a>, CrackersError>
143-
+ Send
144-
+ Sync
145-
+ 'static;
146-
pub type PointerConstraintGenerator = dyn for<'a, 'b> Fn(
147-
&'a Context,
148-
&'b ResolvedVarnode<'a>,
149-
&'b State<'a>,
150-
) -> Result<Option<Bool<'a>>, CrackersError>
151-
+ Send
152-
+ Sync
153-
+ 'static;
217+
```sh
218+
cargo add crackers
154219
```
155220

156-
The first type is used for asserting initial and final space constraints. These functions take a z3 context, and a handle
157-
to the program state, returning a `Result<Bool>`. The decision procedure will automatically evaluate
158-
provided functions and assert the booleans they return.
221+
API documentation can be found on [docs.rs](https://docs.rs/crackers/latest/crackers/).
159222

160-
The second type is used for asserting read/write invariants. These functions take in a handle to z3, as well as a struct containing
161-
the bitvector corresponding to the read/write address, as well as the state the read/write is being performed on.
162-
Any time a chain reads or writes from memory, the procedure will automatically call these functions and assert the returned
163-
booleans. This can allow for setting safe/unsafe ranges of memory or even the register space.
223+
** The API is unstable and largely undocumented at this time. **
164224

165225
# Research Paper
166226

167-
`crackers` was developed in support of our research paper _Synthesis of Code-Reuse Attacks from `p-code` Programs_.
168-
You can find the author accepted manuscript [here](https://ora.ox.ac.uk/objects/uuid:906d32ca-407c-4cab-beab-b90200f81d65).
169-
This work has been accepted to [Usenix Security 2025](https://www.usenix.org/conference/usenixsecurity25/presentation/denhoed).
227+
`crackers` was initially developed in support of our research paper, _Synthesis of Code-Reuse Attacks from `p-code` Programs_,
228+
presented at [Usenix Security 2025](https://www.usenix.org/conference/usenixsecurity25/presentation/denhoed).
170229

171-
You can cite this work with the following BibTex:
230+
If you found the paper or the implementation useful, you can cite it with the following BibTeX:
172231

173232
```bibtex
174-
@inproceedings {denhoed2025synthesis,
175-
author = {Mark DenHoed and Thomas Melham},
176-
title = {Synthesis of Code-Reuse Attacks from p-code Programs},
177-
booktitle = {34th USENIX Security Symposium (USENIX Security 25)},
178-
year = {2025},
179-
address = {Seattle, WA},
180-
publisher = {USENIX Association},
181-
month = aug
233+
@inproceedings{denhoed2025synthesis,
234+
title={Synthesis of ${Code-Reuse}$ Attacks from p-code Programs},
235+
author={DenHoed, Mark and Melham, Tom},
236+
booktitle={34th USENIX Security Symposium (USENIX Security 25)},
237+
pages={395--411},
238+
year={2025}
182239
}
183240
```

crackers/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ toml = ["dep:toml_edit"]
2424
z3-gh-release = ["z3/gh-release"]
2525

2626
[dependencies]
27-
jingle = { version = "0.3.0", features = ["gimli"] }
27+
jingle = { version = "0.3.2", features = ["gimli"] }
2828
z3 = "0.18.2"
2929
serde = { version = "1.0.203", features = ["derive"] }
3030
thiserror = "2.0"

crackers_python/Cargo.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,13 @@ crate-type = ["cdylib"]
1212
[dependencies]
1313
pyo3 = { version = "0.26", features = ["extension-module", "py-clone"] }
1414
crackers = {path = "../crackers", features = ["pyo3"], version = "0.5.4" }
15-
jingle = { version = "0.3.0", features = ["pyo3"]}
15+
jingle = { version = "0.3.2", features = ["pyo3"]}
1616
toml_edit = { version = "0.23.4", features = ["serde"] }
1717
z3 = "0.18.2"
1818
serde_json = "1.0.140"
1919
lazy_static = "1.5.0"
20+
tracing = "0.1.41"
21+
tracing-subscriber = "0.3.20"
2022

2123
[dev-dependencies]
2224
pyo3 = { version = "0", features = ["extension-module"] }

0 commit comments

Comments
 (0)