Skip to content

Commit af520d4

Browse files
committed
Readme update
1 parent b899d00 commit af520d4

File tree

1 file changed

+10
-108
lines changed

1 file changed

+10
-108
lines changed

README.md

Lines changed: 10 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -97,11 +97,20 @@ min = 0x7fffffffde00
9797
max = 0x7ffffffff000
9898
```
9999

100+
A successful synthesis will print out a listing of the gadgets were selected.
101+
100102
## Library Usage
101103

102-
`crackers` can also be used as a library. All of the above settings from the config correspond
104+
`crackers` intended mode of use is as a library. All of the above settings from the config correspond
103105
to settings that can be set programmatically by API consumers.
104106

107+
When using the API, rather than getting a listing of gadgets as an output, you get a model of the synthesized chain.
108+
This model of the chain includes information about what gadgets were selected as well as a Z3 `Model` representing the
109+
memory at all states of execution in the gadget chain. This model can be queried to derive the memory conditions
110+
necessary to execute the chain.
111+
112+
### Constraints
113+
105114
Constraints work a little differently with the API. Instead of specifying registers and register equality,
106115
`crackers` allows consumers to provide a closure of the following types:
107116

@@ -128,110 +137,3 @@ The second type is used for asserting read/write invariants. These functions tak
128137
the bitvector corresponding to the read/write address, as well as the state the read/write is being performed on.
129138
Any time a chain reads or writes from memory, the procedure will automatically call these functions and assert the returned
130139
booleans. This can allow for setting safe/unsafe ranges of memory or even the register space.
131-
132-
## How it Works (Roughly)
133-
134-
### Library generation
135-
136-
A library is taken in and parsed. The executable sections are identified. For every executable section,
137-
we attempt disassembly at every byte offset. If disassembly succeeds and returns a terminating basic block
138-
within N instructions (where N is set in the config), then we call that a gadget and save it.
139-
140-
### Candidate selection
141-
142-
`crackers` works by taking in an "example" computation and synthesizing a chain that is compatible with the example.
143-
So for instance, if you want to call execve on linux, your example computation might look like this:
144-
145-
```
146-
00000000 4889c7 mov rdi, rax
147-
00000003 48c7c03b000000 mov rax, 0x3b
148-
0000000a 48c7c600000000 mov rsi, 0x0
149-
00000011 48c7c200000000 mov rdx, 0x0
150-
00000018 0f05 syscall
151-
```
152-
153-
This computation sets `rax`, `rsi`, and `rdx` to set values, and `rdi` to some indeterminate value, which we will
154-
constrain later.
155-
156-
For each instruction in this computation, we identify a set of "gadget candidates". These candidates are selected out of
157-
the library we assembled. To be a candidate for an instruction a gadget must pass the following checks:
158-
* If the specification instruction contains a jump, the gadget must have terminating control flow that is capable of branching
159-
to the same destination.
160-
* The gadget must write to every direct address that the specification instruction writes to.
161-
* For every indirect access, the gadget must also make an indirect access using the same pointer storage, of at least
162-
as many bytes.
163-
* Taken in isolation, the gadget must be able to have the same effect as the specification instruction:
164-
(e.g. `mov eax, ebx` can stand in for `mov eax, 0`, but `mov eax, 2` cannot).
165-
166-
167-
### Decision Loop
168-
169-
The overall flow of the procedure is as follows:
170-
171-
* Ask the assignment problem for an assignment
172-
* If it returns UNSAT, then no possible assignment exists (under the given parameters) and we return
173-
* If it returns SAT, then we send that assignment to the theory solver
174-
* If the PCODE theory solver returns SAT, we have a valid chain
175-
* If the PCODE theory solver returns UNSAT, it also provides a set of conflict clauses identifying
176-
which `decisions` participated in the UNSAT proof. These clauses are communicated back to the assignment
177-
problem to allow it to outlaw that combination of `decisions`.
178-
179-
We introduce parallelism to this workflow by running the PCODE theory solvers in threads and generating multiple
180-
unique assignments for each worker to solve in parallel.
181-
182-
A description of the assignment problem and the theory problem follow:
183-
### Assignment Problem Setup
184-
185-
Once all the candidates have been found for all instructions, we check and make sure that we have found
186-
at least one candidate for each. If any instruction has no candidates, then we immediately return UNSAT. This
187-
usually indicates that we just did not sample a gadget that touches the needed memory.
188-
189-
For each spec instruction, each candidate is assigned an index. The same gadget can exist as a candidate for multiple indices and
190-
each copy is treated as logically separate from each other.
191-
192-
Using these indices, we construct a simple boolean SAT problem:
193-
* We define a `decision` as a tuple `(i: usize, c: usize)`, indicating that index `i` is using choice `c`. A `decision`
194-
uniquely identifies a given gadget being used in a given slot.
195-
* Each decision is mapped to a Z3 Bool.
196-
* We then construct a boolean SAT problem using these variables with the following constraints:
197-
* For all indices `i`:
198-
* We make exactly 1 choice `c` (e.g. for every `i`, one AND ONLY ONE `decision` with matching `i` must be true)
199-
* In the case of the `optimize` solver, we additionally impose a penalty on every `decision`, proportional to the
200-
number of instructions in the gadget. This pressures the solver into selecting the shortest gadgets that it can.
201-
202-
### PCODE Theory Problem Setup
203-
204-
This procedure runs operates on a single assignment of gadgets. This assignment is evaluated against
205-
the specification computation, as well as any provided constraints.
206-
207-
First, we form the specification computation into a trace, by asserting state equality between the end state
208-
of every instruction and the beginning state of its successor.
209-
210-
Then, we do the same for our assignment of gadgets. We tag these assertions as being `memory` assertions.
211-
212-
We assert all preconditions on the initial state of the gadget chain, and all postconditions on the final state of the gadget chain.
213-
We tag these as `constraint` assertions.
214-
215-
For every instruction `i` and its corresponding gadget `g`:
216-
* Assert that for all `v` in `output(i)`: `g[v]` = `i[v]`.
217-
* If `i` has control flow, assert that the control flow of `g` branches to the same destination as `i`
218-
* These are tagged as `semantic` assertions.
219-
220-
For every gadget `g1` and its successor `g2`:
221-
* Assert that the address of `g2` is the jump target of `g1`. These are tagged as `branch` assertions. The conflict
222-
associated with a `branch` assertions only references `g1` instead of the conjunction of `g1` and `g2` because,
223-
as a heuristic, when `g1` is unable to branch to `g2` it is almost always because of some conflict in `g1`, and not
224-
something about the address of `g2`
225-
226-
We give all these assertions to `z3` using the `assert_and_track` API, which makes Z3 express the unsat core
227-
in terms of booleans representing our varying assertions.
228-
229-
If z3 comes back with SAT, then the chain assignment is valid.
230-
231-
If it comes back UNSAT, then we analyze the UNSAT CORE:
232-
233-
* If the UNSAT core is composed only of `memory` and `constraint` assertions, then, as the formulae are currently tracked,
234-
we have no way to make strong conflicts ouf of this core. As a fallback, we simply return a clause outlawing the complete
235-
assignment.
236-
* Otherwise, we form a conjunction of all participating `decisions` for all `branch` and `semantic` conflicts
237-
and return that to the assignment problem.

0 commit comments

Comments
 (0)