Skip to content

Commit e8d34d7

Browse files
committed
doc: adding documentation on available operations
1 parent de17663 commit e8d34d7

File tree

7 files changed

+175
-8
lines changed

7 files changed

+175
-8
lines changed

Cargo.lock

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ resolver = "2"
33
members = ["bmatcher", "bmatcher-core", "bmatcher-proc", "example"]
44

55
[workspace.package]
6-
version = "0.1.1"
6+
version = "0.1.2"
77
authors = ["Markus Hadenfeldt <[email protected]>"]
88
edition = "2021"
99
description = "bmatcher is a flexible and efficient binary pattern matching library designed to help you search and match binary data."

GRAMMA.MD

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# Syntax of a Binary Pattern
2+
3+
The syntax for this crate's binary patterns is primarily inspired by the [pelite](https://docs.rs/pelite/latest/pelite/pattern/fn.parse.html) crate's pattern system, aligning with existing de facto standards to simplify migration. Additionally, numerous enhancements have been introduced to facilitate matching against generated assembly instructions for function or code signatures.
4+
5+
Below, all available operators are defined and explained.
6+
7+
# Available Operators
8+
9+
## Binary Data (`<hex>`)
10+
11+
The **Match Binary Data** operator is the most fundamental operator. It performs a byte-by-byte comparison of the input with the specified hexadecimal values. Each byte must be written in hexadecimal format and padded to two digits.
12+
13+
The following example searches for the hexadecimal sequence `0xFF 0xDE 0x01 0x23` in the target:
14+
15+
```pattern
16+
FF DE 01 23
17+
```
18+
19+
Spaces between the hexadecimal values are optional. The following example is equivalent to the one above:
20+
21+
```pattern
22+
FFDE0123
23+
```
24+
25+
## Byte Wildcard (`?`)
26+
27+
The Byte Wildcard operator (`?`) matches any byte value, serving as the opposite of the Match Binary Data operator.
28+
For example, the following pattern matches any 32-bit relative call instruction (`E8 rel32`) followed by a return (`C3`) in x86 assembly:
29+
30+
```pattern
31+
E8 ? ? ? ? C3
32+
```
33+
34+
Note that a single question mark matches a whole byte.
35+
36+
## Range Wildcard (`[<min>-<max>]` / `[<count>]`)
37+
38+
The **Range Wildcard** operator (`[<min>-<max>]` / `[<count>]`) extends the capabilities of the byte wildcard operator by allowing you to match a specific range or a fixed count of bytes with any value.
39+
40+
- Fixed Count Wildcard (`[<count>]`)
41+
Matches an exact number of bytes. For example, the following matches a 32-bit relative call instruction (`E8`), skips four bytes, and then matches a return instruction (`C3`):
42+
43+
```pattern
44+
E8 [4] C3
45+
```
46+
47+
- Variable Range Wildcard (`[<min>-<max>]`)
48+
Matches a variable range of bytes.
49+
The matcher aligns the remaining pattern with any offset within the range.
50+
For instance, the following matches a sequence starting with 0xFF, followed by four to eight random bytes, and ending with 0x00:
51+
52+
```pattern
53+
FF [4-8] FF
54+
```
55+
56+
## Save Cursor (`'`)
57+
58+
The **Save Cursor** operator (`'`) acts as a bookmark to save the current cursor's relative virtual address (RVA) in the save array returned by the matcher.
59+
The following example would save the rva of the beginning of the counting sequence in the result array at index 1:
60+
61+
```pattern
62+
FF ' 01 02 03 04
63+
```
64+
65+
Note:
66+
The first index (index 0) in the returned array from the matcher always contains the start address of the matched pattern.
67+
68+
## Rel/Abs Jump (`%` / `$` / `@`)
69+
70+
The **Jump** operator follows either a relative or absolute jump, allowing the pattern to continue matching at the resolved jump target. The following jump modes are supported:
71+
72+
- **1-byte relative jump**: `%`
73+
- **4-byte relative jump**: `$`
74+
- **8-byte absolute jump**: `@`
75+
76+
When using a jump operator, subsequent operations will be performed at the resolved jump location.
77+
78+
Example:
79+
The following pattern matches a function call (`E8`), resolves a 4-byte relative jump (`$`), saves the function's start address to the save array, and confirms the function begins with `push rsp` (`54`):
80+
81+
```pattern
82+
E8 $ ' 54
83+
```
84+
85+
## Rel/Abs Jump with Sub-Pattern (`%` / `$` / `@` with `{}`)
86+
87+
The **Jump** operator can also match a sub-pattern at the resolved jump destination while returning the cursor to its original location after the jump. This is achieved by enclosing the sub-pattern in curly braces (`{}`) immediately following the jump symbol.
88+
89+
Behavior:
90+
91+
- The sub-pattern within the curly braces is matched at the resolved jump destination.
92+
- After the sub-pattern is matched successfully, the cursor returns to the original location before the jump.
93+
- The bytes defining the jump are skipped, and matching continues from that point.
94+
95+
Example:
96+
97+
The following pattern matches a function call (`E8`), resolves a 4-byte relative jump (`$`), confirms the jump target begins with `push rsp` (`54`), saves the target address, and then continues matching after the jump:
98+
99+
```pattern
100+
E8 $ { ' 54 }
101+
```
102+
103+
## OR / Branch (`(<pattern a> | <pattern b> [ | <pattern n> ])`)
104+
105+
The **Branch** operator enables matching against one of multiple specified patterns. It allows for flexibility in matching sequences where alternatives are valid. This operator is especially useful when dealing with multiple valid opcode variations or alternative byte sequences.
106+
107+
Example:
108+
The following pattern matches any of these sequences: 0xFF 0x01 0xFF, 0xFF 0x03 0xFF, or 0xFF 0xFF 0xFF:
109+
110+
```pattern
111+
FF ( 01 | 03 | FF ) FF
112+
```
113+
114+
## Read Value (`r1` / `r2` / `r4`)
115+
116+
The **Read Value** operator reads and saves a value from the matched bytes. It supports reading 1, 2, or 4 bytes and stores the result in the matched stack. This operator is particularly useful for extracting values like offsets, addresses, or immediate data from matched byte sequences.
117+
118+
Example:
119+
The following pattern matches a 32-bit relative call instruction (`E8`) and saves the RVA (read from the 4 bytes following the instruction) into the matched stack at index 1:
120+
121+
```pattern
122+
E8 r4
123+
```
124+
125+
# Formal Syntax specification
126+
127+
The following ABNF specifies the general syntax:
128+
129+
```abnf
130+
match_string := *(operand " ")
131+
132+
operand := operand_bin / operand_wildcard_byte / operand_wildcard_range / operand_jump / operand_read / operand_cursor_save / operand_branch
133+
134+
operand_bin := 1*(2HEXDIG)
135+
operand_wildcard_byte := "?"
136+
operand_wildcard_range := "[" (wildcard_fixed / wildcard_range) "]"
137+
operand_jump := "%" / "$" / "@" [jump_target_matcher]
138+
operand_read := "r" ("1" / "2" / "4")
139+
operand_cursor_save := "'"
140+
operand_branch := "(" *( *(match_string) "|") ")"
141+
142+
wildcard_range := 1*DIGIT "-" 1*DIGIT
143+
wildcard_fixed := 1*DIGIT
144+
145+
jump_target_matcher := "{" *(match_string) "}"
146+
```

README.MD

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ To use `bmatcher`, add it as a dependency in your `Cargo.toml`:
2626
bmatcher = "0.1"
2727
```
2828

29+
## Creating a pattern
30+
31+
An exhausive overview of the pattern syntax and operads can be found [here](./GRAMMA.MD).
32+
2933
## Basic Usage
3034

3135
Here's a simple example demonstrating how to use bmatcher to match a call signature binary pattern:
@@ -36,7 +40,7 @@ let pattern = pattern!("
3640
/*
3741
* call my_function
3842
* $ = follow 4 byte relative jump
39-
* ' = save cursor position to the matched stack)
43+
* ' = save cursor position to the matched stack
4044
*/
4145
E8 $ { ' }
4246

bmatcher-core/src/pattern.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ pub trait BinaryPattern: Send + Sync + Debug {
5151

5252
/// An implementation of the [BinaryPattern] interface that borrows the [Atom]s and byte sequence array.
5353
///
54-
/// This struct is primarily used alongside the [pattern!] macro to generate patterns at runtime.
54+
/// This struct is primarily used alongside the [bmatcher_proc::pattern] macro to generate patterns at runtime.
5555
#[derive(Debug, Clone, Copy)]
5656
pub struct BorrowedBinaryPattern<'a> {
5757
atoms: &'a [Atom],
@@ -79,7 +79,7 @@ impl BinaryPattern for BorrowedBinaryPattern<'_> {
7979

8080
/// An implementation of the [BinaryPattern] interface that allocates a `Vec` for the [Atom]s and the byte sequence.
8181
///
82-
/// This struct is primarily used with [compiler::parse] to parse binary patterns at runtime.
82+
/// This struct is primarily used with [crate::compiler::parse_pattern] to parse binary patterns at runtime.
8383
#[derive(Debug, Default)]
8484
pub struct OwnedBinaryPattern {
8585
atoms: Vec<Atom>,

bmatcher-proc/src/lib.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,9 @@ extern crate proc_macro;
55

66
mod macro_pattern;
77

8-
/// Parse a binary pattern and generate an instance of [BorrowedBinaryPattern] at compile time.
8+
/// Parse a binary pattern and generate an instance of [bmatcher::BorrowedBinaryPattern] at compile time.
9+
/// An exhausive overview of the pattern syntax and operads can be found here: [bpattern::doc_pattern_syntax].
10+
///
911
/// # Example
1012
/// ```rust,ignore
1113
/// static MY_PATTERN: &dyn BinaryPattern = &pattern!("01 02 ? 03 [4]");

bmatcher/src/lib.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
/*!
2+
A flexible and efficient binary pattern matching library designed to help you search and match binary data.
3+
4+
# BMatchers synatx for patterns
5+
An exhausive overview of the pattern syntax and operads can be found here: [doc_pattern_syntax].
6+
7+
# How to create patterns?
8+
In order to create a pattern you need some knowledge with reverse engineering programs.
9+
10+
A guide on how to create signatures can be found here: <https://wiki.alliedmods.net/Signature_scanning#Finding_the_Signature_of_a_Function>
11+
*/
12+
113
#![cfg_attr(not(test), no_std)]
214
pub use bmatcher_core::*;
315
pub use bmatcher_proc::pattern;
16+
17+
#[doc = include_str!("../../GRAMMA.MD")]
18+
pub fn doc_pattern_syntax() {}

0 commit comments

Comments
 (0)