Skip to content

Commit 3cf6a49

Browse files
committed
Update README with new syntax
1 parent 3d2b422 commit 3cf6a49

File tree

1 file changed

+54
-2
lines changed

1 file changed

+54
-2
lines changed

README.md

Lines changed: 54 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,16 +72,53 @@ Below is a summary of the support of libhat OS APIs on various platforms:
7272
| `hp::module::for_each_segment` ||| |
7373

7474
## Quick start
75-
### Pattern scanning
75+
### Defining patterns
76+
libhat's signature syntax consists of space-delimited tokens and is backwards compatible with IDA syntax:
77+
78+
- 8 character sequences are interpreted as binary
79+
- 2 character sequences are interpreted as hex
80+
- 1 character must be a wildcard (`?`)
81+
82+
Any digit can be substituted for a wildcard, for example:
83+
- `????1111` is a binary sequence, and matches any byte with all ones in the lower nibble
84+
- `A?` is a hex sequence, and matches any byte of the form `1010????`
85+
- Both `????????` and `??` are equivalent to `?`, and will match any byte
86+
87+
A complete pattern might look like `AB ? 12 ?3`. This matches any 4-byte
88+
subrange `s` for which all the following conditions are met:
89+
- `s[0] == 0xAB`
90+
- `s[2] == 0x12`
91+
- `s[3] & 0x0F == 0x03`
92+
93+
Due to how various scanning algorithms are implemented, there are some restrictions when defining a pattern:
94+
95+
1) A pattern must contain at least one fully masked byte (i.e. `AB` or `10011001`)
96+
2) The first byte with a non-zero mask must have a full mask
97+
- `?1 02` is disallowed
98+
- `01 02` is allowed
99+
- `?? 01` is allowed
100+
101+
In code, there are a few to initialize a signature from its string representation:
102+
76103
```cpp
77104
#include <libhat/scanner.hpp>
78105

79106
// Parse a pattern's string representation to an array of bytes at compile time
80107
constexpr hat::fixed_signature pattern = hat::compile_signature<"48 8D 05 ? ? ? ? E8">();
81108

82-
// ...or parse it at runtime
109+
// Parse using the UDLs at compile time
110+
using namespace hat::literals;
111+
constexpr hat::fixed_signature pattern = "48 8D 05 ? ? ? ? E8"_sig; // stack owned
112+
constexpr hat::signature_view pattern = "48 8D 05 ? ? ? ? E8"_sigv; // static lifetime
113+
114+
// Parse it at runtime
83115
using parsed_t = hat::result<hat::signature, hat::signature_parse_error>;
84116
parsed_t runtime_pattern = hat::parse_signature("48 8D 05 ? ? ? ? E8");
117+
```
118+
119+
### Scanning patterns
120+
```cpp
121+
#include <libhat/scanner.hpp>
85122
86123
// Scan for this pattern using your CPU's vectorization features
87124
auto begin = /* a contiguous iterator over std::byte */;
@@ -109,6 +146,21 @@ const std::byte* address = result.get();
109146
const std::byte* relative_address = result.rel(3);
110147
```
111148

149+
libhat has a few optimizations for searching for patterns in `x86_64` machine code:
150+
```cpp
151+
#include <libhat/scanner.hpp>
152+
153+
// If a byte pattern matches at the start of a function, the result will be aligned on 16-bytes.
154+
// This can be indicated via the defaulted `alignment` parameter (all overloads have this parameter):
155+
std::span<std::byte> range = /* ... */;
156+
hat::signature_view pattern = /* ... */;
157+
hat::scan_result result = hat::find_pattern(range, pattern, hat::scan_alignment::X16);
158+
159+
// Additionally, x86_64 contains a non-uniform distribution of byte pairs. By passing the `x86_64`
160+
// scan hint, the search can be based on the least common byte pair that is found in the pattern.
161+
hat::scan_result result = hat::find_pattern(range, pattern, hat::scan_alignment::X1, hat::scan_hint::x86_64);
162+
```
163+
112164
### Accessing offsets
113165
```cpp
114166
#include <libhat/access.hpp>

0 commit comments

Comments
 (0)