@@ -72,16 +72,53 @@ Below is a summary of the support of libhat OS APIs on various platforms:
7272| ` hp::module::for_each_segment ` | ✅ | ✅ | |
7373
7474## Quick start
75- ### Pattern scanning
75+ ### Defining patterns
76+ libhat's signature syntax consists of space-delimited tokens and is backwards compatible with IDA syntax:
77+
78+ - 8 character sequences are interpreted as binary
79+ - 2 character sequences are interpreted as hex
80+ - 1 character must be a wildcard (` ? ` )
81+
82+ Any digit can be substituted for a wildcard, for example:
83+ - ` ????1111 ` is a binary sequence, and matches any byte with all ones in the lower nibble
84+ - ` A? ` is a hex sequence, and matches any byte of the form ` 1010???? `
85+ - Both ` ???????? ` and ` ?? ` are equivalent to ` ? ` , and will match any byte
86+
87+ A complete pattern might look like ` AB ? 12 ?3 ` . This matches any 4-byte
88+ subrange ` s ` for which all the following conditions are met:
89+ - ` s[0] == 0xAB `
90+ - ` s[2] == 0x12 `
91+ - ` s[3] & 0x0F == 0x03 `
92+
93+ Due to how various scanning algorithms are implemented, there are some restrictions when defining a pattern:
94+
95+ 1 ) A pattern must contain at least one fully masked byte (i.e. ` AB ` or ` 10011001 ` )
96+ 2 ) The first byte with a non-zero mask must have a full mask
97+ - ` ?1 02 ` is disallowed
98+ - ` 01 02 ` is allowed
99+ - ` ?? 01 ` is allowed
100+
101+ In code, there are a few to initialize a signature from its string representation:
102+
76103``` cpp
77104#include < libhat/scanner.hpp>
78105
79106// Parse a pattern's string representation to an array of bytes at compile time
80107constexpr hat::fixed_signature pattern = hat::compile_signature<" 48 8D 05 ? ? ? ? E8" >();
81108
82- // ...or parse it at runtime
109+ // Parse using the UDLs at compile time
110+ using namespace hat ::literals;
111+ constexpr hat::fixed_signature pattern = "48 8D 05 ? ? ? ? E8"_ sig; // stack owned
112+ constexpr hat::signature_view pattern = "48 8D 05 ? ? ? ? E8"_ sigv; // static lifetime
113+
114+ // Parse it at runtime
83115using parsed_t = hat::result<hat::signature, hat::signature_parse_error>;
84116parsed_t runtime_pattern = hat::parse_signature("48 8D 05 ? ? ? ? E8");
117+ ```
118+
119+ ### Scanning patterns
120+ ```cpp
121+ #include <libhat/scanner.hpp>
85122
86123// Scan for this pattern using your CPU's vectorization features
87124auto begin = /* a contiguous iterator over std::byte */;
@@ -109,6 +146,21 @@ const std::byte* address = result.get();
109146const std::byte* relative_address = result.rel(3);
110147```
111148
149+ libhat has a few optimizations for searching for patterns in ` x86_64 ` machine code:
150+ ``` cpp
151+ #include < libhat/scanner.hpp>
152+
153+ // If a byte pattern matches at the start of a function, the result will be aligned on 16-bytes.
154+ // This can be indicated via the defaulted `alignment` parameter (all overloads have this parameter):
155+ std::span<std::byte> range = /* ... */ ;
156+ hat::signature_view pattern = /* ... */ ;
157+ hat::scan_result result = hat::find_pattern(range, pattern, hat::scan_alignment::X16);
158+
159+ // Additionally, x86_64 contains a non-uniform distribution of byte pairs. By passing the `x86_64`
160+ // scan hint, the search can be based on the least common byte pair that is found in the pattern.
161+ hat::scan_result result = hat::find_pattern(range, pattern, hat::scan_alignment::X1, hat::scan_hint::x86_64);
162+ ```
163+
112164### Accessing offsets
113165``` cpp
114166#include < libhat/access.hpp>
0 commit comments