|
| 1 | +.. _aflpp integration: |
| 2 | + |
| 3 | +================= |
| 4 | +AFL++ Integration |
| 5 | +================= |
| 6 | + |
| 7 | +The C++ backend of *Grammarinator* provides seamless integration with |
| 8 | +AFL++ via its custom mutator interface. This allows *Grammarinator* |
| 9 | +to be used not only as a blackbox test case generator, but also as an |
| 10 | +**in-process input synthesizer**, where its internal derivation trees are |
| 11 | +evolved and mutated during fuzzing runs. The mutator operates on serialized |
| 12 | +``.grt*`` trees and performs grammar-aware transformations based on the compiled |
| 13 | +ANTLR grammar. |
| 14 | + |
| 15 | +Overview |
| 16 | +-------- |
| 17 | + |
| 18 | +The integration uses the AFL++ custom mutator API `custom mutator hooks`_. |
| 19 | +AFL++ loads a shared library implementing these hooks and delegates mutation |
| 20 | +and related workflow operations to Grammarinator. |
| 21 | + |
| 22 | +This enables grammar-aware, structure-preserving mutation and recombination |
| 23 | +of test cases at runtime -- improving coverage and syntactic correctness |
| 24 | +compared to purely byte-level fuzzing. |
| 25 | + |
| 26 | +.. _`custom mutator hooks`: https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/custom_mutators.md |
| 27 | + |
| 28 | +Building the AFL++-Compatible Mutator |
| 29 | +------------------------------------- |
| 30 | + |
| 31 | +To enable this integration in a real AFL++ fuzzing setup, a specialized |
| 32 | +shared library must be generated from the C++ generator class produced by |
| 33 | +:ref:`grammarinator-process<grammarinator-process>`. This can be compiled |
| 34 | +by the :ref:`the build script<cpp_compilation>` using the ``--grafl`` flag |
| 35 | +(short for *grammarinator-afl*). |
| 36 | + |
| 37 | +Example using the `HTML grammar`_:: |
| 38 | + |
| 39 | + python3 grammarinator-cxx/dev/build.py --clean \ |
| 40 | + --generator HTMLGenerator \ |
| 41 | + --includedir <dir-to-HTMLGenerator> \ |
| 42 | + --afl-includedir <AFLplusplus-root>/include \ |
| 43 | + --serializer SimpleSpaceSerializer \ |
| 44 | + --grafl |
| 45 | + |
| 46 | +This command produces a shared library:: |
| 47 | + |
| 48 | + grammarinator-cxx/build/lib/libgrafl-html.so |
| 49 | + |
| 50 | +AFL++ will load this ``.so`` as the custom mutator library through the |
| 51 | +``AFL_CUSTOM_MUTATOR_LIBRARY`` environment variable. |
| 52 | + |
| 53 | +Test inputs are expected to be encoded as ``.grt*`` trees |
| 54 | +(e.g., FlatBuffer-encoded). During fuzzing, mutations will occur in a |
| 55 | +**grammar-aware** manner, resulting in: |
| 56 | + |
| 57 | +- higher syntactic validity of inputs, |
| 58 | +- better exploration of the structured input space, |
| 59 | +- and potentially deeper semantic bugs found in the target. |
| 60 | + |
| 61 | +Note that only ``.grt*``-style inputs (e.g., ``.grtf`` for FlatBuffer-encoded |
| 62 | +trees) are supported by the AFL++ integration. |
| 63 | + |
| 64 | +Fuzzing Configuration |
| 65 | +--------------------- |
| 66 | + |
| 67 | +Unlike the :ref:`grammarinator-generate<grammarinator-generate>` utility, the |
| 68 | +AFL++ custom mutator integration cannot be configured through command-line |
| 69 | +arguments. Instead, the behavior of the mutator can be controlled via |
| 70 | +environment variables prefixed with ``GRAFL_``. |
| 71 | + |
| 72 | +The following options are currently supported: |
| 73 | + |
| 74 | +* **GRAFL_MAX_DEPTH**: Equivalent to ``--max-depth`` (integer) |
| 75 | +* **GRAFL_MAX_TOKENS**: Equivalent to ``--max-tokens`` (integer) |
| 76 | +* **GRAFL_MEMO_SIZE**: Equivalent to ``--memo-size`` (integer) |
| 77 | +* **GRAFL_RANDOM_MUTATORS**: Enables random mutators; inverse of |
| 78 | + ``--disable-random-mutators`` (boolean; accepts ``1``, ``true``, or ``yes`` |
| 79 | + case-insensitively) |
| 80 | +* **GRAFL_WEIGHTS**: Equivalent to ``--weights`` (path to a JSON file) |
| 81 | +* **GRAFL_MAX_TRIM_STEPS**: Maximum number of mutation steps performed during |
| 82 | + trimming of a single test input (integer) |
| 83 | + |
| 84 | +Verifying the Setup |
| 85 | +------------------- |
| 86 | + |
| 87 | +To run a fuzzing session with AFL++ equipped with Grammarinator, a compiler |
| 88 | +wrapper (e.g., ``afl-clang-fast``) and the ``afl-fuzz`` utility must first be |
| 89 | +obtained. Both can be installed or built with following the instruction in the |
| 90 | +official `AFL++ documentation`_. |
| 91 | + |
| 92 | +Once the target application is compiled with the AFL++ compiler wrapper, the |
| 93 | +required instrumentation is automatically injected into the binary. This |
| 94 | +instrumentation is later used by ``afl-fuzz`` to guide the fuzzing process. |
| 95 | + |
| 96 | +Next, select or create a grammar that describes the expected input format (e.g., |
| 97 | +`HTML grammar`_), then :ref:`build<cpp_compilation>` the required binaries with |
| 98 | +``--grafl``, and optionally also with ``--generate`` and ``--decode`` flags. |
| 99 | + |
| 100 | +The next step is to prepare an initial tree corpus that serves as the starting |
| 101 | +point for the fuzzing session. One option is to generate this corpus from |
| 102 | +scratch using the :ref:`grammarinator-generate<grammarinator-generate>` |
| 103 | +utility. For example:: |
| 104 | + |
| 105 | + grammarinator-generate-html \ |
| 106 | + -n 100 \ |
| 107 | + -o html-src/%d.html \ |
| 108 | + --population html-trees/ \ |
| 109 | + --keep-trees |
| 110 | + |
| 111 | +Alternatively, an initial tree corpus can be created by converting existing |
| 112 | +source files (e.g., HTML documents) into tree format using the |
| 113 | +:ref:`grammarinator-parse<grammarinator-parse>` utility. For example:: |
| 114 | + |
| 115 | + grammarinator-parse html-src \ |
| 116 | + -o html-trees \ |
| 117 | + -g HTMLLexer.g4 HTMLParser.g4 \ |
| 118 | + --tree-format flatbuffers |
| 119 | + |
| 120 | +To test the integration, run AFL++ in custom-mutator-only mode and point it to |
| 121 | +the generated shared library:: |
| 122 | + |
| 123 | + AFL_CUSTOM_MUTATOR_ONLY=1 \ |
| 124 | + AFL_CUSTOM_MUTATOR_LIBRARY=grammarinator-cxx/build/lib/libgrafl-html.so \ |
| 125 | + afl-fuzz -i html-trees -o outdir -- ./target_app @@ |
| 126 | + |
| 127 | +Setting ``AFL_CUSTOM_MUTATOR_ONLY=1`` is **mandatory**. Without this flag, |
| 128 | +AFL++ would apply its built-in byte-level mutators to the test cases, which |
| 129 | +would corrupt the encoded tree representation used by Grammarinator. |
| 130 | + |
| 131 | +**Note 1:** When using AFL++ with Grammarinator integration, both the input |
| 132 | +and output corpora must be in tree format. Therefore, any existing input corpus |
| 133 | +must first be converted into trees using the |
| 134 | +:ref:`grammarinator-parse<grammarinator-parse>` utility. After the fuzzing |
| 135 | +session, the resulting tree corpus can be converted back into source-level test |
| 136 | +cases using the :ref:`grammarinator-decode<grammarinator-decode-cpp>` utility. |
| 137 | + |
| 138 | +**Note 2:** The items of a tree corpus can be minimized using the ``afl-tmin`` |
| 139 | +tool in a grammar-aware manner by providing the appropriate custom |
| 140 | +mutator-related environment variables. For example:: |
| 141 | + |
| 142 | + AFL_CUSTOM_MUTATOR_ONLY=1 \ |
| 143 | + AFL_CUSTOM_MUTATOR_LIBRARY=grammarinator-cxx/build/lib/libgrafl-html.so \ |
| 144 | + afl-tmin -i html-trees -o html-trimmed -e -- ./target_app @@ |
| 145 | + |
| 146 | +.. _AFL++ documentation: https://aflplus.plus/docs/install/ |
| 147 | +.. _`HTML grammar`: https://github.com/antlr/grammars-v4/tree/master/html |
0 commit comments