|
| 1 | +================= |
| 2 | +Allocation Tokens |
| 3 | +================= |
| 4 | + |
| 5 | +.. contents:: |
| 6 | + :local: |
| 7 | + |
| 8 | +Introduction |
| 9 | +============ |
| 10 | + |
| 11 | +Clang provides support for allocation tokens to enable allocator-level heap |
| 12 | +organization strategies. Clang assigns mode-dependent token IDs to allocation |
| 13 | +calls; the runtime behavior depends entirely on the implementation of a |
| 14 | +compatible memory allocator. |
| 15 | + |
| 16 | +Possible allocator strategies include: |
| 17 | + |
| 18 | +* **Security Hardening**: Placing allocations into separate, isolated heap |
| 19 | + partitions. For example, separating pointer-containing types from raw data |
| 20 | + can mitigate exploits that rely on overflowing a primitive buffer to corrupt |
| 21 | + object metadata. |
| 22 | + |
| 23 | +* **Memory Layout Optimization**: Grouping related allocations to improve data |
| 24 | + locality and cache utilization. |
| 25 | + |
| 26 | +* **Custom Allocation Policies**: Applying different management strategies to |
| 27 | + different partitions. |
| 28 | + |
| 29 | +Token Assignment Mode |
| 30 | +===================== |
| 31 | + |
| 32 | +The default mode to calculate tokens is: |
| 33 | + |
| 34 | +* ``typehash``: This mode assigns a token ID based on the hash of the allocated |
| 35 | + type's name. |
| 36 | + |
| 37 | +Other token ID assignment modes are supported, but they may be subject to |
| 38 | +change or removal. These may (experimentally) be selected with ``-mllvm |
| 39 | +-alloc-token-mode=<mode>``: |
| 40 | + |
| 41 | +* ``random``: This mode assigns a statically-determined random token ID to each |
| 42 | + allocation site. |
| 43 | + |
| 44 | +* ``increment``: This mode assigns a simple, incrementally increasing token ID |
| 45 | + to each allocation site. |
| 46 | + |
| 47 | +Allocation Token Instrumentation |
| 48 | +================================ |
| 49 | + |
| 50 | +To enable instrumentation of allocation functions, code can be compiled with |
| 51 | +the ``-fsanitize=alloc-token`` flag: |
| 52 | + |
| 53 | +.. code-block:: console |
| 54 | +
|
| 55 | + % clang++ -fsanitize=alloc-token example.cc |
| 56 | +
|
| 57 | +The instrumentation transforms allocation calls to include a token ID. For |
| 58 | +example: |
| 59 | + |
| 60 | +.. code-block:: c |
| 61 | +
|
| 62 | + // Original: |
| 63 | + ptr = malloc(size); |
| 64 | +
|
| 65 | + // Instrumented: |
| 66 | + ptr = __alloc_token_malloc(size, <token id>); |
| 67 | +
|
| 68 | +The following command-line options affect generated token IDs: |
| 69 | + |
| 70 | +* ``-falloc-token-max=<N>`` |
| 71 | + Configures the maximum number of tokens. No max by default (tokens bounded |
| 72 | + by ``SIZE_MAX``). |
| 73 | + |
| 74 | + .. code-block:: console |
| 75 | +
|
| 76 | + % clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc |
| 77 | +
|
| 78 | +Runtime Interface |
| 79 | +----------------- |
| 80 | + |
| 81 | +A compatible runtime must be provided that implements the token-enabled |
| 82 | +allocation functions. The instrumentation generates calls to functions that |
| 83 | +take a final ``size_t token_id`` argument. |
| 84 | + |
| 85 | +.. code-block:: c |
| 86 | +
|
| 87 | + // C standard library functions |
| 88 | + void *__alloc_token_malloc(size_t size, size_t token_id); |
| 89 | + void *__alloc_token_calloc(size_t count, size_t size, size_t token_id); |
| 90 | + void *__alloc_token_realloc(void *ptr, size_t size, size_t token_id); |
| 91 | + // ... |
| 92 | +
|
| 93 | + // C++ operators (mangled names) |
| 94 | + // operator new(size_t, size_t) |
| 95 | + void *__alloc_token__Znwm(size_t size, size_t token_id); |
| 96 | + // operator new[](size_t, size_t) |
| 97 | + void *__alloc_token__Znam(size_t size, size_t token_id); |
| 98 | + // ... other variants like nothrow, etc., are also instrumented. |
| 99 | +
|
| 100 | +Fast ABI |
| 101 | +-------- |
| 102 | + |
| 103 | +An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``, |
| 104 | +which encodes the token ID hint in the allocation function name. |
| 105 | + |
| 106 | +.. code-block:: c |
| 107 | +
|
| 108 | + void *__alloc_token_0_malloc(size_t size); |
| 109 | + void *__alloc_token_1_malloc(size_t size); |
| 110 | + void *__alloc_token_2_malloc(size_t size); |
| 111 | + ... |
| 112 | + void *__alloc_token_0_Znwm(size_t size); |
| 113 | + void *__alloc_token_1_Znwm(size_t size); |
| 114 | + void *__alloc_token_2_Znwm(size_t size); |
| 115 | + ... |
| 116 | +
|
| 117 | +This ABI provides a more efficient alternative where |
| 118 | +``-falloc-token-max`` is small. |
| 119 | + |
| 120 | +Disabling Instrumentation |
| 121 | +------------------------- |
| 122 | + |
| 123 | +To exclude specific functions from instrumentation, you can use the |
| 124 | +``no_sanitize("alloc-token")`` attribute: |
| 125 | + |
| 126 | +.. code-block:: c |
| 127 | +
|
| 128 | + __attribute__((no_sanitize("alloc-token"))) |
| 129 | + void* custom_allocator(size_t size) { |
| 130 | + return malloc(size); // Uses original malloc |
| 131 | + } |
| 132 | +
|
| 133 | +Note: Independent of any given allocator support, the instrumentation aims to |
| 134 | +remain performance neutral. As such, ``no_sanitize("alloc-token")`` |
| 135 | +functions may be inlined into instrumented functions and vice-versa. If |
| 136 | +correctness is affected, such functions should explicitly be marked |
| 137 | +``noinline``. |
| 138 | + |
| 139 | +The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to |
| 140 | +disable this and other sanitizer instrumentations. |
| 141 | + |
| 142 | +Suppressions File (Ignorelist) |
| 143 | +------------------------------ |
| 144 | + |
| 145 | +AllocToken respects the ``src`` and ``fun`` entity types in the |
| 146 | +:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source |
| 147 | +files or functions from instrumentation. |
| 148 | + |
| 149 | +.. code-block:: bash |
| 150 | +
|
| 151 | + [alloc-token] |
| 152 | + # Exclude specific source files |
| 153 | + src:third_party/allocator.c |
| 154 | + # Exclude function name patterns |
| 155 | + fun:*custom_malloc* |
| 156 | + fun:LowLevel::* |
| 157 | +
|
| 158 | +.. code-block:: console |
| 159 | +
|
| 160 | + % clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc |
| 161 | +
|
| 162 | +Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__`` |
| 163 | +----------------------------------------------------------- |
| 164 | + |
| 165 | +In some cases, one may need to execute different code depending on whether |
| 166 | +AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro |
| 167 | +can be used for this purpose. |
| 168 | + |
| 169 | +.. code-block:: c |
| 170 | +
|
| 171 | + #ifdef __SANITIZE_ALLOC_TOKEN__ |
| 172 | + // Code specific to -fsanitize=alloc-token builds |
| 173 | + #endif |
0 commit comments