|
| 1 | +# User-space, Statically Defined Tracing (USDT) for Bitcoin Core |
| 2 | + |
| 3 | +Bitcoin Core includes statically defined tracepoints to allow for more |
| 4 | +observability during development, debugging, code review, and production usage. |
| 5 | +These tracepoints make it possible to keep track of custom statistics and |
| 6 | +enable detailed monitoring of otherwise hidden internals. They have |
| 7 | +little to no performance impact when unused. |
| 8 | + |
| 9 | +``` |
| 10 | +eBPF and USDT Overview |
| 11 | +====================== |
| 12 | +
|
| 13 | + ┌──────────────────┐ ┌──────────────┐ |
| 14 | + │ tracing script │ │ bitcoind │ |
| 15 | + │==================│ 2. │==============│ |
| 16 | + │ eBPF │ tracing │ hooks │ │ |
| 17 | + │ code │ logic │ into┌─┤►tracepoint 1─┼───┐ 3. |
| 18 | + └────┬───┴──▲──────┘ ├─┤►tracepoint 2 │ │ pass args |
| 19 | + 1. │ │ 4. │ │ ... │ │ to eBPF |
| 20 | + User compiles │ │ pass data to │ └──────────────┘ │ program |
| 21 | + Space & loads │ │ tracing script │ │ |
| 22 | + ─────────────────┼──────┼─────────────────┼────────────────────┼─── |
| 23 | + Kernel │ │ │ │ |
| 24 | + Space ┌──┬─▼──────┴─────────────────┴────────────┐ │ |
| 25 | + │ │ eBPF program │◄──────┘ |
| 26 | + │ └───────────────────────────────────────┤ |
| 27 | + │ eBPF kernel Virtual Machine (sandboxed) │ |
| 28 | + └──────────────────────────────────────────┘ |
| 29 | +
|
| 30 | +1. The tracing script compiles the eBPF code and loads the eBPF program into a kernel VM |
| 31 | +2. The eBPF program hooks into one or more tracepoints |
| 32 | +3. When the tracepoint is called, the arguments are passed to the eBPF program |
| 33 | +4. The eBPF program processes the arguments and returns data to the tracing script |
| 34 | +``` |
| 35 | + |
| 36 | +The Linux kernel can hook into the tracepoints during runtime and pass data to |
| 37 | +sandboxed [eBPF] programs running in the kernel. These eBPF programs can, for |
| 38 | +example, collect statistics or pass data back to user-space scripts for further |
| 39 | +processing. |
| 40 | + |
| 41 | +[eBPF]: https://ebpf.io/ |
| 42 | + |
| 43 | +The two main eBPF front-ends with support for USDT are [bpftrace] and |
| 44 | +[BPF Compiler Collection (BCC)]. BCC is used for complex tools and daemons and |
| 45 | +`bpftrace` is preferred for one-liners and shorter scripts. Examples for both can |
| 46 | +be found in [contrib/tracing]. |
| 47 | + |
| 48 | +[bpftrace]: https://github.com/iovisor/bpftrace |
| 49 | +[BPF Compiler Collection (BCC)]: https://github.com/iovisor/bcc |
| 50 | +[contrib/tracing]: ../contrib/tracing/ |
| 51 | + |
| 52 | +## Tracepoint documentation |
| 53 | + |
| 54 | +The currently available tracepoints are listed here. |
| 55 | + |
| 56 | +## Adding tracepoints to Bitcoin Core |
| 57 | + |
| 58 | +To add a new tracepoint, `#include <util/trace.h>` in the compilation unit where |
| 59 | +the tracepoint is inserted. Use one of the `TRACEx` macros listed below |
| 60 | +depending on the number of arguments passed to the tracepoint. Up to 12 |
| 61 | +arguments can be provided. The `context` and `event` specify the names by which |
| 62 | +the tracepoint is referred to. Please use `snake_case` and try to make sure that |
| 63 | +the tracepoint names make sense even without detailed knowledge of the |
| 64 | +implementation details. Do not forget to update the tracepoint list in this |
| 65 | +document. |
| 66 | + |
| 67 | +```c |
| 68 | +#define TRACE(context, event) |
| 69 | +#define TRACE1(context, event, a) |
| 70 | +#define TRACE2(context, event, a, b) |
| 71 | +#define TRACE3(context, event, a, b, c) |
| 72 | +#define TRACE4(context, event, a, b, c, d) |
| 73 | +#define TRACE5(context, event, a, b, c, d, e) |
| 74 | +#define TRACE6(context, event, a, b, c, d, e, f) |
| 75 | +#define TRACE7(context, event, a, b, c, d, e, f, g) |
| 76 | +#define TRACE8(context, event, a, b, c, d, e, f, g, h) |
| 77 | +#define TRACE9(context, event, a, b, c, d, e, f, g, h, i) |
| 78 | +#define TRACE10(context, event, a, b, c, d, e, f, g, h, i, j) |
| 79 | +#define TRACE11(context, event, a, b, c, d, e, f, g, h, i, j, k) |
| 80 | +#define TRACE12(context, event, a, b, c, d, e, f, g, h, i, j, k, l) |
| 81 | +``` |
| 82 | + |
| 83 | +For example: |
| 84 | + |
| 85 | +```C++ |
| 86 | +TRACE6(net, inbound_message, |
| 87 | + pnode->GetId(), |
| 88 | + pnode->GetAddrName().c_str(), |
| 89 | + pnode->ConnectionTypeAsString().c_str(), |
| 90 | + sanitizedType.c_str(), |
| 91 | + msg.data.size(), |
| 92 | + msg.data.data() |
| 93 | +); |
| 94 | +``` |
| 95 | + |
| 96 | +### Guidelines and best practices |
| 97 | + |
| 98 | +#### Clear motivation and use-case |
| 99 | +Tracepoints need a clear motivation and use-case. The motivation should |
| 100 | +outweigh the impact on, for example, code readability. There is no point in |
| 101 | +adding tracepoints that don't end up being used. |
| 102 | + |
| 103 | +#### Provide an example |
| 104 | +When adding a new tracepoint, provide an example. Examples can show the use case |
| 105 | +and help reviewers testing that the tracepoint works as intended. The examples |
| 106 | +can be kept simple but should give others a starting point when working with |
| 107 | +the tracepoint. See existing examples in [contrib/tracing/]. |
| 108 | + |
| 109 | +[contrib/tracing/]: ../contrib/tracing/ |
| 110 | + |
| 111 | +#### No expensive computations for tracepoints |
| 112 | +Data passed to the tracepoint should be inexpensive to compute. Although the |
| 113 | +tracepoint itself only has overhead when enabled, the code to compute arguments |
| 114 | +is always run - even if the tracepoint is not used. For example, avoid |
| 115 | +serialization and parsing. |
| 116 | + |
| 117 | +#### Semi-stable API |
| 118 | +Tracepoints should have a semi-stable API. Users should be able to rely on the |
| 119 | +tracepoints for scripting. This means tracepoints need to be documented, and the |
| 120 | +argument order ideally should not change. If there is an important reason to |
| 121 | +change argument order, make sure to document the change and update the examples |
| 122 | +using the tracepoint. |
| 123 | + |
| 124 | +#### eBPF Virtual Machine limits |
| 125 | +Keep the eBPF Virtual Machine limits in mind. eBPF programs receiving data from |
| 126 | +the tracepoints run in a sandboxed Linux kernel VM. This VM has a limited stack |
| 127 | +size of 512 bytes. Check if it makes sense to pass larger amounts of data, for |
| 128 | +example, with a tracing script that can handle the passed data. |
| 129 | + |
| 130 | +#### `bpftrace` argument limit |
| 131 | +While tracepoints can have up to 12 arguments, bpftrace scripts currently only |
| 132 | +support reading from the first six arguments (`arg0` till `arg5`) on `x86_64`. |
| 133 | +bpftrace currently lacks real support for handling and printing binary data, |
| 134 | +like block header hashes and txids. When a tracepoint passes more than six |
| 135 | +arguments, then string and integer arguments should preferably be placed in the |
| 136 | +first six argument fields. Binary data can be placed in later arguments. The BCC |
| 137 | +supports reading from all 12 arguments. |
| 138 | + |
| 139 | +#### Strings as C-style String |
| 140 | +Generally, strings should be passed into the `TRACEx` macros as pointers to |
| 141 | +C-style strings (a null-terminated sequence of characters). For C++ |
| 142 | +`std::strings`, [`c_str()`] can be used. It's recommended to document the |
| 143 | +maximum expected string size if known. |
| 144 | + |
| 145 | + |
| 146 | +[`c_str()`]: https://www.cplusplus.com/reference/string/string/c_str/ |
| 147 | + |
| 148 | + |
| 149 | +## Listing available tracepoints |
| 150 | + |
| 151 | +Multiple tools can list the available tracepoints in a `bitcoind` binary with |
| 152 | +USDT support. |
| 153 | + |
| 154 | +### GDB - GNU Project Debugger |
| 155 | + |
| 156 | +To list probes in Bitcoin Core, use `info probes` in `gdb`: |
| 157 | + |
| 158 | +``` |
| 159 | +$ gdb ./src/bitcoind |
| 160 | +… |
| 161 | +(gdb) info probes |
| 162 | +Type Provider Name Where Semaphore Object |
| 163 | +stap net inbound_message 0x000000000014419e /src/bitcoind |
| 164 | +stap net outbound_message 0x0000000000107c05 /src/bitcoind |
| 165 | +stap validation block_connected 0x00000000002fb10c /src/bitcoind |
| 166 | +… |
| 167 | +``` |
| 168 | + |
| 169 | +### With `readelf` |
| 170 | + |
| 171 | +The `readelf` tool can be used to display the USDT tracepoints in Bitcoin Core. |
| 172 | +Look for the notes with the description `NT_STAPSDT`. |
| 173 | + |
| 174 | +``` |
| 175 | +$ readelf -n ./src/bitcoind | grep NT_STAPSDT -A 4 -B 2 |
| 176 | +Displaying notes found in: .note.stapsdt |
| 177 | + Owner Data size Description |
| 178 | + stapsdt 0x0000005d NT_STAPSDT (SystemTap probe descriptors) |
| 179 | + Provider: net |
| 180 | + Name: outbound_message |
| 181 | + Location: 0x0000000000107c05, Base: 0x0000000000579c90, Semaphore: 0x0000000000000000 |
| 182 | + Arguments: -8@%r12 8@%rbx 8@%rdi 8@192(%rsp) 8@%rax 8@%rdx |
| 183 | +… |
| 184 | +``` |
| 185 | + |
| 186 | +### With `tplist` |
| 187 | + |
| 188 | +The `tplist` tool is provided by BCC (see [Installing BCC]). It displays kernel |
| 189 | +tracepoints or USDT probes and their formats (for more information, see the |
| 190 | +[`tplist` usage demonstration]). There are slight binary naming differences |
| 191 | +between distributions. For example, on |
| 192 | +[Ubuntu the binary is called `tplist-bpfcc`][ubuntu binary]. |
| 193 | + |
| 194 | +[Installing BCC]: https://github.com/iovisor/bcc/blob/master/INSTALL.md |
| 195 | +[`tplist` usage demonstration]: https://github.com/iovisor/bcc/blob/master/tools/tplist_example.txt |
| 196 | +[ubuntu binary]: https://github.com/iovisor/bcc/blob/master/INSTALL.md#ubuntu---binary |
| 197 | + |
| 198 | +``` |
| 199 | +$ tplist -l ./src/bitcoind -v |
| 200 | +b'net':b'outbound_message' [sema 0x0] |
| 201 | + 1 location(s) |
| 202 | + 6 argument(s) |
| 203 | +… |
| 204 | +``` |
0 commit comments