Skip to content

Commit 84ace9a

Browse files
0xB10Claanwj
andcommitted
doc: Add initial USDT documentation
Both added files are extended in the following commits. doc/usdt.md is based on earlier work by laanwj. Co-authored-by: W. J. van der Laan <[email protected]>
1 parent 979f410 commit 84ace9a

File tree

2 files changed

+249
-0
lines changed

2 files changed

+249
-0
lines changed

contrib/tracing/README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
Example scripts for User-space, Statically Defined Tracing (USDT)
2+
=================================================================
3+
4+
This directory contains scripts showcasing User-space, Statically Defined
5+
Tracing (USDT) support for Bitcoin Core on Linux using. For more information on
6+
USDT support in Bitcoin Core see the [USDT documentation].
7+
8+
[USDT documentation]: ../../doc/tracing.md
9+
10+
11+
Examples for the two main eBPF front-ends, [bpftrace] and
12+
[BPF Compiler Collection (BCC)], with support for USDT, are listed. BCC is used
13+
for complex tools and daemons and `bpftrace` is preferred for one-liners and
14+
shorter scripts.
15+
16+
[bpftrace]: https://github.com/iovisor/bpftrace
17+
[BPF Compiler Collection (BCC)]: https://github.com/iovisor/bcc
18+
19+
20+
To develop and run bpftrace and BCC scripts you need to install the
21+
corresponding packages. See [installing bpftrace] and [installing BCC] for more
22+
information. For development there exist a [bpftrace Reference Guide], a
23+
[BCC Reference Guide], and a [bcc Python Developer Tutorial].
24+
25+
[installing bpftrace]: https://github.com/iovisor/bpftrace/blob/master/INSTALL.md
26+
[installing BCC]: https://github.com/iovisor/bcc/blob/master/INSTALL.md
27+
[bpftrace Reference Guide]: https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md
28+
[BCC Reference Guide]: https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md
29+
[bcc Python Developer Tutorial]: https://github.com/iovisor/bcc/blob/master/docs/tutorial_bcc_python_developer.md
30+
31+
## Examples
32+
33+
The bpftrace examples contain a relative path to the `bitcoind` binary. By
34+
default, the scripts should be run from the repository-root and assume a
35+
self-compiled `bitcoind` binary. The paths in the examples can be changed, for
36+
example, to point to release builds if needed. See the
37+
[Bitcoin Core USDT documentation] on how to list available tracepoints in your
38+
`bitcoind` binary.
39+
40+
[Bitcoin Core USDT documentation]: ../../doc/tracing.md#listing-available-tracepoints
41+
42+
**WARNING: eBPF programs require root privileges to be loaded into a Linux
43+
kernel VM. This means the bpftrace and BCC examples must be executed with root
44+
privileges. Make sure to carefully review any scripts that you run with root
45+
privileges first!**

doc/tracing.md

Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
# User-space, Statically Defined Tracing (USDT) for Bitcoin Core
2+
3+
Bitcoin Core includes statically defined tracepoints to allow for more
4+
observability during development, debugging, code review, and production usage.
5+
These tracepoints make it possible to keep track of custom statistics and
6+
enable detailed monitoring of otherwise hidden internals. They have
7+
little to no performance impact when unused.
8+
9+
```
10+
eBPF and USDT Overview
11+
======================
12+
13+
┌──────────────────┐ ┌──────────────┐
14+
│ tracing script │ │ bitcoind │
15+
│==================│ 2. │==============│
16+
│ eBPF │ tracing │ hooks │ │
17+
│ code │ logic │ into┌─┤►tracepoint 1─┼───┐ 3.
18+
└────┬───┴──▲──────┘ ├─┤►tracepoint 2 │ │ pass args
19+
1. │ │ 4. │ │ ... │ │ to eBPF
20+
User compiles │ │ pass data to │ └──────────────┘ │ program
21+
Space & loads │ │ tracing script │ │
22+
─────────────────┼──────┼─────────────────┼────────────────────┼───
23+
Kernel │ │ │ │
24+
Space ┌──┬─▼──────┴─────────────────┴────────────┐ │
25+
│ │ eBPF program │◄──────┘
26+
│ └───────────────────────────────────────┤
27+
│ eBPF kernel Virtual Machine (sandboxed) │
28+
└──────────────────────────────────────────┘
29+
30+
1. The tracing script compiles the eBPF code and loads the eBPF program into a kernel VM
31+
2. The eBPF program hooks into one or more tracepoints
32+
3. When the tracepoint is called, the arguments are passed to the eBPF program
33+
4. The eBPF program processes the arguments and returns data to the tracing script
34+
```
35+
36+
The Linux kernel can hook into the tracepoints during runtime and pass data to
37+
sandboxed [eBPF] programs running in the kernel. These eBPF programs can, for
38+
example, collect statistics or pass data back to user-space scripts for further
39+
processing.
40+
41+
[eBPF]: https://ebpf.io/
42+
43+
The two main eBPF front-ends with support for USDT are [bpftrace] and
44+
[BPF Compiler Collection (BCC)]. BCC is used for complex tools and daemons and
45+
`bpftrace` is preferred for one-liners and shorter scripts. Examples for both can
46+
be found in [contrib/tracing].
47+
48+
[bpftrace]: https://github.com/iovisor/bpftrace
49+
[BPF Compiler Collection (BCC)]: https://github.com/iovisor/bcc
50+
[contrib/tracing]: ../contrib/tracing/
51+
52+
## Tracepoint documentation
53+
54+
The currently available tracepoints are listed here.
55+
56+
## Adding tracepoints to Bitcoin Core
57+
58+
To add a new tracepoint, `#include <util/trace.h>` in the compilation unit where
59+
the tracepoint is inserted. Use one of the `TRACEx` macros listed below
60+
depending on the number of arguments passed to the tracepoint. Up to 12
61+
arguments can be provided. The `context` and `event` specify the names by which
62+
the tracepoint is referred to. Please use `snake_case` and try to make sure that
63+
the tracepoint names make sense even without detailed knowledge of the
64+
implementation details. Do not forget to update the tracepoint list in this
65+
document.
66+
67+
```c
68+
#define TRACE(context, event)
69+
#define TRACE1(context, event, a)
70+
#define TRACE2(context, event, a, b)
71+
#define TRACE3(context, event, a, b, c)
72+
#define TRACE4(context, event, a, b, c, d)
73+
#define TRACE5(context, event, a, b, c, d, e)
74+
#define TRACE6(context, event, a, b, c, d, e, f)
75+
#define TRACE7(context, event, a, b, c, d, e, f, g)
76+
#define TRACE8(context, event, a, b, c, d, e, f, g, h)
77+
#define TRACE9(context, event, a, b, c, d, e, f, g, h, i)
78+
#define TRACE10(context, event, a, b, c, d, e, f, g, h, i, j)
79+
#define TRACE11(context, event, a, b, c, d, e, f, g, h, i, j, k)
80+
#define TRACE12(context, event, a, b, c, d, e, f, g, h, i, j, k, l)
81+
```
82+
83+
For example:
84+
85+
```C++
86+
TRACE6(net, inbound_message,
87+
pnode->GetId(),
88+
pnode->GetAddrName().c_str(),
89+
pnode->ConnectionTypeAsString().c_str(),
90+
sanitizedType.c_str(),
91+
msg.data.size(),
92+
msg.data.data()
93+
);
94+
```
95+
96+
### Guidelines and best practices
97+
98+
#### Clear motivation and use-case
99+
Tracepoints need a clear motivation and use-case. The motivation should
100+
outweigh the impact on, for example, code readability. There is no point in
101+
adding tracepoints that don't end up being used.
102+
103+
#### Provide an example
104+
When adding a new tracepoint, provide an example. Examples can show the use case
105+
and help reviewers testing that the tracepoint works as intended. The examples
106+
can be kept simple but should give others a starting point when working with
107+
the tracepoint. See existing examples in [contrib/tracing/].
108+
109+
[contrib/tracing/]: ../contrib/tracing/
110+
111+
#### No expensive computations for tracepoints
112+
Data passed to the tracepoint should be inexpensive to compute. Although the
113+
tracepoint itself only has overhead when enabled, the code to compute arguments
114+
is always run - even if the tracepoint is not used. For example, avoid
115+
serialization and parsing.
116+
117+
#### Semi-stable API
118+
Tracepoints should have a semi-stable API. Users should be able to rely on the
119+
tracepoints for scripting. This means tracepoints need to be documented, and the
120+
argument order ideally should not change. If there is an important reason to
121+
change argument order, make sure to document the change and update the examples
122+
using the tracepoint.
123+
124+
#### eBPF Virtual Machine limits
125+
Keep the eBPF Virtual Machine limits in mind. eBPF programs receiving data from
126+
the tracepoints run in a sandboxed Linux kernel VM. This VM has a limited stack
127+
size of 512 bytes. Check if it makes sense to pass larger amounts of data, for
128+
example, with a tracing script that can handle the passed data.
129+
130+
#### `bpftrace` argument limit
131+
While tracepoints can have up to 12 arguments, bpftrace scripts currently only
132+
support reading from the first six arguments (`arg0` till `arg5`) on `x86_64`.
133+
bpftrace currently lacks real support for handling and printing binary data,
134+
like block header hashes and txids. When a tracepoint passes more than six
135+
arguments, then string and integer arguments should preferably be placed in the
136+
first six argument fields. Binary data can be placed in later arguments. The BCC
137+
supports reading from all 12 arguments.
138+
139+
#### Strings as C-style String
140+
Generally, strings should be passed into the `TRACEx` macros as pointers to
141+
C-style strings (a null-terminated sequence of characters). For C++
142+
`std::strings`, [`c_str()`] can be used. It's recommended to document the
143+
maximum expected string size if known.
144+
145+
146+
[`c_str()`]: https://www.cplusplus.com/reference/string/string/c_str/
147+
148+
149+
## Listing available tracepoints
150+
151+
Multiple tools can list the available tracepoints in a `bitcoind` binary with
152+
USDT support.
153+
154+
### GDB - GNU Project Debugger
155+
156+
To list probes in Bitcoin Core, use `info probes` in `gdb`:
157+
158+
```
159+
$ gdb ./src/bitcoind
160+
161+
(gdb) info probes
162+
Type Provider Name Where Semaphore Object
163+
stap net inbound_message 0x000000000014419e /src/bitcoind
164+
stap net outbound_message 0x0000000000107c05 /src/bitcoind
165+
stap validation block_connected 0x00000000002fb10c /src/bitcoind
166+
167+
```
168+
169+
### With `readelf`
170+
171+
The `readelf` tool can be used to display the USDT tracepoints in Bitcoin Core.
172+
Look for the notes with the description `NT_STAPSDT`.
173+
174+
```
175+
$ readelf -n ./src/bitcoind | grep NT_STAPSDT -A 4 -B 2
176+
Displaying notes found in: .note.stapsdt
177+
Owner Data size Description
178+
stapsdt 0x0000005d NT_STAPSDT (SystemTap probe descriptors)
179+
Provider: net
180+
Name: outbound_message
181+
Location: 0x0000000000107c05, Base: 0x0000000000579c90, Semaphore: 0x0000000000000000
182+
Arguments: -8@%r12 8@%rbx 8@%rdi 8@192(%rsp) 8@%rax 8@%rdx
183+
184+
```
185+
186+
### With `tplist`
187+
188+
The `tplist` tool is provided by BCC (see [Installing BCC]). It displays kernel
189+
tracepoints or USDT probes and their formats (for more information, see the
190+
[`tplist` usage demonstration]). There are slight binary naming differences
191+
between distributions. For example, on
192+
[Ubuntu the binary is called `tplist-bpfcc`][ubuntu binary].
193+
194+
[Installing BCC]: https://github.com/iovisor/bcc/blob/master/INSTALL.md
195+
[`tplist` usage demonstration]: https://github.com/iovisor/bcc/blob/master/tools/tplist_example.txt
196+
[ubuntu binary]: https://github.com/iovisor/bcc/blob/master/INSTALL.md#ubuntu---binary
197+
198+
```
199+
$ tplist -l ./src/bitcoind -v
200+
b'net':b'outbound_message' [sema 0x0]
201+
1 location(s)
202+
6 argument(s)
203+
204+
```

0 commit comments

Comments
 (0)