Skip to content

Commit ac796e6

Browse files
theihorjordalgo
andauthored
Update HOWTO.md (#99)
Existing howto doc is quite outdated now. Update and add videos. Signed-off-by: Ihor Solodrai <[email protected]> Co-authored-by: Jordan Rome <[email protected]>
1 parent 095865b commit ac796e6

File tree

2 files changed

+158
-124
lines changed

2 files changed

+158
-124
lines changed

HOWTO.md

Lines changed: 153 additions & 115 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
1+
### Disclaimer
2+
3+
Like many other debugging tools, **bpfvv** may help you better understand **what** is happening with the verification of your BPF program but it is up to you to figure out **why** it is happening.
4+
15
# How to use bpfvv
26

3-
> [!WARNING]
4-
> The bpfvv app is in early stages of development, and you should expect
5-
> bugs, UI inconveniences and significant changes from week to week.
6-
>
7-
> If you're working with BPF and you think this tool (or a better
8-
> version of it) would be useful, feel free to use it and don't be shy
9-
> to report issues and request features via github. Thanks!
7+
The tool itself is hosted here: https://libbpf.github.io/bpfvv/
108

11-
Go here: https://libbpf.github.io/bpfvv/
9+
You can load a log by pasting it into the text box or choosing a local file.
1210

13-
Load a log by pasting it into the text box or choosing a file.
11+
You can also use the `url` query parameter to link to a raw log file, for example:
12+
```
13+
https://libbpf.github.io/bpfvv/?url=https://gist.githubusercontent.com/theihor/e0002c119414e6b40e2192bd7ced01b1/raw/866bcc155c2ce848dcd4bc7fd043a97f39a2d370/gistfile1.txt
14+
```
1415

1516
The app expects BPF verifier log of `BPF_LOG_LEVEL1`[^1]. This is a log
1617
that you get when your BPF program has failed verification on load
@@ -42,14 +43,32 @@ lot of information about the interpreted state of the program on each
4243
instruction. The app parses the log and re-constructs program states
4344
in order to display potentially useful information in interactive way.
4445

45-
There are two main views of the program:
46-
* (on the left) formatted log, instruction stream
47-
* (on the right) program state: known values of registers and stack slots
48-
<img width="1306" alt="Screenshot 2025-04-25 at 4 19 21 PM" src="https://github.com/user-attachments/assets/ccd9337a-14b0-4c13-afcc-cdfc1b2d46e5" />
46+
## UI overview
47+
48+
There are three main views of the program:
49+
* (on the left) C source view
50+
* (in the middle) interactive instruction stream
51+
* (on the right) program state: known values of registers and stack slots for the *selected log line*
52+
53+
The left and right views are collapsible
4954

50-
## What's in the log
55+
https://github.com/user-attachments/assets/758d650b-22f1-49f0-ab46-ae1a089667a8
5156

52-
Notice that the displayed text has different content than the raw log.
57+
### Top bar
58+
59+
The top bar contains basic app controls such as:
60+
* clear current log
61+
* load an example log
62+
* load a local file
63+
* link to this howto doc
64+
65+
https://github.com/user-attachments/assets/4d3f8aa0-cb9d-46e0-ae46-a1224c7a5600
66+
67+
### The instruction stream
68+
69+
The main view of the log is the interactive instruction stream.
70+
71+
Notice that the displayed text has content different from the raw log.
5372
For example, consider this line:
5473
```
5574
1: (7b) *(u64 *)(r10 -24) = r2 ; R2_w=1 R10=fp0 fp-24_w=1
@@ -70,157 +89,152 @@ interactive features. Notable example is call instructions.
7089

7190
For example, consider the following raw log line:
7291
```
73-
23: (85) call bpf_map_lookup_elem#1 ; R0=map_value_or_null(id=3,map=eventmap,ks=4,vs=2452)
92+
7: (85) call bpf_probe_read_user#112
7493
```
7594

7695
It is displayed like this:
7796
```
78-
r0 = call bpf_map_lookup_elem#1(r1, r2, r3, r4, r5)
97+
r0 = bpf_probe_read_user(dst: r1, size: r2, unsafe_ptr: r3)
7998
```
8099

100+
If bpfvv is aware of a helper signature, it knows the number and names of arguments and displays them in the format `name: reg`.
101+
For known helpers its name is also a link to documentation for that helper.
102+
81103
Notice also that the lines not recognized by the parser are greyed
82104
out. If you notice an unrecognized instruction, please submit a bug
83105
report.
84106

85-
### Subprogram calls
107+
#### Data dependencies
108+
109+
The app computes a use-def analysis [^2] and you can interactively view dependencies between the instructions.
110+
111+
The concept is simple. Every instruction may read some slots (registers, stack, memory) and write to others.
112+
Knowing these it is possible to determine, for a given slot, where its value came from, from what slot, and at what instruction.
113+
114+
You can view the results of this analysis by clicking on some instruction operands (registers and stack slots).
115+
116+
The selected slot is identified by a box around it. This selection changes the log view, greying out "irrelevant" instructions, and leaving only data-dependent instructions in the foreground.
117+
118+
On the left side of the instruction stream are the lines visualizing the dependencies. The lines are interactive and can be used for navigation.
119+
120+
https://github.com/user-attachments/assets/82ae80d6-314e-47bf-9892-f5dded4b9944
121+
122+
#### Subprogram calls
86123

87124
When there is a subprogram call in the log instruction stream, the
88-
stack frames are tracked by the app when computing state. When subprogram
89-
call is detected there is indentation and comments in the main log view to
90-
visualize it.
125+
stack frames are tracked by the app when computing state. When a subprogram
126+
call is detected it is visualized in the main log view.
91127

92-
<img width="1211" alt="Screenshot 2025-04-25 at 4 35 14 PM" src="https://github.com/user-attachments/assets/3b53abaf-609e-4d6f-b28b-a977016a00c0" />
128+
https://github.com/user-attachments/assets/14b2302e-9814-4d9a-ae94-e176727fd11a
93129

130+
### The state panel
94131

95-
## What can you do?
132+
The state panel displays the current state of the program based on the loaded log, with the current state determined by the line selected in the instruction stream view.
96133

97-
### Step through the instruction stream
134+
Remember that the verifier log is a trace through the program.
135+
This means that a particular instruction may be visited more than once, and the state at the same instruction (but a different point of execution) is usually also different. And so a log line roughly represents a particular point of the program execution, as interpreted by the BPF verifier.
98136

99-
The most basic feature of the visualizer is "stepping" through the
100-
log, similar to what you'd do in a debugger.
137+
The verifier reports changes in the program state like this:
138+
```
139+
1: (7b) *(u64 *)(r10 -24) = r2 ; R2_w=1 R10=fp0 fp-24_w=1
140+
```
141+
After the semicolon `;`, there are expressions showing relevant register and stack slot states. The visualizer accumulates this information from all the prior instructions, and in the state panel this accumulated state is displayed.
101142

102-
You can select a line by clicking on it, or by navigating with arrows
103-
(you can also use pgup, pgdown, home and end). The selected line has
104-
light-blue background.
143+
The header of the state panel shows the context of the state: log line number, C line number, program counter (PC) and the stack frame index.
105144

106-
When a line is selected, current state of known values is displayed in
107-
the panel on the right. By moving the selected line up/down the log,
108-
you can see how the values change with each instruction.
145+
The known values of the registers and stack slots are displayed in a table.
109146

110-
In the "state panel", the values that are written by selected
111-
instruction are marked with light-red background and the previous
112-
value is also often displayed, for example:
147+
The background color of a row in the state panel indicates that the relevant value has been affected by the selected instruction.
148+
Rows marked with red background indicate a "write" and the previous value is also often displayed, for example:
113149
```
114150
r6 scalar(id=1) -> 0
115151
```
116-
Means that current instruction changes the value of `r6` from
117-
`scalar(id=1)` to `0`.
152+
This means that current instruction changes the value of `r6` from `scalar(id=1)` to `0`.
118153

119-
The values that are read by current instruction have light-green
120-
background.
154+
The values that are read by the current instruction have a blue background.
121155

122-
Note that for "update" instructions (such as `r1 += 8`), the slot
123-
will be marked as written.
156+
Note that for "update" instructions (such as `r1 += 8`), the slot will be marked as written.
124157

125-
#### Sometimes a value of a slot has changed, but it's not highlighted as a write. Is that a bug?
158+
This then allows you to "step through" the instruction stream and watch how the values are changing, similar to classic debugger UIs.
159+
You can click on the lines that interest you, or use arrow keys to navigate.
126160

127-
Currently the visualizer only considers writes derived from the instructions
128-
themselves. For example, `r1 = r2` is a write by definition, or a call would
129-
scratch some registers.
161+
https://github.com/user-attachments/assets/c6b5b5b1-30fb-4309-a90a-1832a0a33502
130162

131-
But remember that we are looking at the BPF verifier log. BPF verifier
132-
simulates execution of a program, which requires maintaining and continuously
133-
updating a virtual state of the program. This means that whenever the verifier
134-
gains some knowledge about a value (which is not necesarily a write instruction),
135-
it will update it.
163+
#### The rows in the state panel are clickable!
136164

137-
For example when processing conditional jumps such as `if (r2 == 0) goto pc+6`,
138-
the verifier usually explores both branches. But in both cases it gained information
139-
about r2: it's either 0 or not. And so while there was no explicit write into r2,
140-
it's value is known (and has changed) after the jump instruction, when you look at
141-
it in the verifier log.
165+
It is sometimes useful to jump to the source of a particular slot value from the selected instruction, even if the slot is not relevant to that instruction.
142166

143-
Going forward the visualizer will likely treat all value updates as writes,
144-
as it is useful to know at what point verifier inferred a particular value.
167+
https://github.com/user-attachments/assets/8f5d03cc-54a5-426b-8428-c8b11f4ccf11
145168

146-
### View data dependencies
169+
### The C source view
147170

148-
The app computes a use-def analysis [^2] and you can interactively
149-
view dependencies between the instructions.
171+
The C source view panel (on the left) shows reconstructed C source lines.
150172

151-
The concept is simple. Every instruction may read some slots
152-
(registers, stack, memory) and write to others. Knowing these sets
153-
(verifier log contains enough information to compute them), it is
154-
possible to determine for a slot used by current instruction, where
155-
its value came from (from what slot in what instruction).
173+
A raw verifier log might contain source line information, and bpfvv attempts to reconstruct the source code and associate it with the instructions.
174+
Here is how it looks in the raw log:
175+
```
176+
1800: R1=scalar() R10=fp0
177+
; int rb_print(rbtree_t __arg_arena *rbtree) @ rbtree.bpf.c:507
178+
1800: (b4) w0 = -22 ; R0_w=0xffffffea
179+
; if (unlikely(!rbtree)) @ rbtree.bpf.c:517
180+
1801: (15) if r1 == 0x0 goto pc+132 ; R1=scalar(umin=1)
181+
```
156182

157-
You can view the results of this analysis by clicking on some
158-
instruction operands (registers and stack slots).
183+
The original source code is not available in the log of course. So bpfvv doesn't have enough information to even format it properly.
159184

160-
The selected slot is identified by a box. This selection changes the
161-
log view, greying out "irrelevant" instructions, and leaving only
162-
data-dependent instructions in the foreground.
163-
<img width="762" alt="Screenshot 2025-04-25 at 4 50 11 PM" src="https://github.com/user-attachments/assets/7cfc0109-6c8c-4a94-a9b5-37af9a4e877a" />
185+
However, it allows you to see a rough correspondence between BPF instructions and the original C source code.
164186

165-
#### What's clickable?
187+
Be aware though that this information is noisy and may be inaccurate, since it reached the visualizer through a long way:
188+
* the compiler generated DWARF with line info, which is already "best-effort"
189+
* DWARF was transformed into BTF with line data
190+
* BTF was processed by the verifier and available information was dumped interleaved with the program trace
166191

167-
Registers r0-r9 and explicit stack accesses such as `*(u32 *)(r10 -8)`.
192+
https://github.com/user-attachments/assets/3e8c52f0-3823-4d5f-abbd-f7c2d8e31d19
168193

169-
r10 (stack frame pointer) is not clickable because it's effectively a
170-
constant [^3].
194+
### The bottom panel
171195

172-
Note that the stack slots may be accessed indirectly: if say `r6 = fp-64`
173-
and then you do `*(u32 *)(r6 -8)` it's equivalent to `*(u32 *)(r10 -72)`.
174-
The visualizer does not show such dependencies (yet). Although state values
175-
are tracked correctly.
196+
The bottom panel shows original log text for the selected line and for the current hovered line.
197+
It is sometimes useful to check the source of the information displayed by the visualizer.
176198

177-
#### How deep is the displayed dependency chain?
178199

179-
It depends, but usually not deep.
200+
## Not frequently asked questions
180201

181-
The problem with showing all dependencies is that it's too much
182-
information, which renders it useless.
202+
### What exactly do "read" and "written" values means here?
183203

184-
Currently the upstream instruction is highlighted if it's an
185-
unambiguous dependency. For example:
186-
```
187-
42: r1 = 13
188-
43: r7 = 0
189-
44: r2 = r1
190-
```
204+
Here is a couple of trivial examples:
205+
* `r1 = 0` this is a write to `r1`
206+
* `r2 = r3` this is a read of `r3` and write to `r2`
207+
* `r2 += 1` this is a read of `r2` and write to `r2`, aka an update
191208

192-
Instruction 42 is an unambiguous dependency of instruction 44, because
193-
r1 is the only read slot, and there were no modifications to it along
194-
the way.
209+
Here is a couple of more complicated examples:
210+
* `*(u64 *)(r10 -32) = r1` this is a read of `r1` and a write to `fp-32`
211+
* `r10` is effectively constant[^3], as it is always a pointer to the top of a BPF stack frame, so stores to `r10-offset` are writes to the stack slots, identified by `fp-off` or `fp[frame]-off` in the visualizer
212+
* `r1 = *(u64 *)(r2 -8)` this is a write to `r1` and a read of `r2`, however it may also be a read of the stack, if `r2` happens to contain a pointer to the stack slot
195213

196-
All such direct dependencies up the chain are shown.
214+
Most instructions have intrinsic "read" and "write" sets, defined by its semantics. However context also matters, as you can see from the last example.
197215

198-
However, when more than one value is read in the upstream instruction,
199-
the UI will stop highlighting at that instruction.
216+
The visualizer takes into account a few important things, when determining data dependencies:
217+
* it is aware of scratch and callee-saved register semantics of subprogram/helper calls
218+
* it is aware of the stack frames: we enter new stack memory in a subprogram, and pop back on exit
219+
* it is aware of indirect stack slot access and basic pointer arithmetic
200220

201-
Consider an example:
202-
```
203-
42: r1 = r2
204-
43: r3 = *(u32 *)(r10 -16)
205-
44: r1 += r3
206-
45: *(u32 *)(r10 -64) = r1
207-
```
221+
### Side effects?
208222

209-
If you select `r1` at instruction 45, only instruction 44 will be
210-
highlighted, even though 42 and 43 are its transitive dependencies
211-
(`r1 += r3` reads both `r1` and `r3`).
223+
One counterintuitive thing about data dependencies in the context of BPF verification is that the instructions which don't do any arithmetic or memory stores can still change the progam state.
212224

213-
The reason for this UI behavior is that showing all dependencies (both
214-
r1 and r3 and in turn all their dependencies) may very quickly cover
215-
most of the instructions. This is especially true for call
216-
instructions, which read up to 5 registers.
225+
Remember, we are looking at the BPF verifier log.
226+
The BPF verifier simulates the execution of a program, which requires maintaining a virtual state of the program.
227+
This means that whenever the verifier gains some knowledge about a value (which is not necesarily an intrinsic write instruction), it will update the program state.
217228

218-
On the other hand the app can't know what the user is looking for, and
219-
there is no point in guessing. So, for an instruction like `r1 += r3`,
220-
the user must choose specific operand (r1 or r3 in this case) to
221-
expand the dependency chain further.
229+
For example, when processing conditional jumps such as `if (r2 == 0) goto pc+6`,
230+
the verifier usually explores both branches. But in both cases it gained information
231+
about `r2`: it's either 0 or not. And so while there was no explicit write into r2,
232+
it's value is known (and has changed) after the jump instruction, when you look at
233+
it in the verifier log.
222234

223-
#### Note on memory stores and loads
235+
https://github.com/user-attachments/assets/94d271e2-f033-439b-8554-d9f8a66b4143
236+
237+
### What if we write to memory or a BPF arena?
224238

225239
Currently non-stack memory access is a "black hole" from the point of
226240
view of use-def analysis in this app. The reason is that it's
@@ -235,6 +249,28 @@ dependencies. If you see `*(u32 *)(r8 +0)` down the instruction
235249
stream, even if value of r8 hasn't changed, the analysis does not
236250
recognize these slots as "the same".
237251

252+
**Unless** `r8` contains a pointer to a stack slot.
253+
In that case you can click both on the register to see where its value came from, and on the dereference expression to see where the stack slot value came from.
254+
255+
https://github.com/user-attachments/assets/f345ec63-b91d-411c-b1d2-3890ed8f1c99
256+
257+
### An instruction is highlighted as dependency, but I don't understand why. Is that a bug?
258+
259+
Probably not[^4].
260+
261+
The visualizer has a single source of information: the verifier log.
262+
The log contains two streams of information: the instructions and the associated state change, as reported by the verifier.
263+
264+
Some of the state that the visualizer computes is derived from the instructions themselves.
265+
However, the state reported by the verifier always takes precedence.
266+
267+
Since the values in the context of the visualizer are just strings, if the verifier reported a slightly different string, we treat it as an update.
268+
For example, you might see something like this:
269+
```
270+
r8 ptr_or_null_node_data(id=9,ref_obj_id=9,off=16) -> ptr_node_data(ref_obj_id=9,off=16)
271+
```
272+
273+
The verifier reported a different value, and that's what bpfvv shows.
238274

239275
## Footnotes
240276

@@ -247,3 +283,5 @@ the browser will not be happy to render it.
247283
[^2]: https://en.wikipedia.org/wiki/Use-define_chain
248284

249285
[^3]: https://docs.cilium.io/en/latest/reference-guides/bpf/architecture/
286+
287+
[^4]: But maybe yes... If you suspect a bug, please report.

README.md

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,17 @@
1-
> [!WARNING]
2-
> The bpfvv app is in early stages of development, and you should expect
3-
> bugs, UI inconveniences and significant changes from week to week.
4-
>
5-
> If you're working with BPF and you think this tool (or a better
6-
> version of it) would be useful, feel free to use it and don't be shy
7-
> to report issues and request features via github. Thanks!
8-
91
[![CI](https://github.com/libbpf/bpfvv/actions/workflows/ci.yml/badge.svg)](https://github.com/libbpf/bpfvv/actions/workflows/ci.yml)
102

113
**bpfvv** stands for BPF Verifier Visualizer
124

135
https://libbpf.github.io/bpfvv/
146

15-
This project is an experiment about visualizing Linux Kernel BPF verifier log to help BPF programmers with debugging verification failures.
7+
BPF Verifier Visualizer is a tool to analyze Linux Kernel BPF verifier logs.
8+
9+
The goal of bpfvv is to help BPF programmers debug verification failures.
1610

1711
The user can load a text file, and the app will attempt to parse it as a verifier log. Successfully parsed lines produce a state which is then visualized in the UI. You can think of this as a primitive debugger UI, except it interprets a log and not a runtime state of a program.
1812

13+
For more information on how to use **bpfvv** see the [HOWTO.md](https://github.com/libbpf/bpfvv/blob/master/HOWTO.md)
14+
1915
## Development
2016

2117
- Fork the website repo: https://github.com/libbpf/bpfvv.git

0 commit comments

Comments
 (0)