Skip to content

Conversation

@pinoOgni
Copy link
Contributor

This PR is a proposal to try to find a more uniform way to print ebpf debug information.

I only modified a few debug prints as a test and wrote the bpf-print-format.md file to document the proposal.

I'm open to suggestions.

@pinoOgni pinoOgni requested a review from a team as a code owner November 27, 2025 09:44
@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.85%. Comparing base (b4c3b17) to head (cf8fcf3).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #941      +/-   ##
==========================================
+ Coverage   47.13%   55.85%   +8.71%     
==========================================
  Files         255      255              
  Lines       22139    22139              
==========================================
+ Hits        10436    12366    +1930     
+ Misses      11041     8935    -2106     
- Partials      662      838     +176     
Flag Coverage Δ
integration-test 22.86% <ø> (?)
integration-test-arm 0.00% <ø> (?)
integration-test-vm-${ARCH}-${KERNEL_VERSION} 0.00% <ø> (?)
k8s-integration-test 2.67% <ø> (?)
oats-test 0.00% <ø> (?)
unittests 47.13% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@marctc
Copy link
Contributor

marctc commented Nov 27, 2025

Is there a way to avoid passing __FUNCTION__ all the time? In logs you changed it's always almost there, I wonder if we could be part of the logger function itself or so

@pinoOgni
Copy link
Contributor Author

Is there a way to avoid passing __FUNCTION__ all the time? In logs you changed it's always almost there, I wonder if we could be part of the logger function itself or so

Hi @marctc I hope I understand the question. I pass __FUNCTION__ so that the name of the current function is printed. I don't know of any other way for the logger function to print the name of the caller.

Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea, I'd propose that we think a bit more on how to do this that will have the best outcome. At the moment the log line is limited to 80 chars, since we send a lot of them and we want to capture as many as possible. I'm afraid that if we prefix the line with the function name, we'll end up losing valuable information, which is the values we keep.

@grcevski
Copy link
Contributor

Another thing to watch out is that kernel 5.15 and lower support only up to 3 arguments in the format, so adding one more will cause us to go on separate lines for many of our prints.

Perhaps the best way to do this is to ensure that all probes have printed the function name on all entry points.

@pinoOgni
Copy link
Contributor Author

Another thing to watch out is that kernel 5.15 and lower support only up to 3 arguments in the format, so adding one more will cause us to go on separate lines for many of our prints.

There's a section at the end of the document about this problem. The solutions I could think of were splitting or hard-coding the function name.

Perhaps the best way to do this is to ensure that all probes have printed the function name on all entry points.

Yes, but that's not enough. If there are two printouts with the same (or similar) message in two different probes, we could have a mix-up. The only way to ensure uniqueness is for each printout to print the function name.

I had also thought about adding an additional argument to some functions so that the probe name would be passed and then printed, but it seemed like too much.

@mmat11
Copy link
Contributor

mmat11 commented Nov 27, 2025

This is a good idea, I'd propose that we think a bit more on how to do this that will have the best outcome. At the moment the log line is limited to 80 chars, since we send a lot of them and we want to capture as many as possible. I'm afraid that if we prefix the line with the function name, we'll end up losing valuable information, which is the values we keep.

Good point, I thought about this and wondered why do we send data to userspace via a ringbuffer instead of reading directly from the trace pipe? That just requires tracefs/debugfs to be mounted but I think debug will only be enabled in dev/test environments so it should be ok.

That will have benefits such as:

  • halve the amount of objects we compile
  • remove the limitation of 80 bytes per log event
  • keep the instructions amount at a similar level to prod builds, since we won't need to use bpf_dbg_helper anymore
  • make the code simpler

Are there things I'm not taking into consideration for this approach?

@marctc
Copy link
Contributor

marctc commented Nov 27, 2025

requires tracefs/debugfs to be mounted but I think debug will only be enabled in dev/test environments so it should be ok.

I don't how tracefs/debugfs works but sometimes we ask users to enable BPF logging to investigate bugs regardless their environment they use to run OBI.

Copy link
Contributor

@rafaelroquetto rafaelroquetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think this will definitely help with the current state of the onion.

In addition to my inline comments, I think we need some sort of linter to validate this (for the cases which cannot be enforced by code), otherwise people will eventually and unintentionally deviate from these rules.

if (invocation) {
bpf_dbg_printk(
"Found Go HTTP invocation, resetting the span id to %x%x", tcp->seq, tcp->ack);
bpf_dbg_printk("%s: found Go HTTP invocation, resetting the span id to seq=%x, ack=%x",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a mistake - the span id here is indeed meant to be the combination of %x%x, i.e. the concatenation of seq and ack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops, thanks!

Addressing issues like logging multiple variables without a separator (e.g., avoiding `%x%x`). Example:

```c
bpf_dbg_printk("%s: found HTTP info, resetting the span id to seq=%x, ack=%x", __FUNCTION__, tcp->seq, tcp->ack);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the point, but this is a bad example (see previous comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks

At the beginning of an eBPF probe (not every probe, as it can be too verbose\!), print the function name in the following way:

```c
bpf_dbg_printk("=== %s ===", __FUNCTION__);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could wrap this in a macro, so that one simply writes:

bpf_dbg_log_func()


#### Function name in generic functions

To print the function name in a generic function, use the **`__FUNCTION__`** identifier in **every log statement** without the bounding **`===`**. The triple equals signs are reserved exclusively for entry points of eBPF probes. Example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the value here, but there's an issue as others have pointed out: some kernels only allow at most 3 arguments, and now we are "wasting" one with __FUNCTION__ - this has been a problem before, and you will notice that because of that some bpf_dbg_printk statements had to be split into at least 2 as a workaround.

This why I've explicitly opted to spell out the function name as part of the format string rather than using __FUNCTION__ in tpinjector.c and friends. I think it's a fair trade-off, given we are dealing with eBPF, but YMMV.

Now, if we want something like that anyway, we could probably have a macro like this:

#define bpf_func_printk(fmt, ...) \
    bpf_printk("%s: " fmt, __func__, ##__VA_ARGS__)

(or wrap around bpf_dbg_printk, I haven't tried with that).

If we do that, it should be a separate macro, whilst bpf_dbg_printk remains as is, giving the caller the flexibility to pick - think of it as a convenience macro.

It can happen, sometimes, that the number of arguments to print is 3, but when `__FUNCTION__` is included, the total reaches 4. In these cases, `bpf_dbg_printk` calls `bpf_printk`, which subsequently calls `___bpf_pick_printk`. This chain then selects and calls `__bpf_vprintk` (for 4+ arguments), which finally invokes `bpf_trace_vprintk`. Since `bpf_trace_vprintk` is only available starting in [kernel version 5.16](https://docs.ebpf.io/linux/helper-function/bpf_trace_vprintk/), two options are available to avoid errors when loading the eBPF program:
1. Split the print into two separate calls: one with one argument and one with two arguments, where `__FUNCTION__` is added to both.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in light of this comment, perhaps the way is to simply provide the convenience macro above - makes it flexible.

@rafaelroquetto
Copy link
Contributor

Good point, I thought about this and wondered why do we send data to userspace via a ringbuffer instead of reading directly from the trace pipe? That just requires tracefs/debugfs to be mounted but I think debug will only be enabled in dev/test environments so it should be ok.

As @marctc pointed out, this is because 99.9% of the logs end users can provide us comes from the usual OBI logs - people don't know how or don't have access to trace_pipe.

It also means sometimes logs lines can get lost, when the ring buffer is under pressure, but that's the nature of it.

@mmat11
Copy link
Contributor

mmat11 commented Nov 27, 2025

I don't how tracefs/debugfs works but sometimes we ask users to enable BPF logging to investigate bugs regardless their environment they use to run OBI.

As @marctc pointed out, this is because 99.9% of the logs end users can provide us comes from the usual OBI logs - people don't know how or don't have access to trace_pipe.

Could mounting host /sys in the helm chart work?

@grcevski
Copy link
Contributor

I don't how tracefs/debugfs works but sometimes we ask users to enable BPF logging to investigate bugs regardless their environment they use to run OBI.

As @marctc pointed out, this is because 99.9% of the logs end users can provide us comes from the usual OBI logs - people don't know how or don't have access to trace_pipe.

Could mounting host /sys in the helm chart work?

It might, but in a sense it's not always enough. For some platforms like GKE Autopilot we need to explicitly make that access as something we need, which reminds me I need to follow-up on the update about the cgroup. Also sometimes the trace pipe might be held by another program, although it's rare.

@grcevski
Copy link
Contributor

grcevski commented Nov 27, 2025

Yes, but that's not enough. If there are two printouts with the same (or similar) message in two different probes, we could have a mix-up. The only way to ensure uniqueness is for each printout to print the function name.

I had also thought about adding an additional argument to some functions so that the probe name would be passed and then printed, but it seemed like too much.

I know, how about this? What if we changed the bpf_debug_printk to add the line for __FUNCTION__ automatically, but not do it for the user space log shipping? can we do macro magic here, assuming we keep the rule that we only do this if snprintf is available, which will also remove the 3 max arguments?

@rafaelroquetto
Copy link
Contributor

@mmat11 in addition to what @grcevski, we removed the dependency on bpffs because it was not possible to mount it without privileged and messing with the k8s/docker app armour - I suspect the same could be true for debugfs and anything under /sys.

In my experience, we tend to know which customers do have access to the node/host and those few select ones are indeed able to provide us with a trace_pipe output - for those who cannot, debugfs would fall into the same bucket, and we'd still need to live with the actual logs from stdout/userspace instead.

Some people literally only have access to logs from a web interface (e.g. Grafana Dashboard/logs) and they end up pasting that. Or sometimes granting us access to their web UI where we can check the logs ourselves, so regardless of trace_pipe and friends, we still need to be able to see logs coming via the ring buffer (and it sucks they get truncated).

@mmat11
Copy link
Contributor

mmat11 commented Nov 27, 2025

@mmat11 in addition to what @grcevski, we removed the dependency on bpffs because it was not possible to mount it without privileged and messing with the k8s/docker app armour - I suspect the same could be true for debugfs and anything under /sys.

In my experience, we tend to know which customers do have access to the node/host and those few select ones are indeed able to provide us with a trace_pipe output - for those who cannot, debugfs would fall into the same bucket, and we'd still need to live with the actual logs from stdout/userspace instead.

Some people literally only have access to logs from a web interface (e.g. Grafana Dashboard/logs) and they end up pasting that. Or sometimes granting us access to their web UI where we can check the logs ourselves, so regardless of trace_pipe and friends, we still need to be able to see logs coming via the ring buffer (and it sucks they get truncated).

I see, I think my proposal is not viable for now then. What about using dynamic log sizes then, similar to large buffer events? It would at least remove the "truncated" limit

@rafaelroquetto
Copy link
Contributor

What about using dynamic log sizes then, similar to large buffer events? It would at least remove the "truncated" limit

@mmat11 I think that's a great idea, and probably the way to go.

One thing along those lines to consider (not sure if a good idea): if we want true structured logging, that can be achieved in this fashion - imagine something like this:

obi_bpf_log("found TP", "tp", tp, "pid", pid);

this gets serialised into a "log event" struct that gets shipped to userspace, which can then decide to unpack it as JSON or as a line (similar to TracePrinter).

but I digress

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants