Skip to content

Conversation

@kernel-patches-daemon-bpf-rc
Copy link

Pull request for series with
subject: xdp: Delegate fast path return decision to page_pool
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1020822

The code brings back the tasklet based code in order
to be able to run in softirq context.

One additional test is added which benchmarks the
impact of page_pool_napi_local().

Signed-off-by: Dragos Tatulea <[email protected]>
XDP uses the BPF_RI_F_RF_NO_DIRECT flag to mark contexts where it is not
allowed to do direct recycling, even though the direct flag was set by
the caller. This is confusing and can lead to races which are hard to
detect [1].

Furthermore, the page_pool already contains an internal
mechanism which checks if it is safe to switch the direct
flag from off to on.

This patch drops the use of the BPF_RI_F_RF_NO_DIRECT flag and always
calls the page_pool release with the direct flag set to false. The
page_pool will decide if it is safe to do direct recycling. This
is not free but it is worth it to make the XDP code safer. The
next paragrapsh are discussing the performance impact.

Performance wise, there are 3 cases to consider. Looking from
__xdp_return() for MEM_TYPE_PAGE_POOL case:

1) napi_direct == false:
  - Before: 1 comparison in __xdp_return() + call of
    page_pool_napi_local() from page_pool_put_unrefed_netmem().
  - After: Only one call to page_pool_napi_local().

2) napi_direct == true && BPF_RI_F_RF_NO_DIRECT
  - Before: 2 comparisons in __xdp_return() + call of
    page_pool_napi_local() from page_pool_put_unrefed_netmem().
  - After: Only one call to page_pool_napi_local().

3) napi_direct == true && !BPF_RI_F_RF_NO_DIRECT
  - Before: 2 comparisons in __xdp_return().
  - After: One call to page_pool_napi_local()

Case 1 & 2 are the slower paths and they only have to gain.
But they are slow anyway so the gain is small.

Case 3 is the fast path and is the one that has to be considered more
closely. The 2 comparisons from __xdp_return() are swapped for the more
expensive page_pool_napi_local() call.

Using the page_pool benchmark between the fast-path and the
newly-added NAPI aware mode to measure [2] how expensive
page_pool_napi_local() is:

  bench_page_pool: time_bench_page_pool01_fast_path(): in_serving_softirq fast-path
  bench_page_pool: Type:tasklet_page_pool01_fast_path Per elem: 15 cycles(tsc) 7.537 ns (step:0)

  bench_page_pool: time_bench_page_pool04_napi_aware(): in_serving_softirq fast-path
  bench_page_pool: Type:tasklet_page_pool04_napi_aware Per elem: 20 cycles(tsc) 10.490 ns (step:0)

... and the slow path for reference:

  bench_page_pool: time_bench_page_pool02_ptr_ring(): in_serving_softirq fast-path
  bench_page_pool: Type:tasklet_page_pool02_ptr_ring Per elem: 30 cycles(tsc) 15.395 ns (step:0)

So the impact is small in the fast-path, but not negligible. One thing
to consider is the fact that the comparisons from napi_direct are
dropped. That means that the impact will be smaller than the
measurements from the benchmark.

[1] Commit 2b986b9 ("bpf, cpumap: Disable page_pool direct xdp_return need larger scope")
[2] Intel Xeon Platinum 8580

Signed-off-by: Dragos Tatulea <[email protected]>
@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: f8c67d8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1020822
version: 1

@kernel-patches-daemon-bpf-rc
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=1020822 expired. Closing PR.

@kernel-patches-daemon-bpf-rc kernel-patches-daemon-bpf-rc bot deleted the series/1020822=>bpf-next branch November 9, 2025 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants