Skip to content

Conversation

@kernel-patches-daemon-bpf-rc
Copy link

Pull request for series with
subject: bpf: Optimize recursion detection for arm64
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1019553

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 11369e6
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1019553
version: 1

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: efa4756
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1019553
version: 1

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: b3387b3
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1019553
version: 1

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 4cb4897
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1019553
version: 1

BPF programs detect recursion by a per-cpu active flag in struct
bpf_prog. This flag is set/unset in the trampoline using atomic
operations to prevent inter-context recursion.

Some arm64 platforms have slow per-CPU atomic operations, for example,
the Neoverse V2.  This commit therefore changes the recursion detection
mechanism to allow four levels of recursion (normal -> softirq -> hardirq
-> NMI). With allowing limited recursion, we can now stop using atomic
operations. This approach is similar to get_recursion_context() in perf.

Change active to a per-cpu array of four u8 values, one for each context
and use non-atomic increment/decrement on them.

This improves the performance on ARM64 (64-CPU Neoverse-N1):

 +----------------+-------------------+-------------------+---------+
 |    Benchmark   |     Base run      |   Patched run     |  Δ (%)  |
 +----------------+-------------------+-------------------+---------+
 | fentry         |  3.694 ± 0.003M/s |  3.828 ± 0.007M/s | +3.63%  |
 | fexit          |  1.389 ± 0.006M/s |  1.406 ± 0.003M/s | +1.22%  |
 | fmodret        |  1.366 ± 0.011M/s |  1.398 ± 0.002M/s | +2.34%  |
 | rawtp          |  3.453 ± 0.026M/s |  3.714 ± 0.003M/s | +7.56%  |
 | tp             |  2.596 ± 0.005M/s |  2.699 ± 0.006M/s | +3.97%  |
 +----------------+-------------------+-------------------+---------+

 Benchmarked using: tools/testing/selftests/bpf/benchs/run_bench_trigger.sh

Signed-off-by: Puranjay Mohan <[email protected]>
@kernel-patches-daemon-bpf-rc
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=1019553 expired. Closing PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants