Skip to content

bpf: Fix RCU stall in bpf_fd_array_map_clear()#11594

Open
kernel-patches-daemon-bpf[bot] wants to merge 1 commit intobpf_basefrom
series/1074913=>bpf
Open

bpf: Fix RCU stall in bpf_fd_array_map_clear()#11594
kernel-patches-daemon-bpf[bot] wants to merge 1 commit intobpf_basefrom
series/1074913=>bpf

Conversation

@kernel-patches-daemon-bpf
Copy link
Copy Markdown

Pull request for series with
subject: bpf: Fix RCU stall in bpf_fd_array_map_clear()
version: 3
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1074913

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: c369299
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1074913
version: 3

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: dbf00d8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1074913
version: 3

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: a8502a7
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1074913
version: 3

Add a missing cond_resched() in bpf_fd_array_map_clear() loop.

For PROG_ARRAY maps with many entries this loop calls
prog_array_map_poke_run() per entry which can be expensive, and
without yielding this can cause RCU stalls under load:

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu: 	(detected by 0, t=6502 jiffies, g=729293, q=305 ncpus=1)
  rcu: All QSes seen, last rcu_preempt kthread activity 6502 (4295096514-4295090012), jiffies_till_next_fqs=1, root ->qsmask 0x0
  rcu: rcu_preempt kthread starved for 6502 jiffies! g729293 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
  rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
  rcu: RCU grace-period kthread stack dump:
  task:rcu_preempt     state:R  running task     stack:0     pid:15    tgid:15    ppid:2      task_flags:0x208040 flags:0x00004000
  Call Trace:
   <TASK>
   context_switch kernel/sched/core.c:5382 [inline]
   __schedule+0x697/0x1430 kernel/sched/core.c:6767
   __schedule_loop kernel/sched/core.c:6845 [inline]
   schedule+0x10a/0x3e0 kernel/sched/core.c:6860
   schedule_timeout+0x145/0x2c0 kernel/time/sleep_timeout.c:99
   rcu_gp_fqs_loop+0x255/0x1350 kernel/rcu/tree.c:2046
   rcu_gp_kthread+0x347/0x680 kernel/rcu/tree.c:2248
   kthread+0x465/0x880 kernel/kthread.c:464
   ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:153
   ret_from_fork_asm+0x19/0x30 arch/x86/entry/entry_64.S:245
   </TASK>
  rcu: Stack dump where RCU GP kthread last ran:
  CPU: 0 UID: 0 PID: 30932 Comm: kworker/0:2 Not tainted 6.14.0-13195-g967e8def1100 #2 PREEMPT(undef)
  Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  Workqueue: events prog_array_map_clear_deferred
  RIP: 0010:write_comp_data+0x38/0x90 kernel/kcov.c:246
  Call Trace:
   <TASK>
   prog_array_map_poke_run+0x77/0x380 kernel/bpf/arraymap.c:1096
   __fd_array_map_delete_elem+0x197/0x310 kernel/bpf/arraymap.c:925
   bpf_fd_array_map_clear kernel/bpf/arraymap.c:1000 [inline]
   prog_array_map_clear_deferred+0x119/0x1b0 kernel/bpf/arraymap.c:1141
   process_one_work+0x898/0x19d0 kernel/workqueue.c:3238
   process_scheduled_works kernel/workqueue.c:3319 [inline]
   worker_thread+0x770/0x10b0 kernel/workqueue.c:3400
   kthread+0x465/0x880 kernel/kthread.c:464
   ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:153
   ret_from_fork_asm+0x19/0x30 arch/x86/entry/entry_64.S:245
   </TASK>

Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Fixes: da765a2 ("bpf: Add poke dependency tracking for prog array maps")
Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: e2d072d
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1074913
version: 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant