Skip to content

Commit bed54ae

Browse files
author
Alexei Starovoitov
committed
Merge branch 'Wait for busy refill_work when destroying bpf memory allocator'
Hou Tao says: ==================== From: Hou Tao <[email protected]> Hi, The patchset aims to fix one problem of bpf memory allocator destruction when there is PREEMPT_RT kernel or kernel with arch_irq_work_has_interrupt() being false (e.g. 1-cpu arm32 host or mips). The root cause is that there may be busy refill_work when the allocator is destroying and it may incur oops or other problems as shown in patch #1. Patch #1 fixes the problem by waiting for the completion of irq work during destroying and patch #2 is just a clean-up patch based on patch #1. Please see individual patches for more details. Comments are always welcome. Change Log: v2: * patch 1: fix typos and add notes about the overhead of irq_work_sync() * patch 1 & 2: add Acked-by tags from [email protected] v1: https://lore.kernel.org/bpf/[email protected]/T/#t ==================== Signed-off-by: Alexei Starovoitov <[email protected]>
2 parents dbe69b2 + fa4447c commit bed54ae

File tree

1 file changed

+16
-2
lines changed

1 file changed

+16
-2
lines changed

kernel/bpf/memalloc.c

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -418,14 +418,17 @@ static void drain_mem_cache(struct bpf_mem_cache *c)
418418
/* No progs are using this bpf_mem_cache, but htab_map_free() called
419419
* bpf_mem_cache_free() for all remaining elements and they can be in
420420
* free_by_rcu or in waiting_for_gp lists, so drain those lists now.
421+
*
422+
* Except for waiting_for_gp list, there are no concurrent operations
423+
* on these lists, so it is safe to use __llist_del_all().
421424
*/
422425
llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu))
423426
free_one(c, llnode);
424427
llist_for_each_safe(llnode, t, llist_del_all(&c->waiting_for_gp))
425428
free_one(c, llnode);
426-
llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist))
429+
llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist))
427430
free_one(c, llnode);
428-
llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra))
431+
llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist_extra))
429432
free_one(c, llnode);
430433
}
431434

@@ -493,6 +496,16 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma)
493496
rcu_in_progress = 0;
494497
for_each_possible_cpu(cpu) {
495498
c = per_cpu_ptr(ma->cache, cpu);
499+
/*
500+
* refill_work may be unfinished for PREEMPT_RT kernel
501+
* in which irq work is invoked in a per-CPU RT thread.
502+
* It is also possible for kernel with
503+
* arch_irq_work_has_interrupt() being false and irq
504+
* work is invoked in timer interrupt. So waiting for
505+
* the completion of irq work to ease the handling of
506+
* concurrency.
507+
*/
508+
irq_work_sync(&c->refill_work);
496509
drain_mem_cache(c);
497510
rcu_in_progress += atomic_read(&c->call_rcu_in_progress);
498511
}
@@ -507,6 +520,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma)
507520
cc = per_cpu_ptr(ma->caches, cpu);
508521
for (i = 0; i < NUM_CACHES; i++) {
509522
c = &cc->cache[i];
523+
irq_work_sync(&c->refill_work);
510524
drain_mem_cache(c);
511525
rcu_in_progress += atomic_read(&c->call_rcu_in_progress);
512526
}

0 commit comments

Comments
 (0)