Skip to content

Commit 7be88a8

Browse files
Frederic WeisbeckerNeeraj Upadhyay
authored andcommitted
rcu/nocb: Assert no callbacks while nocb kthread allocation fails
When a NOCB CPU fails to create a nocb kthread on bringup, the CPU is then deoffloaded. The barrier mutex is locked at this stage. It is typically used to protect against concurrent (de-)offloading and/or concurrent rcu_barrier() that would otherwise risk a nocb locking imbalance. However: * rcu_barrier() can't run concurrently if it's the boot CPU on early boot-up. * rcu_barrier() can run concurrently if it's a secondary CPU but it is expected to see 0 callbacks on this target because it's the first time it boots. * (de-)offloading can't happen concurrently with smp_init(), as rcutorture is initialized later, at least not before device_initcall(), and userspace isn't available yet. * (de-)offloading can't happen concurrently with cpu_up(), courtesy of cpu_hotplug_lock. But: * The lazy shrinker might run concurrently with cpu_up(). It shouldn't try to grab the nocb_lock and risk an imbalance due to lazy_len supposed to be 0 but be extra cautious. * Also be cautious against resume from hibernation potential subtleties. So keep the locking and add some assertions and comments. Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]>
1 parent ff81428 commit 7be88a8

File tree

1 file changed

+11
-3
lines changed

1 file changed

+11
-3
lines changed

kernel/rcu/tree_nocb.h

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1442,7 +1442,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
14421442
"rcuog/%d", rdp_gp->cpu);
14431443
if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) {
14441444
mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex);
1445-
goto end;
1445+
goto err;
14461446
}
14471447
WRITE_ONCE(rdp_gp->nocb_gp_kthread, t);
14481448
if (kthread_prio)
@@ -1454,7 +1454,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
14541454
t = kthread_create(rcu_nocb_cb_kthread, rdp,
14551455
"rcuo%c/%d", rcu_state.abbr, cpu);
14561456
if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
1457-
goto end;
1457+
goto err;
14581458

14591459
if (rcu_rdp_is_offloaded(rdp))
14601460
wake_up_process(t);
@@ -1467,7 +1467,15 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
14671467
WRITE_ONCE(rdp->nocb_cb_kthread, t);
14681468
WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
14691469
return;
1470-
end:
1470+
1471+
err:
1472+
/*
1473+
* No need to protect against concurrent rcu_barrier()
1474+
* because the number of callbacks should be 0 for a non-boot CPU,
1475+
* therefore rcu_barrier() shouldn't even try to grab the nocb_lock.
1476+
* But hold barrier_mutex to avoid nocb_lock imbalance from shrinker.
1477+
*/
1478+
WARN_ON_ONCE(system_state > SYSTEM_BOOTING && rcu_segcblist_n_cbs(&rdp->cblist));
14711479
mutex_lock(&rcu_state.barrier_mutex);
14721480
if (rcu_rdp_is_offloaded(rdp)) {
14731481
rcu_nocb_rdp_deoffload(rdp);

0 commit comments

Comments
 (0)