Skip to content

Commit 5ba2ffb

Browse files
author
Peter Zijlstra
committed
sched: Fix CPU hotplug / tighten is_per_cpu_kthread()
Prior to commit 1cf12e0 ("sched/hotplug: Consolidate task migration on CPU unplug") we'd leave any task on the dying CPU and break affinity and force them off at the very end. This scheme had to change in order to enable migrate_disable(). One cannot wait for migrate_disable() to complete while stuck in stop_machine(). Furthermore, since we need at the very least: idle, hotplug and stop threads at any point before stop_machine, we can't break affinity and/or push those away. Under the assumption that all per-cpu kthreads are sanely handled by CPU hotplug, the new code no long breaks affinity or migrates any of them (which then includes the critical ones above). However, there's an important difference between per-cpu kthreads and kthreads that happen to have a single CPU affinity which is lost. The latter class very much relies on the forced affinity breaking and migration semantics previously provided. Use the new kthread_is_per_cpu() infrastructure to tighten is_per_cpu_kthread() and fix the hot-unplug problems stemming from the change. Fixes: 1cf12e0 ("sched/hotplug: Consolidate task migration on CPU unplug") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent 975707f commit 5ba2ffb

File tree

1 file changed

+35
-4
lines changed

1 file changed

+35
-4
lines changed

kernel/sched/core.c

Lines changed: 35 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1796,13 +1796,28 @@ static inline bool rq_has_pinned_tasks(struct rq *rq)
17961796
*/
17971797
static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
17981798
{
1799+
/* When not in the task's cpumask, no point in looking further. */
17991800
if (!cpumask_test_cpu(cpu, p->cpus_ptr))
18001801
return false;
18011802

1802-
if (is_per_cpu_kthread(p) || is_migration_disabled(p))
1803+
/* migrate_disabled() must be allowed to finish. */
1804+
if (is_migration_disabled(p))
18031805
return cpu_online(cpu);
18041806

1805-
return cpu_active(cpu);
1807+
/* Non kernel threads are not allowed during either online or offline. */
1808+
if (!(p->flags & PF_KTHREAD))
1809+
return cpu_active(cpu);
1810+
1811+
/* KTHREAD_IS_PER_CPU is always allowed. */
1812+
if (kthread_is_per_cpu(p))
1813+
return cpu_online(cpu);
1814+
1815+
/* Regular kernel threads don't get to stay during offline. */
1816+
if (cpu_rq(cpu)->balance_push)
1817+
return false;
1818+
1819+
/* But are allowed during online. */
1820+
return cpu_online(cpu);
18061821
}
18071822

18081823
/*
@@ -3121,6 +3136,13 @@ bool cpus_share_cache(int this_cpu, int that_cpu)
31213136

31223137
static inline bool ttwu_queue_cond(int cpu, int wake_flags)
31233138
{
3139+
/*
3140+
* Do not complicate things with the async wake_list while the CPU is
3141+
* in hotplug state.
3142+
*/
3143+
if (!cpu_active(cpu))
3144+
return false;
3145+
31243146
/*
31253147
* If the CPU does not share cache, then queue the task on the
31263148
* remote rqs wakelist to avoid accessing remote data.
@@ -7276,8 +7298,14 @@ static void balance_push(struct rq *rq)
72767298
/*
72777299
* Both the cpu-hotplug and stop task are in this case and are
72787300
* required to complete the hotplug process.
7301+
*
7302+
* XXX: the idle task does not match kthread_is_per_cpu() due to
7303+
* histerical raisins.
72797304
*/
7280-
if (is_per_cpu_kthread(push_task) || is_migration_disabled(push_task)) {
7305+
if (rq->idle == push_task ||
7306+
((push_task->flags & PF_KTHREAD) && kthread_is_per_cpu(push_task)) ||
7307+
is_migration_disabled(push_task)) {
7308+
72817309
/*
72827310
* If this is the idle task on the outgoing CPU try to wake
72837311
* up the hotplug control thread which might wait for the
@@ -7309,7 +7337,7 @@ static void balance_push(struct rq *rq)
73097337
/*
73107338
* At this point need_resched() is true and we'll take the loop in
73117339
* schedule(). The next pick is obviously going to be the stop task
7312-
* which is_per_cpu_kthread() and will push this task away.
7340+
* which kthread_is_per_cpu() and will push this task away.
73137341
*/
73147342
raw_spin_lock(&rq->lock);
73157343
}
@@ -7497,6 +7525,9 @@ int sched_cpu_deactivate(unsigned int cpu)
74977525
* preempt-disabled and RCU users of this state to go away such that
74987526
* all new such users will observe it.
74997527
*
7528+
* Specifically, we rely on ttwu to no longer target this CPU, see
7529+
* ttwu_queue_cond() and is_cpu_allowed().
7530+
*
75007531
* Do sync before park smpboot threads to take care the rcu boost case.
75017532
*/
75027533
synchronize_rcu();

0 commit comments

Comments
 (0)