Skip to content

Commit 49bef33

Browse files
Valentin SchneiderPeter Zijlstra
authored andcommitted
sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
John reported that push_rt_task() can end up invoking find_lowest_rq(rq->curr) when curr is not an RT task (in this case a CFS one), which causes mayhem down convert_prio(). This can happen when current gets demoted to e.g. CFS when releasing an rt_mutex, and the local CPU gets hit with an rto_push_work irqwork before getting the chance to reschedule. Exactly who triggers this work isn't entirely clear to me - switched_from_rt() only invokes rt_queue_pull_task() if there are no RT tasks on the local RQ, which means the local CPU can't be in the rto_mask. My current suspected sequence is something along the lines of the below, with the demoted task being current. mark_wakeup_next_waiter() rt_mutex_adjust_prio() rt_mutex_setprio() // deboost originally-CFS task check_class_changed() switched_from_rt() // Only rt_queue_pull_task() if !rq->rt.rt_nr_running switched_to_fair() // Sets need_resched __balance_callbacks() // if pull_rt_task(), tell_cpu_to_push() can't select local CPU per the above raw_spin_rq_unlock(rq) // need_resched is set, so task_woken_rt() can't // invoke push_rt_tasks(). Best I can come up with is // local CPU has rt_nr_migratory >= 2 after the demotion, so stays // in the rto_mask, and then: <some other CPU running rto_push_irq_work_func() queues rto_push_work on this CPU> push_rt_task() // breakage follows here as rq->curr is CFS Move an existing check to check rq->curr vs the next pushable task's priority before getting anywhere near find_lowest_rq(). While at it, add an explicit sched_class of rq->curr check prior to invoking find_lowest_rq(rq->curr). Align the DL logic to also reschedule regardless of next_task's migratability. Fixes: a7c8155 ("sched: Fix migrate_disable() vs rt/dl balancing") Reported-by: John Keeping <[email protected]> Signed-off-by: Valentin Schneider <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dietmar Eggemann <[email protected]> Tested-by: John Keeping <[email protected]> Link: https://lore.kernel.org/r/[email protected]
1 parent 3eba050 commit 49bef33

File tree

2 files changed

+28
-16
lines changed

2 files changed

+28
-16
lines changed

kernel/sched/deadline.c

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2240,12 +2240,6 @@ static int push_dl_task(struct rq *rq)
22402240
return 0;
22412241

22422242
retry:
2243-
if (is_migration_disabled(next_task))
2244-
return 0;
2245-
2246-
if (WARN_ON(next_task == rq->curr))
2247-
return 0;
2248-
22492243
/*
22502244
* If next_task preempts rq->curr, and rq->curr
22512245
* can move away, it makes sense to just reschedule
@@ -2258,6 +2252,12 @@ static int push_dl_task(struct rq *rq)
22582252
return 0;
22592253
}
22602254

2255+
if (is_migration_disabled(next_task))
2256+
return 0;
2257+
2258+
if (WARN_ON(next_task == rq->curr))
2259+
return 0;
2260+
22612261
/* We might release rq lock */
22622262
get_task_struct(next_task);
22632263

kernel/sched/rt.c

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2026,13 +2026,35 @@ static int push_rt_task(struct rq *rq, bool pull)
20262026
return 0;
20272027

20282028
retry:
2029+
/*
2030+
* It's possible that the next_task slipped in of
2031+
* higher priority than current. If that's the case
2032+
* just reschedule current.
2033+
*/
2034+
if (unlikely(next_task->prio < rq->curr->prio)) {
2035+
resched_curr(rq);
2036+
return 0;
2037+
}
2038+
20292039
if (is_migration_disabled(next_task)) {
20302040
struct task_struct *push_task = NULL;
20312041
int cpu;
20322042

20332043
if (!pull || rq->push_busy)
20342044
return 0;
20352045

2046+
/*
2047+
* Invoking find_lowest_rq() on anything but an RT task doesn't
2048+
* make sense. Per the above priority check, curr has to
2049+
* be of higher priority than next_task, so no need to
2050+
* reschedule when bailing out.
2051+
*
2052+
* Note that the stoppers are masqueraded as SCHED_FIFO
2053+
* (cf. sched_set_stop_task()), so we can't rely on rt_task().
2054+
*/
2055+
if (rq->curr->sched_class != &rt_sched_class)
2056+
return 0;
2057+
20362058
cpu = find_lowest_rq(rq->curr);
20372059
if (cpu == -1 || cpu == rq->cpu)
20382060
return 0;
@@ -2057,16 +2079,6 @@ static int push_rt_task(struct rq *rq, bool pull)
20572079
if (WARN_ON(next_task == rq->curr))
20582080
return 0;
20592081

2060-
/*
2061-
* It's possible that the next_task slipped in of
2062-
* higher priority than current. If that's the case
2063-
* just reschedule current.
2064-
*/
2065-
if (unlikely(next_task->prio < rq->curr->prio)) {
2066-
resched_curr(rq);
2067-
return 0;
2068-
}
2069-
20702082
/* We might release rq lock */
20712083
get_task_struct(next_task);
20722084

0 commit comments

Comments
 (0)