Commit 52d1552
sched/deadline: Don't count nr_running for dl_server proxy tasks
On CPU offline the kernel stalled with below call trace:
INFO: task kworker/0:1:11 blocked for more than 120 seconds.
cpuhp hold the cpu hotplug lock endless and stalled vmstat_shepherd.
This is because we count nr_running twice on cpuhp enqueuing and failed
the wait condition of cpuhp:
enqueue_task_fair() // pick cpuhp from idle, rq->nr_running = 0
dl_server_start()
[...]
add_nr_running() // rq->nr_running = 1
add_nr_running() // rq->nr_running = 2
[switch to cpuhp, waiting on balance_hotplug_wait()]
rcuwait_wait_event(rq->nr_running == 1 && ...) // failed, rq->nr_running=2
schedule() // wait again
It doesn't make sense to count the dl_server towards runnable tasks,
since it runs other tasks.
Fixes: 63ba842 ("sched/deadline: Introduce deadline servers")
Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]1 parent 421fc59 commit 52d1552
1 file changed
+6
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1851 | 1851 | | |
1852 | 1852 | | |
1853 | 1853 | | |
1854 | | - | |
| 1854 | + | |
| 1855 | + | |
| 1856 | + | |
1855 | 1857 | | |
1856 | 1858 | | |
1857 | 1859 | | |
| |||
1861 | 1863 | | |
1862 | 1864 | | |
1863 | 1865 | | |
1864 | | - | |
| 1866 | + | |
| 1867 | + | |
| 1868 | + | |
1865 | 1869 | | |
1866 | 1870 | | |
1867 | 1871 | | |
| |||
0 commit comments