Skip to content

Commit f5aaff7

Browse files
author
Peter Zijlstra
committed
sched/core: Dequeue PSI signals for blocked tasks that are delayed
psi_dequeue() in for blocked task expects psi_sched_switch() to clear the TSK_.*RUNNING PSI flags and set the TSK_IOWAIT flags however psi_sched_switch() uses "!task_on_rq_queued(prev)" to detect if the task is blocked or still runnable which is no longer true with DELAY_DEQUEUE since a blocking task can be left queued on the runqueue. This can lead to PSI splats similar to: psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=0 set=4 when the task is requeued since the TSK_RUNNING flag was not cleared when the task was blocked. Explicitly communicate that the task was blocked to psi_sched_switch() even if it was delayed and is still on the runqueue. [ prateek: Broke off the relevant part from [1], commit message ] Fixes: 152e11f ("sched/fair: Implement delayed dequeue") Closes: https://lore.kernel.org/lkml/[email protected]/ Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Not-yet-signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: K Prateek Nayak <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Johannes Weiner <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ [1]
1 parent 98442f0 commit f5aaff7

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

kernel/sched/core.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6537,6 +6537,7 @@ static void __sched notrace __schedule(int sched_mode)
65376537
* as a preemption by schedule_debug() and RCU.
65386538
*/
65396539
bool preempt = sched_mode > SM_NONE;
6540+
bool block = false;
65406541
unsigned long *switch_count;
65416542
unsigned long prev_state;
65426543
struct rq_flags rf;
@@ -6622,6 +6623,7 @@ static void __sched notrace __schedule(int sched_mode)
66226623
* After this, schedule() must not care about p->state any more.
66236624
*/
66246625
block_task(rq, prev, flags);
6626+
block = true;
66256627
}
66266628
switch_count = &prev->nvcsw;
66276629
}
@@ -6667,7 +6669,7 @@ static void __sched notrace __schedule(int sched_mode)
66676669

66686670
migrate_disable_switch(rq, prev);
66696671
psi_account_irqtime(rq, prev, next);
6670-
psi_sched_switch(prev, next, !task_on_rq_queued(prev));
6672+
psi_sched_switch(prev, next, block);
66716673

66726674
trace_sched_switch(preempt, prev, next, prev_state);
66736675

0 commit comments

Comments
 (0)