Commit c650812
sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug
Since sched_delayed tasks remain queued even after blocking, the load
balancer can migrate them between runqueues while PSI considers them
to be asleep. As a result, it misreads the migration requeue followed
by a wakeup as a double queue:
psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=. set=4
First, call psi_enqueue() after p->sched_class->enqueue_task(). A
wakeup will clear p->se.sched_delayed while a migration will not, so
psi can use that flag to tell them apart.
Then teach psi to migrate any "sleep" state when delayed-dequeue tasks
are being migrated.
Delayed-dequeue tasks can be revived by ttwu_runnable(), which will
call down with a new ENQUEUE_DELAYED. Instead of further complicating
the wakeup conditional in enqueue_task(), identify migration contexts
instead and default to wakeup handling for all other cases.
It's not just the warning in dmesg, the task state corruption causes a
permanent CPU pressure indication, which messes with workload/machine
health monitoring.
Debugged-by-and-original-fix-by: K Prateek Nayak <[email protected]>
Fixes: 152e11f ("sched/fair: Implement delayed dequeue")
Closes: https://lore.kernel.org/lkml/[email protected]/
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Johannes Weiner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: K Prateek Nayak <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]1 parent f5aaff7 commit c650812
2 files changed
+39
-21
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2012 | 2012 | | |
2013 | 2013 | | |
2014 | 2014 | | |
2015 | | - | |
2016 | | - | |
2017 | | - | |
2018 | | - | |
2019 | | - | |
2020 | 2015 | | |
2021 | 2016 | | |
2022 | 2017 | | |
2023 | 2018 | | |
2024 | 2019 | | |
2025 | 2020 | | |
2026 | 2021 | | |
| 2022 | + | |
| 2023 | + | |
| 2024 | + | |
| 2025 | + | |
| 2026 | + | |
2027 | 2027 | | |
2028 | 2028 | | |
2029 | 2029 | | |
| |||
2041 | 2041 | | |
2042 | 2042 | | |
2043 | 2043 | | |
2044 | | - | |
| 2044 | + | |
2045 | 2045 | | |
2046 | 2046 | | |
2047 | 2047 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | | - | |
123 | | - | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
124 | 129 | | |
125 | | - | |
| 130 | + | |
126 | 131 | | |
127 | | - | |
| 132 | + | |
128 | 133 | | |
129 | 134 | | |
130 | 135 | | |
131 | 136 | | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
136 | 140 | | |
137 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
138 | 149 | | |
| 150 | + | |
139 | 151 | | |
140 | 152 | | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
141 | 156 | | |
142 | 157 | | |
143 | 158 | | |
144 | 159 | | |
145 | 160 | | |
146 | | - | |
| 161 | + | |
147 | 162 | | |
148 | 163 | | |
149 | 164 | | |
150 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
151 | 173 | | |
152 | 174 | | |
153 | 175 | | |
154 | 176 | | |
155 | 177 | | |
156 | 178 | | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | 179 | | |
162 | 180 | | |
163 | 181 | | |
| |||
190 | 208 | | |
191 | 209 | | |
192 | 210 | | |
193 | | - | |
194 | | - | |
| 211 | + | |
| 212 | + | |
195 | 213 | | |
196 | 214 | | |
197 | 215 | | |
| |||
0 commit comments