@@ -174,6 +174,173 @@ running one, no real task switch occurs but interrupts are disabled nonetheless:
174174 | | irq_entry
175175 +---------------+ irq_enable
176176
177+ Monitor nrp
178+ -----------
179+
180+ The need resched preempts (nrp) monitor ensures preemption requires
181+ ``need_resched ``. Only kernel preemption is considered, since preemption
182+ while returning to userspace, for this monitor, is indistinguishable from
183+ ``sched_switch_yield `` (described in the sssw monitor).
184+ A kernel preemption is whenever ``__schedule `` is called with the preemption
185+ flag set to true (e.g. from preempt_enable or exiting from interrupts). This
186+ type of preemption occurs after the need for ``rescheduling `` has been set.
187+ This is not valid for the *lazy * variant of the flag, which causes only
188+ userspace preemption.
189+ A ``schedule_entry_preempt `` may involve a task switch or not, in the latter
190+ case, a task goes through the scheduler from a preemption context but it is
191+ picked as the next task to run. Since the scheduler runs, this clears the need
192+ to reschedule. The ``any_thread_running `` state does not imply the monitored
193+ task is not running as this monitor does not track the outcome of scheduling.
194+
195+ In theory, a preemption can only occur after the ``need_resched `` flag is set. In
196+ practice, however, it is possible to see a preemption where the flag is not
197+ set. This can happen in one specific condition::
198+
199+ need_resched
200+ preempt_schedule()
201+ preempt_schedule_irq()
202+ __schedule()
203+ !need_resched
204+ __schedule()
205+
206+ In the situation above, standard preemption starts (e.g. from preempt_enable
207+ when the flag is set), an interrupt occurs before scheduling and, on its exit
208+ path, it schedules, which clears the ``need_resched `` flag.
209+ When the preempted task runs again, the standard preemption started earlier
210+ resumes, although the flag is no longer set. The monitor considers this a
211+ ``nested_preemption ``, this allows another preemption without re-setting the
212+ flag. This condition relaxes the monitor constraints and may catch false
213+ negatives (i.e. no real ``nested_preemptions ``) but makes the monitor more
214+ robust and able to validate other scenarios.
215+ For simplicity, the monitor starts in ``preempt_irq ``, although no interrupt
216+ occurred, as the situation above is hard to pinpoint::
217+
218+ schedule_entry
219+ irq_entry #===========================================#
220+ +-------------------------- H H
221+ | H H
222+ +-------------------------> H any_thread_running H
223+ H H
224+ +-------------------------> H H
225+ | #===========================================#
226+ | schedule_entry | ^
227+ | schedule_entry_preempt | sched_need_resched | schedule_entry
228+ | | schedule_entry_preempt
229+ | v |
230+ | +----------------------+ |
231+ | +--- | | |
232+ | sched_need_resched | | rescheduling | -+
233+ | +--> | |
234+ | +----------------------+
235+ | | irq_entry
236+ | v
237+ | +----------------------+
238+ | | | ---+
239+ | ---> | | | sched_need_resched
240+ | | preempt_irq | | irq_entry
241+ | | | <--+
242+ | | | <--+
243+ | +----------------------+ |
244+ | | schedule_entry | sched_need_resched
245+ | | schedule_entry_preempt |
246+ | v |
247+ | +-----------------------+ |
248+ +-------------------------- | nested_preempt | --+
249+ +-----------------------+
250+ ^ irq_entry |
251+ +-------------------+
252+
253+ Due to how the ``need_resched `` flag on the preemption count works on arm64,
254+ this monitor is unstable on that architecture, as it often records preemption
255+ when the flag is not set, even in presence of the workaround above.
256+ For the time being, the monitor is disabled by default on arm64.
257+
258+ Monitor sssw
259+ ------------
260+
261+ The set state sleep and wakeup (sssw) monitor ensures ``set_state `` to
262+ sleepable leads to sleeping and sleeping tasks require wakeup. It includes the
263+ following types of switch:
264+
265+ * ``switch_suspend ``:
266+ a task puts itself to sleep, this can happen only after explicitly setting
267+ the task to ``sleepable ``. After a task is suspended, it needs to be woken up
268+ (``waking `` state) before being switched in again.
269+ Setting the task's state to ``sleepable `` can be reverted before switching if it
270+ is woken up or set to ``runnable ``.
271+ * ``switch_blocking ``:
272+ a special case of a ``switch_suspend `` where the task is waiting on a
273+ sleeping RT lock (``PREEMPT_RT `` only), it is common to see wakeup and set
274+ state events racing with each other and this leads the model to perceive this
275+ type of switch when the task is not set to sleepable. This is a limitation of
276+ the model in SMP system and workarounds may slow down the system.
277+ * ``switch_preempt ``:
278+ a task switch as a result of kernel preemption (``schedule_entry_preempt `` in
279+ the nrp model).
280+ * ``switch_yield ``:
281+ a task explicitly calls the scheduler or is preempted while returning to
282+ userspace. It can happen after a ``yield `` system call, from the idle task or
283+ if the ``need_resched `` flag is set. By definition, a task cannot yield while
284+ ``sleepable `` as that would be a suspension. A special case of a yield occurs
285+ when a task in ``TASK_INTERRUPTIBLE `` calls the scheduler while a signal is
286+ pending. The task doesn't go through the usual blocking/waking and is set
287+ back to runnable, the resulting switch (if there) looks like a yield to the
288+ ``signal_wakeup `` state and is followed by the signal delivery. From this
289+ state, the monitor expects a signal even if it sees a wakeup event, although
290+ not necessary, to rule out false negatives.
291+
292+ This monitor doesn't include a running state, ``sleepable `` and ``runnable ``
293+ are only referring to the task's desired state, which could be scheduled out
294+ (e.g. due to preemption). However, it does include the event
295+ ``sched_switch_in `` to represent when a task is allowed to become running. This
296+ can be triggered also by preemption, but cannot occur after the task got to
297+ ``sleeping `` before a ``wakeup `` occurs::
298+
299+ +--------------------------------------------------------------------------+
300+ | |
301+ | |
302+ | switch_suspend | |
303+ | switch_blocking | |
304+ v v |
305+ +----------+ #==========================# set_state_runnable |
306+ | | H H wakeup |
307+ | | H H switch_in |
308+ | | H H switch_yield |
309+ | sleeping | H H switch_preempt |
310+ | | H H signal_deliver |
311+ | | switch_ H H ------+ |
312+ | | _blocking H runnable H | |
313+ | | <----------- H H <-----+ |
314+ +----------+ H H |
315+ | wakeup H H |
316+ +---------------------> H H |
317+ H H |
318+ +---------> H H |
319+ | #==========================# |
320+ | | ^ |
321+ | | | set_state_runnable |
322+ | | | wakeup |
323+ | set_state_sleepable | +------------------------+
324+ | v | |
325+ | +--------------------------+ set_state_sleepable
326+ | | | switch_in
327+ | | | switch_preempt
328+ signal_deliver | sleepable | signal_deliver
329+ | | | ------+
330+ | | | |
331+ | | | <-----+
332+ | +--------------------------+
333+ | | ^
334+ | switch_yield | set_state_sleepable
335+ | v |
336+ | +---------------+ |
337+ +---------- | signal_wakeup | -+
338+ +---------------+
339+ ^ | switch_in
340+ | | switch_preempt
341+ | | switch_yield
342+ +-----------+ wakeup
343+
177344References
178345----------
179346
0 commit comments