Skip to content

Commit b5e20a9

Browse files
kkdwvdKernel Patches Daemon
authored andcommitted
rqspinlock: Precede non-head waiter queueing with AA check
While previous commits sufficiently address the deadlocks, there are still scenarios where queueing of waiters in NMIs can exacerbate the possibility of timeouts. Consider the case below: CPU 0 <NMI> res_spin_lock(A) -> becomes non-head waiter </NMI> lock owner in CS or pending waiter spinning CPU 1 res_spin_lock(A) -> head waiter spinning on owner/pending bits In such a scenario, the non-head waiter in NMI on CPU 0 will not poll for deadlocks or timeout since it will simply queue behind previous waiter (head on CPU 1), and also not enter the trylock fallback since no rqspinlock queue waiter is active on CPU 0. In such a scenario, the transaction initiated by the head waiter on CPU 1 will timeout, signalling the NMI and ending the cyclic dependency, but it will cost 250 ms of time. Instead, the NMI on CPU 0 could simply check for the presence of an AA deadlock and only proceed with queueing on success. Add such a check right before any form of queueing is initiated. The reason the AA deadlock check is not used in conjunction with in_nmi() is that a similar case could occur due to a reentrant path in the owner's critical section, and unconditionally checking for AA before entering the queueing path avoids expensive timeouts. Non-NMI reentrancy only happens at controlled points in the slow path (with specific tracepoints which do not impede the forward progress of a waiter loop), or in the owner CS, while NMIs can land anywhere. While this check is only needed for non-head waiter queueing, checking whether we are head or not is racy without xchg_tail, and after that point, we are already queued, hence for simplicity we must invoke the check unconditionally. Note that a more contrived case could still be constructed by using two locks, and interrupting the progress of the respective owners by non-head waiters of the other lock, in an ABBA fashion, which would still not be covered by the current set of checks and conditions. It would still lead to a timeout though, and not a deadlock. An ABBA check cannot happen optimistically before the queueing, since it can be racy, and needs to be happen continuously during the waiting period, which would then require an unlinking step for queued NMI/reentrant waiters. This is beyond the scope of this patch. Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]>
1 parent 048f45d commit b5e20a9

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

kernel/bpf/rqspinlock.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -437,6 +437,19 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
437437
* queuing.
438438
*/
439439
queue:
440+
/*
441+
* Do not queue if we're a waiter and someone is attempting this lock on
442+
* the same CPU. In case of NMIs, this prevents long timeouts where we
443+
* interrupt the pending waiter, and the owner, that will eventually
444+
* signal the head of our queue, both of which are logically but not
445+
* physically part of the queue, hence outside the scope of the idx > 0
446+
* check above for the trylock fallback.
447+
*/
448+
if (check_deadlock_AA(lock)) {
449+
ret = -EDEADLK;
450+
goto err_release_entry;
451+
}
452+
440453
lockevent_inc(lock_slowpath);
441454
/* Deadlock detection entry already held after failing fast path. */
442455
node = this_cpu_ptr(&rqnodes[0].mcs);

0 commit comments

Comments
 (0)