@@ -13,6 +13,7 @@ The kernel provides a variety of locking primitives which can be divided
13
13
into two categories:
14
14
15
15
- Sleeping locks
16
+ - CPU local locks
16
17
- Spinning locks
17
18
18
19
This document conceptually describes these lock types and provides rules
@@ -44,9 +45,23 @@ Sleeping lock types:
44
45
45
46
On PREEMPT_RT kernels, these lock types are converted to sleeping locks:
46
47
48
+ - local_lock
47
49
- spinlock_t
48
50
- rwlock_t
49
51
52
+
53
+ CPU local locks
54
+ ---------------
55
+
56
+ - local_lock
57
+
58
+ On non-PREEMPT_RT kernels, local_lock functions are wrappers around
59
+ preemption and interrupt disabling primitives. Contrary to other locking
60
+ mechanisms, disabling preemption or interrupts are pure CPU local
61
+ concurrency control mechanisms and not suited for inter-CPU concurrency
62
+ control.
63
+
64
+
50
65
Spinning locks
51
66
--------------
52
67
@@ -67,6 +82,7 @@ can have suffixes which apply further protections:
67
82
_irqsave/restore() Save and disable / restore interrupt disabled state
68
83
=================== ====================================================
69
84
85
+
70
86
Owner semantics
71
87
===============
72
88
@@ -139,6 +155,56 @@ implementation, thus changing the fairness:
139
155
writer from starving readers.
140
156
141
157
158
+ local_lock
159
+ ==========
160
+
161
+ local_lock provides a named scope to critical sections which are protected
162
+ by disabling preemption or interrupts.
163
+
164
+ On non-PREEMPT_RT kernels local_lock operations map to the preemption and
165
+ interrupt disabling and enabling primitives:
166
+
167
+ =========================== ======================
168
+ local_lock(&llock) preempt_disable()
169
+ local_unlock(&llock) preempt_enable()
170
+ local_lock_irq(&llock) local_irq_disable()
171
+ local_unlock_irq(&llock) local_irq_enable()
172
+ local_lock_save(&llock) local_irq_save()
173
+ local_lock_restore(&llock) local_irq_save()
174
+ =========================== ======================
175
+
176
+ The named scope of local_lock has two advantages over the regular
177
+ primitives:
178
+
179
+ - The lock name allows static analysis and is also a clear documentation
180
+ of the protection scope while the regular primitives are scopeless and
181
+ opaque.
182
+
183
+ - If lockdep is enabled the local_lock gains a lockmap which allows to
184
+ validate the correctness of the protection. This can detect cases where
185
+ e.g. a function using preempt_disable() as protection mechanism is
186
+ invoked from interrupt or soft-interrupt context. Aside of that
187
+ lockdep_assert_held(&llock) works as with any other locking primitive.
188
+
189
+ local_lock and PREEMPT_RT
190
+ -------------------------
191
+
192
+ PREEMPT_RT kernels map local_lock to a per-CPU spinlock_t, thus changing
193
+ semantics:
194
+
195
+ - All spinlock_t changes also apply to local_lock.
196
+
197
+ local_lock usage
198
+ ----------------
199
+
200
+ local_lock should be used in situations where disabling preemption or
201
+ interrupts is the appropriate form of concurrency control to protect
202
+ per-CPU data structures on a non PREEMPT_RT kernel.
203
+
204
+ local_lock is not suitable to protect against preemption or interrupts on a
205
+ PREEMPT_RT kernel due to the PREEMPT_RT specific spinlock_t semantics.
206
+
207
+
142
208
raw_spinlock_t and spinlock_t
143
209
=============================
144
210
@@ -258,10 +324,82 @@ implementation, thus changing semantics:
258
324
PREEMPT_RT caveats
259
325
==================
260
326
327
+ local_lock on RT
328
+ ----------------
329
+
330
+ The mapping of local_lock to spinlock_t on PREEMPT_RT kernels has a few
331
+ implications. For example, on a non-PREEMPT_RT kernel the following code
332
+ sequence works as expected::
333
+
334
+ local_lock_irq(&local_lock);
335
+ raw_spin_lock(&lock);
336
+
337
+ and is fully equivalent to::
338
+
339
+ raw_spin_lock_irq(&lock);
340
+
341
+ On a PREEMPT_RT kernel this code sequence breaks because local_lock_irq()
342
+ is mapped to a per-CPU spinlock_t which neither disables interrupts nor
343
+ preemption. The following code sequence works perfectly correct on both
344
+ PREEMPT_RT and non-PREEMPT_RT kernels::
345
+
346
+ local_lock_irq(&local_lock);
347
+ spin_lock(&lock);
348
+
349
+ Another caveat with local locks is that each local_lock has a specific
350
+ protection scope. So the following substitution is wrong::
351
+
352
+ func1()
353
+ {
354
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock_1, flags);
355
+ func3();
356
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock_1, flags);
357
+ }
358
+
359
+ func2()
360
+ {
361
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock_2, flags);
362
+ func3();
363
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock_2, flags);
364
+ }
365
+
366
+ func3()
367
+ {
368
+ lockdep_assert_irqs_disabled();
369
+ access_protected_data();
370
+ }
371
+
372
+ On a non-PREEMPT_RT kernel this works correctly, but on a PREEMPT_RT kernel
373
+ local_lock_1 and local_lock_2 are distinct and cannot serialize the callers
374
+ of func3(). Also the lockdep assert will trigger on a PREEMPT_RT kernel
375
+ because local_lock_irqsave() does not disable interrupts due to the
376
+ PREEMPT_RT-specific semantics of spinlock_t. The correct substitution is::
377
+
378
+ func1()
379
+ {
380
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock, flags);
381
+ func3();
382
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock, flags);
383
+ }
384
+
385
+ func2()
386
+ {
387
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock, flags);
388
+ func3();
389
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock, flags);
390
+ }
391
+
392
+ func3()
393
+ {
394
+ lockdep_assert_held(&local_lock);
395
+ access_protected_data();
396
+ }
397
+
398
+
261
399
spinlock_t and rwlock_t
262
400
-----------------------
263
401
264
- These changes in spinlock_t and rwlock_t semantics on PREEMPT_RT kernels
402
+ The changes in spinlock_t and rwlock_t semantics on PREEMPT_RT kernels
265
403
have a few implications. For example, on a non-PREEMPT_RT kernel the
266
404
following code sequence works as expected::
267
405
@@ -282,9 +420,61 @@ local_lock mechanism. Acquiring the local_lock pins the task to a CPU,
282
420
allowing things like per-CPU interrupt disabled locks to be acquired.
283
421
However, this approach should be used only where absolutely necessary.
284
422
423
+ A typical scenario is protection of per-CPU variables in thread context::
285
424
286
- raw_spinlock_t
287
- --------------
425
+ struct foo *p = get_cpu_ptr(&var1);
426
+
427
+ spin_lock(&p->lock);
428
+ p->count += this_cpu_read(var2);
429
+
430
+ This is correct code on a non-PREEMPT_RT kernel, but on a PREEMPT_RT kernel
431
+ this breaks. The PREEMPT_RT-specific change of spinlock_t semantics does
432
+ not allow to acquire p->lock because get_cpu_ptr() implicitly disables
433
+ preemption. The following substitution works on both kernels::
434
+
435
+ struct foo *p;
436
+
437
+ migrate_disable();
438
+ p = this_cpu_ptr(&var1);
439
+ spin_lock(&p->lock);
440
+ p->count += this_cpu_read(var2);
441
+
442
+ On a non-PREEMPT_RT kernel migrate_disable() maps to preempt_disable()
443
+ which makes the above code fully equivalent. On a PREEMPT_RT kernel
444
+ migrate_disable() ensures that the task is pinned on the current CPU which
445
+ in turn guarantees that the per-CPU access to var1 and var2 are staying on
446
+ the same CPU.
447
+
448
+ The migrate_disable() substitution is not valid for the following
449
+ scenario::
450
+
451
+ func()
452
+ {
453
+ struct foo *p;
454
+
455
+ migrate_disable();
456
+ p = this_cpu_ptr(&var1);
457
+ p->val = func2();
458
+
459
+ While correct on a non-PREEMPT_RT kernel, this breaks on PREEMPT_RT because
460
+ here migrate_disable() does not protect against reentrancy from a
461
+ preempting task. A correct substitution for this case is::
462
+
463
+ func()
464
+ {
465
+ struct foo *p;
466
+
467
+ local_lock(&foo_lock);
468
+ p = this_cpu_ptr(&var1);
469
+ p->val = func2();
470
+
471
+ On a non-PREEMPT_RT kernel this protects against reentrancy by disabling
472
+ preemption. On a PREEMPT_RT kernel this is achieved by acquiring the
473
+ underlying per-CPU spinlock.
474
+
475
+
476
+ raw_spinlock_t on RT
477
+ --------------------
288
478
289
479
Acquiring a raw_spinlock_t disables preemption and possibly also
290
480
interrupts, so the critical section must avoid acquiring a regular
@@ -325,22 +515,25 @@ Lock type nesting rules
325
515
326
516
The most basic rules are:
327
517
328
- - Lock types of the same lock category (sleeping, spinning) can nest
329
- arbitrarily as long as they respect the general lock ordering rules to
330
- prevent deadlocks.
518
+ - Lock types of the same lock category (sleeping, CPU local, spinning)
519
+ can nest arbitrarily as long as they respect the general lock ordering
520
+ rules to prevent deadlocks.
521
+
522
+ - Sleeping lock types cannot nest inside CPU local and spinning lock types.
331
523
332
- - Sleeping lock types cannot nest inside spinning lock types.
524
+ - CPU local and spinning lock types can nest inside sleeping lock types.
333
525
334
- - Spinning lock types can nest inside sleeping lock types.
526
+ - Spinning lock types can nest inside all lock types
335
527
336
528
These constraints apply both in PREEMPT_RT and otherwise.
337
529
338
530
The fact that PREEMPT_RT changes the lock category of spinlock_t and
339
- rwlock_t from spinning to sleeping means that they cannot be acquired while
340
- holding a raw spinlock. This results in the following nesting ordering:
531
+ rwlock_t from spinning to sleeping and substitutes local_lock with a
532
+ per-CPU spinlock_t means that they cannot be acquired while holding a raw
533
+ spinlock. This results in the following nesting ordering:
341
534
342
535
1) Sleeping locks
343
- 2) spinlock_t and rwlock_t
536
+ 2) spinlock_t, rwlock_t, local_lock
344
537
3) raw_spinlock_t and bit spinlocks
345
538
346
539
Lockdep will complain if these constraints are violated, both in
0 commit comments