Skip to content

Commit ebd50e2

Browse files
AlanSternpaulmckrcu
authored andcommitted
tools: memory-model: Add rmw-sequences to the LKMM
Viktor (as relayed by Jonas) has pointed out a weakness in the Linux Kernel Memory Model. Namely, the memory ordering properties of atomic operations are not monotonic: An atomic op with full-barrier semantics does not always provide ordering as strong as one with release-barrier semantics. The following litmus test illustrates the problem: -------------------------------------------------- C atomics-not-monotonic {} P0(int *x, atomic_t *y) { WRITE_ONCE(*x, 1); smp_wmb(); atomic_set(y, 1); } P1(atomic_t *y) { int r1; r1 = atomic_inc_return(y); } P2(int *x, atomic_t *y) { int r2; int r3; r2 = atomic_read(y); smp_rmb(); r3 = READ_ONCE(*x); } exists (2:r2=2 /\ 2:r3=0) -------------------------------------------------- The litmus test is allowed as shown with atomic_inc_return(), which has full-barrier semantics. But if the operation is changed to atomic_inc_return_release(), which only has release-barrier semantics, the litmus test is forbidden. Clearly this violates monotonicity. The reason is because the LKMM treats full-barrier atomic ops as if they were written: mb(); load(); store(); mb(); (where the load() and store() are the two parts of an atomic RMW op), whereas it treats release-barrier atomic ops as if they were written: load(); release_barrier(); store(); The difference is that here the release barrier orders the load part of the atomic op before the store part with A-cumulativity, whereas the mb()'s above do not. This means that release-barrier atomics can effectively extend the cumul-fence relation but full-barrier atomics cannot. To resolve this problem we introduce the rmw-sequence relation, representing an arbitrarily long sequence of atomic RMW operations in which each operation reads from the previous one, and explicitly allow it to extend cumul-fence. This modification of the memory model is sound; it holds for PPC because of B-cumulativity, it holds for TSO and ARM64 because of other-multicopy atomicity, and we can assume that atomic ops on all other architectures will be implemented so as to make it hold for them. For similar reasons we also allow rmw-sequence to extend the w-post-bounded relation, which is analogous to cumul-fence in some ways. Reported-by: Viktor Vafeiadis <[email protected]> Signed-off-by: Alan Stern <[email protected]> Reviewed-by: Jonas Oberhauser <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
1 parent 289e1c8 commit ebd50e2

File tree

2 files changed

+33
-2
lines changed

2 files changed

+33
-2
lines changed

tools/memory-model/Documentation/explanation.txt

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1007,6 +1007,36 @@ order. Equivalently,
10071007
where the rmw relation links the read and write events making up each
10081008
atomic update. This is what the LKMM's "atomic" axiom says.
10091009

1010+
Atomic rmw updates play one more role in the LKMM: They can form "rmw
1011+
sequences". An rmw sequence is simply a bunch of atomic updates where
1012+
each update reads from the previous one. Written using events, it
1013+
looks like this:
1014+
1015+
Z0 ->rf Y1 ->rmw Z1 ->rf ... ->rf Yn ->rmw Zn,
1016+
1017+
where Z0 is some store event and n can be any number (even 0, in the
1018+
degenerate case). We write this relation as: Z0 ->rmw-sequence Zn.
1019+
Note that this implies Z0 and Zn are stores to the same variable.
1020+
1021+
Rmw sequences have a special property in the LKMM: They can extend the
1022+
cumul-fence relation. That is, if we have:
1023+
1024+
U ->cumul-fence X -> rmw-sequence Y
1025+
1026+
then also U ->cumul-fence Y. Thinking about this in terms of the
1027+
operational model, U ->cumul-fence X says that the store U propagates
1028+
to each CPU before the store X does. Then the fact that X and Y are
1029+
linked by an rmw sequence means that U also propagates to each CPU
1030+
before Y does. In an analogous way, rmw sequences can also extend
1031+
the w-post-bounded relation defined below in the PLAIN ACCESSES AND
1032+
DATA RACES section.
1033+
1034+
(The notion of rmw sequences in the LKMM is similar to, but not quite
1035+
the same as, that of release sequences in the C11 memory model. They
1036+
were added to the LKMM to fix an obscure bug; without them, atomic
1037+
updates with full-barrier semantics did not always guarantee ordering
1038+
at least as strong as atomic updates with release-barrier semantics.)
1039+
10101040

10111041
THE PRESERVED PROGRAM ORDER RELATION: ppo
10121042
-----------------------------------------

tools/memory-model/linux-kernel.cat

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,9 @@ let ppo = to-r | to-w | fence | (po-unlock-lock-po & int)
7474

7575
(* Propagation: Ordering from release operations and strong fences. *)
7676
let A-cumul(r) = (rfe ; [Marked])? ; r
77+
let rmw-sequence = (rf ; rmw)*
7778
let cumul-fence = [Marked] ; (A-cumul(strong-fence | po-rel) | wmb |
78-
po-unlock-lock-po) ; [Marked]
79+
po-unlock-lock-po) ; [Marked] ; rmw-sequence
7980
let prop = [Marked] ; (overwrite & ext)? ; cumul-fence* ;
8081
[Marked] ; rfe? ; [Marked]
8182

@@ -174,7 +175,7 @@ let vis = cumul-fence* ; rfe? ; [Marked] ;
174175
let w-pre-bounded = [Marked] ; (addr | fence)?
175176
let r-pre-bounded = [Marked] ; (addr | nonrw-fence |
176177
([R4rmb] ; fencerel(Rmb) ; [~Noreturn]))?
177-
let w-post-bounded = fence? ; [Marked]
178+
let w-post-bounded = fence? ; [Marked] ; rmw-sequence
178179
let r-post-bounded = (nonrw-fence | ([~Noreturn] ; fencerel(Rmb) ; [R4rmb]))? ;
179180
[Marked]
180181

0 commit comments

Comments
 (0)