Skip to content

Commit a68415c

Browse files
author
Ingo Molnar
committed
Merge branch 'lkmm' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into locking/core
Pull v5.9 LKMM changes from Paul E. McKenney. Mostly documentation changes, but also some new litmus tests for atomic ops. Signed-off-by: Ingo Molnar <[email protected]>
2 parents 63722bb + 5ef0a07 commit a68415c

File tree

11 files changed

+285
-58
lines changed

11 files changed

+285
-58
lines changed

Documentation/atomic_t.txt

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -85,21 +85,21 @@ smp_store_release() respectively. Therefore, if you find yourself only using
8585
the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all
8686
and are doing it wrong.
8787

88-
A subtle detail of atomic_set{}() is that it should be observable to the RMW
89-
ops. That is:
88+
A note for the implementation of atomic_set{}() is that it must not break the
89+
atomicity of the RMW ops. That is:
9090

91-
C atomic-set
91+
C Atomic-RMW-ops-are-atomic-WRT-atomic_set
9292

9393
{
94-
atomic_set(v, 1);
94+
atomic_t v = ATOMIC_INIT(1);
9595
}
9696

97-
P1(atomic_t *v)
97+
P0(atomic_t *v)
9898
{
99-
atomic_add_unless(v, 1, 0);
99+
(void)atomic_add_unless(v, 1, 0);
100100
}
101101

102-
P2(atomic_t *v)
102+
P1(atomic_t *v)
103103
{
104104
atomic_set(v, 0);
105105
}
@@ -233,34 +233,34 @@ as well. Similarly, something like:
233233
is an ACQUIRE pattern (though very much not typical), but again the barrier is
234234
strictly stronger than ACQUIRE. As illustrated:
235235

236-
C strong-acquire
236+
C Atomic-RMW+mb__after_atomic-is-stronger-than-acquire
237237

238238
{
239239
}
240240

241-
P1(int *x, atomic_t *y)
241+
P0(int *x, atomic_t *y)
242242
{
243243
r0 = READ_ONCE(*x);
244244
smp_rmb();
245245
r1 = atomic_read(y);
246246
}
247247

248-
P2(int *x, atomic_t *y)
248+
P1(int *x, atomic_t *y)
249249
{
250250
atomic_inc(y);
251251
smp_mb__after_atomic();
252252
WRITE_ONCE(*x, 1);
253253
}
254254

255255
exists
256-
(r0=1 /\ r1=0)
256+
(0:r0=1 /\ 0:r1=0)
257257

258258
This should not happen; but a hypothetical atomic_inc_acquire() --
259259
(void)atomic_fetch_inc_acquire() for instance -- would allow the outcome,
260260
because it would not order the W part of the RMW against the following
261261
WRITE_ONCE. Thus:
262262

263-
P1 P2
263+
P0 P1
264264

265265
t = LL.acq *y (0)
266266
t++;

Documentation/litmus-tests/README

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
============
2+
LITMUS TESTS
3+
============
4+
5+
Each subdirectory contains litmus tests that are typical to describe the
6+
semantics of respective kernel APIs.
7+
For more information about how to "run" a litmus test or how to generate
8+
a kernel test module based on a litmus test, please see
9+
tools/memory-model/README.
10+
11+
12+
atomic (/atomic derectory)
13+
--------------------------
14+
15+
Atomic-RMW+mb__after_atomic-is-stronger-than-acquire.litmus
16+
Test that an atomic RMW followed by a smp_mb__after_atomic() is
17+
stronger than a normal acquire: both the read and write parts of
18+
the RMW are ordered before the subsequential memory accesses.
19+
20+
Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus
21+
Test that atomic_set() cannot break the atomicity of atomic RMWs.
22+
NOTE: Require herd7 7.56 or later which supports "(void)expr".
23+
24+
25+
RCU (/rcu directory)
26+
--------------------
27+
28+
MP+onceassign+derefonce.litmus (under tools/memory-model/litmus-tests/)
29+
Demonstrates the use of rcu_assign_pointer() and rcu_dereference() to
30+
ensure that an RCU reader will not see pre-initialization garbage.
31+
32+
RCU+sync+read.litmus
33+
RCU+sync+free.litmus
34+
Both the above litmus tests demonstrate the RCU grace period guarantee
35+
that an RCU read-side critical section can never span a grace period.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
C Atomic-RMW+mb__after_atomic-is-stronger-than-acquire
2+
3+
(*
4+
* Result: Never
5+
*
6+
* Test that an atomic RMW followed by a smp_mb__after_atomic() is
7+
* stronger than a normal acquire: both the read and write parts of
8+
* the RMW are ordered before the subsequential memory accesses.
9+
*)
10+
11+
{
12+
}
13+
14+
P0(int *x, atomic_t *y)
15+
{
16+
int r0;
17+
int r1;
18+
19+
r0 = READ_ONCE(*x);
20+
smp_rmb();
21+
r1 = atomic_read(y);
22+
}
23+
24+
P1(int *x, atomic_t *y)
25+
{
26+
atomic_inc(y);
27+
smp_mb__after_atomic();
28+
WRITE_ONCE(*x, 1);
29+
}
30+
31+
exists
32+
(0:r0=1 /\ 0:r1=0)
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
C Atomic-RMW-ops-are-atomic-WRT-atomic_set
2+
3+
(*
4+
* Result: Never
5+
*
6+
* Test that atomic_set() cannot break the atomicity of atomic RMWs.
7+
* NOTE: This requires herd7 7.56 or later which supports "(void)expr".
8+
*)
9+
10+
{
11+
atomic_t v = ATOMIC_INIT(1);
12+
}
13+
14+
P0(atomic_t *v)
15+
{
16+
(void)atomic_add_unless(v, 1, 0);
17+
}
18+
19+
P1(atomic_t *v)
20+
{
21+
atomic_set(v, 0);
22+
}
23+
24+
exists
25+
(v=2)
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
C RCU+sync+free
2+
3+
(*
4+
* Result: Never
5+
*
6+
* This litmus test demonstrates that an RCU reader can never see a write that
7+
* follows a grace period, if it did not see writes that precede that grace
8+
* period.
9+
*
10+
* This is a typical pattern of RCU usage, where the write before the grace
11+
* period assigns a pointer, and the writes following the grace period destroy
12+
* the object that the pointer used to point to.
13+
*
14+
* This is one implication of the RCU grace-period guarantee, which says (among
15+
* other things) that an RCU read-side critical section cannot span a grace period.
16+
*)
17+
18+
{
19+
int x = 1;
20+
int *y = &x;
21+
int z = 1;
22+
}
23+
24+
P0(int *x, int *z, int **y)
25+
{
26+
int *r0;
27+
int r1;
28+
29+
rcu_read_lock();
30+
r0 = rcu_dereference(*y);
31+
r1 = READ_ONCE(*r0);
32+
rcu_read_unlock();
33+
}
34+
35+
P1(int *x, int *z, int **y)
36+
{
37+
rcu_assign_pointer(*y, z);
38+
synchronize_rcu();
39+
WRITE_ONCE(*x, 0);
40+
}
41+
42+
exists (0:r0=x /\ 0:r1=0)
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
C RCU+sync+read
2+
3+
(*
4+
* Result: Never
5+
*
6+
* This litmus test demonstrates that after a grace period, an RCU updater always
7+
* sees all stores done in prior RCU read-side critical sections. Such
8+
* read-side critical sections would have ended before the grace period ended.
9+
*
10+
* This is one implication of the RCU grace-period guarantee, which says (among
11+
* other things) that an RCU read-side critical section cannot span a grace period.
12+
*)
13+
14+
{
15+
int x = 0;
16+
int y = 0;
17+
}
18+
19+
P0(int *x, int *y)
20+
{
21+
rcu_read_lock();
22+
WRITE_ONCE(*x, 1);
23+
WRITE_ONCE(*y, 1);
24+
rcu_read_unlock();
25+
}
26+
27+
P1(int *x, int *y)
28+
{
29+
int r0;
30+
int r1;
31+
32+
r0 = READ_ONCE(*x);
33+
synchronize_rcu();
34+
r1 = READ_ONCE(*y);
35+
}
36+
37+
exists (1:r0=1 /\ 1:r1=0)

MAINTAINERS

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9972,6 +9972,7 @@ M: Luc Maranget <[email protected]>
99729972
M: "Paul E. McKenney" <[email protected]>
99739973
R: Akira Yokosawa <[email protected]>
99749974
R: Daniel Lustig <[email protected]>
9975+
R: Joel Fernandes <[email protected]>
99759976
99769977
99779978
S: Supported
@@ -9980,6 +9981,7 @@ F: Documentation/atomic_bitops.txt
99809981
F: Documentation/atomic_t.txt
99819982
F: Documentation/core-api/atomic_ops.rst
99829983
F: Documentation/core-api/refcount-vs-atomic.rst
9984+
F: Documentation/litmus-tests/
99839985
F: Documentation/memory-barriers.txt
99849986
F: tools/memory-model/
99859987

tools/memory-model/Documentation/explanation.txt

Lines changed: 45 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1987,28 +1987,36 @@ outcome undefined.
19871987

19881988
In technical terms, the compiler is allowed to assume that when the
19891989
program executes, there will not be any data races. A "data race"
1990-
occurs when two conflicting memory accesses execute concurrently;
1991-
two memory accesses "conflict" if:
1990+
occurs when there are two memory accesses such that:
19921991

1993-
they access the same location,
1992+
1. they access the same location,
19941993

1995-
they occur on different CPUs (or in different threads on the
1996-
same CPU),
1994+
2. at least one of them is a store,
19971995

1998-
at least one of them is a plain access,
1996+
3. at least one of them is plain,
19991997

2000-
and at least one of them is a store.
1998+
4. they occur on different CPUs (or in different threads on the
1999+
same CPU), and
20012000

2002-
The LKMM tries to determine whether a program contains two conflicting
2003-
accesses which may execute concurrently; if it does then the LKMM says
2004-
there is a potential data race and makes no predictions about the
2005-
program's outcome.
2001+
5. they execute concurrently.
20062002

2007-
Determining whether two accesses conflict is easy; you can see that
2008-
all the concepts involved in the definition above are already part of
2009-
the memory model. The hard part is telling whether they may execute
2010-
concurrently. The LKMM takes a conservative attitude, assuming that
2011-
accesses may be concurrent unless it can prove they cannot.
2003+
In the literature, two accesses are said to "conflict" if they satisfy
2004+
1 and 2 above. We'll go a little farther and say that two accesses
2005+
are "race candidates" if they satisfy 1 - 4. Thus, whether or not two
2006+
race candidates actually do race in a given execution depends on
2007+
whether they are concurrent.
2008+
2009+
The LKMM tries to determine whether a program contains race candidates
2010+
which may execute concurrently; if it does then the LKMM says there is
2011+
a potential data race and makes no predictions about the program's
2012+
outcome.
2013+
2014+
Determining whether two accesses are race candidates is easy; you can
2015+
see that all the concepts involved in the definition above are already
2016+
part of the memory model. The hard part is telling whether they may
2017+
execute concurrently. The LKMM takes a conservative attitude,
2018+
assuming that accesses may be concurrent unless it can prove they
2019+
are not.
20122020

20132021
If two memory accesses aren't concurrent then one must execute before
20142022
the other. Therefore the LKMM decides two accesses aren't concurrent
@@ -2171,8 +2179,8 @@ again, now using plain accesses for buf:
21712179
}
21722180

21732181
This program does not contain a data race. Although the U and V
2174-
accesses conflict, the LKMM can prove they are not concurrent as
2175-
follows:
2182+
accesses are race candidates, the LKMM can prove they are not
2183+
concurrent as follows:
21762184

21772185
The smp_wmb() fence in P0 is both a compiler barrier and a
21782186
cumul-fence. It guarantees that no matter what hash of
@@ -2326,12 +2334,11 @@ could now perform the load of x before the load of ptr (there might be
23262334
a control dependency but no address dependency at the machine level).
23272335

23282336
Finally, it turns out there is a situation in which a plain write does
2329-
not need to be w-post-bounded: when it is separated from the
2330-
conflicting access by a fence. At first glance this may seem
2331-
impossible. After all, to be conflicting the second access has to be
2332-
on a different CPU from the first, and fences don't link events on
2333-
different CPUs. Well, normal fences don't -- but rcu-fence can!
2334-
Here's an example:
2337+
not need to be w-post-bounded: when it is separated from the other
2338+
race-candidate access by a fence. At first glance this may seem
2339+
impossible. After all, to be race candidates the two accesses must
2340+
be on different CPUs, and fences don't link events on different CPUs.
2341+
Well, normal fences don't -- but rcu-fence can! Here's an example:
23352342

23362343
int x, y;
23372344

@@ -2367,7 +2374,7 @@ concurrent and there is no race, even though P1's plain store to y
23672374
isn't w-post-bounded by any marked accesses.
23682375

23692376
Putting all this material together yields the following picture. For
2370-
two conflicting stores W and W', where W ->co W', the LKMM says the
2377+
race-candidate stores W and W', where W ->co W', the LKMM says the
23712378
stores don't race if W can be linked to W' by a
23722379

23732380
w-post-bounded ; vis ; w-pre-bounded
@@ -2380,8 +2387,8 @@ sequence, and if W' is plain then they also have to be linked by a
23802387

23812388
w-post-bounded ; vis ; r-pre-bounded
23822389

2383-
sequence. For a conflicting load R and store W, the LKMM says the two
2384-
accesses don't race if R can be linked to W by an
2390+
sequence. For race-candidate load R and store W, the LKMM says the
2391+
two accesses don't race if R can be linked to W by an
23852392

23862393
r-post-bounded ; xb* ; w-pre-bounded
23872394

@@ -2413,20 +2420,20 @@ is, the rules governing the memory subsystem's choice of a store to
24132420
satisfy a load request and its determination of where a store will
24142421
fall in the coherence order):
24152422

2416-
If R and W conflict and it is possible to link R to W by one
2417-
of the xb* sequences listed above, then W ->rfe R is not
2418-
allowed (i.e., a load cannot read from a store that it
2423+
If R and W are race candidates and it is possible to link R to
2424+
W by one of the xb* sequences listed above, then W ->rfe R is
2425+
not allowed (i.e., a load cannot read from a store that it
24192426
executes before, even if one or both is plain).
24202427

2421-
If W and R conflict and it is possible to link W to R by one
2422-
of the vis sequences listed above, then R ->fre W is not
2423-
allowed (i.e., if a store is visible to a load then the load
2424-
must read from that store or one coherence-after it).
2428+
If W and R are race candidates and it is possible to link W to
2429+
R by one of the vis sequences listed above, then R ->fre W is
2430+
not allowed (i.e., if a store is visible to a load then the
2431+
load must read from that store or one coherence-after it).
24252432

2426-
If W and W' conflict and it is possible to link W to W' by one
2427-
of the vis sequences listed above, then W' ->co W is not
2428-
allowed (i.e., if one store is visible to a second then the
2429-
second must come after the first in the coherence order).
2433+
If W and W' are race candidates and it is possible to link W
2434+
to W' by one of the vis sequences listed above, then W' ->co W
2435+
is not allowed (i.e., if one store is visible to a second then
2436+
the second must come after the first in the coherence order).
24302437

24312438
This is the extent to which the LKMM deals with plain accesses.
24322439
Perhaps it could say more (for example, plain accesses might

0 commit comments

Comments
 (0)