Skip to content

Commit e79e99c

Browse files
iohk-bors[bot]rjmhcoot
authored
Merge #3603
3603: IOSimPOR - race exploration with partial order reduction r=coot a=coot Subsumes #3463. This branch backports `IOSim` fixes to `IOSimPOR`, introduces various minor modifications most of which were discussed in #3463. Each patch can be reviewed seprately. - Add IOSimPOR, an IO simulator that can explore race conditions - Adapt one of the governor tests to use IOSimPOR - See no evil, hear no evil, speak no evil - io-sim-por: share a thunk in prop_shrinkCarefully - io-sim-por: introduced InternalTypes module - io-sim-por: fixed a typo - io-sim-por: refactor happensBeforeStep - io-sim-por: stylistic changes - io-sim-por: removed an if clause - io-sim-por: refactored advanceControl - io-sim-por: renamed lubVClock as leastUpperBoundVClock - io-sim-por: renamed ThreadId constructors - io-sim-por: introduced insertThread - ouroboros-network: nightly cabal flag - io-sim-por: added haddocks & notes - io-sim-por: shuffle module structure - ouroboros-network: added a todo - io-sim-por: backport CancelTimeout handling - io-sim: backport masking state fix - io-sim-por: backport logging of MaskingState Co-authored-by: John <[email protected]> Co-authored-by: Marcin Szamotulski <[email protected]>
2 parents 6899613 + 09db4ce commit e79e99c

File tree

13 files changed

+3541
-700
lines changed

13 files changed

+3541
-700
lines changed

io-classes/io-classes.cabal

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ library
4343
Control.Monad.Class.MonadThrow
4444
Control.Monad.Class.MonadTime
4545
Control.Monad.Class.MonadTimer
46+
Control.Monad.Class.MonadTest
4647
default-language: Haskell2010
4748
other-extensions: CPP
4849
TypeFamilies
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
module Control.Monad.Class.MonadTest ( MonadTest (..) ) where
2+
3+
import Control.Monad.Reader
4+
5+
class Monad m => MonadTest m where
6+
exploreRaces :: m ()
7+
exploreRaces = return ()
8+
9+
instance MonadTest IO
10+
11+
instance MonadTest m => MonadTest (ReaderT e m) where
12+
exploreRaces = lift exploreRaces
13+

io-sim/how-to-use-IOSimPOR.md

Lines changed: 322 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,322 @@
1+
How to use Control.Monad.IOSimPOR
2+
=================================
3+
4+
IOSimPOR is a version of IOSim that supports finding race conditions, by
5+
6+
- allowing many alternative schedules to be explored, and
7+
8+
- searching through schedules in an order which (empirically) is
9+
likely to play well with shrinking; when a test case that provokes
10+
a race condition is shrunk to a smaller test case that suffers from
11+
the same race condition, then it is likely that IOSimPOR will
12+
actually provoke the race condition in the smaller test too.
13+
14+
To do so, IOSimPOR detects potential racing steps when a test is run,
15+
and then re-runs the same test many times with one or more pairs of
16+
racing steps reversed. IOSimPOR runs each equivalent schedule at most
17+
once.
18+
19+
IOSimPOR will usually not explore all possible schedules, because
20+
there are too many for this to be feasible (even infinitely many, in
21+
some cases). It prioritises races that occur earlier in a test
22+
execution, and prioritises making a small number of race reversals
23+
over making many.
24+
25+
It can test non-terminating programs with an infinite trace; in such
26+
cases it only reverse races in the part of the trace that the property
27+
under test actually depends on. Thus properties that select, for
28+
example, the first simulated 24 hours of trace events, can be tested
29+
using IOSimPOR; only races that occur in the first 24 hours will be
30+
reversed.
31+
32+
IOSimPOR assumes that execution is much faster than the passage of
33+
time, so steps that occur at different simulated times are considered
34+
not to race, and will not be reversed.
35+
36+
Exploring all possible schedules is possible with the right
37+
parameters, but not usually advisable, because it is so very very
38+
expensive. When IOSimPOR is used in a QuickCheck property, then many
39+
test cases will not even have a possibility of provoking a fault,
40+
whatever the schedule; it usually makes more sense to explore a
41+
limited number of schedules each, for a large number of cases, than to
42+
explore a small number of cases very very thoroughly.
43+
44+
Do not be surprised that tests run slowly. Each "test" is running (by
45+
default) 100 different schedules; each run is more costly than a run
46+
using vanilla IOSim, because race conditions and data dependencies
47+
need to be tracked.
48+
49+
The IOSimPOR API
50+
----------------
51+
52+
The common way to use IOSim is to call
53+
54+
```hs
55+
runSimTrace :: forall a. (forall s. IOSim s a) -> Trace a
56+
```
57+
58+
which returns a trace, and then express the property to test as a
59+
predicate on the trace. Since IOSimPOR explores many traces, then the
60+
property to test is given as a continuation instead, which is called
61+
for each trace generated. The common way to use IOSimPOR is thus
62+
63+
```hs
64+
exploreSimTrace :: forall a test. (Testable test, Show a) =>
65+
(ExplorationOptions->ExplorationOptions) ->
66+
(forall s. IOSim s a) ->
67+
(Maybe (Trace a) -> Trace a -> test) ->
68+
Property
69+
```
70+
71+
The continuation is the last parameter, with type
72+
73+
```hs
74+
Maybe (Trace a) -> Trace a -> test
75+
```
76+
77+
Here the second parameter of the continuation is the trace that the
78+
property should check. The first parameter simply provides information
79+
that may be useful for debugging---if present, it is a similar trace
80+
that passed the same test, from which the failing trace was derived by
81+
reversing one race.
82+
83+
IOSimPOR Options
84+
----------------
85+
86+
`exploreSimTrace` can be controlled by a number of options, which are
87+
obtained by passing the function passed as the first parameter to a
88+
default set of options. Passing the identity function id as the first
89+
parameter uses the defaults.
90+
91+
The options are:
92+
93+
- the schedule bound, default 100, set by withScheduleBound. This is
94+
an upper bound on the number of schedules with race reversals that
95+
will be explored; a bound of zero means that the default schedule
96+
will be explored, but no others. Setting the bound to zero makes
97+
IOSimPOR behave rather like IOSim, in that only one schedule is
98+
explored, but (a) IOSimPOR is considerably slower, because it still
99+
collects information on potential races, and (b) the IOSimPOR
100+
schedule is different (based on priorities, in contrast to IOSim's
101+
round-robin), and plays better with shrinking.
102+
103+
- the branching factor, default 3, set by withBranching. This is the
104+
number of alternative schedules that IOSimPOR tries to run, per
105+
race reversal. With the default parameters, IOSimPOR will try to
106+
reverse the first 33 (100 `div` 3) races discovered using the
107+
default schedule, then (if 33 or more races are discovered), for
108+
each such reversed race, will run the reversal and try to reverse
109+
two more races in the resulting schedule. A high branching factor
110+
will explore more combinations of reversing fewer races, within the
111+
overall schedule bound. A branching factor of one will explore only
112+
schedules resulting from a single race reversal (unless there are
113+
fewer races available to be reversed than the schedule bound).
114+
115+
- the step time limit in microseconds, default 100000 (100ms), set by
116+
withStepTimeLimit. A simulation step consists of pure computation
117+
followed by an IO action of some sort; we expect in most cases that
118+
the pure computation will be fast. However, a bug might cause the
119+
pure computation to fall into an infinite loop. This can be
120+
detected by applying a time limit to the overall test, for example
121+
using QuickCheck's function within, but if such a time limit is
122+
exceeded, then no information about the trace-so-far is
123+
preserved. Instead IOSimPOR can apply a time limit to each step; if
124+
the time limit is exceeded, then the trace up to that point is
125+
passed to the continuation, terminated by a TraceLoop
126+
constructor. The time limit should be large enough that the
127+
intended pure computations always finish well within that
128+
time. Note that garbage collection pauses are not included in the
129+
time limit (which would otherwise need to be considerably longer
130+
than 100ms to accommodate them), provided the -T flag is supplied to
131+
the run-time system.
132+
133+
- the schedule control to replay, default Nothing, set by
134+
withReplay. When a test fails, exploreSimTrace displays the
135+
schedule control which provoked it. The schedule control specifies
136+
how to modify the default schedule to reproduce the failure; by
137+
saving the schedule control and passing it back to exploreSimTrace
138+
for replay, then it is possible to rerun a failed test with
139+
additional diagnostics, without repeating the search for a schedule
140+
that makes it fail.
141+
142+
IOSimPOR Statistics
143+
-------------------
144+
145+
When a tested property passes, IOSimPOR displays some statistics about
146+
the schedules that were actually run. An example of this output is:
147+
148+
```
149+
Branching factor (5865 in total):
150+
45.51% 1
151+
43.00% 0
152+
6.58% 2
153+
2.23% 3
154+
0.85% 4
155+
0.46% 30-39
156+
0.38% 5
157+
0.32% 6
158+
0.24% 10-19
159+
0.19% 20-29
160+
0.15% 8
161+
0.07% 7
162+
0.02% 9
163+
164+
Modified schedules explored (100 in total):
165+
22% 100-199
166+
22% 90-99
167+
18% 0
168+
7% 30-39
169+
4% 4
170+
4% 60-69
171+
4% 80-89
172+
3% 20-29
173+
3% 3
174+
3% 50-59
175+
3% 70-79
176+
2% 10-19
177+
2% 2
178+
2% 40-49
179+
1% 1
180+
181+
Race reversals per schedule (5865 in total):
182+
50.04% 1
183+
45.05% 2
184+
3.10% 3
185+
1.71% 0
186+
0.10% 4
187+
```
188+
189+
The first table shows us the branching factor at each point where
190+
IOSimPOR explored alternative schedules: in this case, in 43% of cases
191+
we were at a leaf of the exploration, so no alternative schedules were
192+
explored; 45% of the time we explored one alternative schedule; in a
193+
few cases larger numbers of schedules were explored.
194+
195+
The second table shows us how many schedules we explored for each
196+
test; in this case we used the default bound of 100 alternative
197+
schedules, and we reached that bound in 22% of tests--in the other
198+
tests, there were fewer than 100 possible schedules to explore. In 18%
199+
of the tests, there were no alternative schedules to explore---since
200+
these tests were randomly generated, then these 18% were probably
201+
rather trivial tests.
202+
203+
The final table shows us how many races were reversed in each
204+
alternative schedule; in this case most alternative schedules just
205+
reversed one or two races, but we did reach a maximum of four in a
206+
small number of tests.
207+
208+
IOSimPOR Scheduling
209+
-------------------
210+
211+
IOSimPOR distinguishes between non-racy threads and system threads; test
212+
threads are intended to be part of the test itself, whereas system threads are
213+
part of the system under test. The main thread in a test is always a non-racy
214+
thread; non-racy threads can fork system threads, but not vice versa. Test
215+
threads always take priority over system threads, and IOSimPOR does not
216+
attempt to reverse races involving test threads. So when writing a test, one
217+
can assume that whatever the test threads do takes place first (at each
218+
simulated time point), and then the system threads run and any races between
219+
them are explored. Likewise, if a blocked non-racy thread responds to an event
220+
in a system thread, then the test thread's response will take place before the
221+
system threads continue execution. Distinguishing non-racy threads from system
222+
threads drastically reduces the set of races that IOSimPOR needs to consider.
223+
224+
To start a system thread, a non-racy thread may call
225+
226+
```hs
227+
exploreRaces :: MonadTest m => m ()
228+
```
229+
230+
All threads forked (using forkIO) after such a call will be system
231+
threads. If there is no call to exploreRaces in a test, then all the
232+
threads forked will be non-racy threads, and IOSimPOR will not reverse any
233+
races---so it is necessary to include a call to exploreRaces somewhere
234+
in a test.
235+
236+
IOSimPOR ThreadIds contain a list of small integers; the main thread
237+
contains the list `[]`, its children are numbered `[1]`, `[2]`, `[3]` and so
238+
on, then threads forked by `[2]` are numbered `[2,1]`, `[2,2]`, `[2,3]` and so
239+
on. The id of a forked thread always extends its parent's id, and successively
240+
forked children of the same thread are numbered 1, 2, 3 and so on. Thread Ids
241+
give us useful information for debugging; it is easy to see which thread forked
242+
which other, and relatively easy to map thread ids in debugging output to
243+
threads in the source code.
244+
245+
Except when reversing races, IOSimPOR schedules threads using
246+
priorities. Non-racy threads take priority over system threads, but
247+
otherwise priorities are determined by the thread ids: the thread with
248+
the lexicographically greatest thread Id has the highest priority. As
249+
a result, threads behave in a very predictable manner: newly forked
250+
threads take priority over their parent thread, and children forked
251+
later take priority over children forked earlier from the same parent
252+
thread. Knowing these priorities helps in understanding traces.
253+
254+
When IOSimPOR reverses races, this is done by using a 'schedule
255+
control'. The schedule control is displayed in the test output when a
256+
test fails. Here is an example:
257+
258+
```
259+
Schedule control:
260+
ControlAwait [ScheduleMod (ThreadId [4],0)
261+
ControlDefault
262+
[(ThreadId [1],0),
263+
(ThreadId [1],1),
264+
(ThreadId [2],0),
265+
(ThreadId [2],1),
266+
(ThreadId [3],0)]]
267+
268+
ThreadId [4] delayed at time Time 0s
269+
until after:
270+
ThreadId [2]
271+
ThreadId [3]
272+
```
273+
274+
The schedule control consists of one or more schedule modifications,
275+
each of which specifies that an event should be delayed until after a
276+
sequence of other events. The events consist of a thread id and a
277+
per-thread event number. The interpretation of the schedule
278+
modification above is that when the default scheduler would have
279+
performed the first event in thread `[4]` (that is, (`ThreadId [4],0`)),
280+
then it should be delayed, and the list of events in the third field
281+
of the schedule modification should be performed first. In this case
282+
there is only one schedule modification, and so only one event is
283+
delayed, but in general a schedule control may delay a series of
284+
events. A failing test can be replayed by using withReplay to supply
285+
the schedule control as an option to exploreSimTrace.
286+
287+
The final part of the output tells us how the schedule in the failing
288+
test differed from the schedule in the most similar passing test:
289+
namely, thread `[4]` was delayed (at simulated time 0) until after the
290+
events in threads `[2]` and `[3]`. Since the schedule control delays
291+
thread `[4]` until after steps in threads `[1]`, `[2]` and `[3]`, then we can
292+
conclude that the last passing test already delayed thread `[4]` until
293+
after the events in thread `[1]`.
294+
295+
Potential IOSimPOR Errors
296+
-------------------------
297+
298+
Most reported errors are a result of faults in the code under test,
299+
but if you see an error of the following form:
300+
301+
```
302+
Exception:
303+
Can't follow ControlFollow [(ThreadId [5],0)] []
304+
tid: ThreadId [5]
305+
tstep: 0
306+
runqueue: [ThreadId [3],ThreadId [2],ThreadId [1]]
307+
```
308+
309+
then the probable cause is a bug in IOSimPOR itself. The message
310+
indicates that IOSimPOR scheduler is trying to follow a schedule
311+
modification that specifies that thread `[5]` should run next, but this
312+
is impossible because thread `[5]` is not in the runqueue (the list of
313+
runnable threads). If you supplied a schedule control explicitly,
314+
using withReplay, then you may perhaps have supplied a schedule
315+
control that does not match the version of the code you are running:
316+
in this case the exception is your fault. But if this message appears
317+
while you are exploring races, then it indicates a problem in
318+
IOSimPOR's dependency analysis: IOSimPOR has constructed a schedule as
319+
a result of race reversal that tries to run thread `[5]` at this point,
320+
because the dependency analysis indicates that thread `[5]` ought to be
321+
runnable---but it is not. Best to consult Quviq at this point.
322+

0 commit comments

Comments
 (0)