You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: dev/vignettes/_mirai.Rmd
+11-2Lines changed: 11 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -187,8 +187,7 @@ m3[]
187
187
188
188
m3$data$stack.trace
189
189
```
190
-
Elements of the original error condition are also accessible via `$` on the error object.
191
-
For example, additional metadata recorded by `rlang::abort()` is preserved:
190
+
A 'miraiError' inherits from the original condition classes and hence can be caught or re-thrown. The elements of the original error condition are also accessible via `$` on the error object. Additional metadata recorded by `rlang::abort()` is preserved:
192
191
```{r}
193
192
#| label: metaexample
194
193
f <- function(x) if (x > 0) stop("positive")
@@ -608,3 +607,13 @@ The daemons settings are saved under the named profile.
608
607
To create a 'mirai' task using a specific compute profile, specify the `.compute` argument to `mirai()`, which uses the 'default' compute profile if this is `NULL`.
609
608
610
609
Similarly, functions such as `status()`, `launch_local()` or `launch_remote()` should be specified with the desired `.compute` argument.
610
+
611
+
### 9. Random Number Generation
612
+
613
+
mirai employs L'Ecuyer-CMRG streams for random number generation. This is a widely-adopted, statistically-sound method deemed safe for parallel computation, and the same as that employed by base R's own parallel package.
614
+
615
+
Streams essentially cut into the RNG's period (a very long sequence of pseudo-random numbers) at intervals that are far apart from each other that they do not in practice overlap. This ensures that statistical results obtained from parallel computations remain correct and valid. The method of generating streams is recursive.
616
+
617
+
By default (when the `seed` argument to `daemons()` is `NULL`) mirai initiates a new stream for each daemon launched, in the same manner as base R. This guarantees that the results are statistically-sound, although it does not guarantee numerical reproducibility between parallel runs. Firstly, using different numbers or workers would cause mirai to be sent to different workers. Secondly, when using dispatcher, mirai are sent dynamically to the next available daemon, and this is not guaranteed to be the same each time.
618
+
619
+
Supplying an explicit integer `seed` to `daemons()` turns on reproducible RNG. Instead of initiating a new stream for each daemon, now a stream is initiated for each `mirai()`. This is slightly computationally wasteful (although posing a negligible effect on performance), but it does guarantee the same results across runs, and regardless of the number of daemons used.
Copy file name to clipboardExpand all lines: dev/vignettes/_v06-questions.Rmd
+3-5Lines changed: 3 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -36,9 +36,8 @@ On the other hand, if your code previously used the `globals` argument to supply
36
36
Note that this would only work in the case of a named list and not the other forms that `globals` can take.
37
37
38
38
Regardless of using a `mirai()` or `future_promise()`, we recommend that you pass globals explicitly in production code.
39
-
This is as globals detection is never 100% perfect, and there is always some element of guesswork.
40
-
Edge cases can lead to unpredictable failures or silently incorrect results.
41
-
Explicit passing of variables allows for transparent and reliable behaviour, that remains completely robust over time.
39
+
This is as globals detection is never 100% perfect, and there is always some element of guesswork, with edge cases leading to unpredictable results.
40
+
Explicit passing of variables allows for transparent and reliable behaviour, remaining robust over time.
42
41
43
42
**Capture globals using `environment()`:**
44
43
@@ -104,8 +103,7 @@ The random seed is not reset after each mirai call to ensure that however many r
104
103
105
104
Hence normally, the random seed should be set once on the host process when daemons are created, rather than in each daemon.
106
105
107
-
If it is required to set the seed in each daemon, this should be done using an independent method and set each time random draws are required.
108
-
Another option would be to set the random seed within a local execution scope to prevent the global random seed on each daemon from being affected.
106
+
For numerical reproducibility, set the `seed` argument to `daemons()` (see the Random Number Generation section of the reference vignette for further details).
109
107
110
108
### 3. Accessing package functions during development
By running the above two calculations in parallel, they take roughly half the time as running sequentially (minus a relatively inconsequential parallelization overhead).
107
107
@@ -165,9 +165,9 @@ for (i in 1:10) {
165
165
#> iteration 3 successful
166
166
#> iteration 4 successful
167
167
#> iteration 5 successful
168
+
#> Error: random error
168
169
#> iteration 6 successful
169
170
#> iteration 7 successful
170
-
#> Error: random error
171
171
#> iteration 8 successful
172
172
#> iteration 9 successful
173
173
#> iteration 10 successful
@@ -211,8 +211,7 @@ m3$data$stack.trace
211
211
#> [[2]]
212
212
#> f(1)
213
213
```
214
-
Elements of the original error condition are also accessible via `$` on the error object.
215
-
For example, additional metadata recorded by `rlang::abort()` is preserved:
214
+
A 'miraiError' inherits from the original condition classes and hence can be caught or re-thrown. The elements of the original error condition are also accessible via `$` on the error object. Additional metadata recorded by `rlang::abort()` is preserved:
216
215
217
216
```r
218
217
f<-function(x) if (x>0) stop("positive")
@@ -287,7 +286,7 @@ status()
287
286
#> [1] 6
288
287
#>
289
288
#> $daemons
290
-
#> [1] "abstract://b1379cc31bd3ab70ab8177ce"
289
+
#> [1] "ipc:///tmp/130164c8ebee629bd7eab602"
291
290
#>
292
291
#> $mirai
293
292
#> awaiting executing completed
@@ -325,7 +324,7 @@ status()
325
324
#> [1] 6
326
325
#>
327
326
#> $daemons
328
-
#> [1] "abstract://8eea003087abef8dd2dbcca0"
327
+
#> [1] "ipc:///tmp/2ef3f6b12e08ecf8ef29e17f"
329
328
```
330
329
331
330
#### Everywhere
@@ -417,7 +416,7 @@ status()
417
416
#> [1] 0
418
417
#>
419
418
#> $daemons
420
-
#> [1] "tcp://192.168.1.71:39247"
419
+
#> [1] "tcp://10.246.62.139:53122"
421
420
#>
422
421
#> $mirai
423
422
#> awaiting executing completed
@@ -592,7 +591,7 @@ The printed return values may then be copy / pasted directly to a remote machine
@@ -690,3 +689,13 @@ The daemons settings are saved under the named profile.
690
689
To create a 'mirai' task using a specific compute profile, specify the `.compute` argument to `mirai()`, which uses the 'default' compute profile if this is `NULL`.
691
690
692
691
Similarly, functions such as `status()`, `launch_local()` or `launch_remote()` should be specified with the desired `.compute` argument.
692
+
693
+
### 9. Random Number Generation
694
+
695
+
mirai employs L'Ecuyer-CMRG streams for random number generation. This is a widely-adopted, statistically-sound method deemed safe for parallel computation, and the same as that employed by base R's own parallel package.
696
+
697
+
Streams essentially cut into the RNG's period (a very long sequence of pseudo-random numbers) at intervals that are far apart from each other that they do not in practice overlap. This ensures that statistical results obtained from parallel computations remain correct and valid. The method of generating streams is recursive.
698
+
699
+
By default (when the `seed` argument to `daemons()` is `NULL`) mirai initiates a new stream for each daemon launched, in the same manner as base R. This guarantees that the results are statistically-sound, although it does not guarantee numerical reproducibility between parallel runs. Firstly, using different numbers or workers would cause mirai to be sent to different workers. Secondly, when using dispatcher, mirai are sent dynamically to the next available daemon, and this is not guaranteed to be the same each time.
700
+
701
+
Supplying an explicit integer `seed` to `daemons()` turns on reproducible RNG. Instead of initiating a new stream for each daemon, now a stream is initiated for each `mirai()`. This is slightly computationally wasteful (although posing a negligible effect on performance), but it does guarantee the same results across runs, and regardless of the number of daemons used.
Copy file name to clipboardExpand all lines: vignettes/v06-questions.Rmd
+5-7Lines changed: 5 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -29,9 +29,8 @@ On the other hand, if your code previously used the `globals` argument to supply
29
29
Note that this would only work in the case of a named list and not the other forms that `globals` can take.
30
30
31
31
Regardless of using a `mirai()` or `future_promise()`, we recommend that you pass globals explicitly in production code.
32
-
This is as globals detection is never 100% perfect, and there is always some element of guesswork.
33
-
Edge cases can lead to unpredictable failures or silently incorrect results.
34
-
Explicit passing of variables allows for transparent and reliable behaviour, that remains completely robust over time.
32
+
This is as globals detection is never 100% perfect, and there is always some element of guesswork, with edge cases leading to unpredictable results.
33
+
Explicit passing of variables allows for transparent and reliable behaviour, remaining robust over time.
35
34
36
35
**Capture globals using `environment()`:**
37
36
@@ -79,10 +78,10 @@ vec2 <- 4:6
79
78
# Returns different values: good
80
79
mirai_map(list(vec, vec2), \(x) rnorm(x))[]
81
80
#> [[1]]
82
-
#> [1] 0.2112876 0.9041800 0.7834014
81
+
#> [1] 0.38714685 0.09582403 0.85062845
83
82
#>
84
83
#> [[2]]
85
-
#> [1] -0.3150949 -1.5628536 -0.3860887
84
+
#> [1] 0.3188942 0.2086956 0.5288199
86
85
87
86
# Set the seed in the function
88
87
mirai_map(list(vec, vec2), \(x) {
@@ -113,8 +112,7 @@ The random seed is not reset after each mirai call to ensure that however many r
113
112
114
113
Hence normally, the random seed should be set once on the host process when daemons are created, rather than in each daemon.
115
114
116
-
If it is required to set the seed in each daemon, this should be done using an independent method and set each time random draws are required.
117
-
Another option would be to set the random seed within a local execution scope to prevent the global random seed on each daemon from being affected.
115
+
For numerical reproducibility, set the `seed` argument to `daemons()` (see the Random Number Generation section of the reference vignette for further details).
118
116
119
117
### 3. Accessing package functions during development
0 commit comments