forked from brianhigh/envh543
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathex0618prob2d.Rmd
More file actions
759 lines (579 loc) · 27.4 KB
/
ex0618prob2d.Rmd
File metadata and controls
759 lines (579 loc) · 27.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
---
title: "2-D Monte Carlo simulation of microbial exposure"
author: "Jane Pouzou and Brian High"
date: ''
output:
html_document:
toc: true
theme: united
keep_md: yes
self-contained: yes
---
## Introduction
This document offers three 2-D
[Monte Carlo](https://en.wikipedia.org/wiki/Monte_Carlo_method) probabilistic
solutions in R for the daily microbial exposure from drinking water consumption,
swimming in surface water and shellfish consumption for
[Example 6.18](images/ex0618.png) from pages
[215-216](https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118910030.ch6#page=57) of:
[Quantitative Microbial Risk Assessment, 2nd Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1118145291,subjectCd-CH20.html)
by Charles N. Haas, Joan B. Rose, and Charles P. Gerba. (Wiley, 2014).
This is the copyright statement for the book:
> © Haas, Charles N.; Rose, Joan B.; Gerba, Charles P., Jun 02, 2014,
> Quantitative Microbial Risk Assessment Wiley, Somerset, ISBN: 9781118910528
The data for this example comes from this book, but the R code presented below
is an original work, released under a
[Creative Commons Attribution-ShareAlike 4.0 International License]( https://creativecommons.org/licenses/by-sa/4.0/).
Details may be found at the end of this document.
## Three Solutions
The first solution is coded manually with functionality built into "base" R.
The second and third solutions use the _mc2d_ package.
## Set global options
```{r}
# Set display options for use with the print() function.
options(digits = 5)
# Set knitr options for use when rendering this document.
library("knitr")
opts_chunk$set(cache=TRUE, message=FALSE)
```
## Load packages
Make sure the _mc2d_ package is installed before trying to use it. If we just
tried to load it with `library()` or `require()` before it had been installed,
our program would end with errors. (That would not be very friendly.)
```{r}
# Load one or more packages into memory, installing as needed.
load.pkgs <- function(pkgs, repos = "http://cran.r-project.org") {
result <- sapply(pkgs, function(pkg) {
if (!suppressWarnings(require(pkg, character.only = TRUE))) {
install.packages(pkg, quiet = TRUE, repos = repos)
library(pkg, character.only = TRUE, quietly = TRUE)}})
}
suppressMessages(load.pkgs(c("mc2d"))) # Or just use: library(mc2d)
```
## Perform a 2-D Monte Carlo simulation
### Define variables
1. Define the deterministic factors from the microbial exposure example.
2. Define the _seed_ used for random sampling to enable reproducible results.
This variable will be used later to set the seed using the `set.seed()`
function before random sampling.
3. Define the number of random samples to be drawn from probability
distrubutions in the two dimensions (variability and uncertainty) of the
simulation.
```{r}
# Values from Example 6.18 from Quantitative Microbial Risk Assessment,
# 2nd Edition by Charles N. Haas, Joan B. Rose, and Charles P. Gerba.
# (Wiley, 2014), pp. 215-216.
sf.viral.load <- 1 # Shellfish viral loading (viruses/g)
dw.viral.load <- 0.001 # Drinking water viral loading (viruses/L)
sf.cons.g <- 0.135 # Shellfish consumption (viruses/day)
sw.viral.load <- 0.1 # Swimming in viral loading (viruses/L)
sw.frequency <- 7 # Swimming frequency (swims/year)
# Define an integer to use when setting the seed of the random number generator.
seed <- 1
# Define the number of samples for the variability and uncertainty dimensions.
nsv <- 5000
nsu <- 250
```
### Sample from distributions reflecting uncertainty
In our previous [1-D Monte Carlo simulation](ex0618prob.md), we used a point
estimate for the ingestion rate of surface water while swimming. Now, for our
2-D simulation, we will represent the uncertainty around this estimate with a
normal distribution.
For the uncertainty dimension, generate 250 random samples from a truncated
normal distribution to estimate ingestion rate (IR) in mL of surface water while
swimming. We want to make sure our values are positive, so we will draw from
a normal distribution which is truncated at lower truncation point of 0.
```{r}
# Define a function to randomly draw from a truncated normal distribution.
# From: Author = "Dason"; URL = http://stackoverflow.com/questions/19343133/
rtnorm <- function(n, mean = 0, sd = 1, lower = -Inf, upper = Inf) {
qnorm(runif(n, pnorm(lower, mean, sd), pnorm(upper, mean, sd)), mean, sd)
}
# The point estimate of 50 mL/hr for ingestion of surface water while swimming
# is taken from the QMRA textbook (Haas et al. 2014), page 215. We set it here
# as the mean. The standard deviation value of 45 used here is fictitious.
set.seed(seed)
sw.d.IR <- rtnorm(nsu, mean = 50, sd = 45, lower = 0)
```
Plot the kernel density estimates for surface water ingestion rate.
```{r kernel-density-plot}
plot(density(sw.d.IR))
```
### Define exposure risk function
Define a risk function to calculate estimated microbial exposure.
```{r}
Risk.fcn <- function(sf.vl, sf.cons.g, dw.cons.L, dw.vl,
sw.vl, sw.daily.IR, sw.duration, sw.frequency) {
((sf.vl * sf.cons.g) + (dw.cons.L * dw.vl) +
((sw.vl * (sw.daily.IR * sw.duration * sw.frequency)) / 365 / 1000))
}
```
### Run a 2-D MC simulation
Run the simulation using a nested loop structure of 5000 iterations (in the
variability dimension) performed by the `sapply()` function for each of 250
iterations (in the uncertainty dimension) of a `for()` loop.
Specifically, within the `for()` loop:
- Sample from the distributions reflecting variability:
1. Drinking water consumption in liters/day (log-normal distribution)
2. Swimming duration in hours (discrete distribution)
- Compute 5000 daily exposure estimates and store as a vector in a matrix.
The results of each iteration will be accumulated in a 5000 row by 250 column
matrix.
```{r}
# Define an empty matrix to hold the simulation results.
Risk.mat <- matrix(as.numeric(NA), nrow = nsv, ncol = nsu)
# Loop through the uncertainty dimension.
for (j in 1:nsu) {
# 1. Generate 5000 random samples from a log-normal distribution to estimate
# exposure from consumption of drinking water (ml/day). Divide by 1000
# mL/L to get consumption in liters/day. Values for meanlog and sdlog
# are from the QMRA textbook (Haas et al. 2014), page 216, Table 6.30.
set.seed(seed)
dw.cons.L <- rlnorm(n = nsv, meanlog = 7.49, sdlog = 0.407) / 1000
# 2. Sample 5000 times from a discrete distribution of swim duration with
# assigned probabilities of each outcome. These values are hypothetical
# and are not found in the text, but are defined here to provide an
# example of sampling from a discrete distribution.
set.seed(seed)
swim.duration <- sample(x = c(0.5, 1, 2, 2.6), size = nsv,
replace = TRUE, prob = c(0.1, 0.1, 0.2, 0.6))
# Loop through the variability dimension.
# Compute 5000 daily simulations and store as a vector in a matrix.
Risk.mat[, j] <- sapply(1:nsv, function(i)
# Define a function to calculate esimated microbial exposure risk.
Risk.fcn(dw.cons.L = dw.cons.L[i],
sw.duration = swim.duration[i],
sf.vl = sf.viral.load,
dw.vl = dw.viral.load,
sf.cons.g = sf.cons.g,
sw.vl = sw.viral.load,
sw.daily.IR = sw.d.IR[j],
sw.frequency = sw.frequency))
}
```
### Summarize results
Since this is a 2-D simulation, we can report the uncertainty in our point
estimate of the microbial exposure risk.
We will use the 50th percentile (median) of the means of each iteration in the
uncertainty dimension as our point estimate. The 2.5th and 97.5th percentiles
are used to establish a 95% confidence interval (CI95). We will also report the
mean of the means.
```{r}
# Report the mean and median of the means with a 95% confidence interval (CI95).
mean.risk <- sapply(1:nsu, function(j) mean(Risk.mat[, j]))
mean(mean.risk)
quantile(mean.risk, probs = seq(0, 1, 0.025))[c("2.5%", "50%", "97.5%")]
```
Next, we plot the empirical cumulative distribution function (ECDF) of the risk
estimate.
Find the median for each row of the risk matrix using the
`rowMedians()` function (from the _miscTools_ package), calculate the ECDF with
the `ecdf()` function (from the _stats_ package), and then plot with the
`plot()` function (from the _graphics_ package).
```{r ecdf-plot-risk-mat-basic}
load.pkgs(c("miscTools"))
plot(ecdf(rowMedians(Risk.mat)))
```
We can plot more quantiles, but it takes a little more work. We will need to
store the quantiles for each row in a data frame. We begin constructing our
plot with the ECDF of the median (50% percentile) using `plot()` and `ecdf()`.
Then we add a line for the ECDF of each of the other quantiles using
`mapply()` and `lines()`.
```{r ecdf-plot-risk-mat-quantiles}
# Construct a data frame of quantiles and another data frame for their ECDFs.
probs <- c(0.025, 0.25, 0.50, 0.75, 0.975)
quant <- as.data.frame(t(apply(Risk.mat, 1, quantile, probs = probs)))
ecdfs <- sapply(names(quant), function(q) ecdf(quant[[q]]))
# Plot the ECDF of the median first to create the border, scales and labels.
plot(ecdfs[['50%']], main = '')
# Plot a line for each of the quantiles, using shades of gray for line colors.
grays <- c('gray75', 'gray35', 'black', 'gray35', 'gray75')
lines <- mapply(function(e, g) lines(e, col = g), ecdfs, grays)
```
Alternatively, we can produce a similar plot with _ggplot2_. First we need to
reshape the data frame with `melt()` (from the _reshape_ package), then we can
plot with `ggplot()`.
```{r ecdf-plot-risk-mat-ggplot2}
load.pkgs(c("reshape", "ggplot2"))
# Reshape the "wide" format of the data frame to a "long" format with `melt()`.
quant.melt <- suppressMessages(melt(quant))
names(quant.melt) <- c('q', 'x')
# Calculate ECDFs and plot lines for them, using our custom gray palette.
ggplot(quant.melt, aes(x = x)) + theme_bw() + theme(legend.position = 'none') +
geom_hline(yintercept = c(0, 1), linetype = "dashed", color = 'gray') +
stat_ecdf(aes(group = q, color = q)) + xlab('x') + ylab('Fn(x)') +
scale_colour_manual(values = grays)
```
We can also use the `Ecdf()` plotting function from the _Hmisc_ package.
```{r ecdf-plot-risk-mat-hmisc-ecdf}
load.pkgs(c("Hmisc"))
Ecdf(x = quant.melt$x, group = quant.melt$q, col = grays,
label.curves = FALSE, xlab = 'x', ylab = 'Fn(x)', subtitles = FALSE)
abline(h = 0:1, col = "gray", lty = 2)
```
Finally, we can produce this same sort of plot by using the `plot.mcnode()`
function of the [mc2d](https://cran.r-project.org/web/packages/mc2d/index.html)
package. With this, we may easily compare with plots from later examples. This
function requires us to convert our matrix to a "mcnode", which we will do
using the `mcdata()` function. This plotting approach requires far less coding
than the two previous plots, but does not offer quite as much control over how
the plot is displayed.
```{r ecdf-plot-risk-mat}
# Plot the empirical cumulative distribution using the mc2d package.
expo.mc <- mcdata(Risk.mat, type = 'VU', nsv = nsv, nsu = nsu)
plot(expo.mc) # This actually calls plot.mcnode().
```
## Repeat the simulation with mc2d
We will demonstrate two alternative ways to run 2-dimensional Monte Carlo
simulation using the
[mc2d](https://cran.r-project.org/web/packages/mc2d/index.html) package.
For our first alternative, we will use
[mcmodel](http://www.inside-r.org/packages/cran/mc2d/docs/mcmodel) and
[evalmcmod](http://www.inside-r.org/packages/cran/mc2d/docs/evalmcmod) from
the [mc2d](https://cran.r-project.org/web/packages/mc2d/index.html) package.
This is a simple way to implement the 2-D simulation because we just need to
define the model with `mcmodel()` and evaluate it with `evalmcmod()`.
### Load packages
We did this earlier in the document, but will add it again here as a reminder of
what packages are needed for this example.
```{r warning=FALSE}
# Load packages.
library(mc2d)
```
### Define variables
Set the number of simulations for the variability and uncertainty dimensions.
```{r message=FALSE}
ndvar(5000) # Variability
ndunc(250) # Uncertainty
```
The _mc2d_ functions will set `nsv` to `ndvar()` and `nsu` to `ndunc()` by
default. Alternatively, we could supply these values to the _mc2d_ model
functions when we call them.
Define the variables we will use to set the `seed` for random sampling and the
number of `digits` for `print()` statements.
```{r}
seed <- 1
digits <- 5
```
We will use this variable to explicitly set the seed with the various
_mc2d_ functions. Another approach would be to only set it through the
model evaluation functions, or not at all. Since we want to do our best
to provide the most reproducible results, we set the seed explicitly.
### Define exposure model
Within the [mcmodel](http://www.inside-r.org/packages/cran/mc2d/docs/mcmodel)
function, use [mcstoc](http://www.inside-r.org/packages/cran/mc2d/docs/mcstoc)
to define "mc nodes" for each component of the model.
For each stochastic variable, use the `mcstoc()` function to create a
"mcnode", supply a probability function, the node type as "V" for variability
or "U" for uncertainty, and any additional parameters to be passed to the
probability function.
For deterministic variables, create nodes with the `mcdata()` function
using the "0" type.
We will model the deterministic factors as uniform probablity distributions.
The last statement makes an "mc" object using the
[mc](http://www.inside-r.org/packages/cran/mc2d/docs/mc) function.
```{r}
# Define an exposure model for evaluation by evalmcmod().
expo.mod1 <- mcmodel({
# Values from Example 6.18 from Quantitative Microbial Risk Assessment,
# 2nd Edition by Charles N. Haas, Joan B. Rose, and Charles P. Gerba.
# (Wiley, 2014), pp. 215-216. Other fictitious values are noted below.
# Shellfish viral loading (viruses/g):
sf.vl <- mcdata(1, type = "0")
# Shellfish consumption (g/day):
sf.cons.g <- mcdata(0.135, type = "0")
# Drinking water viral loading (viruses/L):
dw.vl <- mcdata(0.001, type = "0")
# Drinking water consumption (L/day):
dw.cons.L <- mcstoc(rlnorm, type = "V", seed = seed,
meanlog = 7.49, sdlog = 0.407) / 1000
# Swimming in surface water viral loading (viruses/L):
sw.vl <- mcdata(0.1, type = "0")
# Swimming daily ingestion rate (mL/hour): fictitious sd = 45
sw.daily.IR <- mcstoc(rnorm, type = "U", seed = seed,
mean = 50, sd = 45, rtrunc = TRUE, linf = 0)
# Swimming duration (hours): fictitious discrete distribution
sw.duration <- mcstoc(rempiricalD, type = "V", seed = seed,
values = c(0.5, 1, 2, 2.6),
prob = c(0.1, 0.1, 0.2, 0.6))
# Swimming frequency (swims/year):
sw.frequency <- mcdata(7, type = "0")
# Estimate the exposure using the 0, V and U nodes to create a VU node.
expo.mc1 <- (sf.vl * sf.cons.g) + (dw.vl * dw.cons.L) +
((sw.vl * (sw.daily.IR * sw.duration * sw.frequency)) / 365 / 1000)
# Build a mc model from all of the mcnode objects.
mc(sf.vl, sf.cons.g, dw.vl, dw.cons.L, sw.vl, sw.daily.IR,
sw.duration, sw.frequency, expo.mc1)
})
```
### Evaluate the model
Evaluate the model with 5000 iterations in the variability dimension and 250
iterations in the uncertainty dimension, as set previously.
```{r}
expo.ev1 <- evalmcmod(expo.mod1, seed = seed)
print(expo.ev1, digits = digits)
```
### Summarize results
Print a summary and a plot of the evaluation results (`expo.ev1`).
```{r results-ev1}
# Summarize the results.
summary(expo.ev1)
# Plot the results.
plot(expo.ev1)
```
Report the mean and median of the means with a 95% confidence interval (CI95).
```{r}
mean.risk1 <- sapply(1:ndunc(), function(j) mean(expo.ev1$expo.mc1[, j, ]))
mean(mean.risk1)
quantile(mean.risk1, probs = seq(0, 1, 0.025))[c("2.5%", "50%", "97.5%")]
```
Plot the empirical cumulative distribution function (ecdf) of the exposure model
(`expo.mc1`) estimates.
```{r plot-mc1}
# Generate an "ecdf" plot. This actually calls plot.mcnode().
plot(expo.ev1$expo.mc1)
```
As before, we can also plot with other functions like `ggplot()`.
```{r plot-mc1-ggplot}
# Create the risk matrix from the mcnode.
Risk.mat <- expo.ev1$expo.mc1[,,]
# Calculate quantiles.
probs <- c(0.025, 0.25, 0.50, 0.75, 0.975)
quant <- as.data.frame(t(apply(Risk.mat, 1, quantile, probs = probs)))
# Define a palette of shades of gray for plotting quantiles.
grays <- c('gray75', 'gray35', 'black', 'gray35', 'gray75')
# Reshape from "wide" to "long" format.
quant.melt <- suppressMessages(melt(quant))
names(quant.melt) <- c('q', 'x')
# Calculate ECDFs and plot lines for them, using our custom gray palette.
ggplot(quant.melt, aes(x = x)) + theme_bw() + theme(legend.position = 'none') +
geom_hline(yintercept = c(0, 1), linetype = "dashed", color = 'gray') +
stat_ecdf(aes(group = q, color = q)) + xlab('x') + ylab('Fn(x)') +
scale_colour_manual(values = grays)
```
## Compare manual and mc2d simulations
How do our results from mc2d compare with the "manual" method we tried first?
You can check the means, medians, etc. that were reported above, and you should
find that they match. How is this possible with probabalistic methods? You might
think that with more decimal places we might see a difference, but rounding of
the printed values is not the explanation.
We used _pseudo-random_ number generation by setting the same _seed_ before
random sampling from our model's probability distributions. So, this means that
all of the random input data should match. Since we used the same mathematical
model, the same parameters, and the same data, all of the simulated estimates
should match, too. But how will we check all 250 * 5000 = 1,250,000 numbers?
We can compare with the `identical()` function, which checks all values to
all stored decimal places, not just the ones displayed with `print()`. A result
of `TRUE` means they match.
```{r}
identical(expo.mc, expo.ev1$expo.mc1) # Should be "TRUE" if same seed was used.
```
If the results had not been identical, you could view the differences with:
```{r}
# Print results to 20 decimal places to help spot even small differences.
differences <- which(expo.mc != expo.ev1$expo.mc1)
sum(differences)
if (sum(differences) > 0) {
print(head(expo.mc[differences]), digits = 20)
print(head(expo.ev1$expo.mc1[differences]), digits = 20)
}
```
## Repeat 2-D simulation again with an `mccut` loop
This time we will use another
[pair of functions](http://www.inside-r.org/packages/cran/mc2d/docs/mccut)
from the [mc2d](https://cran.r-project.org/web/packages/mc2d/index.html)
package, `mcmodelcut()` and `evalmccut()`, to get a different
style of summary output. They implement the uncertainty dimension of the
2-D simulation with a processing _loop_.
By using these [mccut](http://www.inside-r.org/packages/cran/mc2d/docs/mccut)
functions, we will also be able to conserve system memory, at the cost of
taking more time to perform the simulation. This approach allows the evaluation
of very high dimensional models on relatively modest computer systems.
### Load packages
We did this earlier in the document, but will add it again here as a reminder of
what packages are needed for this example.
```{r warning=FALSE}
# Load packages.
library(mc2d)
```
Set the number of simulations for the variability and uncertainty dimensions.
Again, we did this before, but want to include it with this example, explicitly.
```{r message=FALSE}
ndvar(5000) # Variability
ndunc(250) # Uncertainty
```
### Define variables
Define the variables we will use to set the `seed` for random sampling and the
number of `digits` for `print()` statements.
```{r}
seed <- 1
digits <- 5
```
### Define exposure model
The `mcmodelcut()` function expects its input in three blocks:
```
mcmodelcut({
# Block 1: Evaluate all of the 0, V and U mc node objects.
{
# This block is evaluated once before the first loop (step 1).
}
# Block 2: Evaluate all of the VU nodes. Last statement makes an mc object.
{
# This block is evaluated using nsu = 1 (step 2).
}
# Block 3: Build a list of statistics refering to the mc object.
{
# This block is repeated nsu times (step 3).
}
})
```
... where `nsu` is the number of simulations for uncertainty used in the
evaluation.
```{r}
# Build a mcmodelcut object in three blocks for evaluation by evalmccut().
expo.mcmcut <- mcmodelcut({
# Block 1: Evaluate all of the 0, V and U mc node objects.
{
# Values from Example 6.18 from Quantitative Microbial Risk Assessment,
# 2nd Edition by Charles N. Haas, Joan B. Rose, and Charles P. Gerba.
# (Wiley, 2014), pp. 215-216. Other fictitious values are noted below.
# Shellfish viral loading (viruses/g):
sf.vl <- mcdata(1, type = "0")
# Shellfish consumption (g/day):
sf.cons.g <- mcdata(0.135, type = "0")
# Drinking water viral loading (viruses/L):
dw.vl <- mcdata(0.001, type = "0")
# Drinking water consumption (L/day):
dw.cons.L <- mcstoc(rlnorm, type = "V", seed = seed,
meanlog = 7.49, sdlog = 0.407) / 1000
# Swimming in surface water viral loading (viruses/L):
sw.vl <- mcdata(0.1, type = "0")
# Swimming daily ingestion rate (mL/hour): fictitious sd = 45
sw.daily.IR <- mcstoc(rnorm, type = "U", seed = seed,
mean = 50, sd = 45, rtrunc = TRUE, linf = 0)
# Swimming duration (hours): fictitious discrete distribution
sw.duration <- mcstoc(rempiricalD, type = "V", seed = seed,
values = c(0.5, 1, 2, 2.6),
prob = c(0.1, 0.1, 0.2, 0.6))
# Swimming frequency (swims/year):
sw.frequency <- mcdata(7, type = "0")
}
# Block2: Evaluate all of the VU nodes. Last statement makes an mc object.
{
# Estimate the exposure using the 0, V and U nodes to create a VU node.
expo.mc2 <- (sf.vl * sf.cons.g) + (dw.vl * dw.cons.L) +
((sw.vl * (sw.daily.IR * sw.duration * sw.frequency)) / 365 / 1000)
# Build a mc model from all of the mcnode objects.
expo.mod2 <- mc(sf.vl, sf.cons.g, dw.vl, dw.cons.L,
sw.vl, sw.daily.IR, sw.duration, sw.frequency, expo.mc2)
}
# Block 3: Build a list of statistics refering to the mc object.
{
list(sum = summary(expo.mod2), plot = plot(expo.mod2, draw = FALSE))
}
})
```
### Evaluate the model
Evaluate the model with 5000 iterations in the variability dimension and
250 iterations in the uncertainty dimesion. Save the evaluation results as
`expo.ev2`.
Since the `evalmccut()` function produces a lot of text output that we do not
want in our report, we capture the text output with the `capture.output()`
function and print a summary when finished.
```{r}
capture.output(expo.ev2 <- evalmccut(expo.mcmcut, seed = seed))
```
### Summarize results
Print the accumulated statistics with `summary()` and `plot()`.
```{r results-ev2}
# Print a summary
summary(expo.ev2)
# Plot the mccut object. This actually calls plot.mccut().
plot(expo.ev2)
```
Report the mean and median of the means with a 95% confidence interval (CI95).
```{r}
mean.risk2 <- expo.ev2$sum$expo.mc2[, , "mean"]
mean(mean.risk2)
quantile(mean.risk2, probs = seq(0, 1, 0.025))[c("2.5%", "50%", "97.5%")]
```
We would like to plot the empirical cumulative distribution function
of the risk estimate, as we did before. But this example only provides us with
summarized data. This gives us the 1001 quantiles to one decimal place for each
of the nsu=250 uncertainty estimates. So, we will feed this into the `mcdata()`
function and plot the resulting `mcnode` object to approximate the ecdf plots
we have produced for the previous 2-D Monte Carlo examples in this document,
just for comparison. You will note that we transpose the dataset to get the
array dimensions in the order expected by `mcdata()` and we set `nsv='1001'`
to match the dimensions of this dataset.
```{r plot-mc2d}
expo.q <- expo.ev2$plot$expo.mc2 # q = Quantiles
expo.qt <- aperm(expo.q, c(3, 2, 1)) # t = Transposed
expo.mc2d <- mcdata(expo.qt, type='VU', nsv='1001', nsu='250')
plot(expo.mc2d)
```
## A look inside an `mcnode` object
You may have wondered, "What is an `mcnode` anyway?" It is an object of class
`mcnode` containing an array and a list of attributes. We can make one manually
and compare it with one made with `mcdata()`. Let's try this with our transposed
quantile array.
First, let's examine the structure of the array object with the `class`
and `str()` functions.
```{r}
class(expo.qt)
str(expo.qt)
```
So, this is a three-dimensional numerical array with three a list of
three `dimnames` as its only attribute.
To make this into a `mcnode` object, we assign the `mcnode` class and two
attributes (`type` and `outm`). We also remove the `dimnames` list attribute.
```{r}
class(expo.qt) <- 'mcnode'
attr(expo.qt, 'dimnames') <- NULL
attr(expo.qt, 'type') <- 'VU'
attr(expo.qt, 'outm') <- 'each'
```
The same "ecdf" plot that we just made previously can now be created by
plotting the new `mcnode` object made from the transposed quantile array.
```{r plot-expo-qt}
plot(expo.qt)
```
The two plots are identical because the two objects from which they were made
are identical.
```{r}
class(expo.mc2d)
str(expo.mc2d)
class(expo.qt)
str(expo.qt)
```
Our procedure for manually creating a `mcnode` object actually produced an
object exactly identical in every way to the one made with `mcdata()`. We can
verify this with the `identical()` function.
```{r}
identical(expo.mc2d, expo.qt)
```
We do not recommend creating objects manually like this, but gaining a little
understanding of what objects are made of helps to dispel some of the mystery.
## License
Except where noted otherwise, this work is licensed under a
[Creative Commons Attribution-ShareAlike 4.0 International License]( https://creativecommons.org/licenses/by-sa/4.0/).
### You are free to:
* Share — copy and redistribute the material in any medium or format
* Adapt — remix, transform, and build upon the material for any purpose, even
commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.
### Under the following terms:
* Attribution — You must give appropriate credit, provide a link to the license,
and indicate if changes were made. You may do so in any reasonable manner, but
not in any way that suggests the licensor endorses you or your use.
* ShareAlike — If you remix, transform, or build upon the material, you must
distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological
measures that legally restrict others from doing anything the license permits.
### Notices:
* You do not have to comply with the license for elements of the material in the
public domain or where your use is permitted by an applicable exception or
limitation.
* No warranties are given. The license may not give you all of the permissions
necessary for your intended use. For example, other rights such as publicity,
privacy, or moral rights may limit how you use the material.