touchups

philchalmers · philchalmers · commit 2d23601b8e8a · 2026-03-02T09:16:28.000-05:00
diff --git a/vignettes/Catch_errors.Rmd b/vignettes/Catch_errors.Rmd
@@ -148,7 +148,7 @@ runSimulation(Design, replications = 100, load_seed=picked_seed, debug='analyse-
 
 The `.Random.seed` state will be loaded at this exact state, and will always be repeated at this state as well (in case `c` is typed in the debugger, or somehow the error is harder to find while walking through the debug mode). Hence, users must type `Q` to exit the debugger after they have better understood the nature of the error message first-hand, after which the `load_seed` input should be removed. 
 
-# Converting warings to errors explicitly via `manageWarnings()`
+# Converting warnings to errors explicitly via `manageWarnings()`
 
 ```{r include=FALSE}
 knitr::opts_chunk$set(
diff --git a/vignettes/Fixed_obj_fun.Rmd b/vignettes/Fixed_obj_fun.Rmd
@@ -157,4 +157,4 @@ res <- runSimulation(Design, replications, verbose=FALSE, fixed_objects=fixed_ob
                      parallel = TRUE)
 ```
 
-Again, remember that this is only required for R **objects**, NOT for user-defined or third-party functions!
+Again, remember that this is only required for **R objects that are NOT user-defined functions or third-party functions**!
diff --git a/vignettes/HPC-computing.Rmd b/vignettes/HPC-computing.Rmd
@@ -13,8 +13,10 @@ output:
       smooth_scroll: false
 vignette: >
   %\VignetteIndexEntry{HPC cluster array jobs (e.g., via Slurm)}
-  %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
+  %\VignetteEngine{knitr::rmarkdown}
+editor_options: 
+  chunk_output_type: console
 ---
 
 ```{r nomessages, echo = FALSE}
@@ -121,6 +123,7 @@ While generally effective at distributing the computational load, there are a fe
 5) Submitting independent multiple batches to the cluster makes it more difficult to guarantee the quality of the random numbers
     - Setting the `seed` for each condition ensure that within each `design` condition the random numbers are high quality, however there is no guarantee that repeated use of `set.seed()` will result in high-quality random numbers (see next section for example)
     - Hence, repeated job submissions of this type, even with unique seeds per condition, may not generate high quality numbers if repeated too many times (alternative is to isolate each `design` row and submit each row as a unique job, which is demonstrated near the end of this vignette)
+    - Note that there are workarounds with more recent `runSimulation()` implementations when combined with `expandDesign()` (see below), however given the nature of the complete job submission these remain less ideal
 6) Finally, and perhaps most problematic in simulation experiment applications, schedulers frequently cap the maximum number of resources that can be requested (e.g., 256 GB of RAM, 200 CPUs), which limits the application of large RAM and CPU jobs
     - Note that to avoid wasting time by swapping/paging, schedulers will never allocate jobs whose memory requirements exceed the amount of available memory 
     
@@ -166,7 +169,8 @@ print(Design300, show.IDs = TRUE)
 rep_target <- 10000
 
 # replications per row in Design300
-replications <- rep(rep_target  / rc, nrow(Design300))
+replications <- expandReplications(rep_target, repeat_conditions = rc)
+replications
 ```
 The above approach assumes that each `design` condition is equally balanced in terms of computing time and resources, though if this is not the case (e.g., the last condition contains notably higher computing times than the first two conditions) then `repeat_conditions` can be specified as a vector instead, such as `repeat_conditions = c(100, 100, 1000)`, which for the latter portion would be associated with a 10 replications per distributed node instead of 100. 
 
@@ -176,8 +180,9 @@ DesignUnbalanced <- expandDesign(Design, repeat_conditions = rc)
 DesignUnbalanced
 
 rep_target <- 10000
-replicationsUnbalanced <- rep(rep_target / rc, times = rc)
+replicationsUnbalanced <- expandReplications(rep_target, rc)
 head(replicationsUnbalanced)
+tail(replicationsUnbalanced)
 table(replicationsUnbalanced)
 ```
 
@@ -298,7 +303,7 @@ rc <- 100
 Design300 <- expandDesign(Design, repeat_conditions = rc)
 
 rep_target <- 10000
-replications <- rep(rep_target / rc, nrow(Design300))
+replications <- expandReplications(rep_target, repeat_conditions = rc)
 
 # genSeeds() # do this once on the main node/home computer, and store the number!
 iseed <- 1276149341
diff --git a/vignettes/SimDesign-intro.Rmd b/vignettes/SimDesign-intro.Rmd
@@ -214,7 +214,18 @@ As can be seen from the printed results from the `res` object, each result from
 respective condition, meta-statistics have been properly named, and three additional columns have been appended
 to the results: `REPLICATIONS`, which indicates how many time the conditions were performed, `SIM_TIME`, indicating
 the time (in seconds) it took to completely finish the respective conditions, and `SEED`, which indicates the random seeds used by `SimDesign` for each condition (for reproducibility). A call to `View()` in the 
-R console may also be a nice way to sift through the `res` object.
+R console may also be a nice way to sift through the `res` object, while for the 
+`results` object one can use functions like `descript()` and verbs from the `dplyr` package to get a better understanding of the distributions.
+
+```{r}
+# summary statistics for complete results
+descript(results)
+
+# conditional summary statistics using dplyr verbs
+results |> dplyr::group_by(sample_size, distribution) |> 
+    descript()
+```
+
 
 ## Interpreting the results