-
Notifications
You must be signed in to change notification settings - Fork 30
Description
ragg::agg_record() and ragg:agg_capture() advance R’s global RNG state by calling sample() to generate a device name. This causes .Random.seed to change during graphics device initialization, etc.
This arguably breaks reproducibility guarantees when these functions are used as infrastructure, for example by IRkernel in Jupyter notebooks.
Minimal example
In a Jupyter / IRkernel context (but reproducible via direct calls):
set.seed(1)
s1 <- .Random.seed
# Device initialization triggers RNG use
ragg::agg_record()
dev.off()
s2 <- .Random.seed
identical(s1, s2)
# FALSEThe RNG advance occurs here:
paste0("agg_record_", sample(.Machine$integer.max, 1))and here:
paste0('agg_capture_', sample(.Machine$integer.max, 1))on lines 430 and 507 of agg_dev.R (and potentially elsewhere?)
Unintuitive behavior
agg_record()is now used implicitly as a recording device (for example by IRkernel / evaluate in Jupyter notebooks)- This causes
.Random.seedto advance between notebook cells, even for deterministic code - Running the same code via
Rscriptdoes not exhibit this behavior - This creates unintuitive differences between running code in a notebook versus converting the notebook to a script and running it, due solely to differences in RNG state
It is unclear whether this is an intentional design decision, and whether users should expect code to behave differently in a Jupyter notebook versus a script. However, this makes the placement of cell breaks in a notebook affect RNG state, which is quite unintuitive.
For example, a single cell:
set.seed(1234)
runif(1)behaves differently than two sequential cells:
set.seed(1234)runif(1)This is likely confusing for typical users.
Environment
- ragg: 1.5.0
- IRkernel: 1.3.2
- R: 4.5.2