What is the correct way to handle options across workers when the option changes during .onLoad #786
-
|
Would appreciate and suggestions here! I have a package that requires a consistent temp directory across workers. This temp directory is created with So I was wondering what would be the best practice in this case? My current solution is simply to pass this argument via the function, which works fine but, I don't really want to expose this to users as an editable argument. See below with a very basic reprex that illustrates my issue and current solution. Thankyou! library(future.apply)
#> Loading required package: futurefuture::plan("multisession")
# My package sets an option for a tempdirectory during .onLoad
# it requires a consistent tempdir() across all future workers.
options("my_pkg_tempdir" = tempdir())
# obviously works fine in current session
getOption("my_pkg_tempdir")
#> [1] "/tmp/RtmpU1DWXB"# example function that shows the issue:
function_requiring_an_option <- function(x) {
paste(x, getOption("my_pkg_tempdir"))
}
future_lapply(
"this should be a directory:",
function_requiring_an_option
)
#> [[1]]
#> [1] "this should be a directory: "# my current solution:
function_requiring_an_option2 <- function(x, o) {
paste(x, o)
}
future_mapply(
function_requiring_an_option2,
"this should be a directory:",
getOption("my_pkg_tempdir") # but I'd rather the use couldn't access this...
)
#> this should be a directory:
#> "this should be a directory: /tmp/RtmpU1DWXB"Created on 2025-05-13 with reprex v2.1.1 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
This is an interesting use case and I think you distilled the need quite well - "How can we pass on settings to parallel workers without having to instruct the end-user to so?" Right now I don't think there's another way, that having the user call an "init" function in the future iterations. However, this is something worth thinking about how it can be improved. I'm going to think out loud here ... I don't think you're asking for additional arguments in the Future API ( Now, the other day, independently of this, I was toying with an idea of having a mechanism for registering "prologue" expressions that are prepended with every future. In your case that could work like: options(my_pkg_tempdir = tempdir())
getOption("my_pkg_tempdir")
#> [1] "/tmp/alice/RtmpcGyypE"
## Append a prologue expression
future_prologue(bquote({
## Pass down R option 'my_pkg_tempdir' from parent to child
options(my_pkg_tempdir = .(getOption("my_pkg_tempdir")))
}))Futureverse would then inject this at the top of the future expression as if you'd manually would do: f <- future({
## Future prologue expressions
options(my_pkg_tempdir = "/tmp/alice/RtmpcGyypE")
## Future expressions
...
})If this already existed, your package could register the above future prologue expression on load. I guess there should also be away to un-register, e.g. when a package is unloaded. That said, I'm not sure that's the best approach to achieve this. At least that idea needs to get time to be digested and matured. Another, related idea might be to have custom hook functions that packages (e.g. your package) can register, e.g. setHook("future.prologue", function() {
options(my_pkg_tempdir = "/tmp/alice/RtmpcGyypE")
}, action = "append")This would be all that you as a package developer would have to know, and the end-user would not have to be involved at all. Futureverse would then collect any hook functions and make sure they are called, on the worker, to prior evaluating the future expression. Gist: future_expression <- bquote({
.(getHook("future.prologue"))()
.(future_expression)
})The advantage with this approach is that the hook mechanism already exists in R, so it doesn't add another function to the Future API and it's "only" a matter of coming up with a standard of hook names and how and where they should be used. Ideally that could become a standard that all parallel frameworks could use. |
Beta Was this translation helpful? Give feedback.
This is an interesting use case and I think you distilled the need quite well - "How can we pass on settings to parallel workers without having to instruct the end-user to so?"
Right now I don't think there's another way, that having the user call an "init" function in the future iterations. However, this is something worth thinking about how it can be improved. I'm going to think out loud here ...
I don't think you're asking for additional arguments in the Future API (
future(),future_map(), ...) to declare which settings to pass down. For example, I already have proposal #480 on how to explicitly pass R options and environment variables along with a future. But that would still require …