furrr::future_map_df, max.concurrent.jobs for torque cluster
#744
-
|
Hi, result <- future_map_dfr(chunks, ~fitfunction(.x, ...))The batchtools option of max.concurrent.jobs appears to address this issue, but I can't seem to set it via plan or config file, and haven't found any examples. Attempts: max.concurrent.jobs <- 200Is that the right format? adding registry to the plan call: plan(batchtools_torque, workers=10000, template="batchtools.pbs.tmpl",
resources=list(nodes="1:ppn=4", walltime="2:00:00", pmem="6G"),
registry=list(max.concurrent.jobs = 30L))I can see the submitJobs code is doing: So the registry looks like it is the right place to put this, but what is the right way of doing this from I appreciate any advice |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I don't know whether that's correct or if it'll work. I simply don't know enough of the ins and outs of batchtools to say something useful. I think it's best to ask on https://github.com/mlr-org/batchtools/issues to see how it should work and if it works. BTW, there's also mlr-org/batchtools#235, which says "batchtools will try to submit any jobs passed to submitJobs() as fast as possible (but sequentially)". That one is from 2019, so it could be that it is no longer valid and should be closed. An alternative, second best, approach is to limit the total number of workers that the future ecosystem sees:
With this approach, Now, if you don't want to chunk up result <- future_map_dfr(chunks, ~fitfunction(.x, ...), .options = furrr_options(chunk_size = 1L))That will resolve each iterator one-by-one in a separate future, which means one job per iteration to the job scheduler. Because we set FWIW, using the new futurize package (2026-01-22), you can do: plan(batchtools_torque, workers=200, template="batchtools.pbs.tmpl", ...)
result <- purrr::map_dfr(chunks, ~fitfunction(.x, ...)) |> futurize(chunk_size = 1L)to achieve the exact same thing. |
Beta Was this translation helpful? Give feedback.
I don't know whether that's correct or if it'll work. I simply don't know enough of the ins and outs of batchtools to say something useful. I think it's best to ask on https://github.com/mlr-org/batchtools/issues to see how it should work and if it works.
BTW, there's also mlr-org/batchtools#235, which says "batchtools will try to submit any jobs passed to submitJobs() as fast as possible (but sequentially)". That one is from 2019, so it could be that …