-
-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Labels
Description
When using Dagger (v0.19.2) with SlurmClusterManager, the default scope does not distribute tasks across workers. All tasks execute on the master process (worker 1), while pmap distributes correctly.
using Distributed, SlurmClusterManager
addprocs(SlurmManager())
using Dagger
@everywhere function task(x)
sleep(1)
return myid()
end
n = 16 # number of tasks to distribute
# Test 1: default scope
t1 = @elapsed results = fetch.([Dagger.@spawn task(i) for i in 1:n])
# Test 2: explicit workers scope
t2 = @elapsed results_scoped = Dagger.with_options(scope=Dagger.scope(workers=workers())) do
fetch.([Dagger.@spawn task(i) for i in 1:n])
end
# Test 3: pmap for comparison
t3 = @elapsed results_pmap = pmap(task, 1:n)
println("Available workers: $(workers())")
println("\nDefault: time=$(round(t1,digits=1))s, workers used: $results")
println("Scoped: time=$(round(t2,digits=1))s, workers used: $results_scoped")
println("pmap (for comparison): time=$(round(t3,digits=1))s, workers used: $results_pmap")
exit(0)On slurm: sbatch mwe.sh where the script is:
#!/bin/bash
#SBATCH --job-name=dagger_mwe
#SBATCH --nodes=2
#SBATCH --ntasks=8
#SBATCH --time=00:10:00
#SBATCH --output=%j.out
module load julia/1.12.0
export JULIA_DEPOT_PATH=$HOME/.julia
julia mwe.jlI would expect the scope in the second example to be the default.
Available workers: [2, 3, 4, 5, 6, 7, 8, 9]
Default: time=11.4s, workers used: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Scoped: time=7.1s, workers used: [9, 2, 5, 8, 6, 8, 4, 7, 8, 9, 3, 3, 3, 2, 5, 7]
pmap (for comparison): time=2.9s, workers used: [2, 3, 4, 5, 6, 7, 8, 9, 2, 3, 5, 4, 9, 7, 8, 6]