MPI error with kernel_launching.jl
#3981
francispoulin
started this conversation in
Experimental features
Replies: 1 comment 6 replies
-
Weird, I run it on main and I don't have such problems: (base) simonesilvestri@Simones-MacBook-Pro Oceananigans.jl % mpiexecjl -np 4 julia --project validation/distributed_simulations/distributed_nonhydrostatic_turbulence.jl
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
[ Info: MPI has not been initialized, so we are calling MPI.Init().
grid = grid = 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [3.14159, 4.71239) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [4.71239, 6.28319) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z
grid = 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [2.41353e-18, 1.5708) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z
grid = 64×256×1 RectilinearGrid{Float64, Oceananigans.Grids.FullyConnected, Periodic, Flat} on Distributed{CPU} with 3×3×0 halo
├── FullyConnected x ∈ [1.5708, 3.14159) regularly spaced with Δx=0.0245437
├── Periodic y ∈ [-1.91418e-18, 6.28319) regularly spaced with Δy=0.0245437
└── Flat z
[ Info: Initializing simulation...
[ Info: Initializing simulation...
[ Info: Initializing simulation...
[ Info: Initializing simulation...
[ Info: Iteration: 0, time: 0 seconds
[ Info: Rank 1: max|ζ|: 7.58e+01, max(e): 2.33e-01
[ Info: Rank 3: max|ζ|: 7.49e+01, max(e): 2.17e-01
[ Info: Rank 0: max|ζ|: 7.60e+01, max(e): 2.39e-01
[ Info: Rank 2: max|ζ|: 7.54e+01, max(e): 2.52e-01
[ Info: ... simulation initialization complete (14.299 seconds)
[ Info: ... simulation initialization complete (14.139 seconds)
[ Info: Executing initial time step...
[ Info: Executing initial time step...
[ Info: ... simulation initialization complete (14.232 seconds)
[ Info: Executing initial time step...
[ Info: ... simulation initialization complete (14.290 seconds)
[ Info: Executing initial time step...
[ Info: ... initial time step complete (5.912 seconds).
[ Info: ... initial time step complete (5.913 seconds).
[ Info: ... initial time step complete (5.913 seconds).
[ Info: ... initial time step complete (5.883 seconds).
[ Info: Iteration: 10, time: 100.000 ms
[ Info: Rank 2: max|ζ|: 4.50e+01, max(e): 9.66e-02
[ Info: Rank 3: max|ζ|: 4.45e+01, max(e): 9.25e-02
[ Info: Rank 0: max|ζ|: 4.28e+01, max(e): 9.77e-02
[ Info: Rank 1: max|ζ|: 4.48e+01, max(e): 9.63e-02
[ Info: Iteration: 20, time: 190.000 ms
[ Info: Rank 3: max|ζ|: 3.31e+01, max(e): 7.02e-02
[ Info: Rank 1: max|ζ|: 3.33e+01, max(e): 7.47e-02
[ Info: Rank 2: max|ζ|: 3.47e+01, max(e): 6.17e-02
[ Info: Rank 0: max|ζ|: 3.41e+01, max(e): 7.20e-02
[ Info: Iteration: 30, time: 280.000 ms
...
[ Info: Simulation is stopping after running for 47.977 seconds.
[ Info: Simulation is stopping after running for 47.941 seconds.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Simulation is stopping after running for 47.910 seconds.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Simulation is stopping after running for 47.821 seconds.
[ Info: Model iteration 1000 equals or exceeds stop iteration 1000.
[ Info: Iteration: 1000, time: 9.170 seconds
[ Info: Rank 0: max|ζ|: 3.39e+00, max(e): 8.75e-03
[ Info: Rank 1: max|ζ|: 3.44e+00, max(e): 7.93e-03
[ Info: Rank 2: max|ζ|: 3.63e+00, max(e): 9.52e-03
[ Info: Rank 3: max|ζ|: 3.65e+00, max(e): 7.15e-03
(base) simonesilvestri@Simones-MacBook-Pro Oceananigans.jl % Is your MPI configured correctly? Are you maybe using an old version of Oceananigans? |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We (@jakob-braga and @francispoulin ) are trying to run
distributed_nonhydrostatic_turbulence.jl
and getting an error.We have tried two different servers and found the error in both is due to
kernel_launching.jl
.@glwagner @simone-silvestri ?
Beta Was this translation helpful? Give feedback.
All reactions