Building reproducible Oceananigans setups with random initial conditions and FFTW #2790
Replies: 36 comments
-
I will try doing some digging. A first thing I notice (not sure that this will solve the problem) is that, looking at your script, it seems you are using a pretty outdated version of Oceananigans (something like 0.71 or lower). I would try running with an updated versions and see if the problem persists |
Beta Was this translation helpful? Give feedback.
-
Sure, but how can I do that? Is there something on the script to change the Oceananigans version, or did you see that through the options in the script? My apologies if this question is basic, but I am a new user. |
Beta Was this translation helpful? Give feedback.
-
No problem, you have a line in the script that would not work with the current Oceananigans version: the keyword arguments of To update Oceananigans, you can type
this should give you version 0.75.3
If you are using a GPU, I would remain on 0.75.3 at the moment because in 0.77.5 there are problems with |
Beta Was this translation helpful? Give feedback.
-
[17:11]fspereira@ch-fe2[/lustre/scratch5/fspereira/OCEANANIGANS/test/case1]# julia julia> import Pkg julia> Pkg.status("Oceananigans") julia> Pkg.update("Oceananigans") julia> Pkg.status("Oceananigans") It does not update. Do I need a different version of Julia? If I remember correctly, the website says that 1.6.7 is ok |
Beta Was this translation helpful? Give feedback.
-
It should be ok with Julia 1.6, try removing it with
if that doesn't work you can always force a version with
|
Beta Was this translation helpful? Give feedback.
-
Thank you @simone-silvestri. We are using LESbrary.jl for calculating turbulent statistics as shown in the script (lines 288-300) shared by @fspereira1. Even with the latest version of Oceananigans.jl Is there any other package for the estimation of turbulent statistics (second and third-order statistics along with sub-grid-scale fluxes) with the latest version of Oceananigans.jl instead of calculating within the script (commented lines 260-284)? |
Beta Was this translation helpful? Give feedback.
-
Ah I see. Well from what I see everything in the TurbulentStatistic.jl module is compatible with new versions of Oceananigans (except maybe GPU usage). Since that is what you are using, you can maybe use it locally? This is just a quick fix to try out the new Oceananigans.
|
Beta Was this translation helpful? Give feedback.
-
Hi All, I rerun the simulations using the newest version of the code, julia> julia> import Pkg julia> Pkg.status("Oceananigans") and a script based on the one available on oceananigans webpage (I only changed the grid size, constant, and set the random seed. I also tried without these changes): https://github.com/CliMA/Oceananigans.jl/blob/main/examples/ocean_wind_mixing_and_convection.jl Unfortunately, the new code/script led to the same reproducibility problem. I ran 4 simulations using the same script (attached) and obtained 4 different average ww profiles. Any ideas or suggestions? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the hassle of rerunning on the new version. Are the initial conditions exactly the same? (I guess yes since you fix the random seed, but it's good to have the certainty) |
Beta Was this translation helpful? Give feedback.
-
@simone-silvestri, yes, the initial conditions are exactly the same as in the script. |
Beta Was this translation helpful? Give feedback.
-
I suspect the issue is your use of function-based boundary conditions. If you initialize with a function, we launch a kernel that computes the function at every grid cell. However, I don't think that the ordering of the loop is guaranteed to be deterministic. If so, this non-deterministic loop ordering is a feature of KernelAbstractions. Whether or not this is a CPU vs GPU issue, I'm not sure. But I don't think that we get deterministic thread ordering on the GPU without extra effort. To obtain determininstic initial conditions, you could try using arrays to set the initial conditions instead. Your script contains these lines: rng = MersenneTwister(1234)
Random.seed!(rng, 1414)
Ξ(z) = randn(rng) * z / model.grid.Lz * (1 + z / model.grid.Lz)
Tᵢ(x, y, z) = 20 + dTdz * z + dTdz * model.grid.Lz * 1e-6 * Ξ(z)
uᵢ(x, y, z) = sqrt(abs(Qᵘ)) * 1e-3 * Ξ(z)
set!(model, u=uᵢ, w=uᵢ, T=Tᵢ, S=35) You can try replacing these lines with something like Random.seed!(1414)
T = model.tracers.T
u, v, w = model.velocities
x, y, z = nodes(T, reshape=true)
Lz = model.grid.Lz
shape = @. z / Lz * (1 + z / Lz)
ΞT = randn(size(T)...) *. shape
Ξu = randn(size(u)...) *. shape
Ξw = randn(size(w)...) *. shape
Tᵢ = @. 20 + dTdz * z + dTdz * Lz * 1e-6 * ΞT
uᵢ = @. sqrt(abs(Qᵘ)) * 1e-3 * Ξu
wᵢ = @. sqrt(abs(Qᵘ)) * 1e-3 * Ξw
set!(model, u=uᵢ, w=wᵢ, T=Tᵢ, S=35) I'd be curious to know if this works. Here are a few more tips and best practices for raising issues here:
|
Beta Was this translation helpful? Give feedback.
-
Thank you for all your comments. I will try those lines
correct? I am getting this error message:
Line 197 corresponds to the line above. I removed the |
Beta Was this translation helpful? Give feedback.
-
Oh sorry that was a typo (incorrect Julia syntax). It should be ΞT = randn(size(T)...) .* shape |
Beta Was this translation helpful? Give feedback.
-
You can probably also omit the |
Beta Was this translation helpful? Give feedback.
-
Thank you. I will try it and let you know if it fixes the reproducibility problem. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your answer. |
Beta Was this translation helpful? Give feedback.
-
Okay, apologies. I just didn't quite understand what you meant when you said they seem identical. Typically we would just write something like Here's a bit more background on the reproducibility tests we currently have: We have "regression tests" that test to ensure that output from a certain simulation remains identical across PRs, including tests that involve LES closures. These tests involve ~10 time steps. We conclude that results are "identical" when every grid point is within Many of our other tests also implicitly rely on reproducibility. I think, therefore, that we do have reproduciblity in many cases. However, it is quite possible that your case exposes some particular feature that leads to a loss of reprodicibility. I think perhaps the next step in order to make progress is to code up a "minimal working example" (often called an MWE), which involves relentlessly simplifying the examle until we isolate the essential complication that leads to a failure of the test. With that knowledge in hand, we can dig deeper to find the underlying cause (and hopefully fix it). Often, the process of simplying a script in order to isolate the MWE also produces some insight about the issue (and potentially about the test). |
Beta Was this translation helpful? Give feedback.
-
My bad. I was not clear. Yes, T1 - T2 = 0 |
Beta Was this translation helpful? Give feedback.
-
As pointed out by @kburns, if you're interested in bit reproducibility you may also need to set the FFTW plan (by default, FFTW is not reproducible even on identical architectures). I'm not sure this is your issue (I have doubts...) but if you want to be thorough you may want to check this. To set the plan you have to build the pressure solver manually with something like: using Oceananigans.Solvers: FFTBasedPoissonSolver
using FFTW
pressure_solver = FFTBasedPoissonSolver(grid, planner_flag=FFTW.ESTIMATE)
model = NonhydrostaticModel(; grid, pressure_solver, other_kwargs...) I'd be surprised if "round-off errors" accumulate enough to cause the differences you're seeing (but it does seem at least possible, especially for very long runs at relatively coarser resolutions where slight differences in the eddy diffusivity might lead to slightly different turbulent trajectories). Hope that helps. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the suggestion @glwagner I installed the FFT, and I can
but when I try
I get julia> pressure_solver = FFTBasedPoissonSolver(grid, planner_flag=FFTW.ESTIMATE) What should I use? |
Beta Was this translation helpful? Give feedback.
-
Ah sorry. I think you should use pressure_solver = FFTBasedPoissonSolver(grid, FFTW.ESTIMATE) PS try triple backticks (```) rather than single backticks (`) for formatting blocks of code / error messages. |
Beta Was this translation helpful? Give feedback.
-
Thanks, that worked. However, now I get this
|
Beta Was this translation helpful? Give feedback.
-
Oh, is your grid vertically stretched? In that case you need to use pressure_solver = FourierTridiagonalPoissonSolver(grid, FFTW.ESTIMATE) |
Beta Was this translation helpful? Give feedback.
-
Yes, it is. I ran ten simulations (same .jl script) setting Tmax=4h, and I got the same profiles. It seems the reproducibility problem is solved. I'll run the simulations for four days (our goal) and let you know what happens. Thank you for your help. If this works, I can share the final script. |
Beta Was this translation helpful? Give feedback.
-
Great! (Thanks @kburns) |
Beta Was this translation helpful? Give feedback.
-
Feel free to close this issue @fspereira1 if you believe it is resolved. |
Beta Was this translation helpful? Give feedback.
-
@glwagner and @kburns, thank you. Since it is a different topic, I created a new issue https://github.com/CliMA/Oceananigans.jl/issues/2787 but I wonder how I could run this case on parallel (512x512x512 grid). I noticed you just replied. |
Beta Was this translation helpful? Give feedback.
-
It might make sense to convert this to a discussion and change the title to "Building reproducible LES setups". The info here could be useful for future Oceananigans users that would like to build reproducible setups (thanks for your efforts in this department @fspereira1). Of note, the lessons learned here are mostly about achieving reproducibility with Julia and FFTW (the lessons are not Oceananigans specific, and are applicable to other Julia applications). And to summarize the important points:
|
Beta Was this translation helpful? Give feedback.
-
@glwagner This Makes sense to me. And thank you for your help. |
Beta Was this translation helpful? Give feedback.
-
Perhaps start a git repo and post a link to it? It's best to include the julia environment you're using with the file (otherwise it will go stale as Oceananigans evolves). If you want to just post the file then you can use a gist, or copy/paste the code here if its short. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Dear Oceananigans team,
We are trying to use Oceananigans to create reference LES solutions for multiple canonical flows. We use a cubical domain and random perturbations to trigger the flow. During our validation tests, we noticed that we could not reproduce the results, i.e., running the same .jl script (same initial flow conditions) leads to different averaged solutions (see attached picture). We ran more than 16 simulations and never obtained the same solution. We tried to set the seed of the random perturbations constant, but this did not solve the problem. Do you observe this problem, and could you help us run reproducible simulations so other users can obtain the same solutions? We attached the .jl file we are using to define the simulations.
Best regards,
Filipe Pereira
Luke van Roekel
Amrapalli Garanaik
Brodie Pearson
c16_128_128m(1).jl.zip
Beta Was this translation helpful? Give feedback.
All reactions