Skip to content

Conversation

@glwagner
Copy link
Member

No description provided.

@navidcy
Copy link
Member

navidcy commented Oct 29, 2025

What does this error might be implying?
https://buildkite.com/clima/oceananigans/builds/26463#019a30e0-cece-41d7-acb3-7f2ef858751b/17-419

Is there an Adapt missing or something?

@glwagner
Copy link
Member Author

@ali-ramadhan we're getting an error here that I can't reproduce on tartarus. Can you reproduce on nautilis? Do you have an inkling what might be going on?

@ali-ramadhan
Copy link
Member

@glwagner Attempting to reproduce on Nautilus now.

@ali-ramadhan
Copy link
Member

ali-ramadhan commented Oct 29, 2025

I just ran (on this branch)

TEST_FILE="test_time_stepping.jl" CUDA_VISIBLE_DEVICES=3 julia +1.10.10 -O0 --project

on Nautilus (same V100 GPU but not in Docker) and locally (RTX 4090 GPU) and it passed on both.

So I'm wondering if it's a random/intermittent issue although the test group keeps failing... I can try to restart the Docker container.

If it's consistently failing then it seems like something is different between using the GPU inside a Docker container and outside 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants