Number of threads per block exceeds kernel limit on some GPUs for some setups #4034

ali-ramadhan · 2025-01-08T21:39:24Z

ali-ramadhan
Jan 8, 2025
Maintainer

Has anyone else encountered errors of this kind when running simulations? Seems GPU specific, e.g. will work on an RTX 4090 and A100 but not a V100 or an H100. Also may be correlated with simulation/setup/type complexity.

ERROR: LoadError: Number of threads per block exceeds kernel limit (896 > 768).
...
caused by: CUDA error: too many resources requested for launch (code 701, ERROR_LAUNCH_OUT_OF_RESOURCES)

The stacktraces seem to point to a call to maximum of a Field (in progress monitoring simulation callbacks I think) but due to the asynchronous nature of CUDA.jl I don't know if that's the actual kernel at fault.

I've been struggling to get a MWE and want to dedicate some time to chasing this issue, but I don't have a concrete issue yet so I thought I'd open a discussion to see if anyone else has experienced this.

glwagner · 2025-01-08T21:46:39Z

glwagner
Jan 8, 2025
Maintainer

Yes, I have seen this. Specifically for maximum when using an immersed boundary I think.

3 replies

glwagner Jan 8, 2025
Maintainer

The stacktrace points to CUDA.mapreducedim right?

ali-ramadhan Jan 8, 2025
Maintainer Author

Yup that's right! I didn't think about it, but yeah I've only seen it with immersed grids I believe (a good hint!).

glwagner Jan 8, 2025
Maintainer

Makes me wonder if #3794 would help at all. Could also be something to raise on CUDA.jl once we can pinpoint the issue. I don't really understand the error.

glwagner · 2025-01-08T22:58:28Z

glwagner
Jan 8, 2025
Maintainer

@NoraLoose this is the "cryptic error" I was referring to re: reductions in ClimaOcean.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Number of threads per block exceeds kernel limit on some GPUs for some setups #4034

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Number of threads per block exceeds kernel limit on some GPUs for some setups #4034

Uh oh!

ali-ramadhan Jan 8, 2025 Maintainer

Replies: 2 comments · 3 replies

Uh oh!

glwagner Jan 8, 2025 Maintainer

Uh oh!

glwagner Jan 8, 2025 Maintainer

Uh oh!

ali-ramadhan Jan 8, 2025 Maintainer Author

Uh oh!

glwagner Jan 8, 2025 Maintainer

Uh oh!

glwagner Jan 8, 2025 Maintainer

ali-ramadhan
Jan 8, 2025
Maintainer

Replies: 2 comments 3 replies

glwagner
Jan 8, 2025
Maintainer

glwagner Jan 8, 2025
Maintainer

ali-ramadhan Jan 8, 2025
Maintainer Author

glwagner Jan 8, 2025
Maintainer

glwagner
Jan 8, 2025
Maintainer