Clarification on GPU execution semantics in AMReX #4850
chaitanya2596
started this conversation in
General
Replies: 1 comment 1 reply
-
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi AMReX team,
I’m trying to understand the default execution and data movement semantics when AMReX is built and run in GPU mode. I’d appreciate clarification on the following points:
When using MFIter with ParallelFor on GPU, are GPU kernels launched per box, or does AMReX ever combine multiple boxes into a single kernel launch?
Are GPU kernels launched by ParallelFor asynchronous with respect to the host by default, or are there implicit synchronization points users should be aware of?
Besides explicit calls to Gpu::synchronize(), are there implicit synchronizations (e.g., at MFIter boundaries, end of FillBoundary(), or before returning to host-only code)?
Any pointers to relevant documentation or source locations would be very helpful.
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions