Conversation
|
Note that we already have a similar data container called Gpu::Buffer. |
4c5ad16 to
28ff6ce
Compare
Ah that is cool, I forgot about that one. After checking again We could extend/rewrite |
|
We could make Gpu::Buffer template on container type. |
|
Yes, good idea. Let's finish the tests and impl. for this one and then we can investigate if/how we merge them? |
59a2f0e to
069d84d
Compare
32112df to
0b75b94
Compare
|
/run-hpsf-gitlab-ci |
|
GitLab CI has started at https://gitlab.spack.io/amrex/amrex/-/pipelines/1498416. |
5bacf0d to
fc6dbb3
Compare
|
GitLab CI 1498416 finished with status: failed. See details at https://gitlab.spack.io/amrex/amrex/-/pipelines/1498416. |
Only on H100. Unrelated |
|
/run-hpsf-gitlab-ci |
|
GitLab CI has started at https://gitlab.spack.io/amrex/amrex/-/pipelines/1498432. |
|
GitLab CI 1498432 finished with status: failed. See details at https://gitlab.spack.io/amrex/amrex/-/pipelines/1498432. |
1e81608 to
c01f940
Compare
c01f940 to
6854723
Compare
This adds a helper class for synchronizing a pair of host & device vectors (w/o using managed memory). In BLAST codes, I find this to cause significant boilerplate in physics-focused parts of the code and this little helper class cuts down on lines and mental book-keeping. Additionally, especially in interactive pyAMReX and ImpactX workflows, this construct is a building block for enabling user-friendly and even multi-simulation-spanning user data (e.g., ImpactX element data) by enabling users to define their inputs even before AMReX was initialized and being able to reuse their input classes across many thousands of simulations, e.g., in optimization loops, even with AMReX device arenas being shut down/recreated in between.
6854723 to
9f5fa96
Compare
|
/run-hpsf-gitlab-ci |
|
GitLab CI has started at https://gitlab.spack.io/amrex/amrex/-/pipelines/1500854. |
9f5fa96 to
956baac
Compare
956baac to
2e8d395
Compare
|
/run-hpsf-gitlab-ci |
|
GitLab CI has started at https://gitlab.spack.io/amrex/amrex/-/pipelines/1501106. |
|
@WeiqunZhang @atmyers @AlexanderSinn I am very happy with this design now, thank you for your feedback! The easiest way to review this is first look at the user-facing docs, then the test cases I added, then the actual class. Let me know what you think :) |
| */ | ||
| void to_device (bool force=false) { | ||
| #ifdef AMREX_USE_GPU | ||
| if (status() != Status::up_to_date || force) { |
There was a problem hiding this comment.
| if (status() != Status::up_to_date || force) { | |
| if (status() == Status::host_dirty || force) { |
If the status is device_dirty, then calling to_device would just overwrite the recent changes to the device side, but not when compiling for CPU and the vectors are shared.
There was a problem hiding this comment.
I intentionally choose that: if you ask to_device() and the device is not what the host has (for any reason) it gets overwritten.
When compiling for CPU, host and device are in sync by definition/API contract, using the exact same memory. The status is never not Status::up_to_date for a CPU build.
|
GitLab CI 1501106 finished with status: success. See details at https://gitlab.spack.io/amrex/amrex/-/pipelines/1501106. |
727bb23 to
d6c97fe
Compare
d6c97fe to
8d0d62a
Compare
|
/run-hpsf-gitlab-ci |
|
GitLab CI has started at https://gitlab.spack.io/amrex/amrex/-/pipelines/1501542. |
|
GitLab CI 1501542 finished with status: failed. See details at https://gitlab.spack.io/amrex/amrex/-/pipelines/1501542. |
|
Frank issue, CUDA GPU was busy: Other GPUs and run before passed, so this is good (also, local GPU on my laptop passed for CUDA). |
Summary
This adds a helper class for synchronizing a pair of host & device vectors (w/o using managed memory).
In BLAST codes, I find this to cause significant boilerplate in physics-focused parts of the code and this little helper class cuts down on lines and mental book-keeping.
Additionally, especially in interactive pyAMReX and ImpactX workflows, this construct is a building block for enabling user-friendly and even multi-simulation-spanning user data (e.g., ImpactX element data) by enabling users to define their inputs even before AMReX was initialized and being able to reuse their input classes across many thousands of simulations, e.g., in optimization loops, even with AMReX device arenas being shut down/recreated in between.
Additional background
The unit test shows the most common usage patterns & needs.
BLAST-ImpactX/impactx#1368
Checklist
The proposed changes: