Skip to content

Implement pure-SoA HDF5 particle IO#5204

Merged
atmyers merged 4 commits intoAMReX-Codes:developmentfrom
atmyers:pure_soa_hdf5
Mar 25, 2026
Merged

Implement pure-SoA HDF5 particle IO#5204
atmyers merged 4 commits intoAMReX-Codes:developmentfrom
atmyers:pure_soa_hdf5

Conversation

@atmyers
Copy link
Copy Markdown
Member

@atmyers atmyers commented Mar 16, 2026

This closes #5152. Although it's a bit worse than that issue indicated, HDF5 particle IO wasn't implemented for pure SoA at all. This PR does so.

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • include documentation in the code and/or rst files, if appropriate

@atmyers atmyers changed the title [WIP Implement pure-SoA HDF5 particle IO [WIP] Implement pure-SoA HDF5 particle IO Mar 16, 2026
@atmyers atmyers changed the title [WIP] Implement pure-SoA HDF5 particle IO Implement pure-SoA HDF5 particle IO Mar 16, 2026
@atmyers atmyers requested review from WeiqunZhang and ax3l March 16, 2026 21:11
@atmyers atmyers merged commit 48babf5 into AMReX-Codes:development Mar 25, 2026
74 checks passed
@ax3l
Copy link
Copy Markdown
Member

ax3l commented Mar 29, 2026

@atmyers @WeiqunZhang I see in HPSF CI on H100 issues with:

The following tests FAILED:
93 - Particles_CheckpointRestartDualGridHDF5SOA_2d (Failed)
94 - Particles_CheckpointRestartDualGridHDF5SOA_3d (Failed)
96 - Particles_CheckpointRestartDualGridSOA_2d (Failed)
97 - Particles_CheckpointRestartDualGridSOA_3d (Failed)

Could this be from this PR?

@ax3l ax3l mentioned this pull request Mar 30, 2026
5 tasks
@WeiqunZhang
Copy link
Copy Markdown
Member

The failure can probably be reproduced locally. The issue is probably single precision build trying to read double precision data. If we cannot fix them quickly on Monday 3/30, maybe we can disable them for single precision so that we don't release 26.04 with known CTest issues.

@WeiqunZhang
Copy link
Copy Markdown
Member

$ cd Tests/Particles/CheckpointRestartDualGridSOA/
$ make -j6 PRECISION=FLOAT USE_SINGLE_PRECISION_PARTICLES=TRUE
$ mpiexec -n 2 ./main3d.gnu.FLOAT.TPROF.MPI.ex
Initializing AMReX (26.03-107-g112b2d362859)...
MPI initialized with 2 MPI processes
MPI initialized with thread support level 0
AMReX (26.03-107-g112b2d362859) initialized
Test 1: restart with fewer levels than checkpoint
  PASSED
Test 2: restart with more levels than checkpoint
1::Assertion `sm_orig == sm_new' failed, file "main.cpp", line 133, Msg: Int component sum mismatch after restart (comp 0) !!!
SIGABRT
0::Assertion `sm_orig == sm_new' failed, file "main.cpp", line 133, Msg: Int component sum mismatch after restart (comp 0) !!!
SIGABRT
See Backtrace.1 file for details
See Backtrace.0 file for details

@atmyers

@WeiqunZhang
Copy link
Copy Markdown
Member

I think the issue is casting int to float for reduction.

@WeiqunZhang
Copy link
Copy Markdown
Member

#5255

ax3l pushed a commit that referenced this pull request Mar 30, 2026
For floating-point number comparison, use amrex::almostEqual instead of
==.

For int data, cast them to amrex::Long instead of amrex::Real (which
might be float) before reduction.

Follow-up to #5204
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HDF5 CheckpointHDF5 does not skip position components for pure-SoA particles

3 participants