Skip to content

HDF5 interface and support for stochastic reading#362

Merged
mmcleod89 merged 41 commits intodevelopmentfrom
cg_hdf5_interface
Feb 10, 2025
Merged

HDF5 interface and support for stochastic reading#362
mmcleod89 merged 41 commits intodevelopmentfrom
cg_hdf5_interface

Conversation

@20DM
Copy link
Collaborator

@20DM 20DM commented Oct 22, 2024

First stab at an HDF5 interface + some helper functions to support stochastic reading of the file.

Different read modes are supported and documented in the HDF5 part of the tets in mpi_read_measurements.cc:

  • the full data stream in the HDF5 file can be read by each rank
  • the full data stream in the HDF5 file can be divided by the number of MPI ranks into even slices, with each rank only reading its designated slice
  • the root rank can read the full data stream in the HDF5 file, and distribute it to the other ranks
  • assuming the data stream is extremely large, such that the evenly split slices are still too large to be held in memory, a stochastic reading method is provides which will read a small chunk ("minibatch") of the slice. The starting position of the chunk is chosen randomly (uniformly distributed across the slice length, applying a wrap around if the starting position is near the end of the slice)

@20DM 20DM changed the title Draft: HDF5 interface HDF5 interface and support for stochastic reading Nov 22, 2024
@20DM 20DM requested a review from mmcleod89 November 22, 2024 13:48
@mmcleod89 mmcleod89 merged commit 7fb88d3 into development Feb 10, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants