Skip to content

how to load dataset only once on the same machine? #8112

Discussion options

You must be logged in to vote

Since your data is in one single binary file, it won't be possible to reduce the memory footprint. Each ddp process is independent from the others, there is no shared memory. You will have to save each dataset sample individually, so each process can access a subset of these samples through the dataloader and sampler.

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@yllgl
Comment options

@awaelchli
Comment options

@yllgl
Comment options

@awaelchli
Comment options

@tchaton
Comment options

Answer selected by yllgl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants