how to load dataset only once on the same machine? #8112
-
My dataset is large, with total CPU memory usage of 20 GB. I train on 2 nodes with 8 GPU. And I use slurm to train it. But I found that each process will consume 20 GB memory, which is equivelence to 80 GB each node. That's not what I want. I want a node to consume only 20GB in total. Is there a way to do that? class DataModule(LightningDataModule):
def __init__(self):
super().__init__()
self.batch_size = 1
self.CT_dataset=np.load("./CT_dataset.npy")#shape:(7000,1,512,512)
self.MR_dataset=np.load("./MR_dataset.npy")#shape:(7000,1,512,512)
self.batch_size=1
self.CT_dataset = torch.from_numpy(self.CT_dataset)
self.CT_dataset = self.CT_dataset.float()
self.MR_dataset = torch.from_numpy(self.MR_dataset)
self.MR_dataset = self.MR_dataset.float()
self.train_dataset, self.test_dataset = random_split(TensorDataset(self.MR_dataset,self.CT_dataset), [len(self.CT_dataset)-100, 100])
def train_dataloader(self):
return DataLoader(self.train_dataset, batch_size=self.batch_size)
def test_dataloader(self):
return DataLoader(self.test_dataset, batch_size=self.batch_size)
model = CycleGAN()
ds = DataModule()
logger = TensorBoardLogger(save_dir="./run")
trainer = pl.Trainer(max_epochs=1,fast_dev_run=False,profiler="pytorch",overfit_batches=8,gpus=4,logger=logger,accelerator='ddp',num_nodes=2,auto_scale_batch_size='power',weights_summary='full')
trainer.fit(model, ds)
trainer.test(model,datamodule=ds) My code will raise |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Since your data is in one single binary file, it won't be possible to reduce the memory footprint. Each ddp process is independent from the others, there is no shared memory. You will have to save each dataset sample individually, so each process can access a subset of these samples through the dataloader and sampler. |
Beta Was this translation helpful? Give feedback.
Since your data is in one single binary file, it won't be possible to reduce the memory footprint. Each ddp process is independent from the others, there is no shared memory. You will have to save each dataset sample individually, so each process can access a subset of these samples through the dataloader and sampler.