dataloaders and ddp #15696

mosheliv · 2022-11-16T09:38:12Z

mosheliv
Nov 16, 2022

Hi all,

From the documentation, it looks like under ddp, every gpu only sees a subset of the dataset. How is this implemented?
My code looks approximately like this:

class mydataset(Dataset):
    def __init__(self, df):
    ..

    def __len__(self):                                                                                                                                                                                                                        
        return len(self.df)
    def __getitem__(self, index):                                                                                                                                                                                                             
        row = self.df.iloc[index]
        ..
        return img, label

class mymodel(pl.LightningModule):
    def __init__(self, tds, vds, BS): #tds=train dataset, vds=validation
    ..

    def training_step(self, batch, batch_idx):                                                                                                                                                                                                
        x, y = batch                                                                                                                                                                                                                          
        out  = self(x)
        ..
     def validation_step(self, batch, batch_idx):                                                                                                                                                                                                
        x, y = batch                                                                                                                                                                                                                          
        out  = self(x)
        ..
    def train_dataloader(self):                                                                                                                                                                                                               
        return DataLoader(self.tds, batch_size = self.BS, shuffle = True, num_workers=14 )                                                                                                                                                    
                                                                                                                                                                                                                                              
    def val_dataloader(self):                                                                                                                                                                                                                 
        return DataLoader(self.vds, batch_size = self.BS, shuffle = False, num_workers=14)

is this the correct way of doing it? will every process get a different, mutually exclusive subset of indexes of the dataset?
currently, it seems not to work this way.

Also, suppose I would like to change this behaviour and load different batch sizes into different GPU, how would I go about it? if i set, for example, batch size to 10 and in self.device==0 use the first 6 in the batch and if self.device==1 use the last 4, will this work?
Probably not, if the indexes are mustually exclusive between the gpus, so how do I do that?

Sorry for the slightly messy question, hope its clear enoug. this is confusing and not entirely clear.

mosheliv · 2022-11-21T23:57:39Z

mosheliv
Nov 21, 2022
Author

Just if someone is looking for this.... the process id is held in the RANK or NODE_RANK environment variable. It seems like the GPUS are assigned to the process based on the PCI id (device id order)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dataloaders and ddp #15696

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

dataloaders and ddp #15696

Uh oh!

Uh oh!

mosheliv Nov 16, 2022

Replies: 1 comment

Uh oh!

mosheliv Nov 21, 2022 Author

mosheliv
Nov 16, 2022

mosheliv
Nov 21, 2022
Author