`RuntimeError: Expected all tensors to be on the same device` #13667

ddicostanzo · 2022-07-15T01:33:26Z

ddicostanzo
Jul 15, 2022

I am working on my first Lightning project and having an issue when I attempt to train on GPUs. When I train on the CPU using the accelerator='cpu' argument, the training and validation occurs with no problem. My workstation has two GPUs, so I set the accelerator='gpu', devices=2, and strategy='dp' (I've also tried 'ddp' with the same result). The data is being provided by a LightningDataModule that is pulling a custom Torch Dataset, and the Dataset is using a Pandas Dataframe. The Dataframe contains file names to Numpy files which are being loaded as follows:

def __getitem__(self, idx):
    
    x = self.df.ct[idx]
    y = self.df.label[idx]

    x = np.load(x)['arr_0']
    y = np.loadtxt(y).flatten()

    x = torch.from_numpy(x).type(torch.FloatTensor)
    y = torch.from_numpy(y).type(torch.FloatTensor)

    return x, y

However, when I try to switch to GPUs for training, I receive the following error with strategy='dp':
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_addmm)

With the strategy='ddp' this error is displayed:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:1! (when checking argument for argument mat1 in method wrapper_addmm)

Any help would be appreciated.

Answered by ddicostanzo

Jul 15, 2022

I figured it out. I had a mistake in my model generation where I was using a List for modules rather than a Pytorch class.

I changed:
self.fc = []
to
self.fc = nn.Sequential()

View full answer

ddicostanzo · 2022-07-15T12:14:00Z

ddicostanzo
Jul 15, 2022
Author

I figured it out. I had a mistake in my model generation where I was using a List for modules rather than a Pytorch class.

I changed:
self.fc = []
to
self.fc = nn.Sequential()

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`RuntimeError: Expected all tensors to be on the same device` #13667

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

RuntimeError: Expected all tensors to be on the same device #13667

Uh oh!

ddicostanzo Jul 15, 2022

Replies: 1 comment

Uh oh!

ddicostanzo Jul 15, 2022 Author

`RuntimeError: Expected all tensors to be on the same device` #13667

ddicostanzo
Jul 15, 2022

ddicostanzo
Jul 15, 2022
Author