monai data loader stops after few iteration, NVIDIA RTX 4090 GPU #7358
Unanswered
YazdanSalimi
asked this question in
Q&A
Replies: 3 comments
-
As an update if I change "torch.squeeze(batch_data["output_image"], dim = -1).to(device))" to batch_data["output_image"] the error does not happen. it is about torch.squeeze? |
Beta Was this translation helpful? Give feedback.
0 replies
-
Hi @YazdanSalimi, thanks for your interest here. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thank you for your response, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I define data_loader_train by this transform.
transform = monai.transforms.Compose(
[
monai.transforms.LoadImaged(keys=["input_image", "output_image"],
ensure_channel_first = True, image_only = False),
monai.transforms.Spacingd(keys=["input_image", "output_image"], pixdim = pixdim,
mode = [input_interpolation, output_interpolation]),
monai.transforms.EnsureTyped(keys=["input_image", "output_image"]),
monai.transforms.Orientationd(keys=["input_image", "output_image"], axcodes=orientation),
monai.transforms.CropForegroundd(keys=["input_image", "output_image"], source_key="input_image", margin = ImageCropForeGroundMargin),
then try to load the batch to GPU in a loop for testing the dataloader. as below:
for repearted_evaluation in range(data_validity_epochs):
for batch_data in tqdm(data_loader_train,):
input_image, output_image = (torch.squeeze(batch_data["input_image"], dim = -1).to(device),
torch.squeeze(batch_data["output_image"], dim = -1).to(device))
after few epochs, simetimes 100, sometimes five epoochs this loop stops without any error.
I have tried it ttwith cuda 12.1 and 11.8 and diffrerenet number of workers from 0 to 10. I am using windows, CPU corei9 13900KF and GPU RTX 4090. erro is still there.
the exact same code and data work fine on another device with RTX 3080 and CPU corei712700KF.
Could you please help me to solve it?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions