Data augmentation causes 'DataLoader worker (pid(s) 2642711) exited unexpectedly' #2404
Unanswered
dianemarquette
asked this question in
Q&A
Replies: 4 comments 14 replies
-
Hi @dianemarquette , Thanks for your interest here. Thanks. |
Beta Was this translation helpful? Give feedback.
13 replies
-
i have the same error |
Beta Was this translation helpful? Give feedback.
1 reply
-
The content is a routine dataloader and training. The Compose of transform is all i used.
My computer has 64GB ram and a RTX3090
I update to Monai0.9.1 and delete most of transforms. Only keep LoadImaged EnsureChannelFirstd RandCropByLabelClassesd\Orientationd\NormalizeIntensityd.
It works but the dice of val is very low. Only 0.7-0.8
…------------------ 原始邮件 ------------------
发件人: "904604254" ***@***.***>;
发送时间: 2022年9月8日(星期四) 上午9:32
***@***.******@***.***>;
***@***.***>;
主题: 回复: [Project-MONAI/MONAI] Data augmentation causes 'DataLoader worker (pid(s) 2642711) exited unexpectedly' (Discussion #2404)
train_images = sorted(
glob.glob(os.path.join(data_dir, "imagesTr", "*.nii.gz")))
train_labels = sorted(
glob.glob(os.path.join(data_dir, "labelsTr", "*.nii.gz")))
data_dicts = [
{"image": image_name, "label": label_name}
for image_name, label_name in zip(train_images, train_labels)
]
train_dicts, val_dicts = data_dicts[:-7], data_dicts[-10:]
set_determinism(seed=666)
print('Number of training images per epoch:', len(train_dicts))
print('Number of validating images per epoch:', len(val_dicts))
# Creation of data directories for data_loader
# Transforms for training and validation
train_transforms = [
LoadImaged(keys=['image', 'label']),
AddChanneld(keys=['image', 'label']),
# CropForegroundd(keys=['image', 'label'], source_key='image'), # crop CropForeground
# Spacingd(
# keys=["image", "label"],
# pixdim=(1.5, 1.5, 2.0),
# mode=("bilinear", "nearest"),
# ),
RandFlipd(keys=['image', 'label'], prob=0.15, spatial_axis=1),
RandFlipd(keys=['image', 'label'], prob=0.15, spatial_axis=0),
RandFlipd(keys=['image', 'label'], prob=0.15, spatial_axis=2),
RandAffined(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
rotate_range=(np.pi / 36, np.pi / 36, np.pi * 2), padding_mode="zeros"),
RandAffined(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
rotate_range=(np.pi / 36, np.pi / 2, np.pi / 36), padding_mode="zeros"),
RandAffined(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
rotate_range=(np.pi / 2, np.pi / 36, np.pi / 36), padding_mode="zeros"),
Rand3DElasticd(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
sigma_range=(5, 8), magnitude_range=(100, 200), scale_range=(0.15, 0.15, 0.15),
padding_mode="zeros"),
RandGaussianSmoothd(keys=["image"], sigma_x=(0.5, 1.15), sigma_y=(0.5, 1.15), sigma_z=(0.5, 1.15), prob=0.1,),
RandAdjustContrastd(keys=['image'], gamma=(0.5, 2.5), prob=0.1),
RandGaussianNoised(keys=['image'], prob=0.1, mean=np.random.uniform(0, 0.5), std=np.random.uniform(0, 1)),
SpatialPadd(keys=['image', 'label'], spatial_size=opt.patch_size, method= 'end'), # pad if the image is smaller than patch
RandSpatialCropd(keys=['image', 'label'], roi_size=opt.patch_size, random_size=False),
ToTensord(keys=['image', 'label'])
]
val_transforms = [
LoadImaged(keys=['image', 'label']),
AddChanneld(keys=['image', 'label']),
# CropForegroundd(keys=['image', 'label'], source_key='image'), # crop CropForeground
# Spacingd(
# keys=["image", "label"],
# pixdim=(1.5, 1.5, 2.0),
# mode=("bilinear", "nearest"),
# ),
SpatialPadd(keys=['image', 'label'], spatial_size=opt.patch_size, method= 'end'), # pad if the image is smaller than patch
ToTensord(keys=['image', 'label'])
]
train_transforms = Compose(train_transforms)
val_transforms = Compose(val_transforms)
train_ds = monai.data.Dataset(data=train_dicts, transform=train_transforms)
train_loader = DataLoader(train_ds, batch_size=opt.batch_size, shuffle=True, collate_fn=list_data_collate, num_workers=opt.workers, pin_memory=True)
val_ds = monai.data.Dataset(data=val_dicts, transform=val_transforms)
val_loader = DataLoader(val_ds, batch_size=1, num_workers=opt.workers, collate_fn=list_data_collate, pin_memory=True)
post_pred = Compose([EnsureType(), AsDiscrete(argmax=True, to_onehot=6), KeepLargestConnectedComponent(is_onehot=True, applied_labels=[2,4,5])])
post_label = Compose([EnsureType(), AsDiscrete(to_onehot=6)])
saver_ori = SaveImage(output_dir=opt.output_folder, output_ext=".nii.gz", output_postfix="ori",print_log=True)
saver_gt = SaveImage(output_dir=opt.output_folder, output_ext=".nii.gz", output_postfix="gt",print_log=True)
saver_seg = SaveImage(output_dir=opt.output_folder, output_ext=".nii.gz", output_postfix="seg",print_log=True)
net = build_UNETR() # UneTR
print('net == unetr')
net.to(device)
# loss
loss_function = monai.losses.DiceCELoss(to_onehot_y=True, softmax=True)
torch.backends.cudnn.benchmark = opt.benchmark # for accelerating Convs
dice_metric = DiceMetric(include_background=True, reduction="mean", get_not_nans=False) # Mean Dice caculate
dice_metric_batch = DiceMetric(include_background=True, reduction="mean_batch")
scaler = torch.cuda.amp.GradScaler()
optim = torch.optim.AdamW(net.parameters(), lr=1e-4, weight_decay=1e-5)
val_interval = 1
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = []
metric_values = []
metric_values_tb = []
metric_values_tc = []
metric_values_fb = []
metric_values_lfc = []
metric_values_rfc = []
train_time = time()
writer = SummaryWriter() # store in runs/***
for epoch in range(opt.epochs):
print("-" * 20)
epoch_start = time()
net.train()
epoch_loss = 0
step = 0
pbar = tqdm(train_loader, dynamic_ncols=True)
for batch_data in pbar:
pbar.set_description(desc='Train_Epoch : {}/{}'.format(epoch+1, opt.epochs))
step += 1
inputs, labels = (
batch_data["image"].to(device),
batch_data["label"].to(device),
)
with torch.cuda.amp.autocast():
logit_map = net(inputs)
loss = loss_function(logit_map, labels)
# outputs = net(inputs)
# loss = loss_function(outputs, labels)
scaler.scale(loss).backward()
epoch_loss += loss.item()
scaler.unscale_(optim)
scaler.step(optim)
scaler.update()
optim.zero_grad()
epoch_loss /= step
epoch_loss_values.append(epoch_loss)
print(f"epoch {epoch + 1} average loss: {epoch_loss:.4f} ")
904604254
***@***.***
------------------ 原始邮件 ------------------
发件人: "Richard ***@***.***>;
发送时间: 2022年9月1日(星期四) 凌晨0:15
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [Project-MONAI/MONAI] Data augmentation causes 'DataLoader worker (pid(s) 2642711) exited unexpectedly' (Discussion #2404)
please provide an example that demonstrates your problem. Thanks
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
-
train_images = sorted(
glob.glob(os.path.join(data_dir, "imagesTr", "*.nii.gz")))
train_labels = sorted(
glob.glob(os.path.join(data_dir, "labelsTr", "*.nii.gz")))
data_dicts = [
{"image": image_name, "label": label_name}
for image_name, label_name in zip(train_images, train_labels)
]
train_dicts, val_dicts = data_dicts[:-7], data_dicts[-10:]
set_determinism(seed=666)
print('Number of training images per epoch:', len(train_dicts))
print('Number of validating images per epoch:', len(val_dicts))
# Creation of data directories for data_loader
# Transforms for training and validation
train_transforms = [
LoadImaged(keys=['image', 'label']),
AddChanneld(keys=['image', 'label']),
# CropForegroundd(keys=['image', 'label'], source_key='image'), # crop CropForeground
# Spacingd(
# keys=["image", "label"],
# pixdim=(1.5, 1.5, 2.0),
# mode=("bilinear", "nearest"),
# ),
RandFlipd(keys=['image', 'label'], prob=0.15, spatial_axis=1),
RandFlipd(keys=['image', 'label'], prob=0.15, spatial_axis=0),
RandFlipd(keys=['image', 'label'], prob=0.15, spatial_axis=2),
RandAffined(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
rotate_range=(np.pi / 36, np.pi / 36, np.pi * 2), padding_mode="zeros"),
RandAffined(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
rotate_range=(np.pi / 36, np.pi / 2, np.pi / 36), padding_mode="zeros"),
RandAffined(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
rotate_range=(np.pi / 2, np.pi / 36, np.pi / 36), padding_mode="zeros"),
Rand3DElasticd(keys=['image', 'label'], mode=('bilinear', 'nearest'), prob=0.1,
sigma_range=(5, 8), magnitude_range=(100, 200), scale_range=(0.15, 0.15, 0.15),
padding_mode="zeros"),
RandGaussianSmoothd(keys=["image"], sigma_x=(0.5, 1.15), sigma_y=(0.5, 1.15), sigma_z=(0.5, 1.15), prob=0.1,),
RandAdjustContrastd(keys=['image'], gamma=(0.5, 2.5), prob=0.1),
RandGaussianNoised(keys=['image'], prob=0.1, mean=np.random.uniform(0, 0.5), std=np.random.uniform(0, 1)),
SpatialPadd(keys=['image', 'label'], spatial_size=opt.patch_size, method= 'end'), # pad if the image is smaller than patch
RandSpatialCropd(keys=['image', 'label'], roi_size=opt.patch_size, random_size=False),
ToTensord(keys=['image', 'label'])
]
val_transforms = [
LoadImaged(keys=['image', 'label']),
AddChanneld(keys=['image', 'label']),
# CropForegroundd(keys=['image', 'label'], source_key='image'), # crop CropForeground
# Spacingd(
# keys=["image", "label"],
# pixdim=(1.5, 1.5, 2.0),
# mode=("bilinear", "nearest"),
# ),
SpatialPadd(keys=['image', 'label'], spatial_size=opt.patch_size, method= 'end'), # pad if the image is smaller than patch
ToTensord(keys=['image', 'label'])
]
train_transforms = Compose(train_transforms)
val_transforms = Compose(val_transforms)
train_ds = monai.data.Dataset(data=train_dicts, transform=train_transforms)
train_loader = DataLoader(train_ds, batch_size=opt.batch_size, shuffle=True, collate_fn=list_data_collate, num_workers=opt.workers, pin_memory=True)
val_ds = monai.data.Dataset(data=val_dicts, transform=val_transforms)
val_loader = DataLoader(val_ds, batch_size=1, num_workers=opt.workers, collate_fn=list_data_collate, pin_memory=True)
post_pred = Compose([EnsureType(), AsDiscrete(argmax=True, to_onehot=6), KeepLargestConnectedComponent(is_onehot=True, applied_labels=[2,4,5])])
post_label = Compose([EnsureType(), AsDiscrete(to_onehot=6)])
saver_ori = SaveImage(output_dir=opt.output_folder, output_ext=".nii.gz", output_postfix="ori",print_log=True)
saver_gt = SaveImage(output_dir=opt.output_folder, output_ext=".nii.gz", output_postfix="gt",print_log=True)
saver_seg = SaveImage(output_dir=opt.output_folder, output_ext=".nii.gz", output_postfix="seg",print_log=True)
net = build_UNETR() # UneTR
print('net == unetr')
net.to(device)
# loss
loss_function = monai.losses.DiceCELoss(to_onehot_y=True, softmax=True)
torch.backends.cudnn.benchmark = opt.benchmark # for accelerating Convs
dice_metric = DiceMetric(include_background=True, reduction="mean", get_not_nans=False) # Mean Dice caculate
dice_metric_batch = DiceMetric(include_background=True, reduction="mean_batch")
scaler = torch.cuda.amp.GradScaler()
optim = torch.optim.AdamW(net.parameters(), lr=1e-4, weight_decay=1e-5)
val_interval = 1
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = []
metric_values = []
metric_values_tb = []
metric_values_tc = []
metric_values_fb = []
metric_values_lfc = []
metric_values_rfc = []
train_time = time()
writer = SummaryWriter() # store in runs/***
for epoch in range(opt.epochs):
print("-" * 20)
epoch_start = time()
net.train()
epoch_loss = 0
step = 0
pbar = tqdm(train_loader, dynamic_ncols=True)
for batch_data in pbar:
pbar.set_description(desc='Train_Epoch : {}/{}'.format(epoch+1, opt.epochs))
step += 1
inputs, labels = (
batch_data["image"].to(device),
batch_data["label"].to(device),
)
with torch.cuda.amp.autocast():
logit_map = net(inputs)
loss = loss_function(logit_map, labels)
# outputs = net(inputs)
# loss = loss_function(outputs, labels)
scaler.scale(loss).backward()
epoch_loss += loss.item()
scaler.unscale_(optim)
scaler.step(optim)
scaler.update()
optim.zero_grad()
epoch_loss /= step
epoch_loss_values.append(epoch_loss)
print(f"epoch {epoch + 1} average loss: {epoch_loss:.4f} ")
904604254
***@***.***
…------------------ 原始邮件 ------------------
发件人: "Richard ***@***.***>;
发送时间: 2022年9月1日(星期四) 凌晨0:15
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [Project-MONAI/MONAI] Data augmentation causes 'DataLoader worker (pid(s) 2642711) exited unexpectedly' (Discussion #2404)
please provide an example that demonstrates your problem. Thanks
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi !
I used MONAI's 3D spleen segmentation tutorial to train a model with my own dataset. However, I would like to leverage data augmentation to reduce overfitting.
When I define my preprocessing transforms for my training dataset like in the tutorial:
everything runs smoothly.
However, when I include additional transforms for data augmentation:
I get the following error:
I've tried to fix it, but I'm a bit stuck. Any guidance / tip would be greatly appreciated :).
Beta Was this translation helpful? Give feedback.
All reactions