Diverging Dice Metric in Validation Phase #6931
-
Hello, I'm a novice in AI and image segmentation. I have used the UNet pipeline provided in the tutorials for my training phase. Based on the mean dice metric, the network has achieved a value of about 0.68 in a 50-epoch training run with a dataset of 160 training images: Now for the validation phase, I expect the UNet to reach a dice value similar to the dice value of the training phase. However, I get a diverging behavior of the network which reaches only 0.06. For the validation phase, I used the following code: if len(devices) > 1:
model = torch.nn.DataParallel(model, device_ids=devices)
post_label = Compose([AsDiscrete(to_onehot=3)])
model.eval()
with torch.no_grad():
for val_data in val_loader:
val_inputs, val_labels = val_data["image"].to(device), val_data["label"].to(device)
# define sliding window size and batch size for windows inference
roi_size = (96, 96, 96)
sw_batch_size = 4
val_outputs = sliding_window_inference(val_inputs, roi_size, sw_batch_size, model)
val_outputs = [post_trans(i) for i in decollate_batch(val_outputs)]
val_labels = [post_label(i) for i in decollate_batch(val_labels)]
# compute metric for current iteration
dice_metric(y_pred=val_outputs, y=val_labels)
for val_output_dice in val_outputs:
saver(val_output_dice)
IoU_metric(y_pred=val_outputs, y=val_labels)
for val_output_IoU in val_outputs:
# saver(val_output_IoU)
# aggregate the final mean dice result
print("evaluation dice metric:", dice_metric.aggregate().item())
print("evaluation IoU metric:", IoU_metric.aggregate().item())
# reset the status
dice_metric.reset()
IoU_metric.reset() In the end, I got the following result:
So I thought maybe the network was overfitting to the training data and couldn't segment new data with good efficiency. So in the next training phases, I reduced the epochs to 30, 20, 15, and 10 and while the dice value in each training run was less than 0.67, it would get lowered by reducing the number of epochs. In the end, I'd always get the exact same evaluation dice value in the validation phase:
So now, I was sure that the training phase was not overfitting. Thus, I changed the images used in the validation phase with the ones that the network was actually trained with. The result, again was To make sure that nothing was wrong in the validation phase, I added ...
dice_metric(y_pred=val_outputs, y=val_labels)
for val_output_dice in val_outputs:
saver(val_output_dice)
print("evaluation dice metric:", dice_metric.aggregate().item())
... so after each segmentation in the validation phase, I get to see the efficiency of that segmentation (not sure it's the correct way though). Then again, I get the same dice values between 0.052 and 0.062, most of them being 0.060. Here's a bar plot of outputted dice values: Any help regarding solving this issue would very much be appreciated! Thanks in advance, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Hi @Kiarashdnsh, thanks for your interest here. |
Beta Was this translation helpful? Give feedback.
Hi @Kiarashdnsh, I think you can replace all
post_trans
withpost_pred
.For more segmentation details, you can also refer to https://github.com/Project-MONAI/tutorials/tree/main/3d_segmentation.
Thanks.