AssertionError: Shape mismatch: gt_mask torch.Size([2, 640, 480]), pred_mask torch.Size([3, 640, 480])

Hi, thanks for your work. 
I am trying to reimplement the paper's results on GCG task with GranDf dataset following the document.
However, I encountered the following error:
```py
Epoch: [0][  1/500]     Loss 6.0447 (6.6629)    CeLoss 2.9219 (3.6617)  MaskBCELoss 2.6241 (2.5692)     MaskDICELoss 0.4987 (0.4320)    MaskLoss 3.1229 (3.0012)
Epoch: [0][  2/500]     Loss 4.8270 (5.0576)    CeLoss 2.5938 (3.2406)  MaskBCELoss 1.7405 (1.4278)     MaskDICELoss 0.4927 (0.3891)    MaskLoss 2.2332 (1.8170)
Epoch: [0][  3/500]     Loss 5.5492 (5.2785)    CeLoss 3.6094 (3.3023)  MaskBCELoss 1.6736 (1.5641)     MaskDICELoss 0.2661 (0.4121)    MaskLoss 1.9398 (1.9761)
Epoch: [0][  4/500]     Loss 6.2472 (5.2980)    CeLoss 3.7344 (3.2023)  MaskBCELoss 2.0935 (1.6982)     MaskDICELoss 0.4194 (0.3975)    MaskLoss 2.5129 (2.0957)
Epoch: [0][  5/500]     Loss 5.3745 (4.9203)    CeLoss 3.0312 (3.2773)  MaskBCELoss 1.8710 (1.2579)     MaskDICELoss 0.4723 (0.3851)    MaskLoss 2.3432 (1.6429)
Traceback (most recent call last):
  File "/mnt/nasv3_2/zhangxinliang/LMMs/groundingLMM/train.py", line 673, in <module>
    main(args)
  File "/mnt/nasv3_2/zhangxinliang/LMMs/groundingLMM/train.py", line 467, in main
    dataset_iters = train(
  File "/mnt/nasv3_2/zhangxinliang/LMMs/groundingLMM/train.py", line 557, in train
    output_dict = model(**data_batch)
  File "/home/zxl/anaconda3/envs/glamm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/zxl/anaconda3/envs/glamm/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
    ret_val = func(*args, **kwargs)
  File "/home/zxl/anaconda3/envs/glamm/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1829, in forward
    loss = self.module(*inputs, **kwargs)
  File "/home/zxl/anaconda3/envs/glamm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/nasv3_2/zhangxinliang/LMMs/groundingLMM/model/GLaMM.py", line 131, in forward
    return super().forward(**kwargs) if "past_key_values" in kwargs else self.model_forward(**kwargs)
  File "/mnt/nasv3_2/zhangxinliang/LMMs/groundingLMM/model/GLaMM.py", line 167, in model_forward
    return self._calculate_losses(pred_masks, masks_list, output)
  File "/mnt/nasv3_2/zhangxinliang/LMMs/groundingLMM/model/GLaMM.py", line 248, in _calculate_losses
    loss_components = self._compute_loss_components(pred_masks, masks_list, output)
  File "/mnt/nasv3_2/zhangxinliang/LMMs/groundingLMM/model/GLaMM.py", line 267, in _compute_loss_components
    assert gt_mask.shape[0] == pred_mask.shape[
AssertionError: Shape mismatch: gt_mask torch.Size([2, 640, 480]), pred_mask torch.Size([3, 640, 480])
```
This error usually comes out after several epochs.
My shell script is:
```bash
export CUDA_VISIBLE_DEVICES=4,5  # Adjust based on your GPU setup
# Environment variable settings (optional, based on your requirements)
export CUDA_LAUNCH_BLOCKING=1


# Setting a dynamic master port (optional)
export MASTER_PORT=$(shuf -i 2000-65000 -n 1)
# Path to the checkpoint and output directory (modify according to your setup)
export CKPT_PATH="./ramdisk/GLaMM-GranD-Pretrained"
export OUTPUT_DIR_PATH='output/myoffical_finetune_glamm_gcg'

deepspeed --master_port $MASTER_PORT train.py \
  --version $CKPT_PATH \
  --dataset_dir ./Dataset_LLM/ \
  --vision_pretrained ./checkpoints/sam_vit_h_4b8939.pth \
  --exp_name $OUTPUT_DIR_PATH \
  --lora_r 8 \
  --lr 3e-4 \
  --pretrained \
  --use_segm_data \
  --seg_dataset "RefCoco_GCG||PSG_GCG||Flickr_GCG||GranDf_GCG" \
  --segm_sample_rates "3,3,3,1" \
  --val_dataset "FlickrGCGVal|RefCocoGCGVal|PsgGCGVal" \
  --epochs 10 \
  --steps_per_epoch 500 \
  --mask_validation
```
I also found that, this error will not happend if only GranDf_GCG is used. But for the other 3 sub-datasets, this error will happen.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError: Shape mismatch: gt_mask torch.Size([2, 640, 480]), pred_mask torch.Size([3, 640, 480]) #88

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AssertionError: Shape mismatch: gt_mask torch.Size([2, 640, 480]), pred_mask torch.Size([3, 640, 480]) #88

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions