Nan losses with samll batch size #1775

hugopi · 2023-05-11T10:47:16Z

hugopi
May 11, 2023

Dear kerasCV team,

I am writing again to you as I am facing a problem using retinaNet Model. It concern the losses.

When I try to train the model with small batches of data (8) all losses turns NaN. I had to increase the batch size to 32 and add global_clipnorm=10 before the model copute correct losses. But When I put some more samples, 32 samples per batche reproduce NaN and I had to increase batch size until 64. And so on if i add more samples.

I want to reduce the batch size in order to finetune the backbone. (batch of 64 make my GPU OOM)

I appreciate any help you could provide in resolving this issue.

Nice new presentation for the new release by the way ! New tutorials are great !

Best regards,
Hugo

Training with small batch size :

Epoch 1/100
483/483 [==============================] - 60s 89ms/step - loss: nan - box_loss: nan - classification_loss: nan - percent_boxes_matched_with_anchor: 0.0018 - val_loss: nan - val_box_loss: nan - val_classification_loss: nan - val_percent_boxes_matched_with_anchor: 0.0000e+00 - lr: 0.0050
Epoch 2/100
483/483 [==============================] - 26s 55ms/step - loss: nan - box_loss: nan - classification_loss: nan - percent_boxes_matched_with_anchor: 0.0018 - val_loss: nan - val_box_loss: nan - val_classification_loss: nan - val_percent_boxes_matched_with_anchor: 0.0000e+00 - lr: 0.0050

Training with batch size = 64:

Epoch 1/100
7/7 [==============================] - 48s 4s/step - loss: 2.0109 - box_loss: 0.5836 - classification_loss: 1.4273 - percent_boxes_matched_with_anchor: 0.8930 - val_loss: 2.1265 - val_box_loss: 0.6971 - val_classification_loss: 1.4294 - val_percent_boxes_matched_with_anchor: 0.8565 - lr: 0.0050
Epoch 2/100
7/7 [==============================] - 10s 1s/step - loss: 1.9306 - box_loss: 0.4844 - classification_loss: 1.4462 - percent_boxes_matched_with_anchor: 0.8930 - val_loss: 2.0358 - val_box_loss: 0.7192 - val_classification_loss: 1.3166 - val_percent_boxes_matched_with_anchor: 0.8565 - lr: 0.0050
Epoch 3/100
7/7 [==============================] - 11s 2s/step - loss: 1.7796 - box_loss: 0.4412 - classification_loss: 1.3384 - percent_boxes_matched_with_anchor: 0.8930 - val_loss: 1.9590 - val_box_loss: 0.6382 - val_classification_loss: 1.3208 - val_percent_boxes_matched_with_anchor: 0.8565 - lr: 0.0050

(The val_percent_boxes_matched_with_anchor value is remaining constant, is it normal ? )

My images are a stack of 3 radar band gama corrected to put them between 0 and 1. I give you some of my inputs here :

>>> train_dataset 

<PrefetchDataset element_spec={'images': TensorSpec(shape=(64, 512, 512, 3), dtype=tf.float32, name=None), 'bounding_boxes': {'boxes': RaggedTensorSpec(TensorShape([64, None, 4]), tf.float32, 1, tf.int64), 'classes': RaggedTensorSpec(TensorShape([64, None]), tf.float32, 1, tf.int64)}}>

>>>next(iter(train_dataset))
{'images': <tf.Tensor: shape=(64, 512, 512, 3), dtype=float32, numpy=
array([[[[0.6547775 , 0.5771363 , 0.6547775 ],
         [0.6334593 , 0.5771363 , 0.6334593 ],
         [0.6126148 , 0.5771363 , 0.6126148 ],
         ...,
         [0.6291439 , 0.5771363 , 0.6291439 ],
         [0.6316387 , 0.5771363 , 0.6316387 ],
         [0.6373456 , 0.5771363 , 0.6373456 ]],
        [[0.6552488 , 0.5771363 , 0.6552488 ],
         [0.6371163 , 0.5771363 , 0.6371163 ],
         [0.6201546 , 0.5771363 , 0.6201546 ],
         ...,
         [0.62082434, 0.5771363 , 0.62082434],
         [0.6426415 , 0.5771363 , 0.6426415 ],
         [0.6585576 , 0.5771363 , 0.6585576 ]],
        [[0.65501314, 0.5771363 , 0.65501314],
         [0.64148647, 0.5771363 , 0.64148647],
         [0.6291439 , 0.5771363 , 0.6291439 ],
         ...,
         [0.6157083 , 0.5771363 , 0.6157083 ],
         [0.6472823 , 0.5771363 , 0.6472823 ],
         [0.6678632 , 0.5771363 , 0.6678632 ]],
        ...,
        [[0.5913914 , 0.5771363 , 0.5913914 ],
         [0.59011614, 0.5771363 , 0.59011614],
         [0.593096  , 0.5771363 , 0.593096  ],
         ...,
         [0.6232862 , 0.5771363 , 0.6232862 ],
         [0.5952337 , 0.5771363 , 0.5952337 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        [[0.6075664 , 0.5771363 , 0.6075664 ],
         [0.6168169 , 0.5771363 , 0.6168169 ],
         [0.63005   , 0.5771363 , 0.63005   ],
         ...,
         [0.6172609 , 0.5771363 , 0.6172609 ],
         [0.59288263, 0.5771363 , 0.59288263],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        [[0.60931766, 0.5771363 , 0.60931766],
         [0.62418383, 0.5771363 , 0.62418383],
         [0.6428727 , 0.5771363 , 0.6428727 ],
         ...,
         [0.6108541 , 0.5771363 , 0.6108541 ],
         [0.59459156, 0.5771363 , 0.59459156],
         [0.579387  , 0.5771363 , 0.579387  ]]],
       [[[0.6790082 , 0.5771363 , 0.6790082 ],
         [0.69595015, 0.5771363 , 0.69595015],
         [0.7169168 , 0.5771363 , 0.7169168 ],
         ...,
         [0.84056306, 0.72993004, 0.84056306],
         [0.8299702 , 0.73493683, 0.8299702 ],
         [0.817082  , 0.70173293, 0.817082  ]],
        [[0.68440396, 0.5771363 , 0.68440396],
         [0.6974541 , 0.5771363 , 0.6974541 ],
         [0.7149849 , 0.5771363 , 0.7149849 ],
         ...,
         [0.8352123 , 0.73573047, 0.8352123 ],
         [0.8222426 , 0.73931265, 0.8222426 ],
         [0.8226865 , 0.7082005 , 0.8226865 ]],
        [[0.6898426 , 0.5771363 , 0.6898426 ],
         [0.69896126, 0.5771363 , 0.69896126],
         [0.7131865 , 0.5771363 , 0.7131865 ],
         ...,
         [0.8281805 , 0.750569  , 0.8281805 ],
         [0.8129767 , 0.75029904, 0.8129767 ],
         [0.8304182 , 0.72300386, 0.8304182 ]],
        ...,
        [[0.668825  , 0.5771363 , 0.668825  ],
         [0.6951994 , 0.5771363 , 0.6951994 ],
         [0.7207962 , 0.5771363 , 0.7207962 ],
         ...,
         [0.7149849 , 0.5802214 , 0.7149849 ],
         [0.7164011 , 0.5771363 , 0.7164011 ],
         [0.71113676, 0.5771363 , 0.71113676]],
        [[0.6873651 , 0.5771363 , 0.6873651 ],
         [0.71126467, 0.5771363 , 0.71126467],
         [0.7338799 , 0.5771363 , 0.7338799 ],
         ...,
         [0.72482705, 0.5771363 , 0.72482705],
         [0.7292738 , 0.5771363 , 0.7292738 ],
         [0.72888035, 0.5771363 , 0.72888035]],
        [[0.693326  , 0.5771363 , 0.693326  ],
         [0.7143421 , 0.5771363 , 0.7143421 ],
         [0.73401195, 0.5771363 , 0.73401195],
         ...,
         [0.734144  , 0.5771363 , 0.734144  ],
         [0.74104357, 0.5771363 , 0.74104357],
         [0.74478555, 0.5771363 , 0.74478555]]],
       [[[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.58294153, 0.5771363 , 0.58294153],
         [0.5804302 , 0.5771363 , 0.5804302 ],
         [0.5835711 , 0.5771363 , 0.5835711 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5926694 , 0.5771363 , 0.5926694 ],
         [0.5918171 , 0.5771363 , 0.5918171 ],
         [0.60082775, 0.5771363 , 0.60082775]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.6045138 , 0.5771363 , 0.6045138 ],
         [0.604079  , 0.5771363 , 0.604079  ],
         [0.61837226, 0.5771363 , 0.61837226]],
        ...,
        [[0.5800127 , 0.5771363 , 0.5800127 ],
         [0.59160423, 0.5771363 , 0.59160423],
         [0.6201546 , 0.5771363 , 0.6201546 ],
         ...,
         [0.62891763, 0.5771363 , 0.62891763],
         [0.6226138 , 0.5771363 , 0.6226138 ],
         [0.6001796 , 0.5771363 , 0.6001796 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.57813764, 0.5771363 , 0.57813764],
         [0.6117338 , 0.5771363 , 0.6117338 ],
         ...,
         [0.6345999 , 0.5771363 , 0.6345999 ],
         [0.61615145, 0.5771363 , 0.61615145],
         [0.6071294 , 0.5771363 , 0.6071294 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5835711 , 0.5771363 , 0.5835711 ],
         [0.60647446, 0.5771363 , 0.60647446],
         ...,
         [0.6115138 , 0.5771363 , 0.6115138 ],
         [0.6071294 , 0.5771363 , 0.6071294 ],
         [0.6139387 , 0.5771363 , 0.6139387 ]]],
       ...,
       [[[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5812661 , 0.5771363 , 0.5812661 ],
         [0.5863069 , 0.5771363 , 0.5863069 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5779297 , 0.5771363 , 0.5779297 ]],
        [[0.5812661 , 0.5771363 , 0.5812661 ],
         [0.586096  , 0.5771363 , 0.586096  ],
         [0.5960909 , 0.5771363 , 0.5960909 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5918171 , 0.5771363 , 0.5918171 ]],
        ...,
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]]],
       [[[0.7531387 , 0.5771363 , 0.7531387 ],
         [0.74733555, 0.5771363 , 0.74733555],
         [0.76971173, 0.5771363 , 0.76971173],
         ...,
         [0.71126467, 0.5771363 , 0.71126467],
         [0.73931265, 0.5771363 , 0.73931265],
         [0.7451876 , 0.5771363 , 0.7451876 ]],
        [[0.7406438 , 0.5771363 , 0.7406438 ],
         [0.7391797 , 0.5771363 , 0.7391797 ],
         [0.75762296, 0.5771363 , 0.75762296],
         ...,
         [0.709348  , 0.5771363 , 0.709348  ],
         [0.7405105 , 0.5771363 , 0.7405105 ],
         [0.75735044, 0.5771363 , 0.75735044]],
        [[0.7218343 , 0.5771363 , 0.7218343 ],
         [0.7338799 , 0.5771363 , 0.7338799 ],
         [0.757078  , 0.5771363 , 0.757078  ],
         ...,
         [0.7018592 , 0.5771363 , 0.7018592 ],
         [0.73573047, 0.5771363 , 0.73573047],
         [0.75517374, 0.5771363 , 0.75517374]],
        ...,
        [[0.7167878 , 0.5771363 , 0.7167878 ],
         [0.72691625, 0.5771363 , 0.72691625],
         [0.73931265, 0.5771363 , 0.73931265],
         ...,
         [0.7450535 , 0.5771363 , 0.7450535 ],
         [0.7531387 , 0.5771363 , 0.7531387 ],
         [0.7478735 , 0.5771363 , 0.7478735 ]],
        [[0.7051496 , 0.5771363 , 0.7051496 ],
         [0.7119047 , 0.5771363 , 0.7119047 ],
         [0.7192419 , 0.5771363 , 0.7192419 ],
         ...,
         [0.7361276 , 0.5771363 , 0.7361276 ],
         [0.7482772 , 0.5771363 , 0.7482772 ],
         [0.748008  , 0.5771363 , 0.748008  ]],
        [[0.6888505 , 0.5771363 , 0.6888505 ],
         [0.6908361 , 0.5771363 , 0.6908361 ],
         [0.69682705, 0.5771363 , 0.69682705],
         ...,
         [0.70476913, 0.5771363 , 0.70476913],
         [0.727047  , 0.5771363 , 0.727047  ],
         [0.72914267, 0.5771363 , 0.72914267]]],
       [[[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        [[0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         ...,
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ],
         [0.5771363 , 0.5771363 , 0.5771363 ]],
        ...,
        [[0.84419996, 0.69645107, 0.84419996],
         [0.84884447, 0.6924535 , 0.84884447],
         [0.8599876 , 0.69896126, 0.8599876 ],
         ...,
         [0.74733555, 0.6908361 , 0.74733555],
         [0.721315  , 0.6736549 , 0.721315  ],
         [0.72391486, 0.66522527, 0.72391486]],
        [[0.84131944, 0.6960753 , 0.84131944],
         [0.86277676, 0.7052764 , 0.86277676],
         [0.8861352 , 0.72665477, 0.8861352 ],
         ...,
         [0.74478555, 0.71408516, 0.74478555],
         [0.7153709 , 0.6935755 , 0.7153709 ],
         [0.7087103 , 0.65926874, 0.7087103 ]],
        [[0.8275848 , 0.6790082 , 0.8275848 ],
         [0.86425245, 0.69009084, 0.86425245],
         [0.8842244 , 0.70947564, 0.8842244 ],
         ...,
         [0.7592602 , 0.71988916, 0.7592602 ],
         [0.73256093, 0.691831  , 0.73256093],
         [0.6997161 , 0.6428727 , 0.6997161 ]]]], dtype=float32)>, 'bounding_boxes': {'boxes': <tf.RaggedTensor [[[484.0, 293.0, 12.0, 29.0]], [[359.0, 76.0, 10.0, 13.0]],
 [[197.0, 238.0, 10.0, 10.0]], [[43.0, 91.0, 10.0, 10.0],
                                [349.0, 323.0, 10.0, 10.0]],
 [[276.0, 345.0, 8.0, 10.0]], [[474.0, 82.0, 10.0, 10.0]],
 [[137.0, 202.0, 6.0, 6.0],
  [137.0, 192.0, 2.0, 6.0],
  [468.0, 143.0, 12.0, 12.0]], [[281.0, 188.0, 8.0, 12.0],
                                [211.0, 359.0, 8.0, 14.0],
                                [324.0, 340.0, 22.0, 20.0]],
 [[10.0, 302.0, 22.0, 8.0],
  [151.0, 469.0, 16.0, 11.0],
  [79.0, 292.0, 10.0, 14.0],
  [180.0, 397.0, 12.0, 12.0]], [[168.0, 6.0, 10.0, 10.0]],
 [[185.0, 190.0, 10.0, 10.0]], [[387.0, 25.0, 6.0, 10.0],
                                [424.0, 152.0, 6.0, 10.0],
                                [484.0, 112.0, 20.0, 24.0]],
 [[434.0, 119.0, 10.0, 10.0]], [[304.0, 438.0, 20.0, 18.0],
                                [2.0, 361.0, 16.0, 22.0],
                                [181.0, 148.0, 18.0, 16.0],
                                [457.0, 83.0, 28.0, 10.0],
                                [76.0, 269.0, 20.0, 22.0],
                                [354.0, 130.0, 16.0, 30.0],
                                [52.0, 464.0, 10.0, 18.0],
                                [457.0, 15.0, 14.0, 10.0],
                                [50.0, 100.0, 20.0, 24.0]] ,
 [[287.0, 201.0, 10.0, 10.0]], [[377.0, 22.0, 4.0, 6.0],
                                [403.0, 20.0, 4.0, 10.0],
                                [482.0, 42.0, 4.0, 2.0],
                                [446.0, 64.0, 4.0, 4.0]] ,
 [[92.0, 283.0, 10.0, 10.0]], [[38.0, 8.0, 10.0, 10.0]],
 [[320.0, 324.0, 8.0, 8.0]], [[457.0, 435.0, 10.0, 10.0]],
 [[258.0, 329.0, 10.0, 10.0]], [[388.0, 88.0, 4.0, 6.0],
                                [372.0, 92.0, 4.0, 6.0]],
 [[118.0, 446.0, 10.0, 10.0]], [[20.0, 418.0, 12.0, 12.0]],
 [[83.0, 372.0, 10.0, 10.0]], [[161.0, 129.0, 8.0, 10.0],
                               [179.0, 51.0, 8.0, 6.0],
                               [179.0, 14.0, 6.0, 6.0],
                               [234.0, 32.0, 6.0, 6.0],
                               [258.0, 22.0, 6.0, 6.0],
                               [246.0, 26.0, 4.0, 4.0],
                               [268.0, 22.0, 8.0, 8.0]]  ,
 [[181.0, 301.0, 10.0, 10.0]], [[477.0, 113.0, 4.0, 8.0]],
 [[141.0, 412.0, 4.0, 16.0]], [[65.0, 431.0, 8.0, 10.0],
                               [65.0, 413.0, 14.0, 6.0],
                               [158.0, 290.0, 4.0, 4.0],
                               [117.0, 280.0, 8.0, 10.0]],
 [[16.0, 222.0, 6.0, 10.0],
  [26.0, 165.0, 8.0, 10.0],
  [2.0, 308.0, 6.0, 6.0],
  [26.0, 141.0, 8.0, 8.0],
  [5.0, 15.0, 6.0, 18.0],
  [121.0, 167.0, 8.0, 18.0]], [[339.0, 18.0, 10.0, 10.0]],
 [[380.0, 410.0, 10.0, 10.0]], [[368.0, 244.0, 8.0, 12.0],
                                [410.0, 236.0, 10.0, 16.0],
                                [333.0, 170.0, 10.0, 16.0],
                                [238.0, 114.0, 10.0, 13.0],
                                [295.0, 97.0, 8.0, 10.0],
                                [390.0, 379.0, 6.0, 10.0]] ,
 [[432.0, 153.0, 10.0, 10.0]], [[15.0, 312.0, 8.0, 8.0]],
 [[123.0, 438.0, 10.0, 10.0]], [[469.0, 240.0, 26.0, 17.0],
                                [351.0, 461.0, 22.0, 22.0]],
 [[391.0, 297.0, 2.0, 6.0]], [[88.0, 25.0, 10.0, 10.0]],
 [[29.0, 480.0, 6.0, 10.0]], [[457.0, 169.0, 10.0, 10.0]],
 [[485.0, 431.0, 16.0, 18.0],
  [476.0, 274.0, 6.0, 10.0],
  [84.0, 464.0, 6.0, 4.0],
  [5.0, 430.0, 4.0, 9.0]]    , [[442.0, 392.0, 10.0, 10.0],
                                [481.0, 345.0, 10.0, 10.0],
                                [411.0, 210.0, 10.0, 10.0],
                                [317.0, 443.0, 10.0, 10.0],
                                [417.0, 20.0, 10.0, 10.0],
                                [299.0, 80.0, 10.0, 10.0],
                                [489.0, 141.0, 10.0, 10.0],
                                [389.0, 465.0, 10.0, 10.0]],
 [[46.0, 475.0, 10.0, 10.0]], [[467.0, 498.0, 10.0, 16.0],
                               [97.0, 116.0, 22.0, 8.0],
                               [69.0, 410.0, 4.0, 6.0],
                               [103.0, 297.0, 6.0, 6.0],
                               [121.0, 279.0, 6.0, 8.0],
                               [242.0, 214.0, 12.0, 14.0]],
 [[172.0, 154.0, 10.0, 10.0]], [[28.0, 259.0, 10.0, 10.0]],
 [[24.0, 269.0, 14.0, 14.0],
  [342.0, 457.0, 8.0, 10.0],
  [183.0, 347.0, 8.0, 6.0]] , [[141.0, 266.0, 6.0, 11.0]],
 [[441.0, 152.0, 10.0, 10.0]], [[180.0, 490.0, 10.0, 10.0]],
 [[25.0, 33.0, 8.0, 12.0],
  [438.0, 166.0, 8.0, 10.0],
  [361.0, 13.0, 18.0, 14.0],
  [462.0, 313.0, 18.0, 16.0],
  [440.0, 146.0, 20.0, 10.0],
  [220.0, 363.0, 28.0, 31.0],
  [279.0, 188.0, 30.0, 34.0],
  [156.0, 73.0, 25.0, 20.0]] , [[343.0, 272.0, 22.0, 28.0],
                                [250.0, 286.0, 6.0, 6.0]]  ,
 [[455.0, 393.0, 10.0, 10.0]], [[137.0, 190.0, 10.0, 10.0]],
 [[93.0, 320.0, 5.0, 8.0],
  [196.0, 425.0, 8.0, 8.0],
  [186.0, 254.0, 6.0, 6.0],
  [107.0, 286.0, 8.0, 12.0],
  [220.0, 475.0, 6.0, 8.0]] , [[134.0, 201.0, 10.0, 10.0],
                               [121.0, 435.0, 10.0, 10.0],
                               [11.0, 295.0, 10.0, 10.0]] ,
 [[98.0, 278.0, 2.0, 4.0]], [[413.0, 8.0, 6.0, 8.0],
                             [49.0, 430.0, 6.0, 14.0]],
 [[167.0, 361.0, 10.0, 10.0]], [[440.0, 260.0, 10.0, 10.0]],
 [[303.0, 120.0, 22.0, 30.0],
  [507.0, 428.0, 4.0, 8.0]]  , [[109.0, 34.0, 10.0, 10.0]]]>, 'classes': <tf.RaggedTensor [[1.0], [0.0], [1.0], [1.0, 1.0], [1.0], [1.0], [1.0, 1.0, 1.0],
 [1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0], [1.0], [1.0], [0.0, 0.0, 0.0],
 [1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0],
 [0.0, 0.0, 1.0, 1.0], [1.0], [1.0], [1.0], [1.0], [1.0], [1.0, 1.0],
 [1.0], [0.0], [1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0], [1.0],
 [1.0], [1.0, 1.0, 1.0, 1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [1.0], [0.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [1.0], [1.0], [1.0], [1.0, 1.0], [1.0],
 [1.0], [1.0], [1.0], [0.0, 0.0, 1.0, 1.0],
 [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0],
 [1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0], [1.0], [0.0, 0.0, 0.0], [1.0],
 [1.0], [1.0], [0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [0.0, 1.0], [1.0],
 [0.0], [1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0], [1.0, 1.0],
 [1.0], [1.0], [1.0, 0.0], [1.0]]>}}

I create my dataset from a generator (I am not using directly the generator because I do not succed to make it works) :

def dataset_from_generator(generator):
    """
    If you do not want to use a generator this method will give you a classic dataset
    :return: tensorflow batched dataset
    """
    print("\nDATASET")
    list_image_tensor = []
    list_box_tensor = []
    list_class_tensor = []

    for batch in generator:
        list_image_tensor.append(batch[0])
        batch_boxes = batch[1]['boxes']
        split_boxes = tf.split(batch_boxes, batch_boxes.shape[0], axis=0)

        batch_classes = batch[1]['classes']
        split_classes = tf.split(batch_classes, batch_classes.shape[0], axis=0)

        for i in range(len(split_boxes)):
            boxes = tf.squeeze(split_boxes[i], axis=0)
            list_box_tensor.append(boxes.numpy())

            classes = tf.squeeze(split_classes[i], axis=0)
            list_class_tensor.append(classes.numpy())

    # tensor
    stack_image_tensor = tf.concat(list_image_tensor, axis=0)
    stack_label_tensor = {'boxes': tf.ragged.constant(list_box_tensor, ragged_rank=1),
                          'classes': tf.ragged.constant(list_class_tensor, ragged_rank=1)}

    # tensor dataset
    tensor_dataset = tf.data.Dataset.from_tensor_slices(
        {"images": stack_image_tensor, "bounding_boxes": stack_label_tensor})

    print(f"-- split  : {generator.split}")
    print(f"-- number of samples  : {len(tensor_dataset)}")

    tensor_dataset = tensor_dataset.batch(generator.batch_size, drop_remainder=True)

    tensor_dataset = tensor_dataset.prefetch(2)

    # return tensor_dataset
    return tensor_dataset

The getitem of the generator :

    def __getitem__(self, idx):
        """Gets batch at position `index`."""
        list_image_tensor = []
        list_box_tensor = []
        list_class_tensor = []

        batch_patch_paths = self.patch_paths[idx * self.batch_size: (idx + 1) * self.batch_size]
        for patch_path in batch_patch_paths:
            # read dataframe of the chip (containing one or multiple detections
            dataframe = pd.read_csv(os.path.join(patch_path.path, 'annotations.csv'))
            # transform dataframe information in keras inputs
            keras_input = self.data_to_keras_input(dataframe)

            # get each detection's image, box and class
            list_image_tensor.append(keras_input['images'])

            boxes = tf.squeeze(keras_input['bounding_boxes']['boxes'], axis=0)
            list_box_tensor.append(boxes.numpy())

            classes = tf.squeeze(keras_input['bounding_boxes']['classes'], axis=0)
            list_class_tensor.append(classes.numpy())

        # stack detections in one tensor
        stack_image_tensor = tf.concat(list_image_tensor, axis=0)
        stack_label_tensor = {'boxes': tf.ragged.constant(list_box_tensor, ragged_rank=1),
                              'classes': tf.ragged.constant(list_class_tensor, ragged_rank=1)}

        return stack_image_tensor, stack_label_tensor
       ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nan losses with samll batch size #1775

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Nan losses with samll batch size #1775

Uh oh!

hugopi May 11, 2023

Replies: 0 comments

hugopi
May 11, 2023