WIP added yield mid-training + docs + fixes and cleanup

C-Achard · C-Achard · commit 0c169644da69 · 2022-05-11T09:39:15.000+02:00
diff --git a/docs/res/guides/training_module_guide.rst b/docs/res/guides/training_module_guide.rst
@@ -23,9 +23,12 @@ TRAILMAP          An emulation of the `TRAILMAP project on GitHub`_ using `3DUne
 .. _3DUnet for Pytorch: https://github.com/wolny/pytorch-3dunet
 
 .. important::
-    The machine learning models used by this program require all images of a dataset to be of the same size.
-    Please ensure that all the images you are loading are of the **same size**, or to use the **"extract patches" (in augmentation tab)** with an appropriately small size
-    to ensure all images being used by the model are of a workable size.
+    | The machine learning models used by this program require all images of a dataset to be of the same size.
+    | Please ensure that all the images you are loading are of the **same size**, or to use the **"extract patches" (in augmentation tab)** with an appropriately small size to ensure all images being used by the model are of a workable size.
+
+.. important::
+    | **All image sizes used should be as close to a power of two as possible, if not a power of two.**
+    | Images are automatically padded; a 64 pixels cube will be used as is, but a 65 pixel cube will be padded up to 128 pixels, resulting in much higher memory use.
 
 The training module is comprised of several tabs.
 
@@ -41,36 +44,39 @@ The training module is comprised of several tabs.
 * Whether to use images "as is" (**requires all images to be of the same size and cubic**) or extract patches.
 
 * If you're extracting patches :
-    * The size of patches to be extracted (ideally, please use a value **close to a power of two**, such as 120 or 60.
-    * The number of samples to extract from each of your image to ensure correct size and perform data augmentation. A larger number will likely mean better performances, but longer training and larger memory usage.
+
+    * The size of patches to be extracted (ideally, please use a value **close to a power of two**, such as 120 or 60 to ensure correct size.)
+    * The number of samples to extract from each of your images. A larger number will likely mean better performances, but longer training and larger memory usage.
+
 * Whether to perform data augmentation or not (elastic deforms, intensity shifts. random flipping,etc). A rule of thumb for augmentation is :
+
     * If you're using the patch extraction method, enable it if you are using more than 10 samples per image with at least 5 images
     * If you have a large dataset and are not using patches extraction, enable it.
 
 
 3) The third contains training related parameters :
 
-* The model to use for training (see table above)
-* The loss function used for training (see table below)
-* The batch size (larger means quicker training and possibly better performance but increased memory usage)
-* The number of epochs (a possibility is to start with 60 epochs, and decrease or increase depending on performance.)
-* The epoch interval for validation (for example, if set to two, the module will use the validation dataset to evaluate the model with the dice metric every two epochs.)
+* The **model** to use for training (see table above)
+* The **loss function** used for training (see table below)
+* The **batch size** (larger means quicker training and possibly better performance but increased memory usage)
+* The **number of epochs** (a possibility is to start with 60 epochs, and decrease or increase depending on performance.)
+* The **epoch interval** for validation (for example, if set to two, the module will use the validation dataset to evaluate the model with the dice metric every two epochs.)
 
-If the dice metric is better on that validation interval, the model weights will be saved in the results folder.
+.. note::
+    If the dice metric is better on a given validation interval, the model weights will be saved in the results folder.
 
 The available loss functions are :
 
-========================  ====================================================
+========================  ================================================================================================
 Function                  Reference
-========================  ====================================================
+========================  ================================================================================================
 Dice loss                 `Dice Loss from MONAI`_ with ``sigmoid=true``
 Focal loss                `Focal Loss from MONAI`_
 Dice-Focal loss           `Dice-focal Loss from MONAI`_ with ``sigmoid=true`` and ``lambda_dice = 0.5``
 Generalized Dice loss     `Generalized dice Loss from MONAI`_ with ``sigmoid=true``
 Dice-CE loss              `Dice-CE Loss from MONAI`_ with ``sigmoid=true``
 Tversky loss              `Tversky Loss from MONAI`_ with ``sigmoid=true``
-========================  ====================================================
-
+========================  ================================================================================================
 .. _Dice Loss from MONAI: https://docs.monai.io/en/stable/losses.html#diceloss
 .. _Focal Loss from MONAI: https://docs.monai.io/en/stable/losses.html#focalloss
 .. _Dice-focal Loss from MONAI: https://docs.monai.io/en/stable/losses.html#dicefocalloss
diff --git a/src/napari_cellseg3d/launch_review.py b/src/napari_cellseg3d/launch_review.py
@@ -67,19 +67,19 @@ def launch_review(
 
 
     """
-    global slicer # Todo : is this okay ? ask Max
-    global z_pos
-    global view1
-    global layer
-    global images_original
-    global base_label
+    # global slicer # Todo : is this okay ? ask Max. seems to work without, keep an eye on it
+    # global z_pos
+    # global view1
+    # global layer
+    # global images_original
+    # global base_label
     images_original = original
     base_label = base
-    try:
-        del view1
-        del layer
-    except NameError:
-        pass
+    # try:
+    #     del view1
+    #     del layer
+    # except NameError:
+    #     pass
 
     view1 = viewer
     view1.add_image(
diff --git a/src/napari_cellseg3d/model_workers.py b/src/napari_cellseg3d/model_workers.py
@@ -757,6 +757,7 @@ def train(self):
                     f"* {step}/{len(train_ds) // train_loader.batch_size}, "
                     f"Train loss: {loss.detach().item():.4f}"
                 )
+                yield {"plot":False, "weights": model.state_dict()}
 
             epoch_loss /= step
             epoch_loss_values.append(epoch_loss)
@@ -804,6 +805,7 @@ def train(self):
                     val_metric_values.append(metric)
 
                     train_report = {
+                        "plot": True,
                         "epoch": epoch,
                         "losses": epoch_loss_values,
                         "val_metrics": val_metric_values,
diff --git a/src/napari_cellseg3d/plugin_model_training.py b/src/napari_cellseg3d/plugin_model_training.py
@@ -873,10 +873,11 @@ def on_yield(data, widget):
         # print(
         #     f"\nCatching results : for epoch {data['epoch']}, loss is {data['losses']} and validation is {data['val_metrics']}"
         # )
-        widget.progress.setValue(
-            100 * (data["epoch"] + 1) // widget.max_epochs
-        )
-        widget.update_loss_plot(data["losses"], data["val_metrics"])
+        if data["plot"]:
+            widget.progress.setValue(
+                100 * (data["epoch"] + 1) // widget.max_epochs
+            )
+            widget.update_loss_plot(data["losses"], data["val_metrics"])
 
         if widget.stop_requested:
             torch.save(
diff --git a/src/napari_cellseg3d/plugin_review.py b/src/napari_cellseg3d/plugin_review.py
@@ -231,7 +231,7 @@ def run_review(self):
             warnings.warn(
                 "Opening several loader sessions in one window is not supported; opening in new window"
             )
-            self._viewer.remove_from_viewer()
+            self._viewer.close()
         else:
             viewer = self._viewer
             print("new sess")
diff --git a/src/napari_cellseg3d/utils.py b/src/napari_cellseg3d/utils.py
@@ -149,9 +149,14 @@ def get_padding_dim(image_shape, anisotropy_factor=None):
             # problems with zero divs avoided via params for spinboxes
             size = int(size / anisotropy_factor[i])
         while pad < size:
+
+            if size-pad < 30 :
+                warnings.warn(f"Your value is close to a lower power of two; you might want to choose slightly smaller"
+                              f" sizes and/or crop your images down to {pad}")
+
             pad = 2**n
             n += 1
-            if pad >= 1024:
+            if pad >= 256:
                 warnings.warn(
                     "Warning : a very large dimension for automatic padding has been computed.\n"
                     "Ensure your images are of an appropriate size and/or that you have enough memory."

Original file line number	Diff line number	Diff line change
`@@ -231,7 +231,7 @@ def run_review(self):`
`231`	`231`	`warnings.warn(`
`232`	`232`	`"Opening several loader sessions in one window is not supported; opening in new window"`
`233`	`233`	`)`
`234`		`- self._viewer.remove_from_viewer()`
	`234`	`+ self._viewer.close()`
`235`	`235`	`else:`
`236`	`236`	`viewer = self._viewer`
`237`	`237`	`print("new sess")`