Skip to content

Commit 0cec782

Browse files
committed
feat: updated and improved augmentation part, added youtube video in conclusions
1 parent 7096620 commit 0cec782

File tree

5 files changed

+29
-6
lines changed

5 files changed

+29
-6
lines changed

mask_detection_training/augment_train/augment_train.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,19 @@ So, for this model, since I will use 640 pixels, we will just create a first ver
6161

6262
## Task 2: Augment Dataset
6363

64-
In this part, we're going to augment our dataset. Now that we have some knowledge of the set of checkpoints and training parameters we can specify, I'm going to focus on a parameter that is **specifically created** for data augmentation: _`--hyp`__.
64+
In this part, we're going to augment our dataset.
65+
66+
Image augmentation is a process through which you create new images based on existing images in your project training set. It's an effective way to boost model performance. By creating augmented images and adding them to your dataset, you can help your model learn to better identify classes, particularly in conditions that may not be well represented in your dataset.
67+
68+
To make a decision as to what augmentations to apply and how they should be configured, we should ask yourselves the following:
69+
70+
_What types of augmentations will generate data that is beneficial for our use case?_
71+
72+
For example, in the case of aerial images, they might be taken in the early morning when the sun is rising, during the day when the sky is clear, during a cloudy day, and in the early evening. During these times, there will be different levels of brightness in the sky and thus in the images. Thus, modifying the brightness of images can be considered a **great** augmentation for this example.
73+
74+
If we see a decrease in performance from our model with this augmentation, we can always roll the augmentation back by reverting back to an earlier version of our dataset.
75+
76+
Now that we have some knowledge of the set of checkpoints and training parameters we can specify, I'm going to focus on a parameter that is **specifically created** for data augmentation: _`--hyp`_.
6577

6678
This option allows us to specify a custom YAML file that will hold the values for all hyperparameters of our Computer Vision model.
6779

@@ -81,17 +93,23 @@ Here are all available augmentations:
8193

8294
The most notable ones are:
8395
- _`lr0`_: initial learning rate. If you want to use SGD optimizer, set this option to `0.01`. If you want to use ADAM, set it to `0.001`.
84-
- _`hsv_h`_, _`hsv_s`_, _`hsv_v`_: allows us to control HSV modifications to the image. We can either change the **H**ue, **S**aturation, or **V**alue of the image.
96+
- _`hsv_h`_, _`hsv_s`_, _`hsv_v`_: allows us to control HSV modifications to the image. We can either change the **H**ue, **S**aturation, or **V**alue of the image. You can effectively change the brightness of a picture by modifying the _`hsv_v`_ parameter, which carries image information about intensity.
8597
- _`degrees`_: it will rotate the image and let the model learn how to detect objects in different directions of the camera.
8698
- _`translate`_: translating the image will displace it to the right or to the left.
8799
- _`scale`_: it will resize selected images (more or less % gain).
88-
- _`shear`_: it will create new images from a new viewing perspective. The changing axis is horizontal but works like opening a door in real life.
100+
- _`shear`_: it will create new images from a new viewing perspective (randomly distort an image across its horizontal or vertical axis.) The changing axis is horizontal but works like opening a door in real life. RoboFlow also supports vertical shear.
89101
- _`flipud`_, _`fliplr`_: they will simply take an image and flip it either "upside down" or "left to right", which will generate exact copies of the image but in reverse. This will teach the model how to detect objects from different angles of a camera. Also notice that _`flipud`_ works in very limited scenarios: mostly with satellite imagery. And _`fliplr`_ is better suited for ground pictures of any sort (which envelops 99% of Computer Vision models nowadays).
90102
- _`mosaic`_: this will take four images from the dataset and create a mosaic. This is particularly useful when we want to teach the model to detect smaller-than-usual objects, as each detection from the mosaic will be "harder" for the model: each object we want to predict will be represented by fewer pixels.
91103
- _`mixup`_: I have found this augmentation method particularly useful when training **classification** models. It will mix two images, one with more transparency and one with less, and let the model learn the differences between two _problematic_ classes.
92104

93105
Once we create a separate YAML file for our custom augmentation, we can use it in training as a parameter by setting the _`--hyp`_ option. We'll see how to do that right below.
94106

107+
RoboFlow also supports more augmentations. Here's an figure with their available augmentations:
108+
109+
![augmentations offered by RoboFlow](./images/roboflow_augmentations.png)
110+
111+
If you're particularly interested in performing additional advanced types of augmentations, [check out this video from [Jacob Solawetz](https://www.youtube.com/watch?v=r-QBawf9Eoc) illustrating even more ways you can use augmentation, like object occlusion, to improve your dataset.
112+
95113
## Task 3: Train Model
96114

97115
Now that we have our hyperparameters and checkpoint chosen, we just need to run the following commands. To execute training, we first navigate to YOLOv5's cloned repository path:
818 KB
Loading

mask_detection_training/inference/infer.md renamed to mask_detection_training/infer/infer.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,14 @@ Each parameter represents the following:
4545
* Individual Video, in which case it will perform inference frame-by-frame and merge the result into a final video file.
4646
* Individual Image
4747

48-
For example, let's execute:
48+
For example, let us execute:
4949

5050
```
5151
<copy>
5252
~/anaconda3/bin/python detect.py --weights="./models/mask_model/weights/best.pt" --img 640 --conf 0.4 --source="./videos/my_video.mp4"
5353
</copy>
5454
```
5555

56-
5756
## Task 2: Custom Inference with Python (Advanced)
5857

5958
For this method, we're going to use **PyTorch** as the supporting framework. We need PyTorch to load the model, obtain results that make sense, and return these results.
@@ -74,7 +73,13 @@ Or even expand the functionality, with things like counting objects, combining s
7473

7574
## Task 3: Conclusions
7675

77-
We have arrived at the end of this workshop. By this point, you should already be able to:
76+
We have arrived at the end of this workshop.
77+
78+
In my case, I processed this example video against our newly-trained model, and it produced the following results:
79+
80+
[Watch the video](youtube:LPRrbPiZ2X8)
81+
82+
By this point, you should already be able to:
7883

7984
&check; Use OCI to help you train your own Computer Vision models.
8085

0 commit comments

Comments
 (0)