You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: mask_detection_training/augment_train/augment_train.md
+21-3Lines changed: 21 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,7 +61,19 @@ So, for this model, since I will use 640 pixels, we will just create a first ver
61
61
62
62
## Task 2: Augment Dataset
63
63
64
-
In this part, we're going to augment our dataset. Now that we have some knowledge of the set of checkpoints and training parameters we can specify, I'm going to focus on a parameter that is **specifically created** for data augmentation: _`--hyp`__.
64
+
In this part, we're going to augment our dataset.
65
+
66
+
Image augmentation is a process through which you create new images based on existing images in your project training set. It's an effective way to boost model performance. By creating augmented images and adding them to your dataset, you can help your model learn to better identify classes, particularly in conditions that may not be well represented in your dataset.
67
+
68
+
To make a decision as to what augmentations to apply and how they should be configured, we should ask yourselves the following:
69
+
70
+
_What types of augmentations will generate data that is beneficial for our use case?_
71
+
72
+
For example, in the case of aerial images, they might be taken in the early morning when the sun is rising, during the day when the sky is clear, during a cloudy day, and in the early evening. During these times, there will be different levels of brightness in the sky and thus in the images. Thus, modifying the brightness of images can be considered a **great** augmentation for this example.
73
+
74
+
If we see a decrease in performance from our model with this augmentation, we can always roll the augmentation back by reverting back to an earlier version of our dataset.
75
+
76
+
Now that we have some knowledge of the set of checkpoints and training parameters we can specify, I'm going to focus on a parameter that is **specifically created** for data augmentation: _`--hyp`_.
65
77
66
78
This option allows us to specify a custom YAML file that will hold the values for all hyperparameters of our Computer Vision model.
67
79
@@ -81,17 +93,23 @@ Here are all available augmentations:
81
93
82
94
The most notable ones are:
83
95
-_`lr0`_: initial learning rate. If you want to use SGD optimizer, set this option to `0.01`. If you want to use ADAM, set it to `0.001`.
84
-
-_`hsv_h`_, _`hsv_s`_, _`hsv_v`_: allows us to control HSV modifications to the image. We can either change the **H**ue, **S**aturation, or **V**alue of the image.
96
+
-_`hsv_h`_, _`hsv_s`_, _`hsv_v`_: allows us to control HSV modifications to the image. We can either change the **H**ue, **S**aturation, or **V**alue of the image. You can effectively change the brightness of a picture by modifying the _`hsv_v`_ parameter, which carries image information about intensity.
85
97
-_`degrees`_: it will rotate the image and let the model learn how to detect objects in different directions of the camera.
86
98
-_`translate`_: translating the image will displace it to the right or to the left.
87
99
-_`scale`_: it will resize selected images (more or less % gain).
88
-
-_`shear`_: it will create new images from a new viewing perspective. The changing axis is horizontal but works like opening a door in real life.
100
+
-_`shear`_: it will create new images from a new viewing perspective (randomly distort an image across its horizontal or vertical axis.) The changing axis is horizontal but works like opening a door in real life. RoboFlow also supports vertical shear.
89
101
-_`flipud`_, _`fliplr`_: they will simply take an image and flip it either "upside down" or "left to right", which will generate exact copies of the image but in reverse. This will teach the model how to detect objects from different angles of a camera. Also notice that _`flipud`_ works in very limited scenarios: mostly with satellite imagery. And _`fliplr`_ is better suited for ground pictures of any sort (which envelops 99% of Computer Vision models nowadays).
90
102
-_`mosaic`_: this will take four images from the dataset and create a mosaic. This is particularly useful when we want to teach the model to detect smaller-than-usual objects, as each detection from the mosaic will be "harder" for the model: each object we want to predict will be represented by fewer pixels.
91
103
-_`mixup`_: I have found this augmentation method particularly useful when training **classification** models. It will mix two images, one with more transparency and one with less, and let the model learn the differences between two _problematic_ classes.
92
104
93
105
Once we create a separate YAML file for our custom augmentation, we can use it in training as a parameter by setting the _`--hyp`_ option. We'll see how to do that right below.
94
106
107
+
RoboFlow also supports more augmentations. Here's an figure with their available augmentations:
108
+
109
+

110
+
111
+
If you're particularly interested in performing additional advanced types of augmentations, [check out this video from [Jacob Solawetz](https://www.youtube.com/watch?v=r-QBawf9Eoc) illustrating even more ways you can use augmentation, like object occlusion, to improve your dataset.
112
+
95
113
## Task 3: Train Model
96
114
97
115
Now that we have our hyperparameters and checkpoint chosen, we just need to run the following commands. To execute training, we first navigate to YOLOv5's cloned repository path:
## Task 2: Custom Inference with Python (Advanced)
58
57
59
58
For this method, we're going to use **PyTorch** as the supporting framework. We need PyTorch to load the model, obtain results that make sense, and return these results.
@@ -74,7 +73,13 @@ Or even expand the functionality, with things like counting objects, combining s
74
73
75
74
## Task 3: Conclusions
76
75
77
-
We have arrived at the end of this workshop. By this point, you should already be able to:
76
+
We have arrived at the end of this workshop.
77
+
78
+
In my case, I processed this example video against our newly-trained model, and it produced the following results:
79
+
80
+
[Watch the video](youtube:LPRrbPiZ2X8)
81
+
82
+
By this point, you should already be able to:
78
83
79
84
✓ Use OCI to help you train your own Computer Vision models.
0 commit comments