You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're going to use a **platform image** from Oracle called **OCI DSVM**. This image contains several tools for data exploration, analysis, modeling and development. It also includes a Jupyter Notebook, a conda environment ready to use and several more things (like Christmas for a Data practicioner).
17
+
We're going to use a **platform image** from Oracle called **OCI DSVM**. This image contains several tools for data exploration, analysis, modeling, and development. It also includes a Jupyter Notebook, a conda environment ready to use, and several more things (like Christmas for a Data practitioner).
@@ -25,7 +25,7 @@ We can find the platform image by selecting the *Marketplace* button:
25
25
Network settings for the Virtual Machine are very standard. Just make sure to create a new VCN and a new subnet, so that there's no possible way we get any networking issues from other OCI projects you may have.
Once we have the IP address, and having previously saved our public-private keypair (which is what we will use to authenticate ourselves to the machine), let's connect through SSH.
38
+
Once we have the IP address and having previously saved our public-private key pair (which is what we will use to authenticate ourselves to the machine), let's connect through SSH.
@@ -72,7 +72,7 @@ Now, just click on "Quick Connect" and connect:
72
72
73
73
> **Note**: we will connect to our VM and start training / augmenting our data with open-source repositories.
74
74
75
-
## Task 3: Clone Open-Source Repositories
75
+
## Task 3: Clone Open-Source Repositories
76
76
77
77
Once we have connected to our instance, let's download two repositories: YOLOv5 and YOLOv8. You're free to choose either one of them to train and augment our computer vision models, but this guide will show you how to proceed with YOLOv5.
> **Note**: `git` is another tool that's already installed in the custom image we used to spin up our instance. *YOLOv8 can also be installed directly from pip. More information [in this link.](https://github.com/ultralytics/ultralytics#documentation)
90
90
91
+
## Task 4: Transfer Dataset
91
92
92
-
Now that we have cloned our repositories, we're virtually ready to start training. You may now [proceed to the next lab](#next).
93
+
Now that we're connected to the machine, let's move the files from our computer to our OCI Compute Instance.
94
+
95
+
### For Linux & macOS Users
96
+
97
+
We can use the _`scp`_ tool to help us transfer files through SSH:
> **Note**: in this case, my OCI Compute Instance IP is 192.168.0.1. `opc` is the username for Oracle Linux distributions, like the one we are using for this case. And the private key shall be the one we used to connect through SSH in the previous task.
113
+
114
+
115
+
### For Windows Users
116
+
117
+
Use the integrated MobaXterm FTP explorer to transfer files, dropping files from our computer to MobaXterm's explorer, like here but the opposite:
118
+
119
+

120
+
121
+
## Task 5: Install Python Dependencies
122
+
123
+
Once we have the repositories ready, we need to install dependencies that will allow us to run YOLO code:
Now that we have cloned our repositories, uploaded our dataset, and have our machine and conda environment ready, we're virtually ready to start training. You may now [proceed to the next lab](#next).
93
131
94
132
## Acknowledgements
95
133
96
134
***Author** - Nacho Martinez, Data Science Advocate @ Oracle DevRel
> **Note**: as you can see, the little girl on the second row, third column is wearing the mask with their nose showing, which is *incorrect*. We want our custom model to detect cases like these, which are also the hardest to represent, as there are a lot of pictures of people with and without masks, but there aren't as many of people wearing masks incorrectly on the Internet; which causes our dataset to be imbalanced.
20
+
> **Note**: as you can see, the little girl in the second row and third column is wearing the mask with their nose showing, which is *incorrect*. We want our custom model to detect cases like these, which are also the hardest to represent, as there are a lot of pictures of people with and without masks, but there aren't as many pictures of people wearing masks incorrectly on the Internet; which causes our dataset to be imbalanced.
21
21
22
22
## Task 1: Final Result
23
23
@@ -58,4 +58,4 @@ You may now [proceed to the next lab](#next).
58
58
## Acknowledgements
59
59
60
60
***Author** - Nacho Martinez, Data Science Advocate @ Oracle DevRel
Copy file name to clipboardExpand all lines: mask_detection_labeling/roboflow/roboflow.md
+12-13Lines changed: 12 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ Estimated Time: 40 minutes
4
4
5
5
## Overview
6
6
7
-
We are going to use **RoboFlow** as our data platform for this workshop. The good thing about RoboFlow is that it eases the process of labeling and extracting data - in some ways, it works as a data store of hundreds, even thousands of projects like mine, with thousands of imagesa each; and all of these projects are **public** (meaning, we can use someone else's data to help us solve the problem).
7
+
We are going to use **RoboFlow** as our data platform for this workshop. The good thing about RoboFlow is that it eases the process of labeling and extracting data - in some ways, it works as a data store of hundreds, even thousands of projects like mine, with thousands of images each; and all of these projects are **public** (meaning, we can use someone else's data to help us solve the problem).
8
8
9
9
Oh, and RoboFlow is free to use for public projects - I've never had issues with lacking storage space for a project. Shoutout to the RoboFlow team for their continued support.
10
10
@@ -69,7 +69,7 @@ If we've done everything correctly, we should have downloaded all images from th
69
69
70
70
## Task 3: Manipulating Datasets
71
71
72
-
First, let's make a quick intro on why we have all images from each dataset in one of these three aforementioned directories, and what each of these directories represent:
72
+
First, let's make a quick intro on why we have all images from each dataset in one of these three aforementioned directories, and what each of these directories represents:
73
73
74
74
-**Train**: The first type of dataset in machine learning is called the *training* dataset. This dataset is extremely important because it is used to train the model by adjusting the weights and biases of neural networks to produce accurate answers based on the inputs provided. If the training dataset is flawed or incomplete, it is very difficult to develop a good working model.
75
75
@@ -85,7 +85,7 @@ Let's open one of the datasets. You should see this:
85
85
The file that holds all links and values is called `data.yaml`, with a structure like this:
86
86
87
87

88
-
> **Note**: _`nc`_ represents the number of classes, with each class name in the _`names`_ list. It also contains a path to each dataset directory (it's recommended to modify these to be absolute paths, not relative ones). Also, if label names are weird or hard to understand like numbers, you can check what each labels mean by visually inspecting the dataset. For example, I looked at some pictures and made sure that the _`PasMasque`_ class actually represented a *lack* of mask, and that other classes were also correctly represented by correct, meaningful labels.
88
+
> **Note**: _`nc`_ represents the number of classes, with each class name in the _`names`_ list. It also contains a path to each dataset directory (it's recommended to modify these to be absolute paths, not relative ones). Also, if label names are weird or hard to understand like numbers, you can check what each label mean by visually inspecting the dataset. For example, I looked at some pictures and made sure that the _`PasMasque`_ class actually represented a *lack* of a mask, and that other classes were also correctly represented by correct, meaningful labels.
89
89
90
90
We need to modify this YAML file to include the names of the classes that we want, making sure that the order of the labels is also preserved.
91
91
@@ -114,7 +114,7 @@ Then, I click on my Mask Detection Placement model:
114
114
115
115

116
116
117
-
Finally, let's upload some new images to include in the model. I will upload my images by importing a YouTube video of myself; but you can use any pictures you have on your phone or computer, just make sure that you get a healthy ratio of images with different mask-wearing states (correctly, incorrectly, no mask at all).
117
+
Finally, let's upload some new images to include in the model. I will upload my images by importing a YouTube video of myself, but you can use any pictures you have on your phone or computer, just make sure that you get a healthy ratio of images with different mask-wearing states (correctly, incorrectly, no mask at all).
118
118
119
119
We need to be mindful of which **sampling rate** to choose: if we select a sampling rate that's too high, it will cause the dataset to have very similar images, as they will be taken almost one after the other. If the sampling rate is too low, we won't get enough images from the video.
120
120
@@ -125,7 +125,7 @@ Then, we get the selected frames from the video:
125
125
126
126

127
127
128
-
The last thing to do now is to **annotate** these images. We will annotate using bounding boxes (see below explanation why this is a good annotation method for our problem).
128
+
The last thing to do now is to **annotate** these images. We will annotate using bounding boxes (see an explanation below of why this is a good annotation method for our problem).
129
129
130
130
We go to the Annotate section in the toolbar:
131
131
@@ -153,14 +153,14 @@ We repeat this process for every image. Then, we'll choose into which dataset th
> **Note**: I recommend 80%-10%-10% for training-validation-testing for most cases.
155
155
156
-
We can now proceed to augmenting our dataset and generating a new version.
156
+
We can now proceed to augment our dataset and generate a new version.
157
157
158
158
### Different Annotation Strategies
159
159
160
160
Depending on the type of problem, you will need to have a different annotation technique. The three most common ones are:
161
161
- Bounding Boxes: they are rectangles that surround an object and specify its position. This method is perfect for our mask placement model.
162
-
- Polygons: this method takes more time than bounding boxes, but increases performance (accuracy), as the model will be trained on data that's been more constrained. You can annotate an image using the traditional method of drawing a bounding box, without using a polygon. This method takes less time for annotators, but results in missing some added performance. Thus, if you have the resources and you have decided polygon annotation is helpful, it is worth going the extra mile.
163
-
-[Smart Polygons](https://blog.roboflow.com/automated-polygon-labeling-computer-vision/): RoboFlow simplifies this process by having their own Smart Annotation, which will detect an object and try to draw its edges interactively.
162
+
- Polygons: this method takes more time than bounding boxes, but increases performance (accuracy), as the model will be trained on data that's been more constrained. You can annotate an image using the traditional method of drawing a bounding box, without using a polygon. This method takes less time for annotators but results in missing some added performance. Thus, if you have the resources and you have decided polygon annotation is helpful, it is worth going the extra mile.
163
+
-[Smart Polygons](https://blog.roboflow.com/automated-polygon-labeling-computer-vision/): RoboFlow simplifies this process by having its own Smart Annotation, which will detect an object and try to draw its edges interactively.
164
164
165
165
Here are three examples, one for each type of annotation:
166
166
@@ -215,11 +215,11 @@ Since I had some free training credits, I decided to spend one of them to see ho
215
215
216
216

217
217
218
-
I recommend to start training from a public checkpoint, like the one from the COCO dataset, with a 46.7% mAP. This dataset has previously been trained to detect objects from the real-world, and even though it doesn't recognize mask placement, it will serve as a starting point for the Neural Network, which, despite its lack of knowledge of what a 'COVID-19 Mask' is, it's learned to detect other things, like **edges** of objects, shapes and forms. This means that the model _knows_ about the real-world, even if its knowledge is limited. So, let us try with this model first.
218
+
I recommend starting training from a public checkpoint, like the one from the COCO dataset, with a 46.7% mAP. This dataset has previously been trained to detect objects from the realworld, and even though it doesn't recognize mask placement, it will serve as a starting point for the Neural Network, which, despite its lack of knowledge of what a 'COVID-19 Mask' is, it's learned to detect other things, like **edges** of objects, shapes, and forms. This means that the model _knows_ about the realworld, even if its knowledge is limited. So, let us try this model first.
219
219
220
220
## Task 7: Conclusions
221
221
222
-
After the training process is complete (can take up to 24 hours), we can see the following:
222
+
After the training process is complete (which can take up to 24 hours), we can see the following:
223
223
224
224

225
225
@@ -231,10 +231,9 @@ In more detail, we get average precision broken down by validation and testing s
231
231
232
232
> **Note**: since the validation set had fewer pictures than the test set, and the validation set has a lower precision, this leads me to believe that the lower precision on the validation set is caused by having too few pictures, and not by the model being inaccurate on detections. We will fix this in the next article, where we will make a more balanced split for our dataset.
233
233
234
-
Also note that -- across validation and test set -- the "incorrect" label has a constant precision of **49%**. This makes sense, as it's the hardest class to predict of the three - it's very easy to see the difference between someone with our without a mask, but incorrectly-placed masks are harder to detect even for us. Thus, some pictures we may fail to be recognized as humans. As great, new professionals in Computer Vision, we will take note of this and we'll find a way to improve the precision in the future - taking special care with our under-performing class.
235
-
234
+
Also, note that -- across validation and test set -- the "incorrect" label has a constant precision of **49%**. This makes sense, as it's the hardest class to predict of the three - it's very easy to see the difference between someone with our without a mask, but incorrectly-placed masks are harder to detect even for us. Thus, in some pictures, we may fail to be recognized as humans. As great, new professionals in Computer Vision, we will take note of this and we'll find a way to improve the precision in the future - taking special care with our underperforming class.
236
235
237
236
## Acknowledgements
238
237
239
238
***Author** - Nacho Martinez, Data Science Advocate @ Oracle DevRel
0 commit comments