You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-create-labeling-projects.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -135,8 +135,6 @@ For bounding boxes, important questions include:
135
135
136
136
The **ML assisted labeling** page lets you trigger automatic machine learning models to accelerate the labeling task. At the beginning of your labeling project, the images are shuffled into a random order to reduce potential bias. However, any biases that are present in the dataset will be reflected in the trained model. For example, if 80% of your images are of a single class, then approximately 80% of the data used to train the model will be of that class. This training does not include active learning.
137
137
138
-
This feature is available for image classification (multi-class or multi-label) tasks.
139
-
140
138
Select *Enable ML assisted labeling* and specify a GPU to enable assisted labeling, which consists of two phases:
141
139
* Clustering
142
140
* Prelabeling
@@ -147,13 +145,15 @@ Since the final labels still rely on input from the labeler, this technology is
147
145
148
146
### Clustering
149
147
150
-
After a certain number of labels are submitted, the machine learning model starts to group together similar images. These similar images are presented to the labelers on the same screen to speed up manual tagging. Clustering is especially useful when the labeler is viewing a grid of 4, 6, or 9 images.
148
+
After a certain number of labels are submitted, the machine learning model for image classification starts to group together similar images. These similar images are presented to the labelers on the same screen to speed up manual tagging. Clustering is especially useful when the labeler is viewing a grid of 4, 6, or 9 images.
151
149
152
150
Once a machine learning model has been trained on your manually labeled data, the model is truncated to its last fully-connected layer. Unlabeled images are then passed through the truncated model in a process commonly known as "embedding" or "featurization." This embeds each image in a high-dimensional space defined by this model layer. Images which are nearest neighbors in the space are used for clustering tasks.
153
151
152
+
The clustering phase does not appear for object detection models.
153
+
154
154
### Prelabeling
155
155
156
-
After more image labels are submitted, a classification model is used to predict image tags. The labeler now sees pages that contain predicted labels already present on each image. The task is then to review these labels and correct any mis-labeled images before submitting the page.
156
+
After enough image labels are submitted, a classification model is used to predict image tags. Or an object detection model is used to predict bounding boxes. The labeler now sees pages that contain predicted labels already present on each image. For object detection, predicted boxes are also shown. The task is then to review these predictions and correct any mis-labeled images before submitting the page.
157
157
158
158
Once a machine learning model has been trained on your manually labeled data, the model is evaluated on a test set of manually labeled images to determine its accuracy at a variety of different confidence thresholds. This evaluation process is used to determine a confidence threshold above which the model is accurate enough to show pre-labels. The model is then evaluated against unlabeled data. Images with predictions more confident than this threshold are used for pre-labeling.
0 commit comments