You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-auto-train-image-models.md
+2-33Lines changed: 2 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.author: swatig
8
8
ms.service: machine-learning
9
9
ms.subservice: automl
10
10
ms.topic: how-to
11
-
ms.date: 10/06/2021
11
+
ms.date: 01/18/2022
12
12
13
13
# Customer intent: I'm a data scientist with ML knowledge in the computer vision space, looking to build ML models using image data in Azure Machine Learning with full control of the model algorithm, hyperparameters, and training and deployment environments.
Automated ML does not impose any constraints on training or validation data size for computer vision tasks. Maximum dataset size is only limited by the storage layer behind the dataset (i.e. blob store). There is no minimum number of images or labels. However, we recommend to start with a minimum of 10-15 samples per label to ensure the output model is sufficiently trained. The higher the total number of labels/classes, the more samples you need per label.
164
164
165
-
166
-
167
165
Training data is a required and is passed in using the `training_data` parameter. You can optionally specify another TabularDataset as a validation dataset to be used for your model with the `validation_data` parameter of the AutoMLImageConfig. If no validation dataset is specified, 20% of your training data will be used for validation by default, unless you pass `split_ratio` argument with a different value.
With support for computer vision tasks, you can control the model algorithm and sweep hyperparameters. These model algorithms and hyperparameters are passed in as the parameter space for the sweep.
190
188
191
-
The model algorithm is required and is passed in via `model_name` parameter. You can either specify a single `model_name` or choose between multiple. In addition to controlling the model algorithm, you can also tune hyperparameters used for model training. While many of the hyperparameters exposed are model-agnostic, there are instances where hyperparameters are task-specific or model-specific.
189
+
The model algorithm is required and is passed in via `model_name` parameter. You can either specify a single `model_name` or choose between multiple. In addition to controlling the model algorithm, you can also tune hyperparameters used for model training. While many of the hyperparameters exposed are model-agnostic, there are instances where hyperparameters are task-specific or model-specific.[Learn more about the available hyperparameters for these instances]().
The following table summarizes hyperparmeters for image classification (multi-class and multi-label) tasks.
206
-
207
-
| Parameter name | Description | Default |
208
-
| ------------- |-------------|-----|
209
-
|`weighted_loss`| 0 for no weighted loss.<br>1 for weighted loss with sqrt.(class_weights) <br> 2 for weighted loss with class_weights. <br> Must be 0 or 1 or 2. | 0 |
210
-
|`valid_resize_size`| Image size to which to resize before cropping for validation dataset. <br> Must be a positive integer. <br> <br> *Notes: <li> `seresnext` doesn't take an arbitrary size. <li> Training run may get into CUDA OOM if the size is too big*. | 256 |
211
-
|`valid_crop_size`| Image crop size that's input to your neural network for validation dataset. <br> Must be a positive integer. <br> <br> *Notes: <li> `seresnext` doesn't take an arbitrary size. <li> *ViT-variants* should have the same `valid_crop_size` and `train_crop_size`. <li> Training run may get into CUDA OOM if the size is too big*. | 224 |
212
-
|`train_crop_size`| Image crop size that's input to your neural network for train dataset. <br> Must be a positive integer. <br> <br> *Notes: <li> `seresnext` doesn't take an arbitrary size. <li> *ViT-variants* should have the same `valid_crop_size` and `train_crop_size`. <li> Training run may get into CUDA OOM if the size is too big*. | 224 |
213
-
214
-
The following hyperparameters are for object detection and instance segmentation tasks.
215
-
216
-
> [!WARNING]
217
-
> These parameters are not supported with the `yolov5` algorithm.
218
-
219
-
| Parameter name | Description | Default |
220
-
| ------------- |-------------|-----|
221
-
|`validation_metric_type`| Metric computation method to use for validation metrics. <br> Must be `none`, `coco`, `voc`, or `coco_voc`. |`voc`|
222
-
|`min_size`| Minimum size of the image to be rescaled before feeding it to the backbone. <br> Must be a positive integer. <br> <br> *Note: training run may get into CUDA OOM if the size is too big*.| 600 |
223
-
|`max_size`| Maximum size of the image to be rescaled before feeding it to the backbone. <br> Must be a positive integer.<br> <br> *Note: training run may get into CUDA OOM if the size is too big*. | 1333 |
224
-
|`box_score_thresh`| During inference, only return proposals with a classification score greater than `box_score_thresh`. <br> Must be a float in the range [0, 1].| 0.3 |
225
-
|`box_nms_thresh`| Non-maximum suppression (NMS) threshold for the prediction head. Used during inference. <br>Must be a float in the range [0, 1]. | 0.5 |
226
-
|`box_detections_per_img`| Maximum number of detections per image, for all classes. <br> Must be a positive integer.| 100 |
227
-
|`tile_grid_size`| The grid size to use for tiling each image. <br>*Note: tile_grid_size must not be None to enable [small object detection](how-to-use-automl-small-object-detect.md) logic*<br> A tuple of two integers passed as a string. Example: --tile_grid_size "(3, 2)" | No Default |
228
-
|`tile_overlap_ratio`| Overlap ratio between adjacent tiles in each dimension. <br> Must be float in the range of [0, 1) | 0.25 |
229
-
|`tile_predictions_nms_thresh`| The IOU threshold to use to perform NMS while merging predictions from tiles and image. Used in validation/ inference. <br> Must be float in the range of [0, 1]| 0.25 |
0 commit comments