|
| 1 | +--- |
| 2 | +title: Hyperparameter for AutoML computer vision tasks |
| 3 | +titleSuffix: Azure Machine Learning |
| 4 | +description: Learn which hyperparameters are available for computer vision tasks with automated ML. |
| 5 | +services: machine-learning |
| 6 | +ms.service: machine-learning |
| 7 | +ms.subservice: automl |
| 8 | +ms.topic: reference |
| 9 | +ms.reviewer: nibaccam |
| 10 | +author: swatig007 |
| 11 | +ms.author: swatig |
| 12 | +ms.date: 01/18/2022 |
| 13 | +ms.custom: |
| 14 | +--- |
| 15 | + |
| 16 | +# Hyperparameters for computer vision tasks in automated machine learning |
| 17 | + |
| 18 | +Learn which hyperparameters are available specifically for computer vision tasks in automated ML experiments. |
| 19 | + |
| 20 | +With support for computer vision tasks, you can control the model algorithm and sweep hyperparameters. These model algorithms and hyperparameters are passed in as the parameter space for the sweep. While many of the hyperparameters exposed are model-agnostic, there are instances where hyperparameters are task-specific or model-specific. |
| 21 | + |
| 22 | +## Model agnostic hyperparameters |
| 23 | + |
| 24 | +The following table describes the hyperparameters that are model agnostic. |
| 25 | + |
| 26 | +| Parameter name | Description | Default| |
| 27 | +| ------------ | ------------- | ------------ | |
| 28 | +| `number_of_epochs` | Number of training epochs. <br>Must be a positive integer. | 15 <br> (except `yolov5`: 30) | |
| 29 | +| `training_batch_size` | Training batch size.<br> Must be a positive integer. | Multi-class/multi-label: 78 <br>(except *vit-variants*: <br> `vits16r224`: 128 <br>`vitb16r224`: 48 <br>`vitl16r224`:10)<br><br>Object detection: 2 <br>(except `yolov5`: 16) <br><br> Instance segmentation: 2 <br> <br> *Note: The defaults are largest batch size that can be used on 12 GiB GPU memory*.| |
| 30 | +| `validation_batch_size` | Validation batch size.<br> Must be a positive integer. | Multi-class/multi-label: 78 <br>(except *vit-variants*: <br> `vits16r224`: 128 <br>`vitb16r224`: 48 <br>`vitl16r224`:10)<br><br>Object detection: 1 <br>(except `yolov5`: 16) <br><br> Instance segmentation: 1 <br> <br> *Note: The defaults are largest batch size that can be used on 12 GiB GPU memory*.| |
| 31 | +| `grad_accumulation_step` | Gradient accumulation means running a configured number of `grad_accumulation_step` without updating the model weights while accumulating the gradients of those steps, and then using the accumulated gradients to compute the weight updates. <br> Must be a positive integer. | 1 | |
| 32 | +| `early_stopping` | Enable early stopping logic during training. <br> Must be 0 or 1.| 1 | |
| 33 | +| `early_stopping_patience` | Minimum number of epochs or validation evaluations with<br>no primary metric improvement before the run is stopped.<br> Must be a positive integer. | 5 | |
| 34 | +| `early_stopping_delay` | Minimum number of epochs or validation evaluations to wait<br>before primary metric improvement is tracked for early stopping.<br> Must be a positive integer. | 5 | |
| 35 | +| `learning_rate` | Initial learning rate. <br>Must be a float in the range [0, 1]. | Multi-class: 0.01 <br>(except *vit-variants*: <br> `vits16r224`: 0.0125<br>`vitb16r224`: 0.0125<br>`vitl16r224`: 0.001) <br><br> Multi-label: 0.035 <br>(except *vit-variants*:<br>`vits16r224`: 0.025<br>`vitb16r224`: 0.025 <br>`vitl16r224`: 0.002) <br><br> Object detection: 0.005 <br>(except `yolov5`: 0.01) <br><br> Instance segmentation: 0.005 | |
| 36 | +| `lr_scheduler` | Type of learning rate scheduler. <br> Must be `warmup_cosine` or `step`. | `warmup_cosine` | |
| 37 | +| `step_lr_gamma` | Value of gamma when learning rate scheduler is `step`.<br> Must be a float in the range [0, 1]. | 0.5 | |
| 38 | +| `step_lr_step_size` | Value of step size when learning rate scheduler is `step`.<br> Must be a positive integer. | 5 | |
| 39 | +| `warmup_cosine_lr_cycles` | Value of cosine cycle when learning rate scheduler is `warmup_cosine`. <br> Must be a float in the range [0, 1]. | 0.45 | |
| 40 | +| `warmup_cosine_lr_warmup_epochs` | Value of warmup epochs when learning rate scheduler is `warmup_cosine`. <br> Must be a positive integer. | 2 | |
| 41 | +| `optimizer` | Type of optimizer. <br> Must be either `sgd`, `adam`, `adamw`. | `sgd` | |
| 42 | +| `momentum` | Value of momentum when optimizer is `sgd`. <br> Must be a float in the range [0, 1]. | 0.9 | |
| 43 | +| `weight_decay` | Value of weight decay when optimizer is `sgd`, `adam`, or `adamw`. <br> Must be a float in the range [0, 1]. | 1e-4 | |
| 44 | +|`nesterov`| Enable `nesterov` when optimizer is `sgd`. <br> Must be 0 or 1.| 1 | |
| 45 | +|`beta1` | Value of `beta1` when optimizer is `adam` or `adamw`. <br> Must be a float in the range [0, 1]. | 0.9 | |
| 46 | +|`beta2` | Value of `beta2` when optimizer is `adam` or `adamw`.<br> Must be a float in the range [0, 1]. | 0.999 | |
| 47 | +|`amsgrad` | Enable `amsgrad` when optimizer is `adam` or `adamw`.<br> Must be 0 or 1. | 0 | |
| 48 | +|`evaluation_frequency`| Frequency to evaluate validation dataset to get metric scores. <br> Must be a positive integer. | 1 | |
| 49 | +|`split_ratio`| If validation data is not defined, this specifies the split ratio for splitting train data into random train and validation subsets. <br> Must be a float in the range [0, 1].| 0.2 | |
| 50 | +|`checkpoint_frequency`| Frequency to store model checkpoints. <br> Must be a positive integer. | Checkpoint at epoch with best primary metric on validation.| |
| 51 | +|`checkpoint_run_id`| The run id of the experiment that has a pretrained checkpoint for incremental training.| no default | |
| 52 | +|`checkpoint_dataset_id`| FileDataset id containing pretrained checkpoint(s) for incremental training. Make sure to pass `checkpoint_filename` along with `checkpoint_dataset_id`.| no default | |
| 53 | +|`checkpoint_filename`| The pretrained checkpoint filename in FileDataset for incremental training. Make sure to pass `checkpoint_dataset_id` along with `checkpoint_filename`.| no default | |
| 54 | +|`layers_to_freeze`| How many layers to freeze for your model. For instance, passing 2 as value for `seresnext` means freezing layer0 and layer1 referring to the below supported model layer info. <br> Must be a positive integer. <br><br>`'resnet': [('conv1.', 'bn1.'), 'layer1.', 'layer2.', 'layer3.', 'layer4.'],`<br>`'mobilenetv2': ['features.0.', 'features.1.', 'features.2.', 'features.3.', 'features.4.', 'features.5.', 'features.6.', 'features.7.', 'features.8.', 'features.9.', 'features.10.', 'features.11.', 'features.12.', 'features.13.', 'features.14.', 'features.15.', 'features.16.', 'features.17.', 'features.18.'],`<br>`'seresnext': ['layer0.', 'layer1.', 'layer2.', 'layer3.', 'layer4.'],`<br>`'vit': ['patch_embed', 'blocks.0.', 'blocks.1.', 'blocks.2.', 'blocks.3.', 'blocks.4.', 'blocks.5.', 'blocks.6.','blocks.7.', 'blocks.8.', 'blocks.9.', 'blocks.10.', 'blocks.11.'],`<br>`'yolov5_backbone': ['model.0.', 'model.1.', 'model.2.', 'model.3.', 'model.4.','model.5.', 'model.6.', 'model.7.', 'model.8.', 'model.9.'],`<br>`'resnet_backbone': ['backbone.body.conv1.', 'backbone.body.layer1.', 'backbone.body.layer2.','backbone.body.layer3.', 'backbone.body.layer4.']` | no default | |
| 55 | + |
| 56 | +## Image classification (multi-class and multi-label) specific hyperparameters |
| 57 | + |
| 58 | +The following table summarizes hyperparmeters for image classification (multi-class and multi-label) tasks. |
| 59 | + |
| 60 | +| Parameter name | Description | Default | |
| 61 | +| ------------- |-------------|-----| |
| 62 | +| `weighted_loss` | 0 for no weighted loss.<br>1 for weighted loss with sqrt.(class_weights) <br> 2 for weighted loss with class_weights. <br> Must be 0 or 1 or 2. | 0 | |
| 63 | +| `valid_resize_size` | Image size to which to resize before cropping for validation dataset. <br> Must be a positive integer. <br> <br> *Notes: <li> `seresnext` doesn't take an arbitrary size. <li> Training run may get into CUDA OOM if the size is too big*. | 256 | |
| 64 | +| `valid_crop_size` | Image crop size that's input to your neural network for validation dataset. <br> Must be a positive integer. <br> <br> *Notes: <li> `seresnext` doesn't take an arbitrary size. <li> *ViT-variants* should have the same `valid_crop_size` and `train_crop_size`. <li> Training run may get into CUDA OOM if the size is too big*. | 224 | |
| 65 | +| `train_crop_size` | Image crop size that's input to your neural network for train dataset. <br> Must be a positive integer. <br> <br> *Notes: <li> `seresnext` doesn't take an arbitrary size. <li> *ViT-variants* should have the same `valid_crop_size` and `train_crop_size`. <li> Training run may get into CUDA OOM if the size is too big*. | 224 | |
| 66 | + |
| 67 | +## Object detection and instance segmentation task specific hyperparameters |
| 68 | + |
| 69 | +The following hyperparameters are for object detection and instance segmentation tasks. |
| 70 | + |
| 71 | +> [!WARNING] |
| 72 | +> These parameters are not supported with the `yolov5` algorithm. |
| 73 | +
|
| 74 | +| Parameter name | Description | Default | |
| 75 | +| ------------- |-------------|-----| |
| 76 | +| `validation_metric_type` | Metric computation method to use for validation metrics. <br> Must be `none`, `coco`, `voc`, or `coco_voc`. | `voc` | |
| 77 | +| `min_size` | Minimum size of the image to be rescaled before feeding it to the backbone. <br> Must be a positive integer. <br> <br> *Note: training run may get into CUDA OOM if the size is too big*.| 600 | |
| 78 | +| `max_size` | Maximum size of the image to be rescaled before feeding it to the backbone. <br> Must be a positive integer.<br> <br> *Note: training run may get into CUDA OOM if the size is too big*. | 1333 | |
| 79 | +| `box_score_thresh` | During inference, only return proposals with a classification score greater than `box_score_thresh`. <br> Must be a float in the range [0, 1].| 0.3 | |
| 80 | +| `box_nms_thresh` | Non-maximum suppression (NMS) threshold for the prediction head. Used during inference. <br>Must be a float in the range [0, 1]. | 0.5 | |
| 81 | +| `box_detections_per_img` | Maximum number of detections per image, for all classes. <br> Must be a positive integer.| 100 | |
| 82 | +| `tile_grid_size` | The grid size to use for tiling each image. <br>*Note: tile_grid_size must not be None to enable [small object detection](how-to-use-automl-small-object-detect.md) logic*<br> A tuple of two integers passed as a string. Example: --tile_grid_size "(3, 2)" | No Default | |
| 83 | +| `tile_overlap_ratio` | Overlap ratio between adjacent tiles in each dimension. <br> Must be float in the range of [0, 1) | 0.25 | |
| 84 | +| `tile_predictions_nms_thresh` | The IOU threshold to use to perform NMS while merging predictions from tiles and image. Used in validation/ inference. <br> Must be float in the range of [0, 1] | 0.25 | |
| 85 | + |
| 86 | +## Next steps |
| 87 | + |
| 88 | +* Learn how to [Set up AutoML to train computer vision models with Python (preview)](how-to-auto-train-image-models.md). |
| 89 | + |
| 90 | +* [Tutorial: Train an object detection model (preview) with AutoML and Python](tutorial-auto-train-image-models.md). |
0 commit comments