Skip to content

Commit 7cd8db8

Browse files
authored
Merge pull request #208574 from sharma-riti/master
vision docs update
2 parents c0c6367 + cb61c8c commit 7cd8db8

File tree

3 files changed

+113
-28
lines changed

3 files changed

+113
-28
lines changed

articles/machine-learning/how-to-auto-train-image-models.md

Lines changed: 53 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -202,9 +202,9 @@ Automated ML doesn't impose any constraints on training or validation data size
202202

203203
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
204204

205-
Training data is a required parameter and is passed in using the `training` key of the data section. You can optionally specify another MLtable as a validation data with the `validation` key. If no validation data is specified, 20% of your training data will be used for validation by default, unless you pass `validation_data_size` argument with a different value.
205+
Training data is a required parameter and is passed in using the `training_data` key. You can optionally specify another MLtable as a validation data with the `validation_data` key. If no validation data is specified, 20% of your training data will be used for validation by default, unless you pass `validation_data_size` argument with a different value.
206206

207-
Target column name is a required parameter and used as target for supervised ML task. It's passed in using the `target_column_name` key in the data section. For example,
207+
Target column name is a required parameter and used as target for supervised ML task. It's passed in using the `target_column_name` key. For example,
208208

209209
```yaml
210210
target_column_name: label
@@ -303,23 +303,23 @@ Before doing a large sweep to search for the optimal models and hyperparameters,
303303

304304
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
305305

306-
If you wish to use the default hyperparameter values for a given algorithm (say yolov5), you can specify it using model_name key in image_model section. For example,
306+
If you wish to use the default hyperparameter values for a given algorithm (say yolov5), you can specify it using model_name key in training_parameters section. For example,
307307

308308
```yaml
309-
image_model:
310-
model_name: "yolov5"
309+
training_parameters:
310+
model_name: yolov5
311311
```
312312
# [Python SDK](#tab/python)
313313

314314
[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
315315

316-
If you wish to use the default hyperparameter values for a given algorithm (say yolov5), you can specify it using model_name parameter in set_image_model method of the task specific `automl` job. For example,
316+
If you wish to use the default hyperparameter values for a given algorithm (say yolov5), you can specify it using model_name parameter in set_training_parameters method of the task specific `automl` job. For example,
317317

318318
```python
319-
image_object_detection_job.set_image_model(model_name="yolov5")
319+
image_object_detection_job.set_training_parameters(model_name="yolov5")
320320
```
321321
---
322-
Once you've built a baseline model, you might want to optimize model performance in order to sweep over the model algorithm and hyperparameter space. You can use the following sample config to sweep over the hyperparameters for each algorithm, choosing from a range of values for learning_rate, optimizer, lr_scheduler, etc., to generate a model with the optimal primary metric. If hyperparameter values aren't specified, then default values are used for the specified algorithm.
322+
Once you've built a baseline model, you might want to optimize model performance in order to sweep over the model algorithm and hyperparameter space. You can use the following sample config to [sweep over the hyperparameters](./how-to-auto-train-image-models.md#sweeping-hyperparameters-for-your-model) for each algorithm, choosing from a range of values for learning_rate, optimizer, lr_scheduler, etc., to generate a model with the optimal primary metric. If hyperparameter values aren't specified, then default values are used for the specified algorithm.
323323

324324
### Primary metric
325325

@@ -355,6 +355,46 @@ limits:
355355
When training computer vision models, model performance depends heavily on the hyperparameter values selected. Often, you might want to tune the hyperparameters to get optimal performance.
356356
With support for computer vision tasks in automated ML, you can sweep hyperparameters to find the optimal settings for your model. This feature applies the hyperparameter tuning capabilities in Azure Machine Learning. [Learn how to tune hyperparameters](how-to-tune-hyperparameters.md).
357357

358+
# [Azure CLI](#tab/cli)
359+
360+
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
361+
362+
```yaml
363+
search_space:
364+
- model_name:
365+
type: choice
366+
values: [yolov5]
367+
learning_rate:
368+
type: uniform
369+
min_value: 0.0001
370+
max_value: 0.01
371+
model_size:
372+
type: choice
373+
values: [small, medium]
374+
375+
- model_name:
376+
type: choice
377+
values: [fasterrcnn_resnet50_fpn]
378+
learning_rate:
379+
type: uniform
380+
min_value: 0.0001
381+
max_value: 0.001
382+
optimizer:
383+
type: choice
384+
values: [sgd, adam, adamw]
385+
min_size:
386+
type: choice
387+
values: [600, 800]
388+
```
389+
390+
# [Python SDK](#tab/python)
391+
392+
[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
393+
394+
[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=search-space-settings)]
395+
396+
---
397+
358398
### Define the parameter search space
359399

360400
You can define the model algorithms and hyperparameters to sweep in the parameter space.
@@ -437,7 +477,7 @@ You can pass fixed settings or parameters that don't change during the parameter
437477
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
438478

439479
```yaml
440-
image_model:
480+
training_parameters:
441481
early_stopping: True
442482
evaluation_frequency: 1
443483
```
@@ -466,7 +506,7 @@ You can pass the run ID that you want to load the checkpoint from.
466506
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
467507

468508
```yaml
469-
image_model:
509+
training_parameters:
470510
checkpoint_run_id : "target_checkpoint_run_id"
471511
```
472512

@@ -496,7 +536,7 @@ mlflow_parent_run = mlflow_client.get_run(automl_job.name)
496536
target_checkpoint_run_id = mlflow_parent_run.data.tags["automl_best_child_run_id"]
497537
```
498538

499-
To pass a checkpoint via the run ID, you need to use the `checkpoint_run_id` parameter in `set_image_model` function.
539+
To pass a checkpoint via the run ID, you need to use the `checkpoint_run_id` parameter in `set_training_parameters` function.
500540

501541
```python
502542
image_object_detection_job = automl.image_object_detection(
@@ -509,7 +549,7 @@ image_object_detection_job = automl.image_object_detection(
509549
tags={"my_custom_tag": "My custom value"},
510550
)
511551

512-
image_object_detection_job.set_image_model(checkpoint_run_id=target_checkpoint_run_id)
552+
image_object_detection_job.set_training_parameters(checkpoint_run_id=target_checkpoint_run_id)
513553

514554
automl_image_job_incremental = ml_client.jobs.create_or_update(
515555
image_object_detection_job
@@ -722,7 +762,7 @@ this is how your review page looks like. we can select instance type, instance c
722762

723763
### Update inference settings
724764

725-
In the previous step, we downloaded a file `mlflow-model/artifacts/settings.json` from the best model. which can be used to update the inference settings before registering the model. Although its's recommended to use the same parameters as training for best performance.
765+
In the previous step, we downloaded a file `mlflow-model/artifacts/settings.json` from the best model. which can be used to update the inference settings before registering the model. Although it's recommended to use the same parameters as training for best performance.
726766

727767
Each of the tasks (and some models) has a set of parameters. By default, we use the same values for the parameters that were used during the training and validation. Depending on the behavior that we need when using the model for inference, we can change these parameters. Below you can find a list of parameters for each task type and model.
728768

articles/machine-learning/how-to-prepare-datasets-for-automl-images.md

Lines changed: 36 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ In this article, you learn how to prepare image data for training computer visio
2727
To generate models for computer vision tasks with automated machine learning, you need to bring labeled image data as input for model training in the form of an `MLTable`.
2828

2929
You can create an `MLTable` from labeled training data in JSONL format.
30-
If your labeled training data is in a different format (like, pascal VOC or COCO), you can use a conversion script to first convert it to JSONL, and then create an `MLTable`. Alternatively, you can use Azure Machine Learning's [data labeling tool](how-to-create-image-labeling-projects.md) to manually label images, and export the labeled data to use for training your AutoML model.
30+
If your labeled training data is in a different format (like, pascal VOC or COCO), you can use a [conversion script](https://github.com/Azure/azureml-examples/blob/main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/coco2jsonl.py) to first convert it to JSONL, and then create an `MLTable`. Alternatively, you can use Azure Machine Learning's [data labeling tool](how-to-create-image-labeling-projects.md) to manually label images, and export the labeled data to use for training your AutoML model.
3131

3232
## Prerequisites
3333

@@ -36,7 +36,7 @@ If your labeled training data is in a different format (like, pascal VOC or COCO
3636
## Get labeled data
3737
In order to train computer vision models using AutoML, you need to first get labeled training data. The images need to be uploaded to the cloud and label annotations need to be in JSONL format. You can either use the Azure ML Data Labeling tool to label your data or you could start with pre-labeled image data.
3838

39-
### Using Azure ML Data Labeling tool to label your training data
39+
## Using Azure ML Data Labeling tool to label your training data
4040
If you don't have pre-labeled data, you can use Azure Machine Learning's [data labeling tool](how-to-create-image-labeling-projects.md) to manually label images. This tool automatically generates the data required for training in the accepted format.
4141

4242
It helps to create, manage, and monitor data labeling tasks for
@@ -45,9 +45,37 @@ It helps to create, manage, and monitor data labeling tasks for
4545
+ Object detection (bounding box)
4646
+ Instance segmentation (polygon)
4747

48-
If you already have a data labeling project and you want to use that data, you can [export your labeled data as an Azure ML Dataset](how-to-create-image-labeling-projects.md#export-the-labels). You can then access the exported dataset under the 'Datasets' tab in Azure ML Studio, and download the underlying JSONL file from the Dataset details page under Data sources. The downloaded JSONL file can then be used to create an `MLTable` that can be used by automated ML for training computer vision models.
48+
If you already have a data labeling project and you want to use that data, you can [export your labeled data as an Azure ML Dataset](how-to-create-image-labeling-projects.md#export-the-labels) and then access the dataset under 'Datasets' tab in Azure ML Studio. This exported dataset can then be passed as an input using `azureml:<tabulardataset_name>:<version>` format. Here is an example on how to pass existing dataset as input for training computer vision models.
4949

50-
### Using pre-labeled training data
50+
# [Azure CLI](#tab/cli)
51+
52+
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
53+
54+
```yaml
55+
training_data:
56+
path: azureml:odFridgeObjectsTrainingDataset:1
57+
type: mltable
58+
mode: direct
59+
```
60+
61+
# [Python SDK](#tab/python)
62+
63+
[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
64+
65+
```python
66+
from azure.ai.ml.constants import AssetTypes, InputOutputModes
67+
from azure.ai.ml import Input
68+
69+
# Training MLTable with v1 TabularDataset
70+
my_training_data_input = Input(
71+
type=AssetTypes.MLTABLE, path="azureml:odFridgeObjectsTrainingDataset:1",
72+
mode=InputOutputModes.DIRECT
73+
)
74+
```
75+
---
76+
77+
78+
## Using pre-labeled training data
5179
If you have previously labeled data that you would like to use to train your model, you will first need to upload the images to the default Azure Blob Storage of your Azure ML Workspace and register it as a data asset.
5280

5381
# [Azure CLI](#tab/cli)
@@ -78,18 +106,18 @@ az ml data create -f [PATH_TO_YML_FILE] --workspace-name [YOUR_AZURE_WORKSPACE]
78106

79107
Next, you will need to get the label annotations in JSONL format. The schema of labeled data depends on the computer vision task at hand. Refer to [schemas for JSONL files for AutoML computer vision experiments](reference-automl-images-schema.md) to learn more about the required JSONL schema for each task type.
80108

81-
If your training data is in a different format (like, pascal VOC or COCO), [helper scripts](https://github.com/Azure/azureml-examples/blob/main/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to convert the data to JSONL are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/sdk-preview/sdk/jobs/automl-standalone-jobs).
109+
If your training data is in a different format (like, pascal VOC or COCO), [helper scripts](https://github.com/Azure/azureml-examples/blob/main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/coco2jsonl.py) to convert the data to JSONL are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/main/sdk/jobs/automl-standalone-jobs).
82110

83-
## Create MLTable
111+
### Create MLTable
84112

85113
Once you have your labeled data in JSONL format, you can use it to create `MLTable` as shown below. MLtable packages your data into a consumable object for training.
86114

87115
:::code language="yaml" source="~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/data/training-mltable-folder/MLTable":::
88116

89-
You can then pass in the `MLTable` as a data input for your AutoML training job.
117+
You can then pass in the `MLTable` as a [data input for your AutoML training job](./how-to-auto-train-image-models.md#consume-data).
90118

91119
## Next steps
92120

93121
* [Train computer vision models with automated machine learning](how-to-auto-train-image-models.md).
94122
* [Train a small object detection model with automated machine learning](how-to-use-automl-small-object-detect.md).
95-
* [Tutorial: Train an object detection model (preview) with AutoML and Python](tutorial-auto-train-image-models.md).
123+
* [Tutorial: Train an object detection model (preview) with AutoML and Python](tutorial-auto-train-image-models.md).

articles/machine-learning/tutorial-auto-train-image-models.md

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -351,13 +351,30 @@ sweep:
351351

352352
```yaml
353353
search_space:
354-
- model_name: "yolov5"
355-
learning_rate: "uniform(0.0001, 0.01)"
356-
model_size: "choice('small', 'medium')"
357-
- model_name: "fasterrcnn_resnet50_fpn"
358-
learning_rate: "uniform(0.0001, 0.001)"
359-
optimizer: "choice('sgd', 'adam', 'adamw')"
360-
min_size: "choice(600, 800)"
354+
- model_name:
355+
type: choice
356+
values: [yolov5]
357+
learning_rate:
358+
type: uniform
359+
min_value: 0.0001
360+
max_value: 0.01
361+
model_size:
362+
type: choice
363+
values: [small, medium]
364+
365+
- model_name:
366+
type: choice
367+
values: [fasterrcnn_resnet50_fpn]
368+
learning_rate:
369+
type: uniform
370+
min_value: 0.0001
371+
max_value: 0.001
372+
optimizer:
373+
type: choice
374+
values: [sgd, adam, adamw]
375+
min_size:
376+
type: choice
377+
values: [600, 800]
361378
```
362379

363380
# [Python SDK](#tab/python)

0 commit comments

Comments
 (0)