Skip to content

Commit 859a010

Browse files
author
shubham soni
committed
run to trials,jobs
1 parent fa20333 commit 859a010

File tree

3 files changed

+37
-37
lines changed

3 files changed

+37
-37
lines changed

articles/machine-learning/how-to-auto-train-image-models.md

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ ms.date: 07/13/2022
2323

2424
In this article, you learn how to train computer vision models on image data with automated ML with the Azure Machine Learning CLI extension v2 or the Azure Machine Learning Python SDK v2.
2525

26-
Automated ML supports model training for computer vision tasks like image classification, object detection, and instance segmentation. Authoring AutoML models for computer vision tasks is currently supported via the Azure Machine Learning Python SDK. The resulting experimentation runs, models, and outputs are accessible from the Azure Machine Learning studio UI. [Learn more about automated ml for computer vision tasks on image data](concept-automated-ml.md).
26+
Automated ML supports model training for computer vision tasks like image classification, object detection, and instance segmentation. Authoring AutoML models for computer vision tasks is currently supported via the Azure Machine Learning Python SDK. The resulting experimentation trials, models, and outputs are accessible from the Azure Machine Learning studio UI. [Learn more about automated ml for computer vision tasks on image data](concept-automated-ml.md).
2727

2828
## Prerequisites
2929

@@ -102,7 +102,7 @@ In order to generate computer vision models, you need to bring labeled image dat
102102
If your training data is in a different format (like, pascal VOC or COCO), you can apply the helper scripts included with the sample notebooks to convert the data to JSONL. Learn more about how to [prepare data for computer vision tasks with automated ML](how-to-prepare-datasets-for-automl-images.md).
103103

104104
> [!Note]
105-
> The training data needs to have at least 10 images in order to be able to submit an AutoML run.
105+
> The training data needs to have at least 10 images in order to be able to submit an AutoML job.
106106

107107
> [!Warning]
108108
> Creation of `MLTable` from data in JSONL format is supported using the SDK and CLI only, for this capability. Creating the `MLTable` via UI is not supported at this time.
@@ -277,27 +277,27 @@ image_object_detection_job = automl.image_object_detection(
277277

278278
## Configure experiments
279279

280-
For computer vision tasks, you can launch either [individual runs](#individual-runs), [manual sweeps](#manually-sweeping-model-hyperparameters) or [automatic sweeps](#automatically-sweeping-model-hyperparameters-automode). We recommend starting with an automatic sweep to get a first baseline model. Then, you can try out individual runs with certain models and hyperparameter configurations. Finally, with manual sweeps you can explore multiple hyperparameter values near the more promising models and hyperparameter configurations. This three step workflow (automatic sweep, individual runs, manual sweeps) avoids searching the entirety of the hyperparameter space, which grows exponentially in the number of hyperparameters.
280+
For computer vision tasks, you can launch either [individual trials](#individual-trials), [manual sweeps](#manually-sweeping-model-hyperparameters) or [automatic sweeps](#automatically-sweeping-model-hyperparameters-automode). We recommend starting with an automatic sweep to get a first baseline model. Then, you can try out individual trials with certain models and hyperparameter configurations. Finally, with manual sweeps you can explore multiple hyperparameter values near the more promising models and hyperparameter configurations. This three step workflow (automatic sweep, individual trials, manual sweeps) avoids searching the entirety of the hyperparameter space, which grows exponentially in the number of hyperparameters.
281281

282282
Automatic sweeps can yield competitive results for many datasets. Additionally, they do not require advanced knowledge of model architectures, they take into account hyperparameter correlations and they work seamlessly across different hardware setups. All these reasons make them a strong option for the early stage of your experimentation process.
283283

284284
### Primary metric
285285

286286
An AutoML training job uses a primary metric for model optimization and hyperparameter tuning. The primary metric depends on the task type as shown below; other primary metric values are currently not supported.
287287

288-
* Accuracy for image classification
289-
* Intersection over union for image classification multilabel
290-
* Mean average precision for image object detection
291-
* Mean average precision for image instance segmentation
288+
* [Accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html) for image classification
289+
* [Intersection over union]((https://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_score.html#sklearn.metrics.jaccard_score)) for image classification multilabel
290+
* [Mean average precision](en-us/azure/machine-learning/how-to-understand-automated-ml#object-detection-and-instance-segmentation-metrics) for image object detection
291+
* [Mean average precision](en-us/azure/machine-learning/how-to-understand-automated-ml#object-detection-and-instance-segmentation-metrics) for image instance segmentation
292292

293293
### Job limits
294294

295295
You can control the resources spent on your AutoML Image training job by specifying the `timeout_minutes`, `max_trials` and the `max_concurrent_trials` for the job in limit settings as described in the below example.
296296

297297
Parameter | Detail
298298
-----|----
299-
`max_trials` | Parameter for maximum number of configurations to sweep. Must be an integer between 1 and 1000. When exploring just the default hyperparameters for a given model architecture, set this parameter to 1. The default value is 1.
300-
`max_concurrent_trials`| Maximum number of runs that can run concurrently. If specified, must be an integer between 1 and 100. The default value is 1. <br><br> **NOTE:** <li> The number of concurrent runs is gated on the resources available in the specified compute target. Ensure that the compute target has the available resources for the desired concurrency. <li> `max_concurrent_trials` is capped at `max_trials` internally. For example, if user sets `max_concurrent_trials=4`, `max_trials=2`, values would be internally updated as `max_concurrent_trials=2`, `max_trials=2`.
299+
`max_trials` | Parameter for maximum number of trials to sweep. Must be an integer between 1 and 1000. When exploring just the default hyperparameters for a given model architecture, set this parameter to 1. The default value is 1.
300+
`max_concurrent_trials`| Maximum number of trials that can run concurrently. If specified, must be an integer between 1 and 100. The default value is 1. <br><br> **NOTE:** <li> The number of concurrent trials is gated on the resources available in the specified compute target. Ensure that the compute target has the available resources for the desired concurrency. <li> `max_concurrent_trials` is capped at `max_trials` internally. For example, if user sets `max_concurrent_trials=4`, `max_trials=2`, values would be internally updated as `max_concurrent_trials=2`, `max_trials=2`.
301301
`timeout_minutes`| The amount of time in minutes before the experiment terminates. If none specified, default experiment timeout_minutes is seven days (maximum 60 days)
302302

303303
# [Azure CLI](#tab/cli)
@@ -324,7 +324,7 @@ limits:
324324
> [!IMPORTANT]
325325
> This feature is currently in public preview. This preview version is provided without a service-level agreement. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
326326

327-
It is generally hard to predict the best model architecture and hyperparameters for a dataset. Also, in some cases the human time allocated to tuning hyperparameters may be limited. For computer vision tasks, you can specify a number of runs and the system will automatically determine the region of the hyperparameter space to sweep. You do not have to define a hyperparameter search space, a sampling method or an early termination policy.
327+
It is generally hard to predict the best model architecture and hyperparameters for a dataset. Also, in some cases the human time allocated to tuning hyperparameters may be limited. For computer vision tasks, you can specify a number of trials and the system will automatically determine the region of the hyperparameter space to sweep. You do not have to define a hyperparameter search space, a sampling method or an early termination policy.
328328

329329
#### Triggering AutoMode
330330

@@ -349,15 +349,15 @@ image_object_detection_job.set_limits(max_trials=10, max_concurrent_trials=2)
349349
```
350350
---
351351

352-
A number of runs between 10 and 20 will likely work well on many datasets. The [time budget](#job-limits) for the AutoML job can still be set, but we recommend doing this only if each trial may take a long time.
352+
A number of trials between 10 and 20 will likely work well on many datasets. The [time budget](#job-limits) for the AutoML job can still be set, but we recommend doing this only if each trial may take a long time.
353353

354354
> [!Warning]
355355
> Launching automatic sweeps via the UI is not supported at this time.
356356

357357

358-
### Individual runs
358+
### Individual trials
359359

360-
In individual runs, you directly control the model architecture and hyperparameters. The model architecture is passed via the `model_name` parameter.
360+
In individual trials, you directly control the model architecture and hyperparameters. The model architecture is passed via the `model_name` parameter.
361361

362362
#### Supported model architectures
363363

@@ -441,7 +441,7 @@ search_space:
441441

442442
You can define the model architectures and hyperparameters to sweep in the parameter space. You can either specify a single model architecture or multiple ones.
443443

444-
* See [Individual runs](#individual-runs) for the list of supported model architectures for each task type.
444+
* See [Individual trials](#individual-trials) for the list of supported model architectures for each task type.
445445
* See [Hyperparameters for computer vision tasks](reference-automl-images-hyperparameters.md) hyperparameters for each computer vision task type.
446446
* See [details on supported distributions for discrete and continuous hyperparameters](how-to-tune-hyperparameters.md#define-the-search-space).
447447

@@ -460,7 +460,7 @@ When sweeping hyperparameters, you need to specify the sampling method to use fo
460460

461461
#### Early termination policies
462462

463-
You can automatically end poorly performing runs with an early termination policy. Early termination improves computational efficiency, saving compute resources that would have been otherwise spent on less promising configurations. Automated ML for images supports the following early termination policies using the `early_termination` parameter. If no termination policy is specified, all configurations are run to completion.
463+
You can automatically end poorly performing trials with an early termination policy. Early termination improves computational efficiency, saving compute resources that would have been otherwise spent on less promising trials. Automated ML for images supports the following early termination policies using the `early_termination` parameter. If no termination policy is specified, all trials are run to completion.
464464

465465

466466
| Early termination policy | AutoML Job syntax |
@@ -607,12 +607,12 @@ In our experiments, we found that these augmentations help the model to generali
607607

608608
## Incremental training (optional)
609609

610-
Once the training run is done, you have the option to further train the model by loading the trained model checkpoint. You can either use the same dataset or a different one for incremental training.
610+
Once the training job is done, you have the option to further train the model by loading the trained model checkpoint. You can either use the same dataset or a different one for incremental training.
611611

612612

613-
### Pass the checkpoint via run ID
613+
### Pass the checkpoint via job ID
614614

615-
You can pass the run ID that you want to load the checkpoint from.
615+
You can pass the job ID that you want to load the checkpoint from.
616616

617617
# [Azure CLI](#tab/cli)
618618

@@ -628,10 +628,10 @@ training_parameters:
628628

629629
[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
630630

631-
To find the run ID from the desired model, you can use the following code.
631+
To find the job ID from the desired model, you can use the following code.
632632

633633
```python
634-
# find a run id to get a model checkpoint from
634+
# find a job id to get a model checkpoint from
635635
import mlflow
636636

637637
# Obtain the tracking URL from MLClient
@@ -645,11 +645,11 @@ from mlflow.tracking.client import MlflowClient
645645
mlflow_client = MlflowClient()
646646
mlflow_parent_run = mlflow_client.get_run(automl_job.name)
647647

648-
# Fetch the id of the best automl child run.
648+
# Fetch the id of the best automl child trial.
649649
target_checkpoint_run_id = mlflow_parent_run.data.tags["automl_best_child_run_id"]
650650
```
651651

652-
To pass a checkpoint via the run ID, you need to use the `checkpoint_run_id` parameter in `set_training_parameters` function.
652+
To pass a checkpoint via the job ID, you need to use the `checkpoint_run_id` parameter in `set_training_parameters` function.
653653

654654
```python
655655
image_object_detection_job = automl.image_object_detection(
@@ -697,18 +697,18 @@ When you've configured your AutoML Job to the desired settings, you can submit t
697697

698698
## Outputs and evaluation metrics
699699

700-
The automated ML training runs generates output model files, evaluation metrics, logs and deployment artifacts like the scoring file and the environment file which can be viewed from the outputs and logs and metrics tab of the child runs.
700+
The automated ML training jobs generates output model files, evaluation metrics, logs and deployment artifacts like the scoring file and the environment file which can be viewed from the outputs and logs and metrics tab of the child jobs.
701701

702702
> [!TIP]
703-
> Check how to navigate to the run results from the [View run results](how-to-understand-automated-ml.md#view-job-results) section.
703+
> Check how to navigate to the job results from the [View job results](how-to-understand-automated-ml.md#view-job-results) section.
704704

705-
For definitions and examples of the performance charts and metrics provided for each run, see [Evaluate automated machine learning experiment results](how-to-understand-automated-ml.md#metrics-for-image-models-preview).
705+
For definitions and examples of the performance charts and metrics provided for each job, see [Evaluate automated machine learning experiment results](how-to-understand-automated-ml.md#metrics-for-image-models-preview).
706706

707707
## Register and deploy model
708708

709-
Once the run completes, you can register the model that was created from the best run (configuration that resulted in the best primary metric). You can either register the model after downloading or by specifying the azureml path with corresponding jobid. Note: If you want to change the inference settings that are described below you need to download the model and change settings.json and register using the updated model folder.
709+
Once the job completes, you can register the model that was created from the best trial (configuration that resulted in the best primary metric). You can either register the model after downloading or by specifying the azureml path with corresponding jobid. Note: If you want to change the inference settings that are described below you need to download the model and change settings.json and register using the updated model folder.
710710

711-
### Get the best run
711+
### Get the best trial
712712

713713
# [Azure CLI](#tab/cli)
714714

@@ -864,7 +864,7 @@ az ml online-endpoint update --name 'od-fridge-items-endpoint' --traffic 'od-fri
864864

865865

866866
Alternatively You can deploy the model from the [Azure Machine Learning studio UI](https://ml.azure.com/).
867-
Navigate to the model you wish to deploy in the **Models** tab of the automated ML run and select on **Deploy** and select **Deploy to real-time endpoint** .
867+
Navigate to the model you wish to deploy in the **Models** tab of the automated ML job and select on **Deploy** and select **Deploy to real-time endpoint** .
868868

869869
![Screenshot of how the Deployment page looks like after selecting the Deploy option.](./media/how-to-auto-train-image-models/deploy-end-point.png).
870870

@@ -1102,7 +1102,7 @@ image_object_detection_job = automl.image_object_detection(
11021102

11031103
### Streaming image files from storage
11041104

1105-
By default, all image files are downloaded to disk prior to model training. If the size of the image files is greater than available disk space, the run will fail. Instead of downloading all images to disk, you can select to stream image files from Azure storage as they're needed during training. Image files are streamed from Azure storage directly to system memory, bypassing disk. At the same time, as many files as possible from storage are cached on disk to minimize the number of requests to storage.
1105+
By default, all image files are downloaded to disk prior to model training. If the size of the image files is greater than available disk space, the job will fail. Instead of downloading all images to disk, you can select to stream image files from Azure storage as they're needed during training. Image files are streamed from Azure storage directly to system memory, bypassing disk. At the same time, as many files as possible from storage are cached on disk to minimize the number of requests to storage.
11061106

11071107
> [!NOTE]
11081108
> If streaming is enabled, ensure the Azure storage account is located in the same region as compute to minimize cost and latency.

articles/machine-learning/how-to-inference-onnx-automl-image-models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1144,7 +1144,7 @@ for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
11441144
print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))
11451145
```
11461146

1147-
For multi-class and multi-label classification, you can follow the same steps mentioned earlier for all the supported architectures in AutoML.
1147+
For multi-class and multi-label classification, you can follow the same steps mentioned earlier for all the supported model architectures in AutoML.
11481148

11491149

11501150
# [Object detection with Faster R-CNN or RetinaNet](#tab/object-detect-cnn)

0 commit comments

Comments
 (0)