Skip to content

Commit 3f76f32

Browse files
authored
Merge pull request #175475 from nibaccam/automl-test-valid
AutoML | Test data + runs
2 parents 10db765 + 05b3f06 commit 3f76f32

File tree

7 files changed

+225
-67
lines changed

7 files changed

+225
-67
lines changed

articles/machine-learning/concept-automated-ml.md

Lines changed: 60 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ Automated machine learning, also referred to as automated ML or AutoML, is the p
1818

1919
Traditional machine learning model development is resource-intensive, requiring significant domain knowledge and time to produce and compare dozens of models. With automated machine learning, you'll accelerate the time it takes to get production-ready ML models with great ease and efficiency.
2020

21+
<a name="parity"></a>
22+
2123
## Ways to use AutoML in Azure Machine Learning
2224

2325
Azure Machine Learning offers the following two experiences for working with automated ML. See the following sections to understand [feature availability in each experience](#parity).
@@ -28,10 +30,6 @@ Azure Machine Learning offers the following two experiences for working with aut
2830
* [Tutorial: Create a classification model with automated ML in Azure Machine Learning](tutorial-first-experiment-automated-ml.md).
2931
* [Tutorial: Forecast demand with automated machine learning](tutorial-automated-ml-forecast.md)
3032

31-
<a name="parity"></a>
32-
33-
## AutoML settings and configuration
34-
3533
### Experiment settings
3634

3735
The following settings allow you to configure your automated ML experiment.
@@ -65,7 +63,7 @@ These settings can be applied to the best model as a result of your automated ML
6563
|**Enable voting ensemble & stack ensemble models**|||
6664
|**Show best model based on non-primary metric**|||
6765
|**Enable/disable ONNX model compatibility**|||
68-
|**Test the model** || |
66+
|**Test the model** || ✓ (preview)|
6967

7068
### Run control settings
7169

@@ -183,9 +181,65 @@ You can also inspect the logged run information, which [contains metrics](how-to
183181

184182
While model building is automated, you can also [learn how important or relevant features are](how-to-configure-auto-train.md#explain) to the generated models.
185183

186-
187184
> [!VIDEO https://www.microsoft.com/videoplayer/embed/RE2Xc9t]
188185
186+
<a name="local-remote"></a>
187+
188+
## Guidance on local vs. remote managed ML compute targets
189+
190+
The web interface for automated ML always uses a remote [compute target](concept-compute-target.md). But when you use the Python SDK, you will choose either a local compute or a remote compute target for automated ML training.
191+
192+
* **Local compute**: Training occurs on your local laptop or VM compute.
193+
* **Remote compute**: Training occurs on Machine Learning compute clusters.
194+
195+
### Choose compute target
196+
Consider these factors when choosing your compute target:
197+
198+
* **Choose a local compute**: If your scenario is about initial explorations or demos using small data and short trains (i.e. seconds or a couple of minutes per child run), training on your local computer might be a better choice. There is no setup time, the infrastructure resources (your PC or VM) are directly available.
199+
* **Choose a remote ML compute cluster**: If you are training with larger datasets like in production training creating models which need longer trains, remote compute will provide much better end-to-end time performance because `AutoML` will parallelize trains across the cluster's nodes. On a remote compute, the start-up time for the internal infrastructure will add around 1.5 minutes per child run, plus additional minutes for the cluster infrastructure if the VMs are not yet up and running.
200+
201+
### Pros and cons
202+
Consider these pros and cons when choosing to use local vs. remote.
203+
204+
| | Pros (Advantages) |Cons (Handicaps) |
205+
|---------|---------|---------|---------|
206+
|**Local compute target** | <li> No environment start-up time | <li> Subset of features<li> Can't parallelize runs <li> Worse for large data. <li>No data streaming while training <li> No DNN-based featurization <li> Python SDK only |
207+
|**Remote ML compute clusters**| <li> Full set of features <li> Parallelize child runs <li> Large data support<li> DNN-based featurization <li> Dynamic scalability of compute cluster on demand <li> No-code experience (web UI) also available | <li> Start-up time for cluster nodes <li> Start-up time for each child run |
208+
209+
### Feature availability
210+
211+
More features are available when you use the remote compute, as shown in the table below.
212+
213+
| Feature | Remote | Local |
214+
|------------------------------------------------------------|--------|-------|
215+
| Data streaming (Large data support, up to 100 GB) || |
216+
| DNN-BERT-based text featurization and training || |
217+
| Out-of-the-box GPU support (training and inference) || |
218+
| Image Classification and Labeling support || |
219+
| Auto-ARIMA, Prophet and ForecastTCN models for forecasting || |
220+
| Multiple runs/iterations in parallel || |
221+
| Create models with interpretability in AutoML studio web experience UI || |
222+
| Feature engineering customization in studio web experience UI|| |
223+
| Azure ML hyperparameter tuning || |
224+
| Azure ML Pipeline workflow support || |
225+
| Continue a run || |
226+
| Forecasting |||
227+
| Create and run experiments in notebooks |||
228+
| Register and visualize experiment's info and metrics in UI |||
229+
| Data guardrails |||
230+
231+
## Training, validation and test data
232+
233+
With automated ML you provide the **training data** to train ML models, and you can specify what type of model validation to perform. Automated ML performs model validation as part of training. That is, automated ML uses **validation data** to tune model hyperparameters based on the applied algorithm to find the best combination that best fits the training data. However, the same validation data is used for each iteration of tuning, which introduces model evaluation bias since the model continues to improve and fit to the validation data.
234+
235+
To help confirm that such bias isn't applied to the final recommended model, automated ML supports the use of **test data** to evaluate the final model that automated ML recommends at the end of your experiment. When you provide test data as part of your AutoML experiment configuration, this recommended model is tested by default at the end of your experiment (preview).
236+
237+
>[!IMPORTANT]
238+
> Testing your models with a test dataset to evaluate generated models is a preview feature. This capability is an [experimental](/python/api/overview/azure/ml/#stable-vs-experimental) preview feature, and may change at any time.
239+
240+
Learn how to [configure AutoML experiments to use test data (preview) with the SDK](how-to-configure-cross-validation-data-splits.md#provide-test-data-preview) or with the [Azure Machine Learning studio](how-to-use-automated-ml-for-ml-models.md#create-and-run-experiment).
241+
242+
You can also [test any existing automated ML model (preview)](how-to-configure-auto-train.md#test-existing-automated-ml-model)), including models from child runs, by providing your own test data or by setting aside a portion of your training data.
189243

190244
## Feature engineering
191245

@@ -234,51 +288,6 @@ The [Caruana ensemble selection algorithm](http://www.niculescu-mizil.org/papers
234288

235289
See the [how-to](how-to-configure-auto-train.md#ensemble) for changing default ensemble settings in automated machine learning.
236290

237-
## <a name="local-remote"></a>Guidance on local vs. remote managed ML compute targets
238-
239-
The web interface for automated ML always uses a remote [compute target](concept-compute-target.md). But when you use the Python SDK, you will choose either a local compute or a remote compute target for automated ML training.
240-
241-
* **Local compute**: Training occurs on your local laptop or VM compute.
242-
* **Remote compute**: Training occurs on Machine Learning compute clusters.
243-
244-
### Choose compute target
245-
Consider these factors when choosing your compute target:
246-
247-
* **Choose a local compute**: If your scenario is about initial explorations or demos using small data and short trains (i.e. seconds or a couple of minutes per child run), training on your local computer might be a better choice. There is no setup time, the infrastructure resources (your PC or VM) are directly available.
248-
* **Choose a remote ML compute cluster**: If you are training with larger datasets like in production training creating models which need longer trains, remote compute will provide much better end-to-end time performance because `AutoML` will parallelize trains across the cluster's nodes. On a remote compute, the start-up time for the internal infrastructure will add around 1.5 minutes per child run, plus additional minutes for the cluster infrastructure if the VMs are not yet up and running.
249-
250-
### Pros and cons
251-
Consider these pros and cons when choosing to use local vs. remote.
252-
253-
| | Pros (Advantages) |Cons (Handicaps) |
254-
|---------|---------|---------|
255-
|**Local compute target** | <li> No environment start-up time | <li> Subset of features<li> Can't parallelize runs <li> Worse for large data. <li>No data streaming while training <li> No DNN-based featurization <li> Python SDK only |
256-
|**Remote ML compute clusters**| <li> Full set of features <li> Parallelize child runs <li> Large data support<li> DNN-based featurization <li> Dynamic scalability of compute cluster on demand <li> No-code experience (web UI) also available | <li> Start-up time for cluster nodes <li> Start-up time for each child run |
257-
258-
### Feature availability
259-
260-
More features are available when you use the remote compute, as shown in the table below.
261-
262-
| Feature | Remote | Local |
263-
|------------------------------------------------------------|--------|-------|
264-
| Data streaming (Large data support, up to 100 GB) || |
265-
| DNN-BERT-based text featurization and training || |
266-
| Out-of-the-box GPU support (training and inference) || |
267-
| Image classification (preview) and labeling support || |
268-
| Auto-ARIMA, Prophet and ForecastTCN models for forecasting || |
269-
| Multiple runs/iterations in parallel || |
270-
| Create models with interpretability in AutoML studio web experience UI || |
271-
| Feature engineering customization in studio web experience UI|| |
272-
| Azure ML hyperparameter tuning || |
273-
| Azure ML Pipeline workflow support || |
274-
| Continue a run || |
275-
| Forecasting |||
276-
| Computer vision (preview) || |
277-
| Create and run experiments in notebooks |||
278-
| Register and visualize experiment's info and metrics in UI |||
279-
| Data guardrails |||
280-
281-
282291
<a name="use-with-onnx"></a>
283292

284293
## AutoML & ONNX

articles/machine-learning/how-to-configure-auto-train.md

Lines changed: 62 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.reviewer: nibaccam
88
services: machine-learning
99
ms.service: machine-learning
1010
ms.subservice: automl
11-
ms.date: 10/21/2021
11+
ms.date: 11/15/2021
1212
ms.topic: how-to
1313
ms.custom: devx-track-python,contperf-fy21q1, automl, contperf-fy21q4, FY21Q4-aml-seo-hack, contperf-fy22q1
1414
---
@@ -86,7 +86,7 @@ dataset = Dataset.Tabular.from_delimited_files(data)
8686

8787
## Training, validation, and test data
8888

89-
You can specify separate **training data and validation data sets** directly in the `AutoMLConfig` constructor. Learn more about [how to configure data splits and cross validation](how-to-configure-cross-validation-data-splits.md) for your AutoML experiments.
89+
You can specify separate **training data and validation data sets** directly in the `AutoMLConfig` constructor. Learn more about [how to configure training, validation, cross validation, and test data](how-to-configure-cross-validation-data-splits.md) for your AutoML experiments.
9090

9191
If you do not explicitly specify a `validation_data` or `n_cross_validation` parameter, automated ML applies default techniques to determine how validation is performed. This determination depends on the number of rows in the dataset assigned to your `training_data` parameter.
9292

@@ -95,7 +95,15 @@ If you do not explicitly specify a `validation_data` or `n_cross_validation` par
9595
|**Larger&nbsp;than&nbsp;20,000&nbsp;rows**| Train/validation data split is applied. The default is to take 10% of the initial training data set as the validation set. In turn, that validation set is used for metrics calculation.
9696
|**Smaller&nbsp;than&nbsp;20,000&nbsp;rows**| Cross-validation approach is applied. The default number of folds depends on the number of rows. <br> **If the dataset is less than 1,000 rows**, 10 folds are used. <br> **If the rows are between 1,000 and 20,000**, then three folds are used.
9797

98-
At this time, you need to provide your own **test data** for model evaluation. For a code example of bringing your own test data for model evaluation see the **Test** section of [this Jupyter notebook](https://github.com/Azure/azureml-examples/blob/main/python-sdk/tutorials/automl-with-azureml/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb).
98+
99+
> [!TIP]
100+
> You can upload **test data (preview)** to evaluate models that automated ML generated for you. These features are [experimental](/python/api/overview/azure/ml/#stable-vs-experimental) preview capabilities, and may change at any time.
101+
> Learn how to:
102+
> * [Pass in test data to your AutoMLConfig object](how-to-configure-cross-validation-data-splits.md#provide-test-data-preview).
103+
> * [Test the models automated ML generated for your experiment](#test-models-preview).
104+
>
105+
> If you prefer a no-code experience, see [step 11 in Set up AutoML with the studio UI](how-to-use-automated-ml-for-ml-models.md#create-and-run-experiment)
106+
99107

100108
### Large data
101109

@@ -509,9 +517,59 @@ RunDetails(run).show()
509517

510518
![Jupyter notebook widget for Automated Machine Learning](./media/how-to-configure-auto-train/azure-machine-learning-auto-ml-widget.png)
511519

520+
## Test models (preview)
521+
522+
>[!IMPORTANT]
523+
> Testing your models with a test dataset to evaluate automated ML generated models is a preview feature. This capability is an [experimental](/python/api/overview/azure/ml/#stable-vs-experimental) preview feature, and may change at any time.
524+
525+
Passing the `test_data` or `test_size` parameters into the `AutoMLConfig`, automatically triggers a remote test run that uses the provided test data to evaluate the best model that automated ML recommends upon completion of the experiment. This remote test run is done at the end of the experiment, once the best model is determined. See how to [pass test data into your `AutoMLConfig`](how-to-configure-cross-validation-data-splits.md#provide-test-data-preview).
526+
527+
### Get test run results
528+
529+
You can get the predictions and metrics from the remote test run from the [Azure Machine Learning studio](how-to-use-automated-ml-for-ml-models.md#view-remote-test-run-results-preview) or with the following code.
530+
531+
```python
532+
best_run, fitted_model = remote_run.get_output()
533+
test_run = next(best_run.get_children(type='automl.model_test'))
534+
test_run.wait_for_completion(show_output=False, wait_post_processing=True)
535+
536+
# Get test metrics
537+
test_run_metrics = test_run.get_metrics()
538+
for name, value in test_run_metrics.items():
539+
print(f"{name}: {value}")
540+
541+
# Get test predictions as a Dataset
542+
test_run_details = test_run.get_details()
543+
dataset_id = test_run_details['outputDatasets'][0]['identifier']['savedId']
544+
test_run_predictions = Dataset.get_by_id(workspace, dataset_id)
545+
predictions_df = test_run_predictions.to_pandas_dataframe()
546+
547+
# Alternatively, the test predictions can be retrieved via the run outputs.
548+
test_run.download_file("predictions/predictions.csv")
549+
predictions_df = pd.read_csv("predictions.csv")
550+
551+
```
552+
553+
### Test existing automated ML model
554+
555+
To test other existing automated ML models created, best run or child run, use [`ModelProxy()`](/python/api/azureml-train-automl-client/azureml.train.automl.model_proxy.modelproxy) to test a model after the main AutoML run has completed. `ModelProxy()` already returns the predictions and metrics and does not require further processing to retrieve the outputs.
556+
557+
> [!NOTE]
558+
> ModelProxy is an [experimental](/python/api/overview/azure/ml/#stable-vs-experimental) preview class, and may change at any time.
559+
560+
The following code demonstrates how to test a model from any run by using [ModelProxy.test()](/python/api/azureml-train-automl-client/azureml.train.automl.model_proxy.modelproxy#test-test-data--azureml-data-abstract-dataset-abstractdataset--include-predictions-only--bool---false-----typing-tuple-azureml-data-abstract-dataset-abstractdataset--typing-dict-str--typing-any--) method. In the test() method you have the option to specify if you only want to see the predictions of the test run with the `include_predictions_only` parameter.
561+
562+
```python
563+
from azureml.train.automl.model_proxy import ModelProxy
564+
565+
model_proxy = ModelProxy(child_run=my_run, compute_target=cpu_cluster)
566+
predictions, metrics = model_proxy.test(test_data, include_predictions_only= True
567+
)
568+
```
569+
512570
## Register and deploy models
513571

514-
You can register a model, so you can come back to it for later use.
572+
After you test a model and confirm you want to use it in production, you can register it for later use and
515573

516574
To register a model from an automated ML run, use the [`register_model()`](/python/api/azureml-train-automl-client/azureml.train.automl.run.automlrun#register-model-model-name-none--description-none--tags-none--iteration-none--metric-none-) method.
517575

0 commit comments

Comments
 (0)