Skip to content

Commit 38bfc55

Browse files
authored
Merge pull request #103159 from nibaccam/automl-new-ui
AutoML | UI how-to and tutorial updates
2 parents fe56010 + 55136c9 commit 38bfc55

File tree

4 files changed

+47
-40
lines changed

4 files changed

+47
-40
lines changed

articles/machine-learning/concept-automated-ml.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -93,11 +93,12 @@ In every automated machine learning experiment, your data is automatically scale
9393

9494
### Advanced preprocessing: optional featurization
9595

96-
Additional advanced preprocessing and featurization are also available, such as data guardrails, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#preprocess). Enable this setting with:
96+
Additional advanced preprocessing and featurization are also available, such as data guardrails, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#featurization).
97+
Enable this setting with:
9798

98-
+ Azure Machine Learning studio: Selecting the **View featurization settings** in the **Configuration Run** section [with these steps](how-to-create-portal-experiments.md).
99+
+ Azure Machine Learning studio: Enable **Automatic featurization** in the **View additional configuration** section [with these steps](how-to-create-portal-experiments.md#create-and-run-experiment).
99100

100-
+ Python SDK: Specifying `"feauturization": auto' / 'off' / FeaturizationConfig` for the [`AutoMLConfig` class](/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig).
101+
+ Python SDK: Specifying `"feauturization": 'auto' / 'off' / 'FeaturizationConfig'` for the [`AutoMLConfig` class](/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig).
101102

102103
## Prevent over-fitting
103104

articles/machine-learning/how-to-configure-auto-train.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -184,14 +184,20 @@ Learn about the specific definitions of these metrics in [Understand automated m
184184

185185
### Data featurization
186186

187-
In every automated machine learning experiment, your data is [automatically scaled and normalized](concept-automated-ml.md#preprocess) to help *certain* algorithms that are sensitive to features that are on different scales. However, you can also enable additional featurization, such as missing values imputation, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#preprocess).
187+
In every automated machine learning experiment, your data is [automatically scaled and normalized](concept-automated-ml.md#preprocess) to help *certain* algorithms that are sensitive to features that are on different scales. However, you can also enable additional featurization, such as missing values imputation, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#featurization).
188188

189-
To enable this featurization, specify `"featurization": 'auto'` for the [`AutoMLConfig` class](https://docs.microsoft.com/python/api/azureml-train-automl/azureml.train.automl.automlconfig?view=azure-ml-py).
189+
When configuring your experiments, you can enable the advanced setting `featurization`. The following table shows the accepted settings for featurization in the [`AutoMLConfig` class](https://docs.microsoft.com/python/api/azureml-train-automl/azureml.train.automl.automlconfig?view=azure-ml-py).
190+
191+
|Featurization Configuration | Description |
192+
| ------------- | ------------- |
193+
|`"featurization":` `'FeaturizationConfig'`| Indicates customized featurization step should be used. [Learn how to customize featurization](how-to-configure-auto-train.md#customize-feature-engineering).|
194+
|`"featurization": 'off'`| Indicates featurization step should not be done automatically.|
195+
|`"featurization": 'auto'`| Indicates that as part of preprocessing, [data guardrails and featurization steps](how-to-create-portal-experiments.md#advanced-featurization-options) are performed automatically.|
190196

191197
> [!NOTE]
192-
> Automated machine learning pre-processing steps (feature normalization, handling missing data,
198+
> Automated machine learning featurization steps (feature normalization, handling missing data,
193199
> converting text to numeric, etc.) become part of the underlying model. When using the model for
194-
> predictions, the same pre-processing steps applied during training are applied to
200+
> predictions, the same featurization steps applied during training are applied to
195201
> your input data automatically.
196202

197203
### Time Series Forecasting
@@ -406,7 +412,7 @@ Use these 2 APIs on the first step of fitted model to understand more. See [thi
406412
|Transformations|List of transformations applied to input features to generate engineered features.|
407413

408414
### Customize feature engineering
409-
To customize feature engineering, specify `"feauturization":FeaturizationConfig`.
415+
To customize feature engineering, specify `"featurization": FeaturizationConfig`.
410416

411417
Supported customization includes:
412418

articles/machine-learning/how-to-create-portal-experiments.md

Lines changed: 19 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.author: nibaccam
1010
author: tsikiksr
1111
manager: cgronlun
1212
ms.reviewer: nibaccam
13-
ms.date: 11/04/2019
13+
ms.date: 02/04/2020
1414

1515
---
1616

@@ -43,7 +43,7 @@ Otherwise, you'll see a list of your recent automated machine learning experimen
4343

4444
## Create and run experiment
4545

46-
1. Select **+ Create Experiment** and populate the form.
46+
1. Select **+ New automated ML run** and populate the form.
4747

4848
1. Select a dataset from your storage container, or create a new dataset. Datasets can be created from local files, web urls, datastores, or Azure open datasets.
4949

@@ -109,16 +109,19 @@ Otherwise, you'll see a list of your recent automated machine learning experimen
109109

110110
1. Select forecast horizon: Indicate how many time units (minutes/hours/days/weeks/months/years) will the model be able to predict to the future. The further the model is required to predict into the future, the less accurate it will become. [Learn more about forecasting and forecast horizon](how-to-auto-train-forecast.md).
111111

112-
1. (Optional) Addition configurations: additional settings you can use to better control the training job. Otherwise, defaults are applied based on experiment selection and data.
112+
1. (Optional) View addition configuration settings: additional settings you can use to better control the training job. Otherwise, defaults are applied based on experiment selection and data.
113113

114114
Additional configurations|Description
115115
------|------
116116
Primary metric| Main metric used for scoring your model. [Learn more about model metrics](how-to-configure-auto-train.md#explore-model-metrics).
117-
Automatic featurization| Select to enable or disable the preprocessing done by automated machine learning. Preprocessing includes automatic data cleansing, preparing, and transformation to generate synthetic features. [Learn more about preprocessing](#preprocess).
117+
Automatic featurization| Select to enable or disable the preprocessing done by automated machine learning. Preprocessing includes automatic data cleansing, preparing, and transformation to generate synthetic features. Not supported for the time series forecasting task type. [Learn more about preprocessing](#featurization).
118+
Explain best model | Select to enable or disable to show explainability of the recommended best model
118119
Blocked algorithm| Select algorithms you want to exclude from the training job.
119120
Exit criterion| When any of these criteria are met, the training job is stopped. <br> *Training job time (hours)*: How long to allow the training job to run. <br> *Metric score threshold*: Minimum metric score for all pipelines. This ensures that if you have a defined target metric you want to reach, you do not spend more time on the training job than necessary.
120121
Validation| Select one of the cross validation options to use in the training job. [Learn more about cross validation](how-to-configure-auto-train.md).
121-
Concurrency| *Max concurrent iterations*: Maximum number of pipelines (iterations) to test in the training job. The job will not run more than the specified number of iterations. <br> *Max cores per iteration*: Select the multi-core limits you would like to use when using multi-core compute.
122+
Concurrency| *Max concurrent iterations*: Maximum number of pipelines (iterations) to test in the training job. The job will not run more than the specified number of iterations.
123+
124+
1. (Optional) View featurization settings: if you choose to enable **Automatic featurization** in the **Additional configuration settings** form, this form is where you specify which columns to perform those featurizations on, and select which statistical value to use for missing value imputations.
122125

123126
<a name="profile"></a>
124127

@@ -147,17 +150,13 @@ Skewness| Measure of how different this column's data is from a normal distribut
147150
Kurtosis| Measure of how heavily tailed this column's data is compared to a normal distribution.
148151

149152

150-
<a name="preprocess"></a>
153+
<a name="featurization"></a>
151154

152155
## Advanced featurization options
153156

154-
When configuring your experiments, you can enable the advanced setting `feauturization`.
157+
Automated machine learning offers preprocessing and data guardrails automatically, to help you identify and manage potential issues with your data.
155158

156-
|Featurization Configuration | Description |
157-
| ------------- | ------------- |
158-
|"feauturization" = 'FeaturizationConfig'| Indicates customized featurization step should be used. [Learn how to customize featurization](how-to-configure-auto-train.md#customize-feature-engineering).|
159-
|"feauturization" = 'off'| Indicates featurization step should not be done automatically.|
160-
|"feauturization" = 'auto'| Indicates that as part of preprocessing the following data guardrails and featurization steps are performed automatically.|
159+
### Preprocessing
161160

162161
|Preprocessing&nbsp;steps| Description |
163162
| ------------- | ------------- |
@@ -173,7 +172,7 @@ When configuring your experiments, you can enable the advanced setting `feauturi
173172

174173
### Data guardrails
175174

176-
Automated machine learning offers data guardrails to help you identify potential issues with your data (e.g., missing values, class imbalance) and help take corrective actions for improved results. There are many best practices that are available and can be applied to achieve reliable results.
175+
Data guardrails are applied automatically to help you identify potential issues with your data (e.g., missing values, class imbalance) and help take corrective actions for improved results. There are many best practices that are available and can be applied to achieve reliable results.
177176

178177
The following table describes the currently supported data guardrails, and the associated statuses that users may come across when submitting their experiment.
179178

@@ -187,14 +186,11 @@ Time-series data consistency|**Passed** <br><br><br><br> **Fixed** |<br> The sel
187186

188187
## Run experiment and view results
189188

190-
Select **Start** to run your experiment. The experiment preparing process can take up to 10 minutes. Training jobs can take an additional 2-3 minutes more for each pipeline to finish running.
189+
Select **Finish** to run your experiment. The experiment preparing process can take up to 10 minutes. Training jobs can take an additional 2-3 minutes more for each pipeline to finish running.
191190

192191
### View experiment details
193192

194-
>[!NOTE]
195-
> Select **Refresh** periodically to view the status of the run.
196-
197-
The **Run Detail** screen opens to the **Details** tab. This screen shows you a summary of the experiment run including the **Run status**.
193+
The **Run Detail** screen opens to the **Details** tab. This screen shows you a summary of the experiment run including a status bar at the top next to the run number.
198194

199195
The **Models** tab contains a list of the models created ordered by the metric score. By default, the model that scores the highest based on the chosen metric is at the top of the list. As the training job tries out more models, they are added to the list. Use this to get a quick comparison of the metrics for the models produced so far.
200196

@@ -214,18 +210,18 @@ Automated ML helps you with deploying the model without writing code:
214210

215211
1. You have a couple options for deployment.
216212

217-
+ Option 1: To deploy the best model (according to the metric criteria you defined), select Deploy Best Model from the Details tab.
213+
+ Option 1: To deploy the best model (according to the metric criteria you defined), select the **Deploy best model** button on the **Details** tab.
218214

219-
+ Option 2: To deploy a specific model iteration from this experiment, drill down on the model to open its Model details tab and select Deploy Model.
215+
+ Option 2: To deploy a specific model iteration from this experiment, drill down on the model to open its **Model details** tab and select **Deploy model**.
220216

221-
1. Populate the **Deploy Model** pane.
217+
1. Populate the **Deploy model** pane.
222218

223219
Field| Value
224220
----|----
225221
Name| Enter a unique name for your deployment.
226222
Description| Enter a description to better identify what this deployment is for.
227223
Compute type| Select the type of endpoint you want to deploy: *Azure Kubernetes Service (AKS)* or *Azure Container Instance (ACI)*.
228-
Name| *Applies to AKS only:* Select the name of the AKS cluster you wish to deploy to.
224+
Compute name| *Applies to AKS only:* Select the name of the AKS cluster you wish to deploy to.
229225
Enable authentication | Select to allow for token-based or key-based authentication.
230226
Use custom deployment assets| Enable this feature if you want to upload your own scoring script and environment file. [Learn more about scoring scripts](how-to-deploy-and-where.md#script).
231227

@@ -240,7 +236,7 @@ Now you have an operational web service to generate predictions! You can test th
240236

241237
## Next steps
242238

243-
* Try the end to end [tutorial for creating your first automated ML experiment with Azure Machine Learning](tutorial-first-experiment-automated-ml.md).
239+
* Try the end to end [tutorial for creating your first automated ML experiment with Azure Machine Learning studio](tutorial-first-experiment-automated-ml.md).
244240
* [Learn more about automated machine learning](concept-automated-ml.md) and Azure Machine Learning.
245241
* [Understand automated machine learning results](how-to-understand-automated-ml.md).
246242
* [Learn how to consume a web service](https://docs.microsoft.com/azure/machine-learning/how-to-consume-web-service).

articles/machine-learning/tutorial-first-experiment-automated-ml.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.topic: tutorial
99
ms.author: tzvikei
1010
author: tsikiksr
1111
ms.reviewer: nibaccam
12-
ms.date: 11/04/2019
12+
ms.date: 02/04/2020
1313

1414
# Customer intent: As a non-coding data scientist, I want to use automated machine learning techniques so that I can build a classification model.
1515
---
@@ -66,12 +66,16 @@ You complete the following experiment set-up and run steps in Azure Machine Lear
6666

6767
1. Create a new dataset by selecting **From local files** from the **+Create dataset** drop-down.
6868

69+
1. On the **Basic info** form, give your dataset a name and provide an optional description. Automated ML in Azure Machine Learning studio currently only supports tabular datasets, so the dataset type should default to Tabular.
70+
71+
1. Select **Next** on the bottom left
72+
73+
1. On the **Datastore and file selection** form, select the default datastore that was automatically set up during your workspace creation, **workspaceblobstore (Azure Blob Storage)**. This is where you'll upload your data file to make it available to your workspace.
74+
6975
1. Select **Browse**.
7076

7177
1. Choose the **bankmarketing_train.csv** file on your local computer. This is the file you downloaded as a [prerequisite](https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_train.csv).
7278

73-
1. Select **Tabular** as your dataset type.
74-
7579
1. Give your dataset a unique name and provide an optional description.
7680

7781
1. Select **Next** on the bottom left, to upload it to the default container that was automatically set up during your workspace creation.
@@ -133,18 +137,18 @@ You complete the following experiment set-up and run steps in Azure Machine Lear
133137
Blocked algorithms | Algorithms you want to exclude from the training job| None
134138
Exit criterion| If a criteria is met, the training job is stopped. |Training&nbsp;job&nbsp;time (hours): 1 <br> Metric&nbsp;score&nbsp;threshold: None
135139
Validation | Choose a cross-validation type and number of tests.|Validation type:<br>&nbsp;k-fold&nbsp;cross-validation <br> <br> Number of validations: 2
136-
Concurrency| The maximum number of parallel iterations executed and cores used per iteration| Max&nbsp;concurrent&nbsp;iterations: 5<br> Max&nbsp;cores&nbsp;per&nbsp;iteration: None
140+
Concurrency| The maximum number of parallel iterations executed per iteration| Max&nbsp;concurrent&nbsp;iterations: 5
137141

138142
Select **Save**.
139143

140-
1. Select **Finish** to run the experiment. The **Run Detail** screen opens with the **Run status** as the experiment preparation begins.
144+
1. Select **Finish** to run the experiment. The **Run Detail** screen opens with the **Run status** at the top as the experiment preparation begins.
141145

142146
>[!IMPORTANT]
143147
> Preparation takes **10-15 minutes** to prepare the experiment run.
144148
> Once running, it takes **2-3 minutes more for each iteration**.
145149
> Select **Refresh** periodically to see the status of the run as the experiment progresses.
146150
>
147-
> In production, you'd likely walk away for a bit. But for this tutorial, we suggest you start exploring the tested algorithms on the Models tab as they complete while the others are still running.
151+
> In production, you'd likely walk away for a bit. But for this tutorial, we suggest you start exploring the tested algorithms on the **Models** tab as they complete while the others are still running.
148152
149153
## Explore models
150154

@@ -162,11 +166,11 @@ Automated machine learning in Azure Machine Learning studio allows you to deploy
162166

163167
For this experiment, deployment to a web service means that the financial institution now has an iterative and scalable web solution for identifying potential fixed term deposit customers.
164168

165-
Once the run is complete, navigate back to the **Run Detail** page and select the **Models** tab. Select **Refresh**.
169+
Once the run is complete, navigate back to the **Run Detail** page and select the **Models** tab.
166170

167171
In this experiment context, **VotingEnsemble** is considered the best model, based on the **AUC_weighted** metric. We deploy this model, but be advised, deployment takes about 20 minutes to complete. The deployment process entails several steps including registering the model, generating resources, and configuring them for the web service.
168172

169-
1. Select the **Deploy Best Model** button in the bottom-left corner.
173+
1. Select the **Deploy best model** button in the bottom-left corner.
170174

171175
1. Populate the **Deploy a model** pane as follows:
172176

@@ -214,7 +218,7 @@ In this automated machine learning tutorial, you used Azure Machine Learning stu
214218
> [!div class="nextstepaction"]
215219
> [Consume a web service](how-to-consume-web-service.md#consume-the-service-from-power-bi)
216220
217-
+ Learn more about [preprocessing](how-to-create-portal-experiments.md#preprocess).
221+
+ Learn more about [featurization](how-to-create-portal-experiments.md#featurization).
218222
+ Learn more about [data profiling](how-to-create-portal-experiments.md#profile).
219223
+ Learn more about [automated machine learning](concept-automated-ml.md).
220224
+ For more information on classification metrics and charts, see the [Understand automated machine learning results](how-to-understand-automated-ml.md#classification) article.

0 commit comments

Comments
 (0)