Merge pull request #103159 from nibaccam/automl-new-ui

PRMerger18 · web-flow · commit 38bfc5592106 · 2020-02-05T12:50:02.000-08:00
AutoML | UI how-to and tutorial updates
diff --git a/articles/machine-learning/concept-automated-ml.md b/articles/machine-learning/concept-automated-ml.md
@@ -93,11 +93,12 @@ In every automated machine learning experiment, your data is automatically scale
 
 ### Advanced preprocessing: optional featurization
 
-Additional advanced preprocessing and featurization are also available, such as data guardrails, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#preprocess). Enable this setting with:
+Additional advanced preprocessing and featurization are also available, such as data guardrails, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#featurization). 
+Enable this setting with:
 
-+ Azure Machine Learning studio: Selecting the **View featurization settings** in the **Configuration Run** section [with these steps](how-to-create-portal-experiments.md).
++ Azure Machine Learning studio: Enable **Automatic featurization** in the **View additional configuration** section [with these steps](how-to-create-portal-experiments.md#create-and-run-experiment).
 
-+ Python SDK: Specifying `"feauturization": auto' / 'off' / FeaturizationConfig` for the [`AutoMLConfig` class](/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig).
++ Python SDK: Specifying `"feauturization": 'auto' / 'off' / 'FeaturizationConfig'` for the [`AutoMLConfig` class](/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig). 
 
 ## Prevent over-fitting
 
diff --git a/articles/machine-learning/how-to-configure-auto-train.md b/articles/machine-learning/how-to-configure-auto-train.md
@@ -184,14 +184,20 @@ Learn about the specific definitions of these metrics in [Understand automated m
 
 ### Data featurization
 
-In every automated machine learning experiment, your data is [automatically scaled and normalized](concept-automated-ml.md#preprocess) to help *certain* algorithms that are sensitive to features that are on different scales.  However, you can also enable additional featurization, such as missing values imputation, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#preprocess).
+In every automated machine learning experiment, your data is [automatically scaled and normalized](concept-automated-ml.md#preprocess) to help *certain* algorithms that are sensitive to features that are on different scales.  However, you can also enable additional featurization, such as missing values imputation, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#featurization).
 
-To enable this featurization, specify `"featurization": 'auto'` for the [`AutoMLConfig` class](https://docs.microsoft.com/python/api/azureml-train-automl/azureml.train.automl.automlconfig?view=azure-ml-py).
+When configuring your experiments, you can enable the advanced setting `featurization`. The following table shows the accepted settings for featurization in the [`AutoMLConfig` class](https://docs.microsoft.com/python/api/azureml-train-automl/azureml.train.automl.automlconfig?view=azure-ml-py).
+
+|Featurization Configuration | Description |
+| ------------- | ------------- |
+|`"featurization":`&nbsp;`'FeaturizationConfig'`| Indicates customized featurization step should be used. [Learn how to customize featurization](how-to-configure-auto-train.md#customize-feature-engineering).|
+|`"featurization": 'off'`| Indicates featurization step should not be done automatically.|
+|`"featurization": 'auto'`| Indicates that as part of preprocessing, [data guardrails and featurization steps](how-to-create-portal-experiments.md#advanced-featurization-options) are performed automatically.|
 
 > [!NOTE]
-> Automated machine learning pre-processing steps (feature normalization, handling missing data,
+> Automated machine learning featurization steps (feature normalization, handling missing data,
 > converting text to numeric, etc.) become part of the underlying model. When using the model for
-> predictions, the same pre-processing steps applied during training are applied to
+> predictions, the same featurization steps applied during training are applied to
 > your input data automatically.
 
 ### Time Series Forecasting
@@ -406,7 +412,7 @@ Use these 2 APIs on the first step of fitted model to understand more.  See [thi
    |Transformations|List of transformations applied to input features to generate engineered features.|
    
 ### Customize feature engineering
-To customize feature engineering, specify `"feauturization":FeaturizationConfig`.
+To customize feature engineering, specify `"featurization": FeaturizationConfig`.
 
 Supported customization includes:
 
diff --git a/articles/machine-learning/how-to-create-portal-experiments.md b/articles/machine-learning/how-to-create-portal-experiments.md
@@ -10,7 +10,7 @@ ms.author: nibaccam
 author: tsikiksr
 manager: cgronlun
 ms.reviewer: nibaccam
-ms.date: 11/04/2019
+ms.date: 02/04/2020
 
 ---
 
@@ -43,7 +43,7 @@ Otherwise, you'll see a list of your recent automated machine learning experimen
 
 ## Create and run experiment
 
-1. Select **+ Create Experiment** and populate the form.
+1. Select **+ New automated ML run** and populate the form.
 
 1. Select a dataset from your storage container, or create a new dataset. Datasets can be created from local files, web urls, datastores, or Azure open datasets. 
 
@@ -109,16 +109,19 @@ Otherwise, you'll see a list of your recent automated machine learning experimen
 
         1. Select forecast horizon: Indicate how many time units (minutes/hours/days/weeks/months/years) will the model be able to predict to the future. The further the model is required to predict into the future, the less accurate it will become. [Learn more about forecasting and forecast horizon](how-to-auto-train-forecast.md).
 
-1. (Optional) Addition configurations: additional settings you can use to better control the training job. Otherwise, defaults are applied based on experiment selection and data. 
+1. (Optional) View addition configuration settings: additional settings you can use to better control the training job. Otherwise, defaults are applied based on experiment selection and data. 
 
     Additional configurations|Description
     ------|------
     Primary metric| Main metric used for scoring your model. [Learn more about model metrics](how-to-configure-auto-train.md#explore-model-metrics).
-    Automatic featurization| Select to enable or disable the preprocessing done by automated machine learning. Preprocessing includes automatic data cleansing, preparing, and transformation to generate synthetic features. [Learn more about preprocessing](#preprocess).
+    Automatic featurization| Select to enable or disable the preprocessing done by automated machine learning. Preprocessing includes automatic data cleansing, preparing, and transformation to generate synthetic features. Not supported for the time series forecasting task type. [Learn more about preprocessing](#featurization). 
+    Explain best model | Select to enable or disable to show explainability of the recommended best model
     Blocked algorithm| Select algorithms you want to exclude from the training job.
     Exit criterion| When any of these criteria are met, the training job is stopped. <br> *Training job time (hours)*: How long to allow the training job to run. <br> *Metric score threshold*:  Minimum metric score for all pipelines. This ensures that if you have a defined target metric you want to reach, you do not spend more time on the training job than necessary.
     Validation| Select one of the cross validation options to use in the training job. [Learn more about cross validation](how-to-configure-auto-train.md).
-    Concurrency| *Max concurrent iterations*: Maximum number of pipelines (iterations) to test in the training job. The job will not run more than the specified number of iterations. <br> *Max cores per iteration*: Select the multi-core limits you would like to use when using multi-core compute.
+    Concurrency| *Max concurrent iterations*: Maximum number of pipelines (iterations) to test in the training job. The job will not run more than the specified number of iterations.
+
+1. (Optional) View featurization settings: if you choose to enable **Automatic featurization** in the **Additional configuration settings** form, this form is where you specify which columns to perform those featurizations on, and select which statistical value to use for missing value imputations.
 
 <a name="profile"></a>
 
@@ -147,17 +150,13 @@ Skewness| Measure of how different this column's data is from a normal distribut
 Kurtosis| Measure of how heavily tailed this column's data is compared to a normal distribution.
 
 
-<a name="preprocess"></a>
+<a name="featurization"></a>
 
 ## Advanced featurization options
 
-When configuring your experiments, you can enable the advanced setting `feauturization`. 
+Automated machine learning offers preprocessing and data guardrails automatically, to help you identify and manage potential issues with your data. 
 
-|Featurization Configuration | Description |
-| ------------- | ------------- |
-|"feauturization" = 'FeaturizationConfig'| Indicates customized featurization step should be used. [Learn how to customize featurization](how-to-configure-auto-train.md#customize-feature-engineering).|
-|"feauturization" = 'off'| Indicates featurization step should not be done automatically.|
-|"feauturization" = 'auto'| Indicates that as part of preprocessing the following data guardrails and featurization steps are performed automatically.|
+### Preprocessing
 
 |Preprocessing&nbsp;steps| Description |
 | ------------- | ------------- |
@@ -173,7 +172,7 @@ When configuring your experiments, you can enable the advanced setting `feauturi
 
 ### Data guardrails
 
-Automated machine learning offers data guardrails to help you identify potential issues with your data (e.g., missing values, class imbalance) and help take corrective actions for improved results. There are many best practices that are available and can be applied to achieve reliable results. 
+Data guardrails are applied automatically to help you identify potential issues with your data (e.g., missing values, class imbalance) and help take corrective actions for improved results. There are many best practices that are available and can be applied to achieve reliable results. 
 
 The following table describes the currently supported data guardrails, and the associated statuses that users may come across when submitting their experiment.
 
@@ -187,14 +186,11 @@ Time-series data consistency|**Passed** <br><br><br><br> **Fixed** |<br> The sel
 
 ## Run experiment and view results
 
-Select **Start** to run your experiment. The experiment preparing process can take up to 10 minutes. Training jobs can take an additional 2-3 minutes more for each pipeline to finish running.
+Select **Finish** to run your experiment. The experiment preparing process can take up to 10 minutes. Training jobs can take an additional 2-3 minutes more for each pipeline to finish running.
 
 ### View experiment details
 
->[!NOTE]
-> Select **Refresh** periodically to view the status of the run. 
-
-The **Run Detail** screen opens to the **Details** tab. This screen shows you a summary of the experiment run including the **Run status**. 
+The **Run Detail** screen opens to the **Details** tab. This screen shows you a summary of the experiment run including a status bar at the top next to the run number. 
 
 The **Models** tab contains a list of the models created ordered by the metric score. By default, the model that scores the highest based on the chosen metric is at the top of the list. As the training job tries out more models, they are added to the list. Use this to get a quick comparison of the metrics for the models produced so far.
 
@@ -214,18 +210,18 @@ Automated ML helps you with deploying the model without writing code:
 
 1. You have a couple options for deployment. 
 
-    + Option 1: To deploy the best model (according to the metric criteria you defined), select Deploy Best Model from the Details tab.
+    + Option 1: To deploy the best model (according to the metric criteria you defined), select the **Deploy best model** button on the **Details** tab.
 
-    + Option 2: To deploy a specific model iteration from this experiment, drill down on the model to open its Model details tab and select Deploy Model.
+    + Option 2: To deploy a specific model iteration from this experiment, drill down on the model to open its **Model details** tab and select **Deploy model**.
 
-1. Populate the **Deploy Model** pane.
+1. Populate the **Deploy model** pane.
 
     Field| Value
     ----|----
     Name| Enter a unique name for your deployment.
     Description| Enter a description to better identify what this deployment is for.
     Compute type| Select the type of endpoint you want to deploy: *Azure Kubernetes Service (AKS)* or *Azure Container Instance (ACI)*.
-    Name| *Applies to AKS only:* Select the name of the AKS cluster you wish to deploy to.
+    Compute name| *Applies to AKS only:* Select the name of the AKS cluster you wish to deploy to.
     Enable authentication | Select to allow for token-based or key-based authentication.
     Use custom deployment assets| Enable this feature if you want to upload your own scoring script and environment file. [Learn more about scoring scripts](how-to-deploy-and-where.md#script).
 
@@ -240,7 +236,7 @@ Now you have an operational web service to generate predictions! You can test th
 
 ## Next steps
 
-* Try the end to end [tutorial for creating your first automated ML experiment with Azure Machine Learning](tutorial-first-experiment-automated-ml.md). 
+* Try the end to end [tutorial for creating your first automated ML experiment with Azure Machine Learning studio](tutorial-first-experiment-automated-ml.md). 
 * [Learn more about automated machine learning](concept-automated-ml.md) and Azure Machine Learning.
 * [Understand automated machine learning results](how-to-understand-automated-ml.md).
 * [Learn how to consume a web service](https://docs.microsoft.com/azure/machine-learning/how-to-consume-web-service).
diff --git a/articles/machine-learning/tutorial-first-experiment-automated-ml.md b/articles/machine-learning/tutorial-first-experiment-automated-ml.md
@@ -9,7 +9,7 @@ ms.topic: tutorial
 ms.author: tzvikei
 author: tsikiksr
 ms.reviewer: nibaccam
-ms.date: 11/04/2019
+ms.date: 02/04/2020
 
 # Customer intent: As a non-coding data scientist, I want to use automated machine learning techniques so that I can build a classification model.
 ---
@@ -66,12 +66,16 @@ You complete the following experiment set-up and run steps in Azure Machine Lear
 
 1. Create a new dataset by selecting **From local files** from the  **+Create dataset** drop-down. 
 
+    1. On the **Basic info** form, give your dataset a name and provide an optional description. Automated ML in Azure Machine Learning studio currently only supports tabular datasets, so the dataset type should default to Tabular.
+
+    1. Select **Next** on the bottom left
+
+    1. On the **Datastore and file selection** form, select the default datastore that was automatically set up during your workspace creation, **workspaceblobstore (Azure Blob Storage)**. This is where you'll upload your data file to make it available to your workspace.
+
     1. Select **Browse**.
     
     1. Choose the **bankmarketing_train.csv** file on your local computer. This is the file you downloaded as a [prerequisite](https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_train.csv).
 
-    1. Select **Tabular** as your dataset type. 
-
     1. Give your dataset a unique name and provide an optional description. 
 
     1. Select **Next** on the bottom left,  to  upload it to the default container that was automatically set up during your workspace creation.  
@@ -133,18 +137,18 @@ You complete the following experiment set-up and run steps in Azure Machine Lear
         Blocked algorithms | Algorithms you want to exclude from the training job| None
         Exit criterion| If a criteria is met, the training job is stopped. |Training&nbsp;job&nbsp;time (hours): 1 <br> Metric&nbsp;score&nbsp;threshold: None
         Validation | Choose a cross-validation type and number of tests.|Validation type:<br>&nbsp;k-fold&nbsp;cross-validation <br> <br> Number of validations: 2
-        Concurrency| The maximum number of parallel iterations executed and cores used per iteration| Max&nbsp;concurrent&nbsp;iterations: 5<br> Max&nbsp;cores&nbsp;per&nbsp;iteration: None
+        Concurrency| The maximum number of parallel iterations executed per iteration| Max&nbsp;concurrent&nbsp;iterations: 5
         
         Select **Save**.
 
-1. Select **Finish** to run the experiment. The **Run Detail**  screen opens with the **Run status** as the experiment preparation begins.
+1. Select **Finish** to run the experiment. The **Run Detail**  screen opens with the **Run status** at the top as the experiment preparation begins.
 
 >[!IMPORTANT]
 > Preparation takes **10-15 minutes** to prepare the experiment run.
 > Once running, it takes **2-3 minutes more for each iteration**.  
 > Select **Refresh** periodically to see the status of the run as the experiment progresses.
 >
-> In production, you'd likely walk away for a bit. But for this tutorial, we suggest you start exploring the tested algorithms on the Models tab as they complete while the others are still running. 
+> In production, you'd likely walk away for a bit. But for this tutorial, we suggest you start exploring the tested algorithms on the **Models** tab as they complete while the others are still running. 
 
 ##  Explore models
 
@@ -162,11 +166,11 @@ Automated machine learning in Azure Machine Learning studio allows you to deploy
 
 For this experiment, deployment to a web service means that the financial institution now has an iterative and scalable web solution for identifying potential fixed term deposit customers. 
 
-Once the run is complete, navigate back to the **Run Detail** page and select the **Models** tab. Select **Refresh**. 
+Once the run is complete, navigate back to the **Run Detail** page and select the **Models** tab.
 
 In this experiment context, **VotingEnsemble** is considered the best model, based on the **AUC_weighted** metric.  We deploy this model, but be advised, deployment takes about 20 minutes to complete. The deployment process entails several steps including registering the model, generating resources, and configuring them for the web service.
 
-1. Select the **Deploy Best Model** button in the bottom-left corner.
+1. Select the **Deploy best model** button in the bottom-left corner.
 
 1. Populate the **Deploy a model** pane as follows:
 
@@ -214,7 +218,7 @@ In this automated machine learning tutorial, you used Azure Machine Learning stu
 > [!div class="nextstepaction"]
 > [Consume a web service](how-to-consume-web-service.md#consume-the-service-from-power-bi)
 
-+ Learn more about [preprocessing](how-to-create-portal-experiments.md#preprocess).
++ Learn more about [featurization](how-to-create-portal-experiments.md#featurization).
 + Learn more about [data profiling](how-to-create-portal-experiments.md#profile).
 + Learn more about [automated machine learning](concept-automated-ml.md).
 + For more information on classification metrics and charts, see the [Understand automated machine learning results](how-to-understand-automated-ml.md#classification) article.