Minor edits.

TimShererWithAquent · TimShererWithAquent · commit 6987ec3650fd · 2024-09-27T08:49:42.000-07:00
diff --git a/articles/machine-learning/how-to-configure-auto-train.md b/articles/machine-learning/how-to-configure-auto-train.md
@@ -19,7 +19,7 @@ show_latex: true
 
 [!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
 
-In this guide, learn how to set up an automated machine learning (AutoML) training job with the [Azure Machine Learning Python SDK v2](/python/api/overview/azure/ml/intro). AutoML picks an algorithm and hyperparameters for you and generates a model ready for deployment. This guide provides details of the various options that you can use to configure automated ML experiments.
+In this article, learn how to set up an automated machine learning (AutoML) training job with the [Azure Machine Learning Python SDK v2](/python/api/overview/azure/ml/intro). Automated ML picks an algorithm and hyperparameters for you and generates a model ready for deployment. This article provides details of the various options that you can use to configure automated machine learning experiments.
 
 If you prefer a no-code experience, you can also [Set up no-code Automated ML training for tabular data with the studio UI](how-to-use-automated-ml-for-ml-models.md).
 
@@ -35,7 +35,7 @@ To use the **SDK** information, install the Azure Machine Learning [SDK v2 for P
 To install the SDK, you can either:
 
 - Create a compute instance, which already has the latest Azure Machine Learning Python SDK and is configured for ML workflows. For more information, see [Create an Azure Machine Learning compute instance](how-to-create-compute-instance.md).
-- Install the SDK on your local machine
+- Install the SDK on your local machine.
 
 # [Azure CLI](#tab/azurecli)
 
@@ -73,7 +73,7 @@ except Exception as ex:
 
 # [Azure CLI](#tab/azurecli)
 
-In the CLI, begin by logging into your Azure account. If you account is associated with multiple subscriptions, you need to [set the subscription](/cli/azure/manage-azure-subscriptions-azure-cli#change-the-active-subscription).
+In the CLI, begin by signing into your Azure account. If you account is associated with multiple subscriptions, you need to [set the subscription](/cli/azure/manage-azure-subscriptions-azure-cli#change-the-active-subscription).
 
 ```azurecli
 az login
@@ -98,7 +98,7 @@ In order to provide training data in SDK v2, you need to upload it into the clou
 Requirements for loading data into an MLTable:
 
 - Data must be in tabular form.
-- The value to predict, target column, must be in the data.
+- The value to predict, *target column*, must be in the data.
 
 Training data must be accessible from the remote compute. Automated ML v2 (Python SDK and CLI/YAML) accepts MLTable data assets (v2). For backwards compatibility, it also supports v1 Tabular Datasets from v1, a registered Tabular Dataset, through the same input dataset properties. We recommend that you use MLTable, available in v2. In this example, the data is stored at the local path, *./train_data/bank_marketing_train_data.csv*.
 
@@ -121,7 +121,7 @@ This code creates a new file, *./train_data/MLTable*, which contains the file fo
 
 # [Azure CLI](#tab/azurecli)
 
-The following YAML code is the definition of a MLTable that is placed in a local folder or a remote folder in the cloud, along with the data file, which is a *.csv* or Parquet file. In this case, write the YAML text to the local file, *./train_data/MLTable*.
+The following YAML code is the definition of a MLTable that is placed in a local folder or a remote folder in the cloud, along with the data file. The data file is a *.csv* or Parquet file. In this case, write the YAML text to the local file, *./train_data/MLTable*.
 
 ```yml
 $schema: https://azuremlschemas.azureedge.net/latest/MLTable.schema.json
@@ -144,20 +144,20 @@ For more information on MLTable, see [Working with tables in Azure Machine Learn
 
 You can specify separate *training data and validation data sets*. Training data must be provided to the `training_data` parameter in the factory function of your automated machine learning job.
 
-If you don't explicitly specify a `validation_data` or `n_cross_validation` parameter, AutoML applies default techniques to determine how validation is performed. This determination depends on the number of rows in the dataset assigned to your `training_data` parameter.
+If you don't explicitly specify a `validation_data` or `n_cross_validation` parameter, Automated ML applies default techniques to determine how validation is performed. This determination depends on the number of rows in the dataset assigned to your `training_data` parameter.
 
 | Training&nbsp;data&nbsp;size | Validation technique |
 |:---|:-----|
-| **Larger&nbsp;than&nbsp;20,000&nbsp;rows** | Train/validation data split is applied. The default is to take 10% of the initial training data set as the validation set. In turn, that validation set is used for metrics calculation. |
+| **Larger&nbsp;than&nbsp;20,000&nbsp;rows** | Training and validation data split is applied. The default is to take 10% of the initial training data set as the validation set. In turn, that validation set is used for metrics calculation. |
 | **Smaller&nbsp;than&nbsp;or&nbsp;equal&nbsp;to&nbsp;20,000&nbsp;rows** | Cross-validation approach is applied. The default number of folds depends on the number of rows. <br> **If the dataset is fewer than 1,000 rows**, ten folds are used. <br> **If the rows are equal to or between 1,000 and 20,000**, three folds are used. |
 
 ## Compute to run experiment
 
-Automated machine learning jobs with the Python SDK v2 (or CLI v2) are currently only supported on Azure Machine Learning remote compute (cluster or compute instance). For more information about creating compute with the Python SDKv2 or CLIv2, see [Train models with Azure Machine Learning CLI, SDK, and REST API](./how-to-train-model.md).
+Automated machine learning jobs with the Python SDK v2 (or CLI v2) are currently only supported on Azure Machine Learning remote compute cluster or compute instance. For more information about creating compute with the Python SDKv2 or CLIv2, see [Train models with Azure Machine Learning CLI, SDK, and REST API](./how-to-train-model.md).
 
 ## Configure your experiment settings
 
-There are several options that you can use to configure your automated ML experiment. These configuration parameters are set in your task method. You can also set job training settings and [exit criteria](#exit-criteria) with the `training` and `limits` settings.
+There are several options that you can use to configure your automated machine learning experiment. These configuration parameters are set in your task method. You can also set job training settings and [exit criteria](#exit-criteria) with the `training` and `limits` settings.
 
 The following example shows the required parameters for a classification task that specifies accuracy as the [primary metric](#primary-metric) and five cross-validation folds.
 
@@ -239,9 +239,9 @@ training:
 
 ### Select your machine learning task type
 
-Before you can submit your automated machine learning job, determine the kind of machine learning problem that you want to solve. This problem determines which function your job uses and what model algorithms it applies.
+Before you can submit your Automated ML job, determine the kind of machine learning problem that you want to solve. This problem determines which function your job uses and what model algorithms it applies.
 
-AutoML supports different task types:
+Automated ML supports different task types:
 
 - Tabular data based tasks
 
@@ -309,7 +309,7 @@ To learn about the specific definitions of these metrics, see [Evaluate automate
 
 #### Metrics for classification multi-class scenarios
 
-These metrics apply for all classification scenarios, including tabular data, images or computer-vision, and NLP-Text.
+These metrics apply for all classification scenarios, including tabular data, images or computer-vision, and natural language processing text (NLP-Text).
 
 Threshold-dependent metrics, like `accuracy`, `recall_score_weighted`, `norm_macro_recall`, and `precision_score_weighted` might not optimize as well for datasets that are small, have large class skew (class imbalance), or when the expected metric value is very close to 0.0 or 1.0. In those cases, `AUC_weighted` can be a better choice for the primary metric. After automated machine learning completes, you can choose the winning model based on the metric best suited to your business needs.
 
@@ -342,7 +342,7 @@ The main difference between `r2_score` and `normalized_root_mean_squared_error`
 
 If the rank, instead of the exact value, is of interest, `spearman_correlation` can be a better choice. It measures the rank correlation between real values and predictions.
 
-AutoML doesn't currently support any primary metrics that measure *relative* difference between predictions and observations. The metrics `r2_score`, `normalized_mean_absolute_error`, and `normalized_root_mean_squared_error` are all measures of absolute difference. For example, if a prediction differs from an observation by 10 units, these metrics compute the same value if the observation is 20 units or 20,000 units. In contrast, a percentage difference, which is a relative measure, gives errors of 50% and 0.05%, respectively. To optimize for relative difference, you can run AutoML with a supported primary metric and then select the model with the best `mean_absolute_percentage_error` or `root_mean_squared_log_error`. These metrics are undefined when any observation values are zero, so they might not always be good choices.
+Automated ML doesn't currently support any primary metrics that measure *relative* difference between predictions and observations. The metrics `r2_score`, `normalized_mean_absolute_error`, and `normalized_root_mean_squared_error` are all measures of absolute difference. For example, if a prediction differs from an observation by 10 units, these metrics compute the same value if the observation is 20 units or 20,000 units. In contrast, a percentage difference, which is a relative measure, gives errors of 50% and 0.05%, respectively. To optimize for relative difference, you can run Automated ML with a supported primary metric and then select the model with the best `mean_absolute_percentage_error` or `root_mean_squared_log_error`. These metrics are undefined when any observation values are zero, so they might not always be good choices.
 
 | Metric | Example use cases |
 |:------ |:------- |
@@ -382,7 +382,7 @@ The following table shows the accepted settings for featurization.
 
 | Featurization Configuration | Description |
 |:------------- |:------------- |
-| `"mode": 'auto'` | Indicates that, as part of preprocessing, [data guardrails and featurization steps](./v1/how-to-configure-auto-features.md#featurization) are performed automatically. **Default setting**. |
+| `"mode": 'auto'` | Indicates that, as part of preprocessing, [data guardrails and featurization steps](./v1/how-to-configure-auto-features.md#featurization) are performed automatically. This value is the default setting. |
 | `"mode": 'off'` | Indicates featurization step shouldn't be done automatically. |
 | `"mode":`&nbsp;`'custom'` | Indicates customized featurization step should be used. |
 
@@ -506,23 +506,23 @@ az ml job show -n $run_id --web
 
 ### Multiple child runs on clusters
 
-AutoML experiment child runs can be performed on a cluster that is already running another experiment. However, the timing depends on how many nodes the cluster has, and if those nodes are available to run a different experiment.
+Automated ML experiment child runs can be performed on a cluster that is already running another experiment. However, the timing depends on how many nodes the cluster has, and if those nodes are available to run a different experiment.
 
-Each node in the cluster acts as an individual virtual machine (VM) that can accomplish a single training run. For automated ML, this fact means a child run. If all the nodes are busy, a new experiment is queued. If there are free nodes, the new experiment runs child runs in parallel in the available nodes or virtual machines.
+Each node in the cluster acts as an individual virtual machine (VM) that can accomplish a single training run. For Automated ML, this fact means a child run. If all the nodes are busy, a new experiment is queued. If there are free nodes, the new experiment runs child runs in parallel in the available nodes or virtual machines.
 
 To help manage child runs and when they can be performed, we recommend that you create a dedicated cluster per experiment, and match the number of `max_concurrent_iterations` of your experiment to the number of nodes in the cluster. This way, you use all the nodes of the cluster at the same time with the number of concurrent child runs and iterations that you want.
 
 Configure `max_concurrent_iterations` in the `limits` configuration. If it isn't configured, then by default only one concurrent child run/iteration is allowed per experiment. For a compute instance, `max_concurrent_trials` can be set to be the same as number of cores on the compute instance virtual machine.
 
 ## Explore models and metrics
 
-AutoML offers options for you to monitor and evaluate your training results.
+Automated ML offers options for you to monitor and evaluate your training results.
 
 - For definitions and examples of the performance charts and metrics provided for each run, see [Evaluate automated machine learning experiment results](how-to-understand-automated-ml.md).
 
 - To get a featurization summary and understand what features were added to a particular model, see [Featurization transparency](./v1/how-to-configure-auto-features.md#featurization-transparency).
 
-From Azure Machine Learning UI at the model's page, you can also view the hyper-parameters used when you train a particular model and also view and customize the internal model's training code used.
+From the Azure Machine Learning UI at the model's page, you can also view the hyper-parameters used when you train a particular model and also view and customize the internal model's training code used.
 
 ## Register and deploy models
 
@@ -533,9 +533,9 @@ After you test a model and confirm you want to use it in production, you can reg
 
 ## Use AutoML in pipelines
 
-To use AutoML in your machine learning operations workflows, you can add AutoML Job steps to your [Azure Machine Learning Pipelines](./how-to-create-component-pipeline-python.md). This approach allows you to automate your entire workflow by hooking up your data preparation scripts to AutoML. Then register and validate the resulting best model.
+To use Automated ML in your machine learning operations workflows, you can add AutoML Job steps to your [Azure Machine Learning Pipelines](./how-to-create-component-pipeline-python.md). This approach allows you to automate your entire workflow by hooking up your data preparation scripts to Automated ML. Then register and validate the resulting best model.
 
-This code is a [sample pipeline](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1h_automl_in_pipeline/automl-classification-bankmarketing-in-pipeline) with an AutoML classification component and a command component that shows the resulting output. The code references the inputs (training and validation data) and the outputs (best model) in different steps.
+This code is a [sample pipeline](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1h_automl_in_pipeline/automl-classification-bankmarketing-in-pipeline) with an Automated ML classification component and a command component that shows the resulting output. The code references the inputs (training and validation data) and the outputs (best model) in different steps.
 
 # [Python SDK](#tab/python)
 
@@ -594,7 +594,7 @@ returned_pipeline_job
 
 ```
 
-For more examples on how to include AutoML in your pipelines, see the [examples repository](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1h_automl_in_pipeline/).
+For more examples on how to include Automated ML in your pipelines, see the [examples repository](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1h_automl_in_pipeline/).
 
 # [Azure CLI](#tab/azurecli)
 
@@ -659,17 +659,17 @@ Now, you launch the pipeline run using the following command. The pipeline confi
 
 ## Use AutoML at scale: distributed training
 
-For large data scenarios, AutoML supports distributed training for a limited set of models:
+For large data scenarios, Automated ML supports distributed training for a limited set of models:
 
 | Distributed algorithm | Supported tasks | Data size limit (approximate) |
 |:--|:--|:--  |
-|[LightGBM](https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html) | Classification, regression | 1 TB |
-|[TCNForecaster](concept-automl-forecasting-deep-learning.md#introduction-to-tcnforecaster) | Forecasting | 200 GB |
+| [LightGBM](https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html) | Classification, regression | 1 TB |
+| [TCNForecaster](concept-automl-forecasting-deep-learning.md#introduction-to-tcnforecaster) | Forecasting | 200 GB |
 
 Distributed training algorithms automatically partition and distribute your data across multiple compute nodes for model training.
 
 > [!NOTE]
-> Cross-validation, ensemble models, ONNX support, and code generation are not currently supported in the distributed training mode. Also, AutoML can make choices such as restricting available featurizers and sub-sampling data used for validation, explainability, and model evaluation.
+> Cross-validation, ensemble models, ONNX support, and code generation are not currently supported in the distributed training mode. Also, Automatic ML can make choices such as restricting available featurizers and sub-sampling data used for validation, explainability, and model evaluation.
 
 ### Distributed training for classification and regression
 
@@ -726,7 +726,7 @@ To learn how distributed training works for forecasting tasks, see [forecasting
 |:-- |:--|
 | training_mode | Indicates training mode; `distributed` or `non_distributed`. Defaults to `non_distributed`. |
 | enable_dnn_training | Flag to enable deep neural network models. |
-| max_concurrent_trials | This is the maximum number of trial models to train in parallel. Defaults to 1. |
+| max_concurrent_trials | This value is the maximum number of trial models to train in parallel. Defaults to 1. |
 | max_nodes | The total number of nodes to use for training. This setting must be greater than or equal to 2. For forecasting tasks, each trial model is trained using $\text{max}\left(2, \text{floor}( \text{max\_nodes} / \text{max\_concurrent\_trials}) \right)$ nodes. |
 
 The following code sample shows an example of these settings for a forecasting job: