address doc review blockers

EricWrightAtWork · EricWrightAtWork · commit afdbe5234c58 · 2023-08-01T16:18:03.000-07:00
diff --git a/articles/machine-learning/concept-automl-forecasting-evaluation.md b/articles/machine-learning/concept-automl-forecasting-evaluation.md
@@ -75,7 +75,7 @@ Evaluation is the process of generating predictions on a test set held-out from
 
 The following diagram shows a simple example with three forecasting windows:
 
-:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-eval-diagram.png" alt-text="Diagram demonstrating a rolling forecast on a test set.":::
+:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-evaluation-diagram.png" alt-text="Diagram demonstrating a rolling forecast on a test set.":::
 
 The diagram illustrates three rolling evaluation parameters:
 
@@ -85,7 +85,7 @@ The diagram illustrates three rolling evaluation parameters:
 
 Importantly, the context advances along with the forecasting window. This means that actual values from the test set are used to make forecasts when they fall within the current context window. The latest date of actual values used for a given forecast window is called the **origin time** of the window. The following table shows an example output from the three-window rolling forecast with a horizon of three days and a step size of one day:
 
-:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-eval-table.png" alt-text="Example output table from a rolling forecast.":::
+:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-evaluation-table.png" alt-text="Example output table from a rolling forecast.":::
 
 With a table like this, we can visualize the forecasts vs. the actuals and compute desired evaluation metrics. AutoML pipelines can generate rolling forecasts on a test set with an [inference component](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines).
 
@@ -99,7 +99,7 @@ The choice of evaluation summary or metric is usually driven by the specific bus
 * Plots of observed target values vs. forecasted values to check that certain dynamics of the data are captured by the model,
 * MAPE (mean absolute percentage error) between actual and forecasted values,
 * RMSE (root mean squared error), possibly with a normalization, between actual and forecasted values,
-* MAE (mean absolute error) , possibly with a normalization, between actual and forecasted values.
+* MAE (mean absolute error), possibly with a normalization, between actual and forecasted values.
 
 There are many other possibilities, depending on the business scenario. You may need to create your own post-processing utilities for computing evaluation metrics from inference results or rolling forecasts. For more information on metrics, see our [regression and forecasting metrics](how-to-understand-automated-ml.md#regressionforecasting-metrics) article section. 
 
diff --git a/articles/machine-learning/how-to-auto-train-forecast.md b/articles/machine-learning/how-to-auto-train-forecast.md
@@ -1043,7 +1043,7 @@ The many models training component accepts a YAML format configuration file of A
 
 Parameter|Description
 --|--
-| **partition_column_names** | Column names in the data that, when grouped, define the data partitions. Many models launches an independent training job on each partition.
+| **partition_column_names** | Column names in the data that, when grouped, define the data partitions. The many models training component launches an independent training job on each partition.
 | **allow_multi_partitions** | An optional flag that allows training one model per partition when each partition contains more than one unique time series. The default value is False.
 
 The following sample provides a configuration template:
@@ -1367,7 +1367,7 @@ Parameter|Description
 **forecast_level** | The level of the hierarchy to retrieve forecasts for
 **allocation_method** | Allocation method to use when forecasts are disaggregated. Valid values are `"proportions_of_historical_average"` and `"average_historical_proportions"`.
 **max_nodes** | Number of compute nodes to use in the training job 
-**max_concurrency_per_node** | Number of AutoML processes to run on each node. Hence, the total concurrency of a HTS job is `max_nodes * max_concurrency_per_node`. 
+**max_concurrency_per_node** | Number of AutoML processes to run on each node. Hence, the total concurrency of an HTS job is `max_nodes * max_concurrency_per_node`. 
 **parallel_step_timeout_in_seconds** | Many models component timeout given in number of seconds.
 **forecast_mode** | Inference mode for model evaluation. Valid values are `"recursive"` and "`rolling`". See the [model evaluation article](concept-automl-forecasting-evaluation.md) for more information.
 **forecast_step** | Step size for rolling forecast. See the [model evaluation article](concept-automl-forecasting-evaluation.md) for more information.   
diff --git a/articles/machine-learning/how-to-configure-auto-train.md b/articles/machine-learning/how-to-configure-auto-train.md
@@ -352,7 +352,7 @@ The recommendations are similar to those noted for regression scenarios.
 
 ### Data featurization
 
-In every automated ML experiment, your data is automatically transformed to numbers and vectors of numbers plus (i.e. converting text to numeric) also scaled and normalized to help *certain* algorithms that are sensitive to features that are on different scales. This data transformation, scaling and normalization is referred to as featurization. 
+In every automated ML experiment, your data is automatically transformed to numbers and vectors of numbers and also scaled and normalized to help algorithms that are sensitive to features that are on different scales. These data transformations are called _featurization_. 
 
 > [!NOTE]
 > Automated machine learning featurization steps (feature normalization, handling missing data, converting text to numeric, etc.) become part of the underlying model. When using the model for predictions, the same featurization steps applied during training are applied to your input data automatically.
@@ -482,11 +482,11 @@ az ml job show -n $run_id --web
 
 ---
 
-### Multiple child runs on clusters
+### Multiple child runs on a cluster
 
 Automated ML experiment child runs can be performed on a cluster that is already running another experiment. However, the timing depends on how many nodes the cluster has, and if those nodes are available to run a different experiment.
 
-Each node in the cluster acts as an individual virtual machine (VM) that can accomplish a single training run; for automated ML this means a child run. If all the nodes are busy, the new experiment is queued. But if there are free nodes, the new experiment will run automated ML child runs in parallel in the available nodes/VMs.
+Each node in the cluster acts as an individual virtual machine (VM) that can accomplish a single training run; for automated ML this means a child run. If all the nodes are busy, a new experiment is queued. But if there are free nodes, the new experiment will run automated ML child runs in parallel in the available nodes/VMs.
 
 To help manage child runs and when they can be performed, we recommend you create a dedicated cluster per experiment, and match the number of `max_concurrent_iterations` of your experiment to the number of nodes in the cluster. This way, you use all the nodes of the cluster at the same time with the number of concurrent child runs/iterations you want.
 
@@ -658,7 +658,7 @@ Property | Description
 training_mode | Indicates training mode; `distributed` or `non_distributed`. Defaults to `non_distributed`.
 max_nodes | The number of nodes to use for training by each AutoML trial. This setting must be greater than or equal to 4.
 
-The following code samples shows an example of these settings for a classification job:
+The following code sample shows an example of these settings for a classification job:
 
 # [Python SDK](#tab/python)
 
@@ -698,16 +698,16 @@ limits:
 
 ### Distributed training for forecasting
 
-To learn how distributed training works for forecasting tasks, see our [forecasting at scale](concept-automl-forecasting-at-scale.md#distributed-dnn-training) article. To use distributed training for forecasting, you need to set set the `training_mode`, `enable_dnn_training`, `max_nodes`, and optionally the `max_concurrent_trials` properties of the job object.
+To learn how distributed training works for forecasting tasks, see our [forecasting at scale](concept-automl-forecasting-at-scale.md#distributed-dnn-training) article. To use distributed training for forecasting, you need to set the `training_mode`, `enable_dnn_training`, `max_nodes`, and optionally the `max_concurrent_trials` properties of the job object.
 
 Property | Description
 -- | --
 training_mode | Indicates training mode; `distributed` or `non_distributed`. Defaults to `non_distributed`.
 enable_dnn_training | Flag to enable deep neural network models.
 max_concurrent_trials | This is the maximum number of trial models to train in parallel. Defaults to 1.
-max_nodes | The total number of nodes to use for training. This setting must be greater than or equal to 2. For forecasting, each trial model is trained using $\text{max}\left(2, \text{floor}( \text{max\_nodes} / \text{max\_concurrent\_trials}) \right)$ nodes.
+max_nodes | The total number of nodes to use for training. This setting must be greater than or equal to 2. For forecasting tasks, each trial model is trained using $\text{max}\left(2, \text{floor}( \text{max\_nodes} / \text{max\_concurrent\_trials}) \right)$ nodes.
 
-The following code samples shows an example of these settings for a forecasting job:
+The following code sample shows an example of these settings for a forecasting job:
 
 # [Python SDK](#tab/python)
 
diff --git a/articles/machine-learning/how-to-understand-automated-ml.md b/articles/machine-learning/how-to-understand-automated-ml.md
@@ -88,10 +88,7 @@ weighted_accuracy|Weighted accuracy is accuracy where each sample is weighted by
 
 ### Binary vs. multiclass classification metrics
 
-Automated ML automatically detects if the data is binary and also allows users to activate binary classification metrics even if the data is multiclass by specifying a `true` class. Multiclass classification metrics is reported no matter if a dataset has two classes or more than two classes. Binary classification metrics is only reported when the data is binary, or the users activate the option. 
-
-> [!Note]
-> When a binary classification task is detected, we use `numpy.unique` to find the set of labels and the later label will be used as the `true` class. Since there is a sorting procedure in `numpy.unique`, the choice of `true` class will be stable.
+Automated ML automatically detects if the data is binary and also allows users to activate binary classification metrics even if the data is multiclass by specifying a `true` class. Multiclass classification metrics are reported if a dataset has two or more classes. Binary classification metrics are reported only when the data is binary.
 
 Note, multiclass classification metrics are intended for multiclass classification. When applied to a binary dataset, these metrics don't treat any class as the `true` class, as you might expect. Metrics that are clearly meant for multiclass are suffixed with `micro`, `macro`, or `weighted`. Examples include `average_precision_score`, `f1_score`, `precision_score`, `recall_score`, and `AUC`. For example, instead of calculating recall as `tp / (tp + fn)`, the multiclass averaged recall (`micro`, `macro`, or `weighted`) averages over both classes of a binary classification dataset. This is equivalent to calculating the recall for the `true` class and the `false` class separately, and then taking the average of the two.
 
@@ -336,13 +333,13 @@ The Azure Machine Learning Responsible AI dashboard provides a single interface
 * Machine learning interpretability 
 * Error analysis 
 
-While model evaluation metrics and charts are good for measuring the general quality of a model, operations such as inspecting you model’s fairness, viewing its explanations (also known as which dataset features a model used to make its predictions), inspecting its errors (what are the blindspots of the model) are essential when practicing responsible AI. That's why automated ML provides a Responsible AI dashboard to help you observe a variety of insights for your model. See how to view the Responsible AI dashboard in the [Azure Machine Learning studio.](how-to-use-automated-ml-for-ml-models.md#responsible-ai-dashboard-preview)
+While model evaluation metrics and charts are good for measuring the general quality of a model, operations such as inspecting the model’s fairness, viewing its explanations (also known as which dataset features a model used to make its predictions), inspecting its errors and potential blind spots are essential when practicing responsible AI. That's why automated ML provides a Responsible AI dashboard to help you observe a variety of insights for your model. See how to view the Responsible AI dashboard in the [Azure Machine Learning studio.](how-to-use-automated-ml-for-ml-models.md#responsible-ai-dashboard-preview)
 
 See how you can generate this [dashboard via the UI or the SDK.](how-to-responsible-ai-insights-sdk-cli.md)
 
 ## Model explanations and feature importances
 
-While model evaluation metrics and charts are good for measuring the general quality of a model, inspecting which dataset features a model used to make its predictions is essential when practicing responsible AI. That's why automated ML provides a model explanations dashboard to measure and report the relative contributions of dataset features. See how to [view the explanations dashboard in the Azure Machine Learning studio](how-to-use-automated-ml-for-ml-models.md#responsible-ai-dashboard-preview).
+While model evaluation metrics and charts are good for measuring the general quality of a model, inspecting which dataset features a model uses to make predictions is essential when practicing responsible AI. That's why automated ML provides a model explanations dashboard to measure and report the relative contributions of dataset features. See how to [view the explanations dashboard in the Azure Machine Learning studio](how-to-use-automated-ml-for-ml-models.md#responsible-ai-dashboard-preview).
 
 > [!NOTE]
 > Interpretability, best model explanation, is not available for automated ML forecasting experiments that recommend the following algorithms as the best model or ensemble: 
diff --git a/articles/machine-learning/how-to-use-automated-ml-for-ml-models.md b/articles/machine-learning/how-to-use-automated-ml-for-ml-models.md
@@ -271,7 +271,7 @@ To generate a Responsible AI dashboard for a particular model,
 
     ![Select Explain best model from the Automated ML job configuration page](media/how-to-use-automated-ml-for-ml-models/best-model-selection.png)
 
-3. Proceed to the **Compute** page of the setup form and choose the **Serverless** as your compute.
+3. Proceed to the **Compute** page of the setup form and choose the **Serverless** option for your compute.
 
     ![Serverless compute selection](media/how-to-use-automated-ml-for-ml-models/compute-serverless.png)
 
diff --git a/articles/machine-learning/media/concept-automl-forecasting-evaluation/rolling-evaluation-diagram.png b/articles/machine-learning/media/concept-automl-forecasting-evaluation/rolling-evaluation-diagram.png
diff --git a/articles/machine-learning/media/concept-automl-forecasting-evaluation/rolling-evaluation-table.png b/articles/machine-learning/media/concept-automl-forecasting-evaluation/rolling-evaluation-table.png