Merge pull request #103931 from likebupt/docsbugbash-update

ttorble · web-flow · commit a2e8acddd050 · 2020-02-11T16:36:17.000Z
fix some typo or description errors
diff --git a/articles/machine-learning/algorithm-module-reference/apply-transformation.md b/articles/machine-learning/algorithm-module-reference/apply-transformation.md
@@ -7,9 +7,9 @@ ms.service: machine-learning
 ms.subservice: core
 ms.topic: reference
 
-author: xiaoharper
-ms.author: zhanxia
-ms.date: 10/22/2019
+author: likebupt
+ms.author: keli19
+ms.date: 02/11/2020
 ---
 
 # Apply Transformation module
@@ -28,9 +28,9 @@ Azure Machine Learning provides support for creating and then applying many diff
 
 ## How to use Apply Transformation  
   
-1. Add the **Apply Transformation** module to your pipeline. You can find this module under **Machine Learning**, in the **Score** category. 
+1. Add the **Apply Transformation** module to your pipeline. You can find this module in the **Model Scoring & Evaluation** category. 
   
-2. Locate an existing transformation to use as an input.  Previously saved transformations can be found in the **Transforms** group in the left navigation pane.  
+2. Locate an existing transformation to use as an input. Previously saved transformations can be found in the **My Datasets** group under **Datasets** category in the left module tree.  
   
    
   
diff --git a/articles/machine-learning/algorithm-module-reference/clean-missing-data.md b/articles/machine-learning/algorithm-module-reference/clean-missing-data.md
@@ -7,9 +7,9 @@ ms.service: machine-learning
 ms.subservice: core
 ms.topic: reference
 
-author: xiaoharper
-ms.author: zhanxia
-ms.date: 10/22/2019
+author: likebupt
+ms.author: keli19
+ms.date: 02/11/2020
 ---
 
 # Clean Missing Data module
@@ -20,7 +20,7 @@ Use this module to remove, replace, or infer missing values.
 
 Data scientists often check data for missing values and then perform various operations to fix the data or insert new values. The goal of such cleaning operations is to prevent problems caused by missing data that can arise when training a model. 
 
-This module supports multiple type of operations for "cleaning" missing values, including:
+This module supports multiple types of operations for "cleaning" missing values, including:
 
 + Replacing missing values with a placeholder, mean, or other value
 + Completely removing rows and columns that have missing values
@@ -33,11 +33,11 @@ This module also outputs a definition of the transformation used to clean the mi
 
 ## How to use Clean Missing Data
 
-This module lets you define a cleaning operation. You can also save the cleaning operation so that you can apply it later to new data. See the following links for a description of how to create and save a cleaning process: 
+This module lets you define a cleaning operation. You can also save the cleaning operation so that you can apply it later to new data. See the following sections of how to create and save a cleaning process: 
  
-+ To replace missing values
++ [To replace missing values](#replace-missing-values)
   
-+ To apply a cleaning transformation to new data
++ [To apply a cleaning transformation to new data](#apply-a-saved-cleaning-operation-to-new-data)
  
 > [!IMPORTANT]
 > The cleaning method that you use for handling missing values can dramatically affect your results. We recommend that you experiment with different methods. Consider both the justification for use of a particular method, and the quality of the results.
@@ -52,12 +52,9 @@ Each time that you apply the  [Clean Missing Data](./clean-missing-data.md) modu
 
     For example, to check for missing values in all numeric columns:
 
-    1. Open the Column Selector, and select **WITH RULES**.
-    2. For **BEGIN WITH**, select **NO COLUMNS**.
+    1. Select the **Clean Missing Data** module, and click on **Edit column** in the right panel of the module.
 
-        You can also start with ALL COLUMNS and then exclude columns. Initially, rules are not shown if you first click **ALL COLUMNS**, but you can click **NO COLUMNS** and then click **ALL COLUMNS** again to start with all columns and then filter out (exclude) columns based on the name, data type, or columns index.
-
-    3. For **Include**, select **Column type** from the dropdown list, and then select **Numeric**, or a more specific numeric type. 
+    3. For **Include**, select **Column types** from the dropdown list, and then select **Numeric**. 
   
     Any cleaning or replacement method that you choose must be applicable to **all** columns in the selection. If the data in any column is incompatible with the specified operation, the module returns an error and stops the pipeline.
   
@@ -105,7 +102,7 @@ Each time that you apply the  [Clean Missing Data](./clean-missing-data.md) modu
   
 6. The option **Replacement value** is available if you have selected the option, **Custom substitution value**. Type a new value to use as the replacement value for all missing values in the column.  
   
-    Note that you can use this option only in columns that have the Integer, Double, Boolean, or Date data types. For date columns, the replacement value can also be entered as the number of 100-nanosecond ticks since 1/1/0001 12:00 A.M.  
+    Note that you can use this option only in columns that have the Integer, Double, Boolean, or String.
   
 7. **Generate missing value indicator column**: Select this option if you want to output some indication of whether the values in the column met the criteria for missing value cleaning. This option is particularly useful when you are setting up a new cleaning operation and want to make sure it works as designed.
   
diff --git a/articles/machine-learning/algorithm-module-reference/cross-validate-model.md b/articles/machine-learning/algorithm-module-reference/cross-validate-model.md
@@ -9,7 +9,7 @@ ms.topic: reference
 
 author: likebupt
 ms.author: keli19
-ms.date: 10/10/2019
+ms.date: 02/11/2020
 ---
 # Cross Validate Model
 
@@ -57,22 +57,20 @@ In this scenario, you both train and test the model by using Cross Validate Mode
 
 2. Connect the output of any classification or regression model. 
 
-    For example, if you're using **Two Class Bayes Point Machine** for classification, configure the model with the parameters that you want. Then, drag a connector from the **Untrained model** port of the classifier to the matching port of Cross Validate Model. 
+    For example, if you're using **Two Class Boosted Decision Tree** for classification, configure the model with the parameters that you want. Then, drag a connector from the **Untrained model** port of the classifier to the matching port of Cross Validate Model. 
 
     > [!TIP] 
     > You don't have to train the model, because Cross-Validate Model automatically trains the model as part of evaluation.  
 3.  On the **Dataset** port of Cross Validate Model, connect any labeled training dataset.  
 
-4.  In the **Properties** pane of Cross Validate Model, select **Launch column selector**. Choose the single column that contains the class label, or the predictable value. 
+4.  In the right panel of Cross Validate Model, click **Edit column**. Select the single column that contains the class label, or the predictable value. 
 
 5. Set a value for the **Random seed** parameter if you want to repeat the results of cross-validation across successive runs on the same data.  
 
 6. Run the pipeline.
 
 7. See the [Results](#results) section for a description of the reports.
 
-    To get a copy of the model for reuse later, switch to the **Outputs** tab in the right panel of the module that contains the algorithm (for example, the **Two Class Bayes Point Machine**). Then select the **Register dataset** icon to save a copy of the trained model in the module tree.
-
 ## Results
 
 After all iterations are complete, Cross Validate Model creates scores for the entire dataset. It also creates performance metrics that you can use to assess the quality of the model.
diff --git a/articles/machine-learning/algorithm-module-reference/edit-metadata.md b/articles/machine-learning/algorithm-module-reference/edit-metadata.md
@@ -7,9 +7,9 @@ ms.service: machine-learning
 ms.subservice: core
 ms.topic: reference
 
-author: xiaoharper
-ms.author: zhanxia
-ms.date: 10/22/2019
+author: likebupt
+ms.author: keli19
+ms.date: 02/11/2020
 ---
 # Edit Metadata module
 
@@ -35,9 +35,9 @@ Typical metadata changes might include:
   
 ## Configure Edit Metadata
   
-1. In Azure Machine Learning, add the Edit Metadata module to your pipeline and connect the dataset you want to update. You can find the dataset under **Data Transformation** in the **Manipulate** category.
+1. In Azure Machine Learning designer, add the Edit Metadata module to your pipeline and connect the dataset you want to update. You can find the module in the **Data Transformation** category.
   
-1. Select **Launch the column selector** and choose the column or set of columns to work with. You can choose columns individually by name or index, or you can choose a group of columns by type.  
+1. Click **Edit column** in the right panel of the module and choose the column or set of columns to work with. You can choose columns individually by name or index, or you can choose a group of columns by type.  
   
 1. Select the **Data type** option if you need to assign a different data type to the selected columns. You might need to change the data type for certain operations. For example, if your source dataset has numbers handled as text, you must change them to a numeric data type before using math operations.
 
diff --git a/articles/machine-learning/algorithm-module-reference/evaluate-model.md b/articles/machine-learning/algorithm-module-reference/evaluate-model.md
@@ -7,9 +7,9 @@ ms.service: machine-learning
 ms.subservice: core
 ms.topic: reference
 
-author: xiaoharper
-ms.author: zhanxia
-ms.date: 11/19/2019
+author: likebupt
+ms.author: keli19
+ms.date: 02/11/2020
 ---
 # Evaluate Model module
 
@@ -75,10 +75,10 @@ Because this is a clustering model, the evaluation results are different than if
 
 This section describes the metrics returned for the specific types of models supported for use with **Evaluate Model**:
 
-+ [classification models](#bkmk_classification)
-+ [regression models](#bkmk_regression)
++ [classification models](#metrics-for-classification-models)
++ [regression models](#metrics-for-regression-models)
 
-###  <a name="bkmk_classification"></a> Metrics for classification models
+### Metrics for classification models
 
 The following metrics are reported when evaluating classification models. If you compare models, they are ranked by the metric you select for evaluation.  
   
@@ -96,7 +96,7 @@ The following metrics are reported when evaluating classification models. If you
   
 - **Training log loss** is a single score that represents the advantage of the classifier over a random prediction. The log loss measures the uncertainty of your model by comparing the probabilities it outputs to the known values (ground truth) in the labels. You want to minimize log loss for the model as a whole.
 
-##  <a name="bkmk_regression"></a> Metrics for regression models
+### Metrics for regression models
  
 The metrics returned for regression models are designed to estimate the amount of error.  A model is considered to fit the data well if the difference between observed and predicted values is small. However, looking at the pattern of the residuals (the difference between any one predicted point and its corresponding actual value) can tell you a lot about potential bias in the model.  
   
@@ -110,7 +110,7 @@ The metrics returned for regression models are designed to estimate the amount o
   
 - **Relative squared error (RSE)** similarly normalizes the total squared error of the predicted values by dividing by the total squared error of the actual values.  
   
-- **Mean Zero One Error (MZOE)** indicates whether the prediction was correct or not.  In other words: `ZeroOneLoss(x,y) = 1` when `x!=y`; otherwise `0`.
+
   
 - **Coefficient of determination**, often referred to as R<sup>2</sup>, represents the predictive power of the model as a value between 0 and 1. Zero means the model is random (explains nothing); 1 means there is a perfect fit. However, caution should be used in interpreting  R<sup>2</sup> values, as low values can be entirely normal and high values can be suspect.
   
diff --git a/articles/machine-learning/algorithm-module-reference/score-model.md b/articles/machine-learning/algorithm-module-reference/score-model.md
@@ -7,9 +7,9 @@ ms.service: machine-learning
 ms.subservice: core
 ms.topic: reference
 
-author: xiaoharper
-ms.author: zhanxia
-ms.date: 10/22/2019
+author: likebupt
+ms.author: keli19
+ms.date: 02/11/2020
 ---
 # Score Model module
 
@@ -39,7 +39,7 @@ The score, or predicted value, can be in many different formats, depending on th
 
 - For classification models, [Score Model](./score-model.md) outputs a predicted value for the class, as well as the probability of the predicted value.
 - For regression models, [Score Model](./score-model.md) generates just the predicted numeric value.
-- For image classification models, the score might be the class of object in the image, or a Boolean indicating whether a particular feature was found.
+
 
 ## Publish scores as a web service
 
diff --git a/articles/machine-learning/algorithm-module-reference/train-model.md b/articles/machine-learning/algorithm-module-reference/train-model.md
@@ -7,9 +7,9 @@ ms.service: machine-learning
 ms.subservice: core
 ms.topic: reference
 
-author: xiaoharper
-ms.author: zhanxia
-ms.date: 10/22/2019
+author: likebupt
+ms.author: keli19
+ms.date: 02/11/2020
 ---
 # Train Model module
 
@@ -34,7 +34,7 @@ In Azure Machine Learning, creating and using a machine learning model is typica
 
 3. After training is completed, use the trained model with one of the [scoring modules](./score-model.md), to make predictions on new data.
 
-## How to use **Train Model**  
+## How to use Train Model 
   
 1.  In Azure Machine Learning, configure a classification model or regression model.
     
@@ -44,7 +44,7 @@ In Azure Machine Learning, creating and using a machine learning model is typica
 
     The training dataset must contain a label column. Any rows without labels are ignored.
   
-4.  For **Label column**, click **Launch column selector**, and choose a single column that contains outcomes the model can use for training.
+4.  For **Label column**, click **Edit column** in the right panel of module, and choose a single column that contains outcomes the model can use for training.
   
     - For classification problems, the label column must contain either **categorical** values or **discrete** values. Some examples might be a yes/no rating, a disease classification code or name, or an income group.  If you pick a noncategorical column, the module will return an error during training.
   
diff --git a/articles/machine-learning/algorithm-module-reference/tune-model-hyperparameters.md b/articles/machine-learning/algorithm-module-reference/tune-model-hyperparameters.md
@@ -9,7 +9,7 @@ ms.topic: reference
 
 author: likebupt
 ms.author: keli19
-ms.date: 10/16/2019
+ms.date: 02/11/2020
 ---
 # Tune Model Hyperparameters
 
@@ -38,17 +38,13 @@ This section describes how to perform a basic parameter sweep, which trains a mo
 
 2.  Connect an untrained model to the leftmost input. 
 
-3. Set the **Create trainer mode** option to **Parameter Range**. Use **Range Builder** to specify a range of values to use in the parameter sweep.  
 
-    Almost all the classification and regression modules support an integrated parameter sweep. For learners that don't support configuring a parameter range, you can test only the available parameter values.
-
-    You can manually set the value for one or more parameters, and then sweep over the remaining parameters. This might save some time.
 
 4.  Add the dataset that you want to use for training, and connect it to the middle input of Tune Model Hyperparameters.  
 
     Optionally, if you have a tagged dataset, you can connect it to the rightmost input port (**Optional validation dataset**). This lets you measure accuracy while training and tuning.
 
-5.  In the **Properties** pane of Tune Model Hyperparameters, choose a value for **Parameter sweeping mode**. This option controls how the parameters are selected.
+5.  In the right panel of Tune Model Hyperparameters, choose a value for **Parameter sweeping mode**. This option controls how the parameters are selected.
 
     - **Entire grid**: When you select this option, the module loops over a grid predefined by the system, to try different combinations and identify the best learner. This option is useful when you don't know what the best parameter settings might be and want to try all possible combinations of values.
 
@@ -60,8 +56,6 @@ This section describes how to perform a basic parameter sweep, which trains a mo
 
     1. **Maximum number of runs on random sweep**: If you choose a random sweep, you can specify how many times the model should be trained, by using a random combination of parameter values.
 
-    2. **Maximum number of runs on random grid**: This option also controls the number of iterations over a random sampling of parameter values, but the values are not generated randomly from the specified range. Instead, the module creates a matrix of all possible combinations of parameter values. It then takes a random sampling over the matrix. This method is more efficient and less prone to regional oversampling or undersampling.
-
 8.  For **Ranking**, choose a single metric to use for ranking the models.
 
     When you run a parameter sweep, the module calculates all applicable metrics for the model type and returns them in the **Sweep results** report. The module uses separate metrics for regression and classification models.
diff --git a/articles/machine-learning/how-to-designer-import-data.md b/articles/machine-learning/how-to-designer-import-data.md
@@ -41,7 +41,7 @@ Your registered datasets can be found in the module palette, under **Datasets**
 
 ![Screenshot showing location of saved datasets in the designer palette](media/how-to-designer-import-data/use-datasets-designer.png)
 
-Any [file dataset](how-to-create-register-datasets.md#dataset-types) registered to your machine learning workspace will appear in the module palette. You aren't limited to using datasets created in the designer.
+
 
 > [!NOTE]
 > The designer currently only supports processing [tabular datasets](how-to-create-register-datasets.md#dataset-types). If you want to use [file datasets](how-to-create-register-datasets.md#dataset-types), use the Azure Machine Learning SDK available for Python and R.
diff --git a/articles/machine-learning/how-to-designer-sample-regression-automobile-price-basic.md b/articles/machine-learning/how-to-designer-sample-regression-automobile-price-basic.md
@@ -9,7 +9,7 @@ ms.topic: sample
 author: likebupt
 ms.author: keli19
 ms.reviewer: peterlu
-ms.date: 12/25/2019
+ms.date: 02/11/2020
 ---
 # Use regression to predict car prices with Azure Machine Learning designer
 
@@ -55,7 +55,7 @@ Use the **Select Columns in Dataset** module to exclude normalized-losses that h
 
 Machine learning problems vary. Common machine learning tasks include classification, clustering, regression, and recommender systems, each of which might require a different algorithm. Your choice of algorithm often depends on the requirements of the use case. After you pick an algorithm, you need to tune its parameters to train a more accurate model. You then need to evaluate all models based on metrics like accuracy, intelligibility, and efficiency.
 
-Since the goal of this sample is to predict automobile prices, and because the label column (price) contains real numbers, a regression model is a good choice. Considering that the number of features is relatively small (less than 100) and these features aren't sparse, the decision boundary is likely to be nonlinear. So we use **Decision Forest Regression** for this pipeline.
+Since the goal of this sample is to predict automobile prices, and because the label column (price) is continuous data, a regression model can be a good choice. We use **Linear Regression** for this pipeline.
 
 Use the **Split Data** module to randomly divide the input data so that the training dataset contains 70% of the original data and the testing dataset contains 30% of the original data.
 
diff --git a/articles/machine-learning/how-to-designer-sample-text-classification.md b/articles/machine-learning/how-to-designer-sample-text-classification.md
diff --git a/articles/machine-learning/how-to-run-batch-predictions-designer.md b/articles/machine-learning/how-to-run-batch-predictions-designer.md