Merge pull request #186968 from lgayhardt/ml-designer-0122

v-ccolin · web-flow · commit 6b71b4bf9ab1 · 2022-02-01T09:29:54.000Z
Designer overview and import data acrolinx updates
diff --git a/articles/machine-learning/component-reference/component-reference.md b/articles/machine-learning/component-reference/component-reference.md
@@ -1,6 +1,6 @@
 ---
 title:  "Algorithm & component reference"
-description: Learn about the Azure Machine Learning designer components you can use to create your own machine learning projects.
+description: Learn about the Azure Machine Learning designer components that you can use to create your own machine learning projects.
 titleSuffix: Azure Machine Learning
 services: machine-learning
 ms.service: machine-learning
@@ -31,7 +31,7 @@ For help with choosing algorithms, see
 | --- |--- | --- |
 | Data Input and Output | Move data from cloud sources into your pipeline. Write your results or intermediate data to Azure Storage, SQL Database, or Hive, while running a pipeline, or use cloud storage to exchange data between pipelines.  | [Enter Data Manually](enter-data-manually.md) <br/> [Export Data](export-data.md) <br/> [Import Data](import-data.md) |
 | Data Transformation | Operations on data that are unique to machine learning, such as normalizing or binning data, dimensionality reduction, and converting data among various file formats.| [Add Columns](add-columns.md) <br/> [Add Rows](add-rows.md) <br/> [Apply Math Operation](apply-math-operation.md) <br/> [Apply SQL Transformation](apply-sql-transformation.md) <br/> [Clean Missing Data](clean-missing-data.md) <br/> [Clip Values](clip-values.md) <br/> [Convert to CSV](convert-to-csv.md) <br/> [Convert to Dataset](convert-to-dataset.md) <br/> [Convert to Indicator Values](convert-to-indicator-values.md) <br/> [Edit Metadata](edit-metadata.md) <br/> [Group Data into Bins](group-data-into-bins.md) <br/> [Join Data](join-data.md) <br/> [Normalize Data](normalize-data.md) <br/> [Partition and Sample](partition-and-sample.md)  <br/> [Remove Duplicate Rows](remove-duplicate-rows.md) <br/> [SMOTE](smote.md) <br/> [Select Columns Transform](select-columns-transform.md) <br/> [Select Columns in Dataset](select-columns-in-dataset.md) <br/> [Split Data](split-data.md) |
-| Feature Selection | Select a subset of relevant, useful features to use in building an analytical model. | [Filter Based Feature Selection](filter-based-feature-selection.md) <br/> [Permutation Feature Importance](permutation-feature-importance.md) |
+| Feature Selection | Select a subset of relevant, useful features to use to build an analytical model. | [Filter Based Feature Selection](filter-based-feature-selection.md) <br/> [Permutation Feature Importance](permutation-feature-importance.md) |
 | Statistical Functions | Provide a wide variety of statistical methods related to data science. | [Summarize Data](summarize-data.md)|
 
 ## Machine learning algorithms
@@ -40,7 +40,7 @@ For help with choosing algorithms, see
 | --- |--- | --- |
 | Regression | Predict a value. | [Boosted Decision Tree Regression](boosted-decision-tree-regression.md) <br/> [Decision Forest Regression](decision-forest-regression.md) <br/> [Fast Forest Quantile Regression](fast-forest-quantile-regression.md)  <br/> [Linear Regression](linear-regression.md)  <br/> [Neural Network Regression](neural-network-regression.md)  <br/> [Poisson Regression](poisson-regression.md)  <br/>|
 | Clustering | Group data together.| [K-Means Clustering](k-means-clustering.md)
-| Classification | Predict a class.  Choose from binary (two-class) or multiclass algorithms.| [Multiclass Boosted Decision Tree](multiclass-boosted-decision-tree.md) <br/> [Multiclass Decision Forest](multiclass-decision-forest.md) <br/> [Multiclass Logistic Regression](multiclass-logistic-regression.md)  <br/> [Multiclass Neural Network](multiclass-neural-network.md) <br/> [One vs. All Multiclass](one-vs-all-multiclass.md) <br/> [One vs. One Multiclass](one-vs-one-multiclass.md) <br/>[Two-Class Averaged Perceptron](two-class-averaged-perceptron.md) <br/>  [Two-Class Boosted Decision Tree](two-class-boosted-decision-tree.md)  <br/> [Two-Class Decision Forest](two-class-decision-forest.md) <br/>  [Two-Class Logistic Regression](two-class-logistic-regression.md) <br/> [Two-Class Neural Network](two-class-neural-network.md) <br/> [Two Class Support Vector Machine](two-class-support-vector-machine.md) | 
+| Classification | Predict a class. Choose from binary (two-class) or multiclass algorithms.| [Multiclass Boosted Decision Tree](multiclass-boosted-decision-tree.md) <br/> [Multiclass Decision Forest](multiclass-decision-forest.md) <br/> [Multiclass Logistic Regression](multiclass-logistic-regression.md)  <br/> [Multiclass Neural Network](multiclass-neural-network.md) <br/> [One vs. All Multiclass](one-vs-all-multiclass.md) <br/> [One vs. One Multiclass](one-vs-one-multiclass.md) <br/>[Two-Class Averaged Perceptron](two-class-averaged-perceptron.md) <br/>  [Two-Class Boosted Decision Tree](two-class-boosted-decision-tree.md)  <br/> [Two-Class Decision Forest](two-class-decision-forest.md) <br/>  [Two-Class Logistic Regression](two-class-logistic-regression.md) <br/> [Two-Class Neural Network](two-class-neural-network.md) <br/> [Two Class Support Vector Machine](two-class-support-vector-machine.md) | 
 
 ## Components for building and evaluating models
 
@@ -55,14 +55,13 @@ For help with choosing algorithms, see
 | Recommendation | Build recommendation models. | [Evaluate Recommender](evaluate-recommender.md) <br/> [Score SVD Recommender](score-svd-recommender.md) <br/> [Score Wide and Deep Recommender](score-wide-and-deep-recommender.md)<br/> [Train SVD Recommender](train-SVD-recommender.md) <br/> [Train Wide and Deep Recommender](train-wide-and-deep-recommender.md)|
 | Anomaly Detection | Build anomaly detection models. | [PCA-Based Anomaly Detection](pca-based-anomaly-detection.md) <br/> [Train Anomaly Detection Model](train-anomaly-detection-model.md) |
 
-
 ## Web service
 
-Learn about the [web service components](web-service-input-output.md) which are necessary for real-time inference in Azure Machine Learning designer.
+Learn about the [web service components](web-service-input-output.md), which are necessary for real-time inference in Azure Machine Learning designer.
 
 ## Error messages
 
-Learn about the [error messages and exception codes](designer-error-codes.md) you might encounter using components in Azure Machine Learning designer.
+Learn about the [error messages and exception codes](designer-error-codes.md) that you might encounter using components in Azure Machine Learning designer.
 
 ## Next steps
 
diff --git a/articles/machine-learning/component-reference/import-data.md b/articles/machine-learning/component-reference/import-data.md
@@ -1,7 +1,7 @@
 ---
 title:  "Import Data: Component Reference"
 titleSuffix: Azure Machine Learning
-description: Learn  how to use the Import Data component in Azure Machine Learning to load data into a machine learning pipeline from existing cloud data services.  
+description: Learn how to use the Import Data component in Azure Machine Learning to load data into a machine learning pipeline from existing cloud data services.  
 services: machine-learning
 ms.service: machine-learning
 ms.subservice: core
@@ -33,7 +33,7 @@ The **Import Data** component support read data from following sources:
     - Azure SQL Database
     - Azure PostgreSQL    
 
-Before using cloud storage, you need to register a datastore in your Azure Machine Learning workspace first. For more information, see [How to Access Data](../how-to-access-data.md). 
+Before using cloud storage, you must register a datastore in your Azure Machine Learning workspace first. For more information, see [How to Access Data](../how-to-access-data.md). 
 
 After you define the data you want and connect to the source, **[Import Data](./import-data.md)** infers the data type of each column based on the values it contains, and loads the data into your designer pipeline. The output of **Import Data** is a dataset that can be used with any designer pipeline.
 
@@ -51,8 +51,9 @@ If your source data changes, you can refresh the dataset and add new data by rer
 
 1. Select **Data source**, and choose the data source type. It could be HTTP or datastore.
 
-    If you choose datastore, you can select existing datastores that already registered to your Azure Machine Learning workspace or create a new datastore. Then define the path of data to import in the datastore. You can easily browse the path by click **Browse Path**
-    ![Screenshot shows the Browse path link which opens the Path selection dialog box.](media/module/import-data-path.png)
+    If you choose datastore, you can select existing datastores that are already registered to your Azure Machine Learning workspace or create a new datastore. Then define the path of data to import in the datastore. You can easily browse the path by selecting **Browse Path**.
+
+    :::image type="content" source="media/module/import-data-path.png" alt-text="Screenshot shows the Browse path link which opens the Path selection dialog box." lightbox ="media/module/import-data-path.png":::
 
     > [!NOTE]
     > **Import Data** component is for **Tabular** data only.
@@ -64,13 +65,13 @@ If your source data changes, you can refresh the dataset and add new data by rer
 
 1. Select the preview schema to filter the columns you want to include. You can also define advanced settings like Delimiter in Parsing options.
 
-    ![import-data-preview](media/module/import-data.png)
+    :::image type="content" source="media/module/import-data.png" alt-text="Screenshot of the schema preview with Column 3, 4, 5 and 6 selected.":::
 
-1. The checkbox, **Regenerate output**, decides whether to execute the component to regenerate output at running time. 
+1. The checkbox, **Regenerate output**, decides whether to execute the component to regenerate output at running time.
 
-    It's by default unselected, which means if the component has been executed with the same parameters previously, the system will reuse the output from last run to reduce run time. 
+    It's by default unselected, which means if the component has been executed with the same parameters previously, the system reuses the output from last run to reduce run time.
 
-    If it is selected, the system will execute the component again to regenerate output. So select this option when underlying data in storage is updated, it can help to get the latest data.
+    If it is selected, the system executes the component again to regenerate output. So select this option when underlying data in storage is updated, it can help to get the latest data.
 
 
 1. Submit the pipeline.
@@ -85,20 +86,20 @@ If your source data changes, you can refresh the dataset and add new data by rer
 
 When import completes, right-click the output dataset and select **Visualize** to see if the data was imported successfully.
 
-If you want to save the data for reuse, rather than importing a new set of data each time the pipeline is run, select the **Register dataset** icon under the **Outputs+logs** tab in the right panel of the component. Choose a name for the dataset. The saved dataset preserves the data at the time of saving, the dataset is not updated when the pipeline is rerun, even if the dataset in the pipeline changes. This can be useful for taking snapshots of data.
+If you want to save the data for reuse, rather than importing a new set of data each time the pipeline is run, select the **Register dataset** icon under the **Outputs+logs** tab in the right panel of the component. Choose a name for the dataset. The saved dataset preserves the data at the time of saving. The dataset is not updated when the pipeline is rerun, even if the dataset in the pipeline changes. This can be useful for taking snapshots of data.
 
-After importing the data, it might need some additional preparations for modeling and analysis:
+After you import the data, it might need some additional preparations for modeling and analysis:
 
-- Use [Edit Metadata](./edit-metadata.md) to change column names, to handle a column as a different data type, or to indicate that some columns are labels or features.
+- Use [Edit Metadata](./edit-metadata.md) to change column names, handle a column as a different data type, or indicate that some columns are labels or features.
 
 - Use [Select Columns in Dataset](./select-columns-in-dataset.md) to select a subset of columns to transform or use in modeling. The transformed or removed columns can easily be rejoined to the original dataset by using the [Add Columns](./add-columns.md) component.  
 
 - Use [Partition and Sample](./partition-and-sample.md) to divide the dataset, perform sampling, or get the top n rows.
 
 ## Limitations
 
-Due to datstore access limitation, if your inference pipeline contains **Import Data** component, it will be auto-removed when deploy to real-time endpoint.
+Due to datastore access limitation, if your inference pipeline contains **Import Data** component, it is auto-removed when deployed to real-time endpoint.
 
 ## Next steps
 
-See the [set of components available](component-reference.md) to Azure Machine Learning. 
+See the [set of components available](component-reference.md) to Azure Machine Learning.