Skip to content

Commit add1973

Browse files
committed
edit pass: automobile-price-train-score-and-deploy
1 parent 3a947bc commit add1973

File tree

1 file changed

+51
-51
lines changed

1 file changed

+51
-51
lines changed

articles/machine-learning/service/tutorial-designer-automobile-price-train-score.md

Lines changed: 51 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: 'Tutorial: Predict automobile price with the designer'
33
titleSuffix: Azure Machine Learning
4-
description: Learn how to train, score, and deploy a machine learning model using a drag and drop interface. This tutorial is part one of a two-part series on predicting automobile prices using linear regression.
4+
description: Learn how to train, score, and deploy a machine learning model by using a drag-and-drop interface. This tutorial is part one of a two-part series on predicting automobile prices by using linear regression.
55

66
author: peterclu
77
ms.author: peterlu
@@ -17,63 +17,63 @@ ms.date: 11/04/2019
1717

1818
In this two-part tutorial, you learn how to use the Azure Machine Learning designer to develop and deploy a predictive analytics solution that predicts the price of any car.
1919

20-
In part one, you set up your environment, drag-and-drop modules onto an interactive canvas, and connect them together to create an Azure Machine Learning pipeline.
20+
In part one, you set up your environment, drag modules onto an interactive canvas, and connect them together to create an Azure Machine Learning pipeline.
2121

22-
In part one of the tutorial you learn how to:
22+
In part one of the tutorial, you'll learn how to:
2323

2424
> [!div class="checklist"]
25-
> * Create a new pipeline
26-
> * Import data
27-
> * Prepare data
28-
> * Train a machine learning model
29-
> * Evaluate a machine learning model
25+
> * Create a new pipeline.
26+
> * Import data.
27+
> * Prepare data.
28+
> * Train a machine learning model.
29+
> * Evaluate a machine learning model.
3030
31-
In [part two](tutorial-designer-automobile-price-deploy.md) of the tutorial, you learn how to deploy your predictive model as a real-time inferencing endpoint to predict the price of any car based on technical specifications you send it.
31+
In [part two](tutorial-designer-automobile-price-deploy.md) of the tutorial, you'll learn how to deploy your predictive model as a real-time inferencing endpoint to predict the price of any car based on technical specifications you send it.
3232

33-
> [!Note]
33+
> [!NOTE]
3434
>A completed version of this tutorial is available as a sample pipeline.
3535
>
36-
>To find it, go to the **designer in your workspace**. In the **New pipeline** section, select **Sample 1 - Regression: Automobile Price Prediction(Basic)**.
36+
>To find it, go to the designer in your workspace. In the **New pipeline** section, select **Sample 1 - Regression: Automobile Price Prediction(Basic)**.
3737
3838
## Create a new pipeline
3939

4040
Azure Machine Learning pipelines organize multiple, dependent machine learning and data processing steps into a single resource. Pipelines help you organize, manage, and reuse complex machine learning workflows across projects and users. To create an Azure Machine Learning pipeline, you need an Azure Machine Learning workspace. In this section, you learn how to create both these resources.
4141

4242
### Create a new workspace
4343

44-
If you have an Azure Machine Learning workspace with an **Enterprise edition**, [skip to the next section](#create-the-pipeline).
44+
If you have an Azure Machine Learning workspace with an Enterprise edition, [skip to the next section](#create-the-pipeline).
4545

4646
[!INCLUDE [aml-create-portal](../../../includes/aml-create-in-portal-enterprise.md)]
4747

4848
### Create the pipeline
4949

50-
1. Sign into [ml.azure.com](https://ml.azure.com) and select the workspace you want to work with.
50+
1. Sign in to [ml.azure.com](https://ml.azure.com), and select the workspace you want to work with.
5151

5252
1. Select **Designer**.
5353

5454
![Screenshot of the visual workspace showing how to access the designer](./media/ui-tutorial-automobile-price-train-score/launch-visual-interface.png)
5555

5656
1. Select **Easy-to-use prebuilt modules**.
5757

58-
1. Select the default pipeline name, **"Pipeline-Created-on ..."** at the top of the canvas, and rename it to something meaningful. For example, **"Automobile price prediction"**. The name doesn't need to be unique.
58+
1. Select the default pipeline name **Pipeline-Created-on** at the top of the canvas. Rename it to something meaningful. An example is *Automobile price prediction*. The name doesn't need to be unique.
5959

6060
## Import data
6161

6262
There are several sample datasets included in the designer for you to experiment with. For this tutorial, use **Automobile price data (Raw)**.
6363

64-
1. To the left of the pipeline canvas is a palette of datasets and modules. Select **Datasets** then view the **Samples** section to view the available sample datasets.
64+
1. To the left of the pipeline canvas is a palette of datasets and modules. Select **Datasets**, and then view the **Samples** section to view the available sample datasets.
6565

66-
1. Select the dataset, **Automobile price data (Raw)**, and drag it onto the canvas.
66+
1. Select the dataset **Automobile price data (Raw)**, and drag it onto the canvas.
6767

6868
![Drag data to canvas](./media/ui-tutorial-automobile-price-train-score/drag-data.gif)
6969

7070
### Visualize the data
7171

72-
You can visualize the data to understand the dataset you will be using.
72+
You can visualize the data to understand the dataset that you'll use.
7373

7474
1. Select the **Automobile price data (Raw)** module.
7575

76-
1. In the **Properties** pane to the right of the canvas, select **Outputs**.
76+
1. In the properties pane to the right of the canvas, select **Outputs**.
7777

7878
1. Select the graph icon to visualize the data.
7979

@@ -85,17 +85,17 @@ You can visualize the data to understand the dataset you will be using.
8585

8686
## Prepare data
8787

88-
Datasets typically require some preprocessing before analysis. You might have noticed some missing values when inspect the dataset. These missing values need to be cleaned so that the model can analyze the data correctly.
88+
Datasets typically require some preprocessing before analysis. You might have noticed some missing values when you inspected the dataset. These missing values must be cleaned so that the model can analyze the data correctly.
8989

9090
### Remove a column
9191

92-
When you train a model, you have to do something about the data that's missing. In this dataset, the **normalized-losses** column is missing many values, so you'll exclude that column from the model altogether.
92+
When you train a model, you have to do something about the data that's missing. In this dataset, the **normalized-losses** column is missing many values, so you exclude that column from the model altogether.
9393

94-
1. Enter **Select** in the Search box at the top of the palette to find the **Select Columns in Dataset** module.
94+
1. Enter **Select** in the search box at the top of the palette to find the **Select Columns in Dataset** module.
9595

96-
1. Click and drag the **Select Columns in Dataset** module onto the canvas. Drop the module below the dataset module.
96+
1. Drag the **Select Columns in Dataset** module onto the canvas. Drop the module below the dataset module.
9797

98-
1. Connect the **Automobile price data (Raw)** dataset to the **Select Columns in Dataset**. Drag from the dataset's output port, which is the small circle at the bottom of the dataset on the canvas, to the input port of **Select Columns in Dataset**, which is the small circle at the top of the module.
98+
1. Connect the **Automobile price data (Raw)** dataset to the **Select Columns in Dataset** module. Drag from the dataset's output port, which is the small circle at the bottom of the dataset on the canvas, to the input port of **Select Columns in Dataset**, which is the small circle at the top of the module.
9999

100100
> [!TIP]
101101
> You create a flow of data through your pipeline when you connect the output port of one module to an input port of another.
@@ -105,13 +105,13 @@ When you train a model, you have to do something about the data that's missing.
105105

106106
1. Select the **Select Columns in Dataset** module.
107107

108-
1. In the **Properties** pane to the right of the canvas, select **Parameters** > **Edit column**.
108+
1. In the properties pane to the right of the canvas, select **Parameters** > **Edit column**.
109109

110110
1. Select the **+** to add a new rule.
111111

112112
1. From the drop-down menu, select **Exclude** and **Column names**.
113113

114-
1. Enter **normalized-losses** into the text box.
114+
1. Enter *normalized-losses* in the text box.
115115

116116
1. In the lower right, select **Save** to close the column selector.
117117

@@ -121,50 +121,50 @@ When you train a model, you have to do something about the data that's missing.
121121

122122
1. Select the **Select Columns in Dataset** module.
123123

124-
1. In the **Properties** pane, select **Parameters** > **Comment** and enter "Exclude normalized losses.".
124+
1. In the properties pane, select **Parameters** > **Comment** and enter *Exclude normalized losses*.
125125

126126
### Clean missing data
127127

128-
Your dataset still has missing values after removing the **normalized-losses** column. You can remove the remaining missing data using the **Clean Missing Data** module.
128+
Your dataset still has missing values after you remove the **normalized-losses** column. You can remove the remaining missing data by using the **Clean Missing Data** module.
129129

130130
> [!TIP]
131131
> Cleaning the missing values from input data is a prerequisite for using most of the modules in the designer.
132132
133-
1. Enter **Clean** in the Search box to find the **Clean Missing Data** module.
133+
1. Enter **Clean** in the search box to find the **Clean Missing Data** module.
134134

135-
1. Drag the **Clean Missing Data** module to the pipeline canvas and connect it to the **Select Columns in Dataset** module.
135+
1. Drag the **Clean Missing Data** module to the pipeline canvas. Connect it to the **Select Columns in Dataset** module.
136136

137-
1. In the Properties pane, select **Remove entire row** under **Cleaning mode**.
137+
1. In the properties pane, select **Remove entire row** under **Cleaning mode**.
138138

139-
1. In the Properties pane **Comment** box, enter "Remove missing value rows."
139+
1. In the properties pane **Comment** box, enter *Remove missing value rows*.
140140

141141
Your pipeline should now look something like this:
142142

143-
![select-column](./media/ui-tutorial-automobile-price-train-score/pipeline-clean.png)
143+
![Select-column](./media/ui-tutorial-automobile-price-train-score/pipeline-clean.png)
144144

145145
## Train a machine learning model
146146

147147
Now that the data is processed, you can train a predictive model.
148148

149149
### Select an algorithm
150150

151-
**Classification** and **regression** are two types of supervised machine learning algorithms. **Classification** predicts an answer from a defined set of categories, such as a color (red, blue, or green). **Regression** is used to predict a number.
151+
*Classification* and *regression* are two types of supervised machine learning algorithms. Classification predicts an answer from a defined set of categories, such as a color like red, blue, or green. Regression is used to predict a number.
152152

153-
Since you want to predict price, which is a number, you can use a regression algorithm. For this example, you'll use a linear regression model.
153+
Because you want to predict price, which is a number, you can use a regression algorithm. For this example, you use a linear regression model.
154154

155155
### Split the data
156156

157157
Split your data into two separate datasets for training the model and testing it.
158158

159-
1. Enter **split data** in the search box to find the **Split Data** module and connect it to the left port of the **Clean Missing Data** module.
159+
1. Enter **split data** in the search box to find the **Split Data** module. Connect it to the left port of the **Clean Missing Data** module.
160160

161161
1. Select the **Split Data** module.
162162

163-
1. In the Properties pane, set the **Fraction of rows in the first output dataset** to 0.7.
163+
1. In the properties pane, set the **Fraction of rows in the first output dataset** to 0.7.
164164

165-
This splits 70 percent of the data to train the model and 30 percent for testing it.
165+
This option splits 70 percent of the data to train the model and 30 percent for testing it.
166166

167-
1. In the Properties **Comment** box, enter "Split the dataset into training set (0.7) and test set (0.3)."
167+
1. In the properties pane **Comment** box, enter *Split the dataset into training set (0.7) and test set (0.3)*.
168168

169169
### Train the model
170170

@@ -174,9 +174,9 @@ Train the model by giving it a set of data that includes the price. The model sc
174174

175175
1. Expand **Machine Learning Algorithms**.
176176

177-
This displays several categories of modules that you can use to initialize learning algorithms.
177+
This option displays several categories of modules that you can use to initialize learning algorithms.
178178

179-
1. Select **Regression** > **Linear Regression** and drag it to the pipeline canvas.
179+
1. Select **Regression** > **Linear Regression**, and drag it to the pipeline canvas.
180180

181181
1. Find and drag the **Train Model** module to the pipeline canvas.
182182

@@ -188,25 +188,25 @@ Train the model by giving it a set of data that includes the price. The model sc
188188

189189
1. Select the **Train Model** module.
190190

191-
1. In the Properties pane, select **Edit column** selector.
191+
1. In the properties pane, select **Edit column** selector.
192192

193-
1. In the **Label column** dialog, expand the drop-down menu and select **Column names**.
193+
1. In the **Label column** dialog box, expand the drop-down menu and select **Column names**.
194194

195-
1. In the text box, enter **price**. Price is the value that your model is going to predict.
195+
1. In the text box, enter *price*. Price is the value that your model is going to predict.
196196

197197
Your pipeline should look like this:
198198

199199
![Screenshot showing the correct configuration of the pipeline after adding the Train Model module.](./media/ui-tutorial-automobile-price-train-score/pipeline-train-graph.png)
200200

201201
## Evaluate a machine learning model
202202

203-
After training your model using 70 percent of the data, you can use it to score the other 30 percent to see how well your model functions.
203+
After you train your model by using 70 percent of the data, you can use it to score the other 30 percent to see how well your model functions.
204204

205-
1. Enter **score model** in the search box to find the **Score Model** module and drag the module to the pipeline canvas.
205+
1. Enter *score model* in the search box to find the **Score Model** module. Drag the module to the pipeline canvas.
206206

207207
1. Connect the output of the **Train Model** module to the left input port of **Score Model**. Connect the test data output (right port) of the **Split Data** module to the right input port of **Score Model**.
208208

209-
1. Enter **evaluate** in the search box to find the **Evaluate Model** and drag the module to the pipeline canvas.
209+
1. Enter *evaluate* in the search box to find the **Evaluate Model** module. Drag the module to the pipeline canvas.
210210

211211
1. Connect the output of the **Score Model** module to the left input of **Evaluate Model**.
212212

@@ -224,23 +224,23 @@ After the run completes, you can view the results of the pipeline run.
224224

225225
1. Select the **Score Model** module to view its output.
226226

227-
1. In the **Properties** pane, select **Outputs** > **Visualize**.
227+
1. In the properties pane, select **Outputs** > **Visualize**.
228228

229229
Here you can see the predicted prices and the actual prices from the testing data.
230230

231-
![Screenshot of the output visualization highlighting the "Scored Label" column](./media/ui-tutorial-automobile-price-train-score/score-result.png)
231+
![Screenshot of the output visualization highlighting the Scored Label column](./media/ui-tutorial-automobile-price-train-score/score-result.png)
232232

233233
1. Select the **Evaluate Model** module to view its output.
234234

235-
1. In the **Properties** pane, select **Output** > **Visualize**.
235+
1. In the properties pane, select **Output** > **Visualize**.
236236

237237
The following statistics are shown for your model:
238238

239-
* **Mean Absolute Error (MAE)**: The average of absolute errors (an error is the difference between the predicted value and the actual value).
239+
* **Mean Absolute Error (MAE)**: The average of absolute errors. An error is the difference between the predicted value and the actual value.
240240
* **Root Mean Squared Error (RMSE)**: The square root of the average of squared errors of predictions made on the test dataset.
241241
* **Relative Absolute Error**: The average of absolute errors relative to the absolute difference between actual values and the average of all actual values.
242242
* **Relative Squared Error**: The average of squared errors relative to the squared difference between the actual values and the average of all actual values.
243-
* **Coefficient of Determination**: Also known as the R squared value, this is a statistical metric indicating how well a model fits the data.
243+
* **Coefficient of Determination**: Also known as the R squared value, this statistical metric indicates how well a model fits the data.
244244

245245
For each of the error statistics, smaller is better. A smaller value indicates that the predictions are closer to the actual values. For the coefficient of determination, the closer its value is to one (1.0), the better the predictions.
246246

0 commit comments

Comments
 (0)