You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/algorithm-module-reference/create-python-model.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,11 +11,11 @@ author: likebupt
11
11
ms.author: keli19
12
12
ms.date: 11/19/2019
13
13
---
14
-
# Create Python Model
14
+
# Create Python Model module
15
15
16
16
This article describes a module in Azure Machine Learning designer (preview).
17
17
18
-
Learn how to use the Create Python Model module to create an untrained model from a Python script. You can base the model on any learner that is included in a Python package in the Azure Machine Learning designer environment.
18
+
Learn how to use the Create Python Model module to create an untrained model from a Python script. You can base the model on any learner that's included in a Python package in the Azure Machine Learning designer environment.
19
19
20
20
After you create the model, you can use [Train Model](train-model.md) to train the model on a dataset, like any other learner in Azure Machine Learning. The trained model can be passed to [Score Model](score-model.md) to make predictions. You can then save the trained model and publish the scoring workflow as a web service.
Copy file name to clipboardExpand all lines: articles/machine-learning/algorithm-module-reference/enter-data-manually.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@ This module can be helpful in scenarios such as:
52
52
53
53
-**ARFF**: Paste in an existing ARFF format file. If you're typing values directly, be sure to add the optional header and required attribute fields at the beginning of the data.
54
54
55
-
For example, the following header and attribute rows could be added to a simple list. The column heading would be `SampleText`.
55
+
For example, the following header and attribute rows can be added to a simple list. The column heading would be `SampleText`.
Copy file name to clipboardExpand all lines: articles/machine-learning/algorithm-module-reference/partition-and-sample.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -113,7 +113,7 @@ Use this option when you want to divide the dataset into subsets of the data. Th
113
113
114
114
1.**Specify the partitioner method**: Indicate how you want data to be apportioned to each partition, by using these options:
115
115
116
-
-**Partition evenly**: Use this option to place an equal number of rows in each partition. To specify the number of output partitions, type a whole number in the **Specify number of folds to split evenly into** box.
116
+
-**Partition evenly**: Use this option to place an equal number of rows in each partition. To specify the number of output partitions, enter a whole number in the **Specify number of folds to split evenly into** box.
117
117
118
118
-**Partition with customized proportions**: Use this option to specify the size of each partition as a comma-separated list.
Copy file name to clipboardExpand all lines: articles/machine-learning/algorithm-module-reference/split-data.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ This article describes a module in Azure Machine Learning designer (preview).
17
17
18
18
Use the Split Data module to divide a dataset into two distinct sets.
19
19
20
-
This module is particularly useful when you need to separate data into training and testing sets. You can also customize the way that data is divided. Some options support randomization of data. Others are tailored for a certain data type or model type.
20
+
This module is useful when you need to separate data into training and testing sets. You can also customize the way that data is divided. Some options support randomization of data. Others are tailored for a certain data type or model type.
21
21
22
22
## Configure the module
23
23
@@ -37,25 +37,25 @@ This module is particularly useful when you need to separate data into training
37
37
38
38
For example, if you're analyzing sentiment, you can check for the presence of a particular product name in a text field. You can then divide the dataset into rows with the target product name and rows without the target product name.
39
39
40
-
-**Relative expression split**: Use this option whenever you want to apply a condition to a number column. The number can be a date/time field, a column that contains age or dollar amounts, or even a percentage. For example, you might want to divide your dataset depending on the cost of the items, group people by age ranges, or separate data by a calendar date.
40
+
-**Relative expression split**: Use this option whenever you want to apply a condition to a number column. The number can be a date/time field, a column that contains age or dollar amounts, or even a percentage. For example, you might want to divide your dataset based on the cost of the items, group people by age ranges, or separate data by a calendar date.
41
41
42
42
### Split rows
43
43
44
44
1. Add the [Split Data](./split-data.md) module to your pipeline in the designer, and connect the dataset that you want to split.
45
45
46
46
1. For **Splitting mode**, select **Split rows**.
47
47
48
-
1.**Fraction of rows in the first output dataset**: Use this option to determine how many rows will go into the first (left side) output. All other rows will go to the second (right side) output.
48
+
1.**Fraction of rows in the first output dataset**: Use this option to determine how many rows will go into the first (left side) output. All other rows will go into the second (right side) output.
49
49
50
50
The ratio represents the percentage of rows sent to the first output dataset, so you must enter a decimal number between 0 and 1.
51
51
52
-
For example, if you enter 0.75 as the value, the dataset will be split 75/25. In this split, 75 percent of the rows will be sent to the first output dataset. The remaining 25 percent will be sent to the second output dataset.
52
+
For example, if you enter **0.75** as the value, the dataset will be split 75/25. In this split, 75 percent of the rows will be sent to the first output dataset. The remaining 25 percent will be sent to the second output dataset.
53
53
54
54
1. Select the **Randomized split** option if you want to randomize selection of data into the two groups. This is the preferred option when you're creating training and test datasets.
55
55
56
56
1.**Random Seed**: Enter a non-negative integer value to start the pseudorandom sequence of instances to be used. This default seed is used in all modules that generate random numbers.
57
57
58
-
Specifying a seed makes the results generally reproducible. If you need to repeat the results of a split operation, you should specify a seed for the random number generator. Otherwise the random seed is set by default to **0**, which means the initial seed value is obtained from the system clock. As a result, the distribution of data might be slightly different each time you perform a split.
58
+
Specifying a seed makes the results reproducible. If you need to repeat the results of a split operation, you should specify a seed for the random number generator. Otherwise the random seed is set by default to **0**, which means the initial seed value is obtained from the system clock. As a result, the distribution of data might be slightly different each time you perform a split.
59
59
60
60
1.**Stratified split**: Set this option to **True** to ensure that the two output datasets contain a representative sample of the values in the *strata column* or *stratification key column*.
61
61
@@ -111,7 +111,7 @@ The first result dataset contains all rows where the index column begins with on
111
111
- The expression can reference a maximum of one column name.
112
112
- Use the ampersand character, `&`, for the AND operation. Use the pipe character, `|`, for the OR operation.
113
113
- The following operators are supported: `<`, `>`, `<=`, `>=`, `==`, `!=`.
114
-
- You cannot group operations by using `(` and `)`.
114
+
- You can't group operations by using `(` and `)`.
115
115
116
116
For **String column**:
117
117
- The following operators are supported: `==`, `!=`.
0 commit comments