You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/algorithm-module-reference/split-data.md
+47Lines changed: 47 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -79,6 +79,26 @@ This module is particularly useful when you need to separate data into training
79
79
80
80
Based on the regular expression you provide, the dataset is divided into two sets of rows: rows with values that match the expression and all remaining rows.
81
81
82
+
The following examples demonstrate how to divide a dataset using the **Regular Expression** option.
83
+
84
+
### Single whole word
85
+
86
+
This example puts into the first dataset all rows that contain the text `Gryphon` in the column `Text`, and puts other rows into the second output of **Split Data**:
87
+
88
+
```text
89
+
\"Text" Gryphon
90
+
```
91
+
92
+
### Substring
93
+
94
+
This example looks for the specified string in any position within the second column of the dataset, denoted here by the index value of 1. The match is case-sensitive.
95
+
96
+
```text
97
+
(\1) ^[a-f]
98
+
```
99
+
100
+
The first result dataset contains all rows where the index column begins with one of these characters: `a`, `b`, `c`, `d`, `e`, `f`. All other rows are directed to the second output.
101
+
82
102
## Relative expression split.
83
103
84
104
1. Add the [Split Data](./split-data.md) module to your pipeline, and connect it as input to the dataset you want to split.
@@ -108,6 +128,33 @@ This module is particularly useful when you need to separate data into training
108
128
109
129
The expression divides the dataset into two sets of rows: rows with values that meet the condition, and all remaining rows.
110
130
131
+
The following examples demonstrate how to divide a dataset using the **Relative Expression** option in the **Split Data** module:
132
+
133
+
### Using calendar year
134
+
135
+
A common scenario is to divide a dataset by years. The following expression selects all rows where the values in the column `Year` are greater than `2010`.
136
+
137
+
```text
138
+
\"Year" > 2010
139
+
```
140
+
141
+
The date expression must account for all date parts that are included in the data column, and the format of dates in the data column must be consistent.
142
+
143
+
For example, in a date column using the format `mmddyyyy`, the expression should be something like this:
144
+
145
+
```text
146
+
\"Date" > 1/1/2010
147
+
```
148
+
149
+
### Using column indices
150
+
151
+
The following expression demonstrates how you can use the column index to select all rows in the first column of the dataset that contain values less than or equal to 30, but not equal to 20.
152
+
153
+
```text
154
+
(\0)<=30 & !=20
155
+
```
156
+
157
+
111
158
## Next steps
112
159
113
160
See the [set of modules available](module-reference.md) to Azure Machine Learning.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-retrain-designer.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -114,15 +114,17 @@ Use the following steps to submit a pipeline endpoint run from the designer:
114
114
115
115
1. Select the pipeline you want to run.
116
116
117
-
1. Select **Run**.
117
+
1. Select **Submit**.
118
118
119
119
1. In the setup dialog, you can specify a new input data path value, which points to your new dataset.
120
120
121
121

122
122
123
123
### Submit runs with code
124
124
125
-
There are multiple ways to access your REST endpoint programatically depending on your development environment. You can find code samples that show you how to submit pipeline runs with parameters in the **Consume** tab of your pipeline.
125
+
You can find the REST endpoint of a published pipeline in the overview panel. By calling the endpoint, you can retrain the published pipeline.
126
+
127
+
To make a REST call, you will need an OAuth 2.0 bearer-type authentication header. See the following [tutorial section](tutorial-pipeline-batch-scoring-classification.md#publish-and-run-from-a-rest-endpoint) for more detail on setting up authentication to your workspace and making a parameterized REST call.
0 commit comments