You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-create-image-labeling-projects.md
+27-17Lines changed: 27 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.reviewer: sgilley
8
8
ms.service: machine-learning
9
9
ms.subservice: mldata
10
10
ms.topic: how-to
11
-
ms.date: 10/21/2021
11
+
ms.date: 02/03/2023
12
12
ms.custom: data4ml, ignite-fall-2021, ignite-2022
13
13
---
14
14
@@ -72,32 +72,39 @@ In many cases, it's fine to just upload local files. But [Azure Storage Explorer
72
72
73
73
To create a dataset from data that you've already stored in Azure Blob storage:
74
74
75
-
1. Select **Create a dataset**> **From datastore**.
76
-
1. Assign a **Name** to your dataset.
75
+
1. Select **+ Create** .
76
+
1. Assign a **Name** to your dataset, and optionally a description.
77
77
1.**Dataset type** is set to file, only file dataset types are supported for images.
78
-
1. Select the datastore.
78
+
1. Select **Next**.
79
+
1. Select **From Azure storage**, then **Next**.
80
+
1. Select the datastore, then select **Next**.
79
81
1. If your data is in a subfolder within your blob storage, choose **Browse** to select the path.
80
82
* Append "/**" to the path to include all the files in subfolders of the selected path.
81
83
* Append "**/*.*" to include all the data in the current container and its subfolders.
82
-
1. (Optional) Provide a description for your dataset.
83
-
1. Select **Next**.
84
-
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
84
+
1. Select **Create**.
85
+
1. Now select the data asset you just created.
85
86
86
87
### Create a dataset from uploaded data
87
88
88
89
To directly upload your data:
89
90
90
-
1. Select **Create a dataset** > **From local files**.
91
-
1. Assign a **Name** to your dataset.
91
+
1. Select **+ Create**.
92
+
1. Assign a **Name** to your dataset, and optionally a description.
92
93
1.**Dataset type** is set to file, only file dataset types are supported for images.
93
-
1. (Optional) Provide a description for your dataset.
94
94
1. Select **Next**.
95
-
1. (Optional) Select or create a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
96
-
1. Select **Browse** to select the local files or folder(s) to upload.
95
+
1. Select **From local files**, then select **Next**.
96
+
1. (Optional) Select a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
97
+
1. Select **Next**.
98
+
1. Select **Upload > Upload files** or **Upload > Upload folder** to select the local files or folder(s) to upload.
99
+
1. In the browser window, find your files or folder, then select **Open**.
100
+
1. Continue using **Upload** until you have specified all your files/folders.
101
+
1. Check the box **Overwrite if already exists** if you wish. Verify the list of files/folders.
97
102
1. Select **Next**.
98
103
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
190
+
The progress charts shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
191
+
192
+
Below the charts is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
193
+
194
+
You also see a distribution of labelers and how many items they've labeled.
184
195
185
-
The middle section shows the queue of tasks yet to be assigned. When ML assisted labeling is off, this section shows the number of manual tasks to be assigned. When ML assisted labeling is on, this section will also show:
196
+
Finally, in the middle section, there is a table showing a queue of tasks yet to be assigned. When ML assisted labeling is off, this section shows the number of manual tasks to be assigned. When ML assisted labeling is on, this section will also show:
186
197
187
198
* Tasks containing clustered items in the queue
188
199
* Tasks containing prelabeled items in the queue
189
200
190
-
Additionally, when ML assisted labeling is enabled, a small progress bar shows when the next training run will occur. The Experiments sections give links for each of the machine learning runs.
201
+
Additionally, when ML assisted labeling is enabled, scroll down to see the ML assisted labeling status. The Jobs sections give links for each of the machine learning runs.
191
202
192
203
* Training - trains a model to predict the labels
193
204
* Validation - determines whether this model's prediction will be used for pre-labeling the items
194
205
* Inference - prediction run for new items
195
206
* Featurization - clusters items (only for image classification projects)
196
207
197
-
On the right side is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-create-text-labeling-projects.md
+28-24Lines changed: 28 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.reviewer: sgilley
8
8
ms.service: machine-learning
9
9
ms.subservice: mldata
10
10
ms.topic: how-to
11
-
ms.date: 09/29/2022
11
+
ms.date: 02/03/2023
12
12
ms.custom: data4ml, ignite-fall-2021
13
13
---
14
14
@@ -72,39 +72,41 @@ In many cases, it's fine to just upload local files. But [Azure Storage Explorer
72
72
73
73
To create a dataset from data that you've already stored in Azure Blob storage:
74
74
75
-
1. Select **Create a dataset**> **From datastore**.
76
-
1. Assign a **Name** to your dataset.
75
+
1. Select **+ Create** .
76
+
1. Assign a **Name** to your dataset, and optionally a description.
77
77
1. Choose the **Dataset type**:
78
78
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response.
79
79
* Select **File** if you're using separate .txt files for each response.
80
-
1. (Optional) Provide a description for your dataset.
81
80
1. Select **Next**.
82
-
1. Select the datastore.
81
+
1. Select **From Azure storage**, then **Next**.
82
+
1. Select the datastore, then select **Next**.
83
83
1. If your data is in a subfolder within your blob storage, choose **Browse** to select the path.
84
84
* Append "/**" to the path to include all the files in subfolders of the selected path.
85
85
* Append "**/*.*" to include all the data in the current container and its subfolders.
86
-
1. Select **Next**.
87
-
1.Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
86
+
1. Select **Create**.
87
+
1.Now select the data asset you just created.
88
88
89
89
### Create a dataset from uploaded data
90
90
91
91
To directly upload your data:
92
92
93
-
1. Select **Create a dataset** > **From local files**.
94
-
1. Assign a **Name** to your dataset.
95
-
1. Choose the **Dataset type**.
96
-
* Select **Tabular** if you're using a .csv or .tsv file, where each row is a response.
93
+
1. Select **+ Create**.
94
+
1. Assign a **Name** to your dataset, and optionally a description.
95
+
1. Choose the **Dataset type**:
96
+
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response.
97
97
* Select **File** if you're using separate .txt files for each response.
98
-
1. (Optional) Provide a description of your dataset.
99
-
1. Select **Next**
100
-
1. (Optional) Select or create a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
101
-
1. Select **Upload** to select the local file(s) or folder(s) to upload.
102
98
1. Select **Next**.
103
-
1. If uploading .csv or .tsv files:
104
-
* Confirm the settings and preview, select **Next**.
105
-
* Include all columns of text you'd like the labeler to see when classifying that row. If you'll be using ML assisted labeling, adding numeric columns may degrade the ML assist model.
106
-
* Select **Next**.
107
-
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
99
+
1. Select **From local files**, then select **Next**.
100
+
1. (Optional) Select a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
101
+
1. Select **Next**.
102
+
1. Select **Upload > Upload files** or **Upload > Upload folder** to select the local files or folder(s) to upload.
103
+
1. In the browser window, find your files or folder, then select **Open**.
104
+
1. Continue using **Upload** until you have specified all your files/folders.
105
+
1. Check the box **Overwrite if already exists** if you wish. Verify the list of files/folders.
106
+
1. Select **Next**.
107
+
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
108
+
1. Now select the data asset you just created.
109
+
108
110
109
111
110
112
## Configure incremental refresh
@@ -174,15 +176,17 @@ The **Dashboard** tab shows the progress of the labeling task.
174
176
175
177
:::image type="content" source="./media/how-to-create-text-labeling-projects/text-labeling-dashboard.png" alt-text="Text data labeling dashboard":::
176
178
179
+
The progress charts shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
177
180
178
-
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
181
+
Below the charts is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
179
182
180
-
The middle section shows the queue of tasks yet to be assigned. If ML-assisted labeling is on, you'll also see the number of pre-labeled items.
183
+
You also see a distribution of labelers and how many items they've labeled.
181
184
185
+
Finally, in the middle section, there is a table showing a queue of tasks yet to be assigned. When ML assisted labeling is off, this section shows the number of manual tasks to be assigned.
182
186
183
-
On the right side is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
187
+
Additionally, when ML assisted labeling is enabled, scroll down to see the ML assisted labeling status. The Jobs sections give links for each of the machine learning runs.
184
188
185
-
### Data tab
189
+
### Data
186
190
187
191
On the **Data** tab, you can see your dataset and review labeled data. Scroll through the labeled data to see the labels. If you see incorrectly labeled data, select it and choose **Reject**, which will remove the labels and put the data back into the unlabeled queue.
0 commit comments