Skip to content

Commit 81cca36

Browse files
Merge pull request #226280 from sdgilley/sdg-updates
Freshness pass - create image and text labeling articles
2 parents cc2b3e6 + ac3fb00 commit 81cca36

File tree

6 files changed

+55
-41
lines changed

6 files changed

+55
-41
lines changed

articles/machine-learning/how-to-create-image-labeling-projects.md

Lines changed: 27 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.reviewer: sgilley
88
ms.service: machine-learning
99
ms.subservice: mldata
1010
ms.topic: how-to
11-
ms.date: 10/21/2021
11+
ms.date: 02/03/2023
1212
ms.custom: data4ml, ignite-fall-2021, ignite-2022
1313
---
1414

@@ -72,32 +72,39 @@ In many cases, it's fine to just upload local files. But [Azure Storage Explorer
7272

7373
To create a dataset from data that you've already stored in Azure Blob storage:
7474

75-
1. Select **Create a dataset** > **From datastore**.
76-
1. Assign a **Name** to your dataset.
75+
1. Select **+ Create** .
76+
1. Assign a **Name** to your dataset, and optionally a description.
7777
1. **Dataset type** is set to file, only file dataset types are supported for images.
78-
1. Select the datastore.
78+
1. Select **Next**.
79+
1. Select **From Azure storage**, then **Next**.
80+
1. Select the datastore, then select **Next**.
7981
1. If your data is in a subfolder within your blob storage, choose **Browse** to select the path.
8082
* Append "/**" to the path to include all the files in subfolders of the selected path.
8183
* Append "**/*.*" to include all the data in the current container and its subfolders.
82-
1. (Optional) Provide a description for your dataset.
83-
1. Select **Next**.
84-
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
84+
1. Select **Create**.
85+
1. Now select the data asset you just created.
8586

8687
### Create a dataset from uploaded data
8788

8889
To directly upload your data:
8990

90-
1. Select **Create a dataset** > **From local files**.
91-
1. Assign a **Name** to your dataset.
91+
1. Select **+ Create**.
92+
1. Assign a **Name** to your dataset, and optionally a description.
9293
1. **Dataset type** is set to file, only file dataset types are supported for images.
93-
1. (Optional) Provide a description for your dataset.
9494
1. Select **Next**.
95-
1. (Optional) Select or create a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
96-
1. Select **Browse** to select the local files or folder(s) to upload.
95+
1. Select **From local files**, then select **Next**.
96+
1. (Optional) Select a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
97+
1. Select **Next**.
98+
1. Select **Upload > Upload files** or **Upload > Upload folder** to select the local files or folder(s) to upload.
99+
1. In the browser window, find your files or folder, then select **Open**.
100+
1. Continue using **Upload** until you have specified all your files/folders.
101+
1. Check the box **Overwrite if already exists** if you wish. Verify the list of files/folders.
97102
1. Select **Next**.
98103
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
104+
1. Now select the data asset you just created.
99105

100-
## <a name="incremental-refresh"> </a> Configure incremental refresh
106+
107+
## Configure incremental refresh
101108

102109
[!INCLUDE [refresh](../../includes/machine-learning-data-labeling-refresh.md)]
103110

@@ -180,21 +187,24 @@ The **Dashboard** tab shows the progress of the labeling task.
180187

181188
:::image type="content" source="./media/how-to-create-labeling-projects/labeling-dashboard.png" alt-text="Data labeling dashboard":::
182189

183-
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
190+
The progress charts shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
191+
192+
Below the charts is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
193+
194+
You also see a distribution of labelers and how many items they've labeled.
184195

185-
The middle section shows the queue of tasks yet to be assigned. When ML assisted labeling is off, this section shows the number of manual tasks to be assigned. When ML assisted labeling is on, this section will also show:
196+
Finally, in the middle section, there is a table showing a queue of tasks yet to be assigned. When ML assisted labeling is off, this section shows the number of manual tasks to be assigned. When ML assisted labeling is on, this section will also show:
186197

187198
* Tasks containing clustered items in the queue
188199
* Tasks containing prelabeled items in the queue
189200

190-
Additionally, when ML assisted labeling is enabled, a small progress bar shows when the next training run will occur. The Experiments sections give links for each of the machine learning runs.
201+
Additionally, when ML assisted labeling is enabled, scroll down to see the ML assisted labeling status. The Jobs sections give links for each of the machine learning runs.
191202

192203
* Training - trains a model to predict the labels
193204
* Validation - determines whether this model's prediction will be used for pre-labeling the items
194205
* Inference - prediction run for new items
195206
* Featurization - clusters items (only for image classification projects)
196207

197-
On the right side is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
198208

199209
### Data tab
200210

articles/machine-learning/how-to-create-text-labeling-projects.md

Lines changed: 28 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.reviewer: sgilley
88
ms.service: machine-learning
99
ms.subservice: mldata
1010
ms.topic: how-to
11-
ms.date: 09/29/2022
11+
ms.date: 02/03/2023
1212
ms.custom: data4ml, ignite-fall-2021
1313
---
1414

@@ -72,39 +72,41 @@ In many cases, it's fine to just upload local files. But [Azure Storage Explorer
7272

7373
To create a dataset from data that you've already stored in Azure Blob storage:
7474

75-
1. Select **Create a dataset** > **From datastore**.
76-
1. Assign a **Name** to your dataset.
75+
1. Select **+ Create** .
76+
1. Assign a **Name** to your dataset, and optionally a description.
7777
1. Choose the **Dataset type**:
7878
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response.
7979
* Select **File** if you're using separate .txt files for each response.
80-
1. (Optional) Provide a description for your dataset.
8180
1. Select **Next**.
82-
1. Select the datastore.
81+
1. Select **From Azure storage**, then **Next**.
82+
1. Select the datastore, then select **Next**.
8383
1. If your data is in a subfolder within your blob storage, choose **Browse** to select the path.
8484
* Append "/**" to the path to include all the files in subfolders of the selected path.
8585
* Append "**/*.*" to include all the data in the current container and its subfolders.
86-
1. Select **Next**.
87-
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
86+
1. Select **Create**.
87+
1. Now select the data asset you just created.
8888

8989
### Create a dataset from uploaded data
9090

9191
To directly upload your data:
9292

93-
1. Select **Create a dataset** > **From local files**.
94-
1. Assign a **Name** to your dataset.
95-
1. Choose the **Dataset type**.
96-
* Select **Tabular** if you're using a .csv or .tsv file, where each row is a response.
93+
1. Select **+ Create**.
94+
1. Assign a **Name** to your dataset, and optionally a description.
95+
1. Choose the **Dataset type**:
96+
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response.
9797
* Select **File** if you're using separate .txt files for each response.
98-
1. (Optional) Provide a description of your dataset.
99-
1. Select **Next**
100-
1. (Optional) Select or create a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
101-
1. Select **Upload** to select the local file(s) or folder(s) to upload.
10298
1. Select **Next**.
103-
1. If uploading .csv or .tsv files:
104-
* Confirm the settings and preview, select **Next**.
105-
* Include all columns of text you'd like the labeler to see when classifying that row. If you'll be using ML assisted labeling, adding numeric columns may degrade the ML assist model.
106-
* Select **Next**.
107-
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
99+
1. Select **From local files**, then select **Next**.
100+
1. (Optional) Select a datastore. Or keep the default to upload to the default blob store ("workspaceblobstore") of your Machine Learning workspace.
101+
1. Select **Next**.
102+
1. Select **Upload > Upload files** or **Upload > Upload folder** to select the local files or folder(s) to upload.
103+
1. In the browser window, find your files or folder, then select **Open**.
104+
1. Continue using **Upload** until you have specified all your files/folders.
105+
1. Check the box **Overwrite if already exists** if you wish. Verify the list of files/folders.
106+
1. Select **Next**.
107+
1. Confirm the details. Select **Back** to modify the settings or **Create** to create the dataset.
108+
1. Now select the data asset you just created.
109+
108110

109111

110112
## Configure incremental refresh
@@ -174,15 +176,17 @@ The **Dashboard** tab shows the progress of the labeling task.
174176

175177
:::image type="content" source="./media/how-to-create-text-labeling-projects/text-labeling-dashboard.png" alt-text="Text data labeling dashboard":::
176178

179+
The progress charts shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
177180

178-
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
181+
Below the charts is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
179182

180-
The middle section shows the queue of tasks yet to be assigned. If ML-assisted labeling is on, you'll also see the number of pre-labeled items.
183+
You also see a distribution of labelers and how many items they've labeled.
181184

185+
Finally, in the middle section, there is a table showing a queue of tasks yet to be assigned. When ML assisted labeling is off, this section shows the number of manual tasks to be assigned.
182186

183-
On the right side is a distribution of the labels for those tasks that are complete. Remember that in some project types, an item can have multiple labels, in which case the total number of labels can be greater than the total number items.
187+
Additionally, when ML assisted labeling is enabled, scroll down to see the ML assisted labeling status. The Jobs sections give links for each of the machine learning runs.
184188

185-
### Data tab
189+
### Data
186190

187191
On the **Data** tab, you can see your dataset and review labeled data. Scroll through the labeled data to see the labels. If you see incorrectly labeled data, select it and choose **Reject**, which will remove the labels and put the data back into the unlabeled queue.
188192

-104 KB
Loading
71 KB
Loading
-145 KB
Loading
45.8 KB
Loading

0 commit comments

Comments
 (0)