Skip to content

Commit a15d0ac

Browse files
authored
Update label-tool.md
1 parent 66a0fbd commit a15d0ac

File tree

1 file changed

+19
-25
lines changed

1 file changed

+19
-25
lines changed

articles/applied-ai-services/form-recognizer/label-tool.md

Lines changed: 19 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
22
title: "How-to: Analyze documents, Label forms, train a model, and analyze forms with Form Recognizer"
33
titleSuffix: Azure Applied AI Services
4-
description: In this how-to, you'll use the Form Recognizer sample tool to analyze documents, invoices, receipts etc. Label and create a custom model to extract text, tables, selection marks, structure and key-value pairs from documents.
4+
description: How to use the Form Recognizer sample tool to analyze documents, invoices, receipts etc. Label and create a custom model to extract text, tables, selection marks, structure and key-value pairs from documents.
55
author: laujan
66
manager: nitinme
77
ms.service: applied-ai-services
88
ms.subservice: forms-recognizer
99
ms.topic: how-to
10-
ms.date: 11/02/2021
10+
ms.date: 06/23/2022
1111
ms.author: lajanuar
1212
ms.custom: cog-serv-seo-aug-2020, ignite-fall-2021
1313
keywords: document processing
@@ -18,23 +18,17 @@ keywords: document processing
1818
<!-- markdownlint-disable MD034 -->
1919
# Train a custom model using the Sample Labeling tool
2020

21-
>[!TIP]
22-
> * For an enhanced experience and advanced model quality, try the [Form Recognizer v3.0 Studio (preview)](https://formrecognizer.appliedai.azure.com/studio).
23-
> * The v3.0 Studio supports any model trained with v2.1 labeled data.
24-
> * You can refer to the API migration guide for detailed information about migrating from v2.1 to v3.0.
25-
> * *See* our [**REST API**](quickstarts/try-v3-rest-api.md) or [**C#**](quickstarts/try-v3-csharp-sdk.md), [**Java**](quickstarts/try-v3-java-sdk.md), [**JavaScript**](quickstarts/try-v3-javascript-sdk.md), or [Python](quickstarts/try-v3-python-sdk) SDK quickstarts to get started with the V3.0 preview.
26-
27-
In this article, you'll use the Form Recognizer REST API with the Sample Labeling tool to train a custom model with manually labeled data.
21+
In this article, you'll use the Form Recognizer REST API with the Sample Labeling tool to train a custom model with manually labeled data.
2822

2923
> [!VIDEO https://docs.microsoft.com/Shows/Docs-Azure/Azure-Form-Recognizer/player]
3024
3125
## Prerequisites
3226

33-
To complete this quickstart, you must have:
27+
You'll need the following resources to complete this project:
3428

3529
* Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services)
3630
* Once you have your Azure subscription, <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer" title="Create a Form Recognizer resource" target="_blank">create a Form Recognizer resource </a> in the Azure portal to get your key and endpoint. After it deploys, select **Go to resource**.
37-
* You will need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart.
31+
* You'll need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart.
3832
* You can use the free pricing tier (`F0`) to try the service, and upgrade later to a paid tier for production.
3933
* A set of at least six forms of the same type. You'll use this data to train the model and test a form. You can use a [sample data set](https://go.microsoft.com/fwlink/?linkid=2090451) (download and extract *sample_data.zip*) for this quickstart. Upload the training files to the root of a blob storage container in a standard-performance-tier Azure Storage account.
4034

@@ -49,7 +43,7 @@ Try out the [**Form Recognizer Sample Labeling tool**](https://fott-2-1.azureweb
4943
> [!div class="nextstepaction"]
5044
> [Try Prebuilt Models](https://fott-2-1.azurewebsites.net/)
5145
52-
You will need an Azure subscription ([create one for free](https://azure.microsoft.com/free/cognitive-services)) and a [Form Recognizer resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer) endpoint and key to try out the Form Recognizer service.
46+
You'll need an Azure subscription ([create one for free](https://azure.microsoft.com/free/cognitive-services)) and a [Form Recognizer resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer) endpoint and key to try out the Form Recognizer service.
5347

5448
## Set up the Sample Labeling tool
5549

@@ -150,15 +144,15 @@ When you create or open a project, the main tag editor window opens. The tag edi
150144
* The main editor pane that allows you to apply tags.
151145
* The tags editor pane that allows users to modify, lock, reorder, and delete tags.
152146

153-
### Identify text and tables
147+
### Identify text and tables
154148

155149
Select **Run Layout on unvisited documents** on the left pane to get the text and table layout information for each document. The labeling tool will draw bounding boxes around each text element.
156150

157-
The labeling tool will also show which tables have been automatically extracted. Select the table/grid icon on the left hand of the document to see the extracted table. In this quickstart, because the table content is automatically extracted, we will not be labeling the table content, but rather rely on the automated extraction.
151+
The labeling tool will also show which tables have been automatically extracted. Select the table/grid icon on the left hand of the document to see the extracted table. In this quickstart, because the table content is automatically extracted, we won't be labeling the table content, but rather rely on the automated extraction.
158152

159153
:::image type="content" source="media/label-tool/table-extraction.png" alt-text="Table visualization in Sample Labeling tool.":::
160154

161-
In v2.1, if your training document does not have a value filled in, you can draw a box where the value should be. Use **Draw region** on the upper left corner of the window to make the region taggable.
155+
In v2.1, if your training document doesn't have a value filled in, you can draw a box where the value should be. Use **Draw region** on the upper left corner of the window to make the region taggable.
162156

163157
### Apply labels to text
164158

@@ -201,16 +195,16 @@ The following value types and variations are currently supported:
201195

202196
* `number`
203197
* default, `currency`
204-
* Formatted as a Floating point value.
205-
* Example:1234.98 on the document will be formatted into 1234.98 on the output
198+
* Formatted as a Floating point value.
199+
* Example: 1234.98 on the document will be formatted into 1234.98 on the output
206200

207201
* `date`
208202
* default, `dmy`, `mdy`, `ymd`
209203

210204
* `time`
211205
* `integer`
212-
* Formatted as a Integer value.
213-
* Example:1234.98 on the document will be formatted into 123498 on the output
206+
* Formatted as an integer value.
207+
* Example: 1234.98 on the document will be formatted into 123498 on the output.
214208
* `selectionMark`
215209

216210
> [!NOTE]
@@ -242,11 +236,11 @@ The following value types and variations are currently supported:
242236

243237
### Label tables (v2.1 only)
244238

245-
At times, your data might lend itself better to being labeled as a table rather than key-value pairs. In this case, you can create a table tag by clicking on "Add a new table tag," specify whether the table will have a fixed number of rows or variable number of rows depending on the document, and define the schema.
239+
At times, your data might lend itself better to being labeled as a table rather than key-value pairs. In this case, you can create a table tag by selecting **Add a new table tag**. Specify whether the table will have a fixed number of rows or variable number of rows depending on the document and define the schema.
246240

247241
:::image type="content" source="media/label-tool/table-tag.png" alt-text="Configuring a table tag.":::
248242

249-
Once you have defined your table tag, tag the cell values.
243+
Once you've defined your table tag, tag the cell values.
250244

251245
:::image type="content" source="media/table-labeling.png" alt-text="Labeling a table.":::
252246

@@ -255,7 +249,7 @@ Once you have defined your table tag, tag the cell values.
255249
Choose the Train icon on the left pane to open the Training page. Then select the **Train** button to begin training the model. Once the training process completes, you'll see the following information:
256250

257251
* **Model ID** - The ID of the model that was created and trained. Each training call creates a new model with its own ID. Copy this string to a secure location; you'll need it if you want to do prediction calls through the [REST API](./quickstarts/try-sdk-rest-api.md?pivots=programming-language-rest-api&tabs=preview%2cv2-1) or [client library guide](./quickstarts/try-sdk-rest-api.md).
258-
* **Average Accuracy** - The model's average accuracy. You can improve model accuracy by labeling additional forms and retraining to create a new model. We recommend starting by labeling five forms and adding more forms as needed.
252+
* **Average Accuracy** - The model's average accuracy. You can improve model accuracy by adding and labeling more forms, then retraining to create a new model. We recommend starting by labeling five forms and adding more forms as needed.
259253
* The list of tags, and the estimated accuracy per tag.
260254

261255

@@ -286,7 +280,7 @@ Select the Analyze (light bulb) icon on the left to test your model. Select sour
286280

287281
## Improve results
288282

289-
Depending on the reported accuracy, you may want to do further training to improve the model. After you've done a prediction, examine the confidence values for each of the applied tags. If the average accuracy training value was high, but the confidence scores are low (or the results are inaccurate), you should add the prediction file to the training set, label it, and train again.
283+
Depending on the reported accuracy, you may want to do further training to improve the model. After you've done a prediction, examine the confidence values for each of the applied tags. If the average accuracy training value is high, but the confidence scores are low (or the results are inaccurate), add the prediction file to the training set, label it, and train again.
290284

291285
The reported average accuracy, confidence scores, and actual accuracy can be inconsistent when the analyzed documents differ from documents used in training. Keep in mind that some documents look similar when viewed by people but can look distinct to the AI model. For example, you might train with a form type that has two variations, where the training set consists of 20% variation A and 80% variation B. During prediction, the confidence scores for documents of variation A are likely to be lower.
292286

@@ -300,11 +294,11 @@ Go to your project settings page (slider icon) and take note of the security tok
300294

301295
### Restore project credentials
302296

303-
When you want to resume your project, you first need to create a connection to the same blob storage container. To do so, repeat the steps above. Then, go to the application settings page (gear icon) and see if your project's security token is there. If it isn't, add a new security token and copy over your token name and key from the previous step. Select **Save** to retain your settings..
297+
When you want to resume your project, you first need to create a connection to the same blob storage container. To do so, repeat the steps above. Then, go to the application settings page (gear icon) and see if your project's security token is there. If it isn't, add a new security token and copy over your token name and key from the previous step. Select **Save** to retain your settings.
304298

305299
### Resume a project
306300

307-
Finally, go to the main page (house icon) and select **Open Cloud Project**. Then select the blob storage connection, and select your project's **.fott** file. The application will load all of the project's settings because it has the security token.
301+
Finally, go to the main page (house icon) and select **Open Cloud Project**. Then select the blob storage connection, and select your project's `.fott` file. The application will load all of the project's settings because it has the security token.
308302

309303
## Next steps
310304

0 commit comments

Comments
 (0)