Skip to content

Commit 1b1f469

Browse files
authored
Merge pull request #202692 from laujan/patch-101
Update try-sample-label-tool.md
2 parents 1b80b07 + 4118005 commit 1b1f469

File tree

4 files changed

+43
-24
lines changed

4 files changed

+43
-24
lines changed

articles/applied-ai-services/form-recognizer/deploy-label-tool.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,13 @@ ms.author: lajanuar
1313

1414
# Deploy the Sample Labeling tool
1515

16+
>[!TIP]
17+
>
18+
> * For an enhanced experience and advanced model quality, try the [Form Recognizer v3.0 Studio (preview)](https://formrecognizer.appliedai.azure.com/studio).
19+
> * The v3.0 Studio supports any model trained with v2.1 labeled data.
20+
> * You can refer to the API migration guide for detailed information about migrating from v2.1 to v3.0.
21+
> * *See* our [**REST API**](quickstarts/try-v3-rest-api.md) or [**C#**](quickstarts/try-v3-csharp-sdk.md), [**Java**](quickstarts/try-v3-java-sdk.md), [**JavaScript**](quickstarts/try-v3-javascript-sdk.md), or [Python](quickstarts/try-v3-python-sdk.md) SDK quickstarts to get started with the V3.0 preview.
22+
1623
> [!NOTE]
1724
> The [cloud hosted](https://fott-2-1.azurewebsites.net/) labeling tool is available at [https://fott-2-1.azurewebsites.net/](https://fott-2-1.azurewebsites.net/). Follow the steps in this document only if you want to deploy the sample labeling tool for yourself.
1825

articles/applied-ai-services/form-recognizer/label-tool.md

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
22
title: "How-to: Analyze documents, Label forms, train a model, and analyze forms with Form Recognizer"
33
titleSuffix: Azure Applied AI Services
4-
description: In this how-to, you'll use the Form Recognizer sample tool to analyze documents, invoices, receipts etc. Label and create a custom model to extract text, tables, selection marks, structure and key-value pairs from documents.
4+
description: How to use the Form Recognizer sample tool to analyze documents, invoices, receipts etc. Label and create a custom model to extract text, tables, selection marks, structure and key-value pairs from documents.
55
author: laujan
66
manager: nitinme
77
ms.service: applied-ai-services
88
ms.subservice: forms-recognizer
99
ms.topic: how-to
10-
ms.date: 11/02/2021
10+
ms.date: 06/23/2022
1111
ms.author: lajanuar
1212
ms.custom: cog-serv-seo-aug-2020, ignite-fall-2021
1313
keywords: document processing
@@ -18,17 +18,17 @@ keywords: document processing
1818
<!-- markdownlint-disable MD034 -->
1919
# Train a custom model using the Sample Labeling tool
2020

21-
In this article, you'll use the Form Recognizer REST API with the Sample Labeling tool to train a custom model with manually labeled data.
21+
In this article, you'll use the Form Recognizer REST API with the Sample Labeling tool to train a custom model with manually labeled data.
2222

2323
> [!VIDEO https://docs.microsoft.com/Shows/Docs-Azure/Azure-Form-Recognizer/player]
2424
2525
## Prerequisites
2626

27-
To complete this quickstart, you must have:
27+
You'll need the following resources to complete this project:
2828

2929
* Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services)
3030
* Once you have your Azure subscription, <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer" title="Create a Form Recognizer resource" target="_blank">create a Form Recognizer resource </a> in the Azure portal to get your key and endpoint. After it deploys, select **Go to resource**.
31-
* You will need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart.
31+
* You'll need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart.
3232
* You can use the free pricing tier (`F0`) to try the service, and upgrade later to a paid tier for production.
3333
* A set of at least six forms of the same type. You'll use this data to train the model and test a form. You can use a [sample data set](https://go.microsoft.com/fwlink/?linkid=2090451) (download and extract *sample_data.zip*) for this quickstart. Upload the training files to the root of a blob storage container in a standard-performance-tier Azure Storage account.
3434

@@ -43,7 +43,7 @@ Try out the [**Form Recognizer Sample Labeling tool**](https://fott-2-1.azureweb
4343
> [!div class="nextstepaction"]
4444
> [Try Prebuilt Models](https://fott-2-1.azurewebsites.net/)
4545
46-
You will need an Azure subscription ([create one for free](https://azure.microsoft.com/free/cognitive-services)) and a [Form Recognizer resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer) endpoint and key to try out the Form Recognizer service.
46+
You'll need an Azure subscription ([create one for free](https://azure.microsoft.com/free/cognitive-services)) and a [Form Recognizer resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer) endpoint and key to try out the Form Recognizer service.
4747

4848
## Set up the Sample Labeling tool
4949

@@ -144,15 +144,15 @@ When you create or open a project, the main tag editor window opens. The tag edi
144144
* The main editor pane that allows you to apply tags.
145145
* The tags editor pane that allows users to modify, lock, reorder, and delete tags.
146146

147-
### Identify text and tables
147+
### Identify text and tables
148148

149149
Select **Run Layout on unvisited documents** on the left pane to get the text and table layout information for each document. The labeling tool will draw bounding boxes around each text element.
150150

151-
The labeling tool will also show which tables have been automatically extracted. Select the table/grid icon on the left hand of the document to see the extracted table. In this quickstart, because the table content is automatically extracted, we will not be labeling the table content, but rather rely on the automated extraction.
151+
The labeling tool will also show which tables have been automatically extracted. Select the table/grid icon on the left hand of the document to see the extracted table. In this quickstart, because the table content is automatically extracted, we won't be labeling the table content, but rather rely on the automated extraction.
152152

153153
:::image type="content" source="media/label-tool/table-extraction.png" alt-text="Table visualization in Sample Labeling tool.":::
154154

155-
In v2.1, if your training document does not have a value filled in, you can draw a box where the value should be. Use **Draw region** on the upper left corner of the window to make the region taggable.
155+
In v2.1, if your training document doesn't have a value filled in, you can draw a box where the value should be. Use **Draw region** on the upper left corner of the window to make the region taggable.
156156

157157
### Apply labels to text
158158

@@ -195,16 +195,16 @@ The following value types and variations are currently supported:
195195

196196
* `number`
197197
* default, `currency`
198-
* Formatted as a Floating point value.
199-
* Example:1234.98 on the document will be formatted into 1234.98 on the output
198+
* Formatted as a Floating point value.
199+
* Example: 1234.98 on the document will be formatted into 1234.98 on the output
200200

201201
* `date`
202202
* default, `dmy`, `mdy`, `ymd`
203203

204204
* `time`
205205
* `integer`
206-
* Formatted as a Integer value.
207-
* Example:1234.98 on the document will be formatted into 123498 on the output
206+
* Formatted as an integer value.
207+
* Example: 1234.98 on the document will be formatted into 123498 on the output.
208208
* `selectionMark`
209209

210210
> [!NOTE]
@@ -236,11 +236,11 @@ The following value types and variations are currently supported:
236236

237237
### Label tables (v2.1 only)
238238

239-
At times, your data might lend itself better to being labeled as a table rather than key-value pairs. In this case, you can create a table tag by clicking on "Add a new table tag," specify whether the table will have a fixed number of rows or variable number of rows depending on the document, and define the schema.
239+
At times, your data might lend itself better to being labeled as a table rather than key-value pairs. In this case, you can create a table tag by selecting **Add a new table tag**. Specify whether the table will have a fixed number of rows or variable number of rows depending on the document and define the schema.
240240

241241
:::image type="content" source="media/label-tool/table-tag.png" alt-text="Configuring a table tag.":::
242242

243-
Once you have defined your table tag, tag the cell values.
243+
Once you've defined your table tag, tag the cell values.
244244

245245
:::image type="content" source="media/table-labeling.png" alt-text="Labeling a table.":::
246246

@@ -249,7 +249,7 @@ Once you have defined your table tag, tag the cell values.
249249
Choose the Train icon on the left pane to open the Training page. Then select the **Train** button to begin training the model. Once the training process completes, you'll see the following information:
250250

251251
* **Model ID** - The ID of the model that was created and trained. Each training call creates a new model with its own ID. Copy this string to a secure location; you'll need it if you want to do prediction calls through the [REST API](./quickstarts/try-sdk-rest-api.md?pivots=programming-language-rest-api&tabs=preview%2cv2-1) or [client library guide](./quickstarts/try-sdk-rest-api.md).
252-
* **Average Accuracy** - The model's average accuracy. You can improve model accuracy by labeling additional forms and retraining to create a new model. We recommend starting by labeling five forms and adding more forms as needed.
252+
* **Average Accuracy** - The model's average accuracy. You can improve model accuracy by adding and labeling more forms, then retraining to create a new model. We recommend starting by labeling five forms and adding more forms as needed.
253253
* The list of tags, and the estimated accuracy per tag.
254254

255255

@@ -280,7 +280,7 @@ Select the Analyze (light bulb) icon on the left to test your model. Select sour
280280

281281
## Improve results
282282

283-
Depending on the reported accuracy, you may want to do further training to improve the model. After you've done a prediction, examine the confidence values for each of the applied tags. If the average accuracy training value was high, but the confidence scores are low (or the results are inaccurate), you should add the prediction file to the training set, label it, and train again.
283+
Depending on the reported accuracy, you may want to do further training to improve the model. After you've done a prediction, examine the confidence values for each of the applied tags. If the average accuracy training value is high, but the confidence scores are low (or the results are inaccurate), add the prediction file to the training set, label it, and train again.
284284

285285
The reported average accuracy, confidence scores, and actual accuracy can be inconsistent when the analyzed documents differ from documents used in training. Keep in mind that some documents look similar when viewed by people but can look distinct to the AI model. For example, you might train with a form type that has two variations, where the training set consists of 20% variation A and 80% variation B. During prediction, the confidence scores for documents of variation A are likely to be lower.
286286

@@ -294,11 +294,11 @@ Go to your project settings page (slider icon) and take note of the security tok
294294

295295
### Restore project credentials
296296

297-
When you want to resume your project, you first need to create a connection to the same blob storage container. To do so, repeat the steps above. Then, go to the application settings page (gear icon) and see if your project's security token is there. If it isn't, add a new security token and copy over your token name and key from the previous step. Select **Save** to retain your settings..
297+
When you want to resume your project, you first need to create a connection to the same blob storage container. To do so, repeat the steps above. Then, go to the application settings page (gear icon) and see if your project's security token is there. If it isn't, add a new security token and copy over your token name and key from the previous step. Select **Save** to retain your settings.
298298

299299
### Resume a project
300300

301-
Finally, go to the main page (house icon) and select **Open Cloud Project**. Then select the blob storage connection, and select your project's **.fott** file. The application will load all of the project's settings because it has the security token.
301+
Finally, go to the main page (house icon) and select **Open Cloud Project**. Then select the blob storage connection, and select your project's `.fott` file. The application will load all of the project's settings because it has the security token.
302302

303303
## Next steps
304304

articles/applied-ai-services/form-recognizer/quickstarts/try-sample-label-tool.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,12 @@ keywords: document processing
1919
<!-- markdownlint-disable MD029 -->
2020
# Get started with the Form Recognizer Sample Labeling tool
2121

22-
Azure Form Recognizer is a cloud-based Azure Applied AI Service that uses machine-learning models to extract key-value pairs, text, and tables from your documents. You can use Form Recognizer to automate your data processing in applications and workflows, enhance data-driven strategies, and enrich document search capabilities.
22+
>[!TIP]
23+
>
24+
> * For an enhanced experience and advanced model quality, try the [Form Recognizer v3.0 Studio (preview)](https://formrecognizer.appliedai.azure.com/studio).
25+
> * The v3.0 Studio supports any model trained with v2.1 labeled data.
26+
> * You can refer to the API migration guide for detailed information about migrating from v2.1 to v3.0.
27+
> * *See* our [**REST API**](try-v3-rest-api.md) or [**C#**](try-v3-csharp-sdk.md), [**Java**](try-v3-java-sdk.md), [**JavaScript**](try-v3-javascript-sdk.md), or [Python](try-v3-python-sdk.md) SDK quickstarts to get started with the V3.0 preview.
2328
2429
The Form Recognizer Sample Labeling tool is an open source tool that enables you to test the latest features of Azure Form Recognizer and Optical Character Recognition (OCR) services:
2530

@@ -126,13 +131,13 @@ Train a custom model to analyze and extract data from forms and documents specif
126131

127132
### Prerequisites for training a custom form model
128133

129-
* An Azure Storage blob container that contains a set of training data. Make sure all the training documents are of the same format. If you have forms in multiple formats, organize them into subfolders based on common format. For this project, you can use our [sample data set](https://github.com/Azure-Samples/cognitive-services-REST-api-samples/blob/master/curl/form-recognizer/sample_data_without_labels.zip).
134+
* An Azure Storage blob container that contains a set of training data. Make sure all the training documents are of the same format. If you have forms in multiple formats, organize them into subfolders based on common format. For this project, you can use our [sample data set](https://github.com/Azure-Samples/cognitive-services-REST-api-samples/blob/master/curl/form-recognizer/sample_data_without_labels.zip). If you don't know how to create an Azure storage account with a container, follow the [Azure Storage quickstart for Azure portal](../../../storage/blobs/storage-quickstart-blobs-portal.md).
130135

131136
* Configure CORS
132137

133-
[CORS (Cross Origin Resource Sharing)](/rest/api/storageservices/cross-origin-resource-sharing--cors--support-for-the-azure-storage-services) needs to be configured on your Azure storage account for it to be accessible from the Form Recognizer Studio. To configure CORS in the Azure portal, you'll need access to the CORS blade of your storage account.
138+
[CORS (Cross Origin Resource Sharing)](/rest/api/storageservices/cross-origin-resource-sharing--cors--support-for-the-azure-storage-services) needs to be configured on your Azure storage account for it to be accessible from the Form Recognizer Studio. To configure CORS in the Azure portal, you'll need access to the CORS tab of your storage account.
134139

135-
1. Select the CORS blade for the storage account.
140+
1. Select the CORS tab for the storage account.
136141

137142
:::image type="content" source="../media/quickstarts/cors-setting-menu.png" alt-text="Screenshot of the CORS setting menu in the Azure portal.":::
138143

articles/applied-ai-services/form-recognizer/supervised-table-tags.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,21 @@ ms.custom: ignite-fall-2021
1515

1616
# Use table tags to train your custom template model
1717

18+
>[!TIP]
19+
>
20+
> * For an enhanced experience and advanced model quality, try the [Form Recognizer v3.0 Studio (preview)](https://formrecognizer.appliedai.azure.com/studio).
21+
> * The v3.0 Studio supports any model trained with v2.1 labeled data.
22+
> * You can refer to the API migration guide for detailed information about migrating from v2.1 to v3.0.
23+
> * *See* our [**REST API**](quickstarts/try-v3-rest-api.md) or [**C#**](quickstarts/try-v3-csharp-sdk.md), [**Java**](quickstarts/try-v3-java-sdk.md), [**JavaScript**](quickstarts/try-v3-javascript-sdk.md), or [Python](quickstarts/try-v3-python-sdk.md) SDK quickstarts to get started with the V3.0 preview.
24+
1825
In this article, you'll learn how to train your custom template model with table tags (labels). Some scenarios require more complex labeling than simply aligning key-value pairs. Such scenarios include extracting information from forms with complex hierarchical structures or encountering items that not automatically detected and extracted by the service. In these cases, you can use table tags to train your custom template model.
1926

2027
## When should I use table tags?
2128

2229
Here are some examples of when using table tags would be appropriate:
2330

2431
- There's data that you wish to extract presented as tables in your forms, and the structure of the tables are meaningful. For instance, each row of the table represents one item and each column of the row represents a specific feature of that item. In this case, you could use a table tag where a column represents features and a row represents information about each feature.
25-
- There's data you wish to extract that is not presented in specific form fields but semantically, the data could fit in a two-dimensional grid. For instance, your form has a list of people, and includes, a first name, a last name, and an email address. You would like to extract this information. In this case, you could use a table tag with first name, last name, and email address as columns and each row is populated with information about a person from your list.
32+
- There's data you wish to extract that isn't presented in specific form fields but semantically, the data could fit in a two-dimensional grid. For instance, your form has a list of people, and includes, a first name, a last name, and an email address. You would like to extract this information. In this case, you could use a table tag with first name, last name, and email address as columns and each row is populated with information about a person from your list.
2633

2734
> [!NOTE]
2835
> Form Recognizer automatically finds and extracts all tables in your documents whether the tables are tagged or not. Therefore, you don't have to label every table from your form with a table tag and your table tags don't have to replicate the structure of very table found in your form. Tables extracted automatically by Form Recognizer will be included in the pageResults section of the JSON output.

0 commit comments

Comments
 (0)