Skip to content

Commit d2dd084

Browse files
authored
Merge pull request #188899 from laujan/vinod-post-release-updates
Vinod post release updates
2 parents 300cb4d + f57e139 commit d2dd084

15 files changed

+158
-32
lines changed

articles/applied-ai-services/form-recognizer/concept-custom-neural.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,12 @@ https://{endpoint}/formrecognizer/documentModels:build?api-version=2022-01-30-pr
119119
* Train a custom model:
120120

121121
> [!div class="nextstepaction"]
122-
> [Form Recognizer quickstart](quickstarts/try-v3-form-recognizer-studio.md#custom-models)
122+
> [How to train a model](how-to-guides/build-custom-model-v3.md)
123+
124+
* Learn more about custom template models:
125+
126+
> [!div class="nextstepaction"]
127+
> [Custom template models](concept-custom-template.md )
123128
124129
* View the REST API:
125130

articles/applied-ai-services/form-recognizer/concept-custom-template.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,10 @@ https://{endpoint}/formrecognizer/documentModels:build?api-version=2022-01-30-pr
5959

6060
## Next steps
6161

62-
* Train a custom template model:
62+
* * Train a custom model:
6363

6464
> [!div class="nextstepaction"]
65-
> [Form Recognizer quickstart](quickstarts/try-sdk-rest-api.md)
65+
> [How to train a model](how-to-guides/build-custom-model-v3.md)
6666
6767
* Learn more about custom neural models:
6868

articles/applied-ai-services/form-recognizer/concept-custom.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@ To create a custom model, you label a dataset of documents with the values you w
1919

2020
## Custom model types
2121

22-
Custom models can be one of two types, [**custom template**](concept-custom-template.md ) or [**custom neural**](concept-custom-neural.md) models. The labeling and training process for both models is identical, but the models differ as follows:
22+
Custom models can be one of two types, [**custom template**](concept-custom-template.md ) or custom form and [**custom neural**](concept-custom-neural.md) or custom document models. The labeling and training process for both models is identical, but the models differ as follows:
2323

2424
### Custom template model
2525

26-
The custom template model relies on a consistent visual template to extract the labeled data. The accuracy of your model is affected by variances in the visual structure of your documents. Questionnaires or application forms are examples of consistent visual templates.Your training set will consist of structured documents where the formatting and layout are static and constant from one document instance to the next. Custom template models support key-value pairs, selection marks, tables, signature fields and regions and can be trained on documents in any of the [supported languages](language-support.md). For more information, *see* [custom template models](concept-custom-template.md ).
26+
The custom template or custom form model relies on a consistent visual template to extract the labeled data. The accuracy of your model is affected by variances in the visual structure of your documents. Structured forms such as questionnaires or applications are examples of consistent visual templates. Your training set will consist of structured documents where the formatting and layout are static and constant from one document instance to the next. Custom template models support key-value pairs, selection marks, tables, signature fields and regions and can be trained on documents in any of the [supported languages](language-support.md). For more information, *see* [custom template models](concept-custom-template.md ).
2727

2828
> [!TIP]
2929
>
@@ -33,7 +33,7 @@ Custom models can be one of two types, [**custom template**](concept-custom-temp
3333
3434
### Custom neural model
3535

36-
The custom neural model is a deep learning model type relies on a base model trained on a large collection of labeled documents using key-value pairs. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents. When choosing between the two model types, start with a neural model if it meets your functional needs. See [neural models](concept-custom-neural.md) to learn more about custom document models.
36+
The custom neural (custom document) model is a deep learning model type that relies on a base model trained on a large collection of documents. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents. When choosing between the two model types, start with a neural model if it meets your functional needs. See [neural models](concept-custom-neural.md) to learn more about custom document models.
3737

3838
## Model features
3939

@@ -51,7 +51,7 @@ The following tools are supported by Form Recognizer v3.0:
5151

5252
| Feature | Resources |
5353
|----------|-------------|
54-
|Custom model| <ul><li>[Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio/customform/projects)</li><li>[REST API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-1/operations/AnalyzeDocument)</li><li>[C# SDK](quickstarts/try-v3-csharp-sdk.md)</li><li>[Python SDK](quickstarts/try-v3-python-sdk.md)</li></ul>|
54+
|Custom model| <ul><li>[Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio/customform/projects)</li><li>[REST API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-2/operations/AnalyzeDocument)</li><li>[C# SDK](quickstarts/try-v3-csharp-sdk.md)</li><li>[Python SDK](quickstarts/try-v3-python-sdk.md)</li></ul>|
5555

5656
### Try Form Recognizer
5757

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
---
2+
title: "Train a custom model in the Form Recognizer Studio"
3+
titleSuffix: Azure Applied AI Services
4+
description: Learn how to build, label, and train a custom model in the Form Recognizer Studio.
5+
author: laujan
6+
manager: nitinme
7+
ms.service: applied-ai-services
8+
ms.subservice: forms-recognizer
9+
ms.topic: how-to
10+
ms.date: 02/16/2022
11+
ms.author: vikurpad
12+
---
13+
14+
# Build your training data set for a custom model
15+
16+
Form Recognizer models require as few as five training documents to get started. If you have at least five documents, you can get started training a custom model. You can train either a [custom template model (custom form)](../concept-custom-template.md) or a [custom neural model (custom document)](../concept-custom-neural.md). The training process is identical for both models and this document walks you through the process of training either model.
17+
18+
## Custom model input requirements
19+
20+
First, make sure your training data set follows the input requirements for Form Recognizer.
21+
22+
[!INCLUDE [input requirements](../includes/input-requirements.md)]
23+
24+
## Training data tips
25+
26+
Follow these tips to further optimize your data set for training:
27+
28+
* If possible, use text-based PDF documents instead of image-based documents. Scanned PDFs are handled as images.
29+
* For forms with input fields, use examples that have all of the fields completed.
30+
* Use forms with different values in each field.
31+
* If your form images are of lower quality, use a larger data set (10-15 images, for example).
32+
33+
## Upload your training data
34+
35+
When you've put together the set of forms or documents that you'll use for training, you'll need to upload it to an Azure blob storage container. If you don't know how to create an Azure storage account with a container, following the [Azure Storage quickstart for Azure portal](../../../storage/blobs/storage-quickstart-blobs-portal.md). You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
36+
37+
## Create a project in the Form Recognizer Studio
38+
39+
The Form Recognizer Studio provides and orchestrates all the API calls required to create the files required to complete your dataset and train your model.
40+
41+
1. Start by navigating to the [Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio). If this is your first time using the Studio, you'll need to [initialize it for use](../quickstarts/try-v3-form-recognizer-studio.md). Follow the [additional prerequisite for custom projects](../quickstarts/try-v3-form-recognizer-studio.md#additional-prerequisites-for-custom-projects) to configure the Studio to access your training dataset.
42+
43+
1. In the Studio select the **Custom models** tile, on the custom models page and select the **Create a project** button.
44+
45+
:::image type="content" source="../media/how-to/studio-create-project.png" alt-text="Screenshot: Create a project in the Form Recognizer Studio.":::
46+
47+
1. On the create project dialog, provide a name for your project, optionally a description, and select continue.
48+
49+
1. On the next step in the workflow, choose or create a Form Recognizer resource before you select continue.
50+
51+
> [!IMPORTANT]
52+
> Custom neural models models are only available in a few regions. If you plan on training a neural model, please select or create a resource in one of [these supported regions](https://aka.ms/fr-neural#l#supported-regions).
53+
54+
:::image type="content" source="../media/how-to/studio-select-resource.png" alt-text="Screenshot: Select the Form Recognizer resource.":::
55+
56+
1. Next select the storage account where you uploaded the dataset you wish to use to train your custom model. The **Folder path** should be empty if your training documents are in the root of the container. If your documents are in a sub-folder, enter the relative path from the container root in the **Folder path** field. Once your storage account is configured, select continue.
57+
58+
:::image type="content" source="../media/how-to/studio-select-storage.png" alt-text="Screenshot: Select the storage account.":::
59+
60+
1. Finally, review your project settings and select **Create Project** to create a new project. You should now be in the labeling window and see the files in your dataset listed.
61+
62+
## Label your data
63+
64+
In your project, your first task is to label your dataset with the fields you wish to extract.
65+
66+
You'll see the files you uploaded to storage on the left of your screen, with the first file ready to be labeled.
67+
68+
1. To start labeling your dataset, create your first field by selecting the plus (➕) button on the top-right of the screen to select a field type.
69+
70+
:::image type="content" source="../media/how-to/studio-create-label.png" alt-text="Screenshot: Create a label.":::
71+
72+
1. Enter a name for the field.
73+
74+
1. To assign a value to the field, simply choose a word or words in the document and select the field in either the dropdown or the field list on the right navigation bar. You'll see the labeled value below the field name in the list of fields.
75+
76+
1. Repeat this process for all the fields you wish to label for your dataset
77+
78+
1. Label the remaining documents in your dataset by selecting each document in the document list and selecting the text to be labeled
79+
80+
You now have all the documents in your dataset labeled. If you look at the storage account, you'll find a *.labels.json* and *.ocr.json* files that correspond to each document in your training dataset and an additional fields.json file. This is the training dataset that will be submitted to train the model.
81+
82+
## Train your model
83+
84+
With your dataset labeled, you're now ready to train your model. Select the train button in the upper-right corner.
85+
86+
1. On the train model dialog, provide a unique model ID and, optionally, a description.
87+
88+
1. For the build mode, select the type of model you want to train. Learn more about the [model types and capabilities](../concept-custom.md).
89+
90+
:::image type="content" source="../media/how-to/studio-train-model.png" alt-text="Screenshot: Train model dialog":::
91+
92+
1. Select **Train** to initiate the training process.
93+
94+
1. Template models train in a few minutes. Neural models can take up to 30 minutes to train.
95+
96+
1. Navigate to the *Models* menu to view the status of the train operation.
97+
98+
## Test the model
99+
100+
Once the model training is complete, you can test your model by selecting the model on the models list page.
101+
102+
1. Select the model and select on the **Test** button.
103+
104+
1. Select the `+ Add` button to select a file to test the model.
105+
106+
1. With a file selected, choose the **Analyze** button to test the model.
107+
108+
1. The model results are displayed in the main window and the fields extracted are listed in the right navigation bar.
109+
110+
1. Validate your model by evaluating the results for each field.
111+
112+
1. The right navigation bar also has the sample code to invoke your model and the JSON results from the API.
113+
114+
Congratulations you've trained a custom model in the Form Recognizer Studio! Your model is ready for use with the REST API or the SDK to analyze documents.
115+
116+
## Next steps
117+
118+
> [!div class="nextstepaction"]
119+
> [Learn about custom model types](../concept-custom.md)
120+
121+
> [!div class="nextstepaction"]
122+
> [Learn about accuracy and confidence with custom models](../concept-accuracy-confidence.md)
294 KB
Loading
678 KB
Loading
90.2 KB
Loading
62.4 KB
Loading
46.2 KB
Loading
-24 Bytes
Loading

0 commit comments

Comments
 (0)