Skip to content

Commit 4a350d3

Browse files
Merge pull request #188262 from laujan/vinod-release-preview2-branch
vinod-release-updates
2 parents 6d4f08d + eed7130 commit 4a350d3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+502
-134
lines changed

articles/applied-ai-services/form-recognizer/api-v2-0/includes/csharp-v3-0-0.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -382,7 +382,7 @@ Submodel Form Type: form-63c013e3-1cab-43eb-84b0-f4b20cb9214c
382382

383383
## Analyze forms with a custom model
384384

385-
This section demonstrates how to extract key/value information and other content from your custom form types, using models you trained with your own forms.
385+
This section demonstrates how to extract key/value information and other content from your custom template types, using models you trained with your own forms.
386386

387387
> [!IMPORTANT]
388388
> In order to implement this scenario, you must have already trained a model so you can pass its ID into the method below.

articles/applied-ai-services/form-recognizer/api-v2-0/includes/java-v3-0-0.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -294,7 +294,7 @@ The model found field 'field-6' with label: VAT ID
294294

295295
## Analyze forms with a custom model
296296

297-
This section demonstrates how to extract key/value information and other content from your custom form types, using models you trained with your own forms.
297+
This section demonstrates how to extract key/value information and other content from your custom template types, using models you trained with your own forms.
298298

299299
> [!IMPORTANT]
300300
> In order to implement this scenario, you must have already trained a model so you can pass its ID into the method below. See the [Train a model](#train-a-model-without-labels) section.
@@ -314,7 +314,7 @@ The returned value is a collection of **RecognizedForm** objects: one for each p
314314

315315
```console
316316
Analyze PDF form...
317-
----------- Recognized custom form info for page 0 -----------
317+
----------- Recognized custom template info for page 0 -----------
318318
Form type: form-0
319319
Field 'field-0' has label 'Address:' with a confidence score of 0.91.
320320
Field 'field-1' has label 'Invoice For:' with a confidence score of 1.00.

articles/applied-ai-services/form-recognizer/api-v2-0/includes/javascript-v3-0-0.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -260,7 +260,7 @@ Document errors: undefined
260260

261261
## Analyze forms with a custom model
262262

263-
This section demonstrates how to extract key/value information and other content from your custom form types, using models you trained with your own forms.
263+
This section demonstrates how to extract key/value information and other content from your custom template types, using models you trained with your own forms.
264264

265265
> [!IMPORTANT]
266266
> In order to implement this scenario, you must have already trained a model so you can pass its ID into the method below. See the [Train a model](#train-a-model-without-labels) section.

articles/applied-ai-services/form-recognizer/api-v2-0/includes/python-v3-0-0.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -268,7 +268,7 @@ Document errors: []
268268

269269
## Analyze forms with a custom model
270270

271-
This section demonstrates how to extract key/value information and other content from your custom form types, using models you trained with your own forms.
271+
This section demonstrates how to extract key/value information and other content from your custom template types, using models you trained with your own forms.
272272

273273
> [!IMPORTANT]
274274
> In order to implement this scenario, you must have already trained a model so you can pass its ID into the method below. See the [Train a model](#train-a-model-without-labels) section.

articles/applied-ai-services/form-recognizer/build-training-data-set.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ Follow these additional tips to further optimize your data set for training.
3838

3939
## Upload your training data
4040

41-
When you've put together the set of form documents that you'll use for training, you need to upload it to an Azure blob storage container. If you don't know how to create an Azure storage account with a container, following the [Azure Storage quickstart for Azure portal](../../storage/blobs/storage-quickstart-blobs-portal.md). Use the standard performance tier.
41+
When you've put together the set of form documents that you'll use for training, you need to upload it to an Azure blob storage container. If you don't know how to create an Azure storage account with a container, follow the [Azure Storage quickstart for Azure portal](../../storage/blobs/storage-quickstart-blobs-portal.md). Use the standard performance tier.
4242

4343
If you want to use manually labeled data, you'll also have to upload the *.labels.json* and *.ocr.json* files that correspond to your training documents. You can use the [Sample Labeling tool](label-tool.md) (or your own UI) to generate these files.
4444

articles/applied-ai-services/form-recognizer/compose-custom-models.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,28 @@
11
---
2-
title: "How to guide: use custom and composed models"
2+
title: "How to guide: create and compose custom models with Form Recognizer v2.1"
33
titleSuffix: Azure Applied AI Services
4-
description: Learn how to create, use, and manage Form Recognizer custom and composed models
4+
description: Learn how to create, compose use, and manage custom models with Form Recognizer v2.1
55
author: laujan
66
manager: nitinme
77
ms.service: applied-ai-services
88
ms.subservice: forms-recognizer
99
ms.topic: how-to
10-
ms.date: 11/02/2021
10+
ms.date: 02/12/2022
1111
ms.author: lajanuar
1212
recommendations: false
13-
ms.custom: ignite-fall-2021
1413
---
1514

16-
# Use custom and composed models
15+
# Create and compose custom models
16+
17+
> [!NOTE]
18+
> This how-to guide references Form Recognizer v2.1 (GA).
1719
1820
Form Recognizer uses advanced machine-learning technology to detect and extract information from document images and return the extracted data in a structured JSON output. With Form Recognizer, you can train standalone custom models or combine custom models to create composed models.
1921

2022
* **Custom models**. Form Recognizer custom models enable you to analyze and extract data from forms and documents specific to your business. Custom models are trained for your distinct data and use cases.
2123

2224
* **Composed models**. A composed model is created by taking a collection of custom models and assigning them to a single model that encompasses your form types. When a document is submitted to a composed model, the service performs a classification step to decide which custom model accurately represents the form presented for analysis.
2325

24-
***Model configuration window in Form Recognizer Studio***
25-
26-
:::image type="content" source="media/studio/composed-model.png" alt-text="Screenshot: model configuration window in Form Recognizer Studio.":::
27-
2826
In this article, you'll learn how to create Form Recognizer custom and composed models using our [Form Recognizer Sample Labeling tool](label-tool.md), [REST APIs](quickstarts/client-library.md?branch=main&pivots=programming-language-rest-api#train-a-custom-model), or [client-library SDKs](quickstarts/client-library.md?branch=main&pivots=programming-language-csharp#train-a-custom-model).
2927

3028
## Sample Labeling tool

articles/applied-ai-services/form-recognizer/concept-accuracy-confidence.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,24 @@
11
---
2-
title: Interpret and improve accuracy and confidence scores
2+
title: Interpret and improve model accuracy and analysis confidence scores
33
titleSuffix: Azure Applied AI Services
4-
description: Best practices for how to interpret the accuracy score from the train model operation and the confidence score from analysis operations.
4+
description: Best practices to interpret the accuracy score from the train model operation and the confidence score from analysis operations.
55
author: laujan
66
manager: nitinme
77
ms.service: applied-ai-services
88
ms.subservice: forms-recognizer
99
ms.topic: conceptual
10-
ms.date: 02/03/2022
10+
ms.date: 02/14/2022
1111
ms.author: vikurpad
1212
---
1313

1414
# Interpret and improve accuracy and confidence for custom models
1515

1616
> [!NOTE]
17+
>
1718
> * **Custom models do not provide accuracy scores during training**.
1819
> * Confidence scores for structured fields such as tables are currently unavailable.
1920
20-
Custom models generate an estimated accuracy score when trained. Documents analyzed with a Custom model produce a confidence score for extracted fields. In this document, you'll learn to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
21+
Custom models generate an estimated accuracy score when trained. Documents analyzed with a custom model produce a confidence score for extracted fields. In this article, you'll learn to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
2122

2223
## Accuracy scores
2324

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: Form Recognizer composed models
3+
titleSuffix: Azure Applied AI Services
4+
description: Learn about composed custom models
5+
author: laujan
6+
manager: nitinme
7+
ms.service: applied-ai-services
8+
ms.subservice: forms-recognizer
9+
ms.topic: conceptual
10+
ms.date: 02/13/2022
11+
ms.author: vikurpad
12+
recommendations: false
13+
---
14+
15+
# Composed custom models
16+
17+
**Composed models**. A composed model is created by taking a collection of custom models and assigning them to a single model comprised of your form types. When a document is submitted for analysis to a composed model, the service performs a classification to decide which custom model accurately represents the form presented for analysis.
18+
19+
With composed models, you can assign multiple custom models to a composed model called with a single model ID. It's useful when you've trained several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.
20+
21+
* ```Custom form```and ```Custom document``` models can be composed together into a single composed model when they're trained with the same API version or an API version later than ```2021-01-30-preview```. For more information on composing custom template and custom neural models, see [compose model limits](#compose-model-limits).
22+
* With the model compose operation, you can assign up to 100 trained custom models to a single composed model. When you call Analyze with the composed model ID, Form Recognizer will first classify the form you submitted, choose the best matching assigned model, and then return results for that model.
23+
* For **_custom template models_**, the composed model can be created using variations of a custom template or different form types. This operation is useful when incoming forms may belong to one of several templates.
24+
* The response will include a ```docType``` property to indicate which of the composed models was used to analyze the document.
25+
26+
## Compose model limits
27+
28+
> [!NOTE]
29+
> With the addition of **_custom neural model_** , there are a few limits to the compatibility of models that can be composed together.
30+
31+
### Composed model compatibility
32+
33+
|Custom model type | API Version |Custom form 2021-01-30-preview (v3.0)| Custom document 2021-01-30-preview(v3.0) | Custom form GA version (v2.1) or earlier|
34+
|--|--|--|--|--|
35+
|**Custom template** (updated custom form)| 2021-01-30-preview | ✱|| X |
36+
|**Custom neural**| trained with current API version (2021-01-30-preview) ||| X |
37+
|**Custom form**| Custom form GA version (v2.1) or earlier | X | X||
38+
39+
**Table symbols**: ✔ — supported; **X** — not supported; ✱ — unsupported for this API version, but will be supported in a future API version.
40+
41+
* To compose a model trained with a prior version of the API (2.1 or earlier), train a model with the 3.0 API using the same labeled dataset to ensure that it can be composed with other models.
42+
43+
* Models composed with v2.1 of the API will continue to be supported, requiring no updates.
44+
45+
* The limit for maximum number of custom models that can be composed is 100.
46+
47+
## Development options
48+
49+
The following resources are supported by Form Recognizer **v3.0** (preview):
50+
51+
| Feature | Resources |
52+
|----------|-------------|
53+
|_**Custom model**_| <ul><li>[Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio/custommodel/projects)</li><li>[REST API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-1/operations/BuildDocumentModel)</li><li>[C# SDK](quickstarts/try-v3-csharp-sdk.md)</li><li>[Java SDK](quickstarts/try-v3-java-sdk.md)</li><li>[JavaScript SDK](quickstarts/try-v3-javascript-sdk.md)</li><li>[Python SDK](quickstarts/try-v3-python-sdk.md)</li></ul>|
54+
| _**Composed model**_| <ul><li>[Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio/custommodel/projects)</li><li>[REST API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-1/operations/ComposeDocumentModel)</li><li>[C# SDK](/dotnet/api/azure.ai.formrecognizer.documentanalysis.documentmodeladministrationclient.startcreatecomposedmodel?view=azure-dotnet-preview&preserve-view=true)</li><li>[Java SDK](/java/api/com.azure.ai.formrecognizer.administration.documentmodeladministrationclient.begincreatecomposedmodel?view=azure-java-preview&preserve-view=true)</li><li>[JavaScript SDK](/javascript/api/@azure/ai-form-recognizer/documentmodeladministrationclient?view=azure-node-preview#@azure-ai-form-recognizer-documentmodeladministrationclient-begincomposemodel&preserve-view=true)</li><li>[Python SDK](/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.formtrainingclient?view=azure-python-preview#azure-ai-formrecognizer-formtrainingclient-begin-create-composed-model&preserve-view=true)</li></ul>|
55+
56+
The following resources are supported by Form Recognizer v2.1:
57+
58+
| Feature | Resources |
59+
|----------|-------------------------|
60+
|_**Custom model**_| <ul><li>[Form Recognizer labeling tool](https://fott-2-1.azurewebsites.net)</li><li>[REST API](quickstarts/try-sdk-rest-api.md?pivots=programming-language-rest-api#analyze-forms-with-a-custom-model)</li><li>[Client library SDK](quickstarts/try-sdk-rest-api.md)</li><li>[Form Recognizer Docker container](containers/form-recognizer-container-install-run.md?tabs=custom#run-the-container-with-the-docker-compose-up-command)</li></ul>|
61+
| _**Composed model**_ |<ul><li>[Form Recognizer labeling tool](https://fott-2-1.azurewebsites.net/)</li><li>[REST API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/Compose)</li><li>[C# SDK](/dotnet/api/azure.ai.formrecognizer.training.createcomposedmodeloperation?view=azure-dotnet&preserve-view=true)</li><li>[Java SDK](/java/api/com.azure.ai.formrecognizer.models.createcomposedmodeloptions?view=azure-java-stable&preserve-view=true)</li><li>[JavaScript SDK](/javascript/api/@azure/ai-form-recognizer/begincreatecomposedmodeloptions?view=azure-node-latest&preserve-view=true)</li><li>[Python SDK](/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.formtrainingclient?view=azure-python#azure-ai-formrecognizer-formtrainingclient-begin-create-composed-model&preserve-view=true)</li></ul>|
62+
63+
64+
## Next steps
65+
66+
Learn to create and compose custom models:
67+
68+
> [!div class="nextstepaction"]
69+
> [**Form Recognizer v2.1 (GA)**](compose-custom-models.md)
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: Form Recognizer custom neural model
3+
titleSuffix: Azure Applied AI Services
4+
description: Learn about custom neural (neural) model type, its features and how you train a model with high accuracy to extract data from structured and unstructured documents
5+
author: laujan
6+
manager: nitinme
7+
ms.service: applied-ai-services
8+
ms.subservice: forms-recognizer
9+
ms.topic: conceptual
10+
ms.date: 02/13/2022
11+
ms.author: vikurpad
12+
ms.custom: references_regions
13+
recommendations: false
14+
---
15+
16+
# Form Recognizer custom neural model
17+
18+
Custom document models or neural models are a deep learned model that combines layout and language features to accurately extract labeled fields from documents. The base custom neural model is trained on various document types that makes it suitable to be trained for extracting fields from structured, semi-structured and unstructured documents. The table below lists common document types for each category:
19+
20+
|Documents | Examples |
21+
|---|--|
22+
|structured| surveys, questionnaires|
23+
|semi-structured | invoices, purchase orders |
24+
|unstructured | contracts, letters|
25+
26+
Custom document models share the same labeling format and strategy as custom template models. Currently custom neural models only support a subset of the field types supported by custom template models.
27+
28+
## Model capabilities
29+
30+
Custom document models currently only support key-value pairs and selection marks, future releases will include support for structured fields (tables) and signature.
31+
32+
| Form fields | Selection marks | Tables | Signature | Region |
33+
|--|--|--|--|--|
34+
| Supported| Supported | Unsupported | Unsupported | Unsupported |
35+
36+
## Supported regions
37+
38+
In public preview custom neural models can only be trained in select Azure regions.
39+
40+
* AustraliaEast
41+
* BrazilSouth
42+
* CanadaCentral
43+
* CentralIndia
44+
* CentralUS
45+
* EastUS
46+
* EastUS2
47+
* FranceCentral
48+
* JapanEast
49+
* JioIndiaWest
50+
* KoreaCentral
51+
* NorthEurope
52+
* SouthCentralUS
53+
* SoutheastAsia
54+
* UKSouth
55+
* WestEurope
56+
* WestUS
57+
* WestUS2
58+
* WestUS3
59+
60+
You can copy a model trained in one of the regions listed above to any other region for use.
61+
62+
## Best practices
63+
64+
Custom document models differ from custom template models in a few different ways.
65+
66+
### Dealing with variations
67+
68+
Custom document models can generalize across different formats of a single document type. As a best practice, create a single model for all variations of a document type. Add at least five labeled samples for each of the different variations to the training dataset.
69+
70+
### Field naming
71+
72+
When you label the data, labeling the field relevant to the value will improve the accuracy of the key-value pairs extracted. For example, for a field value containing the supplier ID, consider naming the field "supplier_id". Field names should be in the language of the document.
73+
74+
### Labeling contiguous values
75+
76+
Value tokens/words of one field must be either
77+
78+
* Consecutive sequence in natural reading order without interleaving with other fields
79+
* In a region that don't cover any other fields
80+
81+
### Representative data
82+
83+
Values in training cases should be diverse and representative. For example, if a field is named "date", values for this field should be a date. synthetic value like a random string can affect model performance.
84+
85+
86+
## Current Limitations
87+
88+
* The model doesn't recognize values split across page boundaries.
89+
* Custom document models are only trained in English and model performance will be lower for documents in other languages.
90+
* If a dataset labeled for custom template models is used to train a custom neural model, the unsupported field types are ignored.
91+
* Custom document models are limited to 10 build operations per month. Open a support request if you need the limit increased.
92+
93+
## Training a model
94+
95+
Custom document models are only available in the [v3 API](v3-migration-guide.md).
96+
97+
| Document Type | REST API | SDK | Label and Test Models|
98+
|--|--|--|--|
99+
| Custom document | [Form Recognizer 3.0 (preview)](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-1/operations/AnalyzeDocument)| [Form Recognizer Preview SDK](quickstarts/try-v3-python-sdk.md)| [Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio)
100+
101+
The build operation to train model supports a new ```buildMode``` property, to train a custom neural model, set the ```buildMode``` to ```neural```.
102+
103+
```REST
104+
https://{endpoint}/formrecognizer/documentModels:build?api-version=2022-01-30-preview
105+
106+
{
107+
"modelId": "string",
108+
"description": "string",
109+
"buildMode": "neural",
110+
"azureBlobSource":
111+
{
112+
"containerUrl": "string",
113+
"prefix": "string"
114+
}
115+
}
116+
```
117+
## Next steps
118+
119+
* Train a custom model:
120+
121+
> [!div class="nextstepaction"]
122+
> [Form Recognizer quickstart](quickstarts/try-v3-form-recognizer-studio.md#custom-models)
123+
124+
* View the labeling guidelines:
125+
126+
> [!div class="nextstepaction"]
127+
> [Form Recognizer API v2.1](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeWithCustomForm)

0 commit comments

Comments
 (0)