Skip to content

Commit 8d67a97

Browse files
authored
Merge pull request #188646 from laujan/release-preview3-three
update branch
2 parents ea9d811 + 5cfef1f commit 8d67a97

14 files changed

+79
-104
lines changed

articles/applied-ai-services/form-recognizer/compose-custom-models-preview.md

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -71,17 +71,10 @@ to an Azure blob storage container.
7171

7272
If you want to use manually labeled data, you'll also have to upload the *.labels.json* and *.ocr.json* files that correspond to your training documents.
7373

74-
*See* [Train with labels](#train-with-labels)
7574

7675
## Train your custom model
7776

78-
You can [train your model](https://formrecognizer.appliedai.azure.com/studio/custommodel/projects) with or without labeled data sets. Unlabeled datasets rely solely on the [prebuilt-layout model](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-2/operations/AnalyzeDocument) to detect and identify key information without added human input. Labeled datasets also rely on the Layout API, but supplementary human input is included such as your specific labels and field locations. To use both labeled and unlabeled data, start with at least five completed forms of the same type for the labeled training data and then add unlabeled data to the required data set.
79-
80-
### Train without labels
81-
82-
Form Recognizer uses unsupervised learning to understand the layout and relationships between fields and entries in your forms. When you submit your input forms, the algorithm clusters the forms by type, discovers what keys and tables are present, and associates values to keys and entries to tables. Training without labels doesn't require manual data labeling or intensive coding and maintenance, and we recommend you try this method first.
83-
84-
### Train with labels
77+
You [train your model](https://formrecognizer.appliedai.azure.com/studio/custommodel/projects). Labeled datasets rely on the prebuilt-layout API, but supplementary human input is included such as your specific labels and field locations. To use both labeled data, start with at least five completed forms of the same type for the labeled training data and then add unlabeled data to the required data set.
8578

8679
When you train with labeled data, the model uses supervised learning to extract values of interest, using the labeled forms you provide. Labeled data results in better-performing models and can produce models that work with complex forms or forms containing values without keys.
8780

articles/applied-ai-services/form-recognizer/compose-custom-models.md

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -70,15 +70,7 @@ to an Azure blob storage container. If you don't know how to create an Azure sto
7070

7171
## Train your custom model
7272

73-
You can [train your model](./quickstarts/try-sdk-rest-api.md#train-a-custom-model) with or without labeled data sets. Unlabeled datasets rely solely on the [Layout API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeLayoutAsync) to detect and identify key information without added human input. Labeled datasets also rely on the Layout API, but supplementary human input is included such as your specific labels and field locations. To use both labeled and unlabeled data, start with at least five completed forms of the same type for the labeled training data and then add unlabeled data to the required data set.
74-
75-
### Train without labels
76-
77-
Form Recognizer uses unsupervised learning to understand the layout and relationships between fields and entries in your forms. When you submit your input forms, the algorithm clusters the forms by type, discovers what keys and tables are present, and associates values to keys and entries to tables. Training without labels doesn't require manual data labeling or intensive coding and maintenance, and we recommend you try this method first.
78-
79-
See [Build a training data set](./build-training-data-set.md) for tips on how to collect your training documents.
80-
81-
### Train with labels
73+
You [train your model](./quickstarts/try-sdk-rest-api.md#train-a-custom-model) with labeled data sets. Labeled datasets rely on the prebuilt-layout API, but supplementary human input is included such as your specific labels and field locations. Start with at least five completed forms of the same type for your labeled training data.
8274

8375
When you train with labeled data, the model uses supervised learning to extract values of interest, using the labeled forms you provide. Labeled data results in better-performing models and can produce models that work with complex forms or forms containing values without keys.
8476

articles/applied-ai-services/form-recognizer/concept-custom-neural.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,19 +15,19 @@ recommendations: false
1515

1616
# Form Recognizer custom neural model
1717

18-
Custom document models or neural models are a deep learned model that combines layout and language features to accurately extract labeled fields from documents. The base custom neural model is trained on various document types that makes it suitable to be trained for extracting fields from structured, semi-structured and unstructured documents. The table below lists common document types for each category:
18+
Custom neural models or neural models are a deep learned model that combines layout and language features to accurately extract labeled fields from documents. The base custom neural model is trained on various document types that makes it suitable to be trained for extracting fields from structured, semi-structured and unstructured documents. The table below lists common document types for each category:
1919

2020
|Documents | Examples |
2121
|---|--|
2222
|structured| surveys, questionnaires|
2323
|semi-structured | invoices, purchase orders |
2424
|unstructured | contracts, letters|
2525

26-
Custom document models share the same labeling format and strategy as custom template models. Currently custom neural models only support a subset of the field types supported by custom template models.
26+
Custom neural models share the same labeling format and strategy as custom template models. Currently custom neural models only support a subset of the field types supported by custom template models.
2727

2828
## Model capabilities
2929

30-
Custom document models currently only support key-value pairs and selection marks, future releases will include support for structured fields (tables) and signature.
30+
Custom neural models currently only support key-value pairs and selection marks, future releases will include support for structured fields (tables) and signature.
3131

3232
| Form fields | Selection marks | Tables | Signature | Region |
3333
|--|--|--|--|--|
@@ -61,11 +61,11 @@ You can copy a model trained in one of the regions listed above to any other reg
6161

6262
## Best practices
6363

64-
Custom document models differ from custom template models in a few different ways.
64+
Custom neural models differ from custom template models in a few different ways.
6565

6666
### Dealing with variations
6767

68-
Custom document models can generalize across different formats of a single document type. As a best practice, create a single model for all variations of a document type. Add at least five labeled samples for each of the different variations to the training dataset.
68+
Custom neural models can generalize across different formats of a single document type. As a best practice, create a single model for all variations of a document type. Add at least five labeled samples for each of the different variations to the training dataset.
6969

7070
### Field naming
7171

@@ -86,13 +86,13 @@ Values in training cases should be diverse and representative. For example, if a
8686
## Current Limitations
8787

8888
* The model doesn't recognize values split across page boundaries.
89-
* Custom document models are only trained in English and model performance will be lower for documents in other languages.
89+
* Custom neural models are only trained in English and model performance will be lower for documents in other languages.
9090
* If a dataset labeled for custom template models is used to train a custom neural model, the unsupported field types are ignored.
91-
* Custom document models are limited to 10 build operations per month. Open a support request if you need the limit increased.
91+
* Custom neural models are limited to 10 build operations per month. Open a support request if you need the limit increased.
9292

9393
## Training a model
9494

95-
Custom document models are only available in the [v3 API](v3-migration-guide.md).
95+
Custom neural models are only available in the [v3 API](v3-migration-guide.md).
9696

9797
| Document Type | REST API | SDK | Label and Test Models|
9898
|--|--|--|--|
@@ -121,7 +121,7 @@ https://{endpoint}/formrecognizer/documentModels:build?api-version=2022-01-30-pr
121121
> [!div class="nextstepaction"]
122122
> [Form Recognizer quickstart](quickstarts/try-v3-form-recognizer-studio.md#custom-models)
123123
124-
* View the labeling guidelines:
124+
* View the REST API:
125125

126126
> [!div class="nextstepaction"]
127127
> [Form Recognizer API v3.0](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-2/operations/AnalyzeDocument)

articles/applied-ai-services/form-recognizer/concept-custom-template.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,9 +67,9 @@ https://{endpoint}/formrecognizer/documentModels:build?api-version=2022-01-30-pr
6767
* Learn more about custom neural models:
6868

6969
> [!div class="nextstepaction"]
70-
> [Custom document models](concept-custom-neural.md )
70+
> [Custom neural models](concept-custom-neural.md )
7171
72-
* View the labeling guidelines:
72+
* View the REST API:
7373

7474
> [!div class="nextstepaction"]
7575
> [Form Recognizer API v2.1](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeWithCustomForm)

articles/applied-ai-services/form-recognizer/concept-custom.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ Custom models can be one of two types, [**custom template**](concept-custom-temp
3333
3434
### Custom neural model
3535

36-
The custom neural model is a deep learning model type relies on a base model trained on a large collection of labeled documents using key-value pairs. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents.
36+
The custom neural model is a deep learning model type relies on a base model trained on a large collection of labeled documents using key-value pairs. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents. When choosing between the two model types, start with a neural model if it meets your functional needs. See [neural models](concept-custom-neural.md) to learn more about custom document models.
3737

3838
## Model features
3939

articles/applied-ai-services/form-recognizer/concept-general-document.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ recommendations: false
1818
The General document preview model combines powerful Optical Character Recognition (OCR) capabilities with deep learning models to extract key-value pairs, selection marks, and entities from documents. General document is only available with the preview (v3.0) API. For more information on using the preview (v3.0) API, see our [migration guide](v3-migration-guide.md).
1919

2020

21-
The general document API supports most form types and will analyze your documents and extract keys and associated values. It is ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to [training a custom model without labels](compose-custom-models.md#train-without-labels).
21+
The general document API supports most form types and will analyze your documents and extract keys and associated values. It is ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to training a custom model without labels.
2222

2323
> [!NOTE]
2424
> The ```2022-01-30-preview``` update to the general document model adds support for selection marks.

articles/applied-ai-services/form-recognizer/concept-model-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ The Read API analyzes and extracts ext lines, words, their locations, detected l
4848

4949
:::image type="content" source="media/studio/general-document.png" alt-text="Screenshot: Studio general document icon.":::
5050

51-
* The general document API supports most form types and will analyze your documents and associate values to keys and entries to tables that it discovers. It's ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to [training a custom model without labels](compose-custom-models.md#train-without-labels).
51+
* The general document API supports most form types and will analyze your documents and associate values to keys and entries to tables that it discovers. It's ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to training a custom model without labels.
5252

5353
* The general document is a pre-trained model and can be directly invoked via the REST API.
5454

articles/applied-ai-services/form-recognizer/language-support.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ This section lists the supported languages in the latest GA version.
139139
|Irish | `ga` |Zulu | `zu` |
140140
|Italian | `it` |
141141

142-
## Custom document model
142+
## Custom neural model
143143

144144
Language| Locale code |
145145
|:-----|:----:|
-12.8 KB
Loading

0 commit comments

Comments
 (0)