You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/applied-ai-services/form-recognizer/compose-custom-models-preview.md
+1-8Lines changed: 1 addition & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -71,17 +71,10 @@ to an Azure blob storage container.
71
71
72
72
If you want to use manually labeled data, you'll also have to upload the *.labels.json* and *.ocr.json* files that correspond to your training documents.
73
73
74
-
*See*[Train with labels](#train-with-labels)
75
74
76
75
## Train your custom model
77
76
78
-
You can [train your model](https://formrecognizer.appliedai.azure.com/studio/custommodel/projects) with or without labeled data sets. Unlabeled datasets rely solely on the [prebuilt-layout model](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-2/operations/AnalyzeDocument) to detect and identify key information without added human input. Labeled datasets also rely on the Layout API, but supplementary human input is included such as your specific labels and field locations. To use both labeled and unlabeled data, start with at least five completed forms of the same type for the labeled training data and then add unlabeled data to the required data set.
79
-
80
-
### Train without labels
81
-
82
-
Form Recognizer uses unsupervised learning to understand the layout and relationships between fields and entries in your forms. When you submit your input forms, the algorithm clusters the forms by type, discovers what keys and tables are present, and associates values to keys and entries to tables. Training without labels doesn't require manual data labeling or intensive coding and maintenance, and we recommend you try this method first.
83
-
84
-
### Train with labels
77
+
You [train your model](https://formrecognizer.appliedai.azure.com/studio/custommodel/projects). Labeled datasets rely on the prebuilt-layout API, but supplementary human input is included such as your specific labels and field locations. To use both labeled data, start with at least five completed forms of the same type for the labeled training data and then add unlabeled data to the required data set.
85
78
86
79
When you train with labeled data, the model uses supervised learning to extract values of interest, using the labeled forms you provide. Labeled data results in better-performing models and can produce models that work with complex forms or forms containing values without keys.
Copy file name to clipboardExpand all lines: articles/applied-ai-services/form-recognizer/compose-custom-models.md
+1-9Lines changed: 1 addition & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -70,15 +70,7 @@ to an Azure blob storage container. If you don't know how to create an Azure sto
70
70
71
71
## Train your custom model
72
72
73
-
You can [train your model](./quickstarts/try-sdk-rest-api.md#train-a-custom-model) with or without labeled data sets. Unlabeled datasets rely solely on the [Layout API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeLayoutAsync) to detect and identify key information without added human input. Labeled datasets also rely on the Layout API, but supplementary human input is included such as your specific labels and field locations. To use both labeled and unlabeled data, start with at least five completed forms of the same type for the labeled training data and then add unlabeled data to the required data set.
74
-
75
-
### Train without labels
76
-
77
-
Form Recognizer uses unsupervised learning to understand the layout and relationships between fields and entries in your forms. When you submit your input forms, the algorithm clusters the forms by type, discovers what keys and tables are present, and associates values to keys and entries to tables. Training without labels doesn't require manual data labeling or intensive coding and maintenance, and we recommend you try this method first.
78
-
79
-
See [Build a training data set](./build-training-data-set.md) for tips on how to collect your training documents.
80
-
81
-
### Train with labels
73
+
You [train your model](./quickstarts/try-sdk-rest-api.md#train-a-custom-model) with labeled data sets. Labeled datasets rely on the prebuilt-layout API, but supplementary human input is included such as your specific labels and field locations. Start with at least five completed forms of the same type for your labeled training data.
82
74
83
75
When you train with labeled data, the model uses supervised learning to extract values of interest, using the labeled forms you provide. Labeled data results in better-performing models and can produce models that work with complex forms or forms containing values without keys.
Copy file name to clipboardExpand all lines: articles/applied-ai-services/form-recognizer/concept-custom-neural.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,19 +15,19 @@ recommendations: false
15
15
16
16
# Form Recognizer custom neural model
17
17
18
-
Custom document models or neural models are a deep learned model that combines layout and language features to accurately extract labeled fields from documents. The base custom neural model is trained on various document types that makes it suitable to be trained for extracting fields from structured, semi-structured and unstructured documents. The table below lists common document types for each category:
18
+
Custom neural models or neural models are a deep learned model that combines layout and language features to accurately extract labeled fields from documents. The base custom neural model is trained on various document types that makes it suitable to be trained for extracting fields from structured, semi-structured and unstructured documents. The table below lists common document types for each category:
19
19
20
20
|Documents | Examples |
21
21
|---|--|
22
22
|structured| surveys, questionnaires|
23
23
|semi-structured | invoices, purchase orders |
24
24
|unstructured | contracts, letters|
25
25
26
-
Custom document models share the same labeling format and strategy as custom template models. Currently custom neural models only support a subset of the field types supported by custom template models.
26
+
Custom neural models share the same labeling format and strategy as custom template models. Currently custom neural models only support a subset of the field types supported by custom template models.
27
27
28
28
## Model capabilities
29
29
30
-
Custom document models currently only support key-value pairs and selection marks, future releases will include support for structured fields (tables) and signature.
30
+
Custom neural models currently only support key-value pairs and selection marks, future releases will include support for structured fields (tables) and signature.
31
31
32
32
| Form fields | Selection marks | Tables | Signature | Region |
33
33
|--|--|--|--|--|
@@ -61,11 +61,11 @@ You can copy a model trained in one of the regions listed above to any other reg
61
61
62
62
## Best practices
63
63
64
-
Custom document models differ from custom template models in a few different ways.
64
+
Custom neural models differ from custom template models in a few different ways.
65
65
66
66
### Dealing with variations
67
67
68
-
Custom document models can generalize across different formats of a single document type. As a best practice, create a single model for all variations of a document type. Add at least five labeled samples for each of the different variations to the training dataset.
68
+
Custom neural models can generalize across different formats of a single document type. As a best practice, create a single model for all variations of a document type. Add at least five labeled samples for each of the different variations to the training dataset.
69
69
70
70
### Field naming
71
71
@@ -86,13 +86,13 @@ Values in training cases should be diverse and representative. For example, if a
86
86
## Current Limitations
87
87
88
88
* The model doesn't recognize values split across page boundaries.
89
-
* Custom document models are only trained in English and model performance will be lower for documents in other languages.
89
+
* Custom neural models are only trained in English and model performance will be lower for documents in other languages.
90
90
* If a dataset labeled for custom template models is used to train a custom neural model, the unsupported field types are ignored.
91
-
* Custom document models are limited to 10 build operations per month. Open a support request if you need the limit increased.
91
+
* Custom neural models are limited to 10 build operations per month. Open a support request if you need the limit increased.
92
92
93
93
## Training a model
94
94
95
-
Custom document models are only available in the [v3 API](v3-migration-guide.md).
95
+
Custom neural models are only available in the [v3 API](v3-migration-guide.md).
96
96
97
97
| Document Type | REST API | SDK | Label and Test Models|
Copy file name to clipboardExpand all lines: articles/applied-ai-services/form-recognizer/concept-custom.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,7 +33,7 @@ Custom models can be one of two types, [**custom template**](concept-custom-temp
33
33
34
34
### Custom neural model
35
35
36
-
The custom neural model is a deep learning model type relies on a base model trained on a large collection of labeled documents using key-value pairs. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents.
36
+
The custom neural model is a deep learning model type relies on a base model trained on a large collection of labeled documents using key-value pairs. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents. When choosing between the two model types, start with a neural model if it meets your functional needs. See [neural models](concept-custom-neural.md) to learn more about custom document models.
Copy file name to clipboardExpand all lines: articles/applied-ai-services/form-recognizer/concept-general-document.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ recommendations: false
18
18
The General document preview model combines powerful Optical Character Recognition (OCR) capabilities with deep learning models to extract key-value pairs, selection marks, and entities from documents. General document is only available with the preview (v3.0) API. For more information on using the preview (v3.0) API, see our [migration guide](v3-migration-guide.md).
19
19
20
20
21
-
The general document API supports most form types and will analyze your documents and extract keys and associated values. It is ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to [training a custom model without labels](compose-custom-models.md#train-without-labels).
21
+
The general document API supports most form types and will analyze your documents and extract keys and associated values. It is ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to training a custom model without labels.
22
22
23
23
> [!NOTE]
24
24
> The ```2022-01-30-preview``` update to the general document model adds support for selection marks.
Copy file name to clipboardExpand all lines: articles/applied-ai-services/form-recognizer/concept-model-overview.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ The Read API analyzes and extracts ext lines, words, their locations, detected l
48
48
49
49
:::image type="content" source="media/studio/general-document.png" alt-text="Screenshot: Studio general document icon.":::
50
50
51
-
* The general document API supports most form types and will analyze your documents and associate values to keys and entries to tables that it discovers. It's ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to [training a custom model without labels](compose-custom-models.md#train-without-labels).
51
+
* The general document API supports most form types and will analyze your documents and associate values to keys and entries to tables that it discovers. It's ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to training a custom model without labels.
52
52
53
53
* The general document is a pre-trained model and can be directly invoked via the REST API.
0 commit comments