You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Document Intelligence supports more sophisticated and modular analysis capabilities. Use the add-on features to extend the results to include more features extracted from your documents. Some add-on features incur an extra cost. These optional features can be enabled and disabled depending on the scenario of the document extraction. To enable a feature simply add the associated feature name to the `features` query string property. You can enable more than one add-on feature on a request by providing a comma seperated list of features. The following add-on capabilities are available for `2023-07-31 (GA)` and later releases.
37
+
Document Intelligence supports more sophisticated and modular analysis capabilities. Use the add-on features to extend the results to include more features extracted from your documents. Some add-on features incur an extra cost. These optional features can be enabled and disabled depending on the scenario of the document extraction. To enable a feature, add the associated feature name to the `features` query string property. You can enable more than one add-on feature on a request by providing a comma-separated list of features. The following add-on capabilities are available for `2023-07-31 (GA)` and later releases.
@@ -77,7 +77,7 @@ The following add-on capability is available for `2023-10-31-preview` and later
77
77
|Query fields|Add-On*| ✔️|n/a|n/a| n/a|
78
78
79
79
80
-
Add-On* - Query fields are priced differently than the other addon features. See [pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/) for details.
80
+
Add-On* - Query fields are priced differently than the other add-on features. See [pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/) for details.
Adding the `languages` feature to the analyze request predicts the detected primary language for each text line along with the `confidence` in the `languages` collection under `analyzeResult`.
238
+
Adding the `languages` feature to the `analyzeResult` request predicts the detected primary language for each text line along with the `confidence` in the `languages` collection under `analyzeResult`.
In earlier API versions the prebuilt-document model extracted keyvalue pairs from forms and documents. With the addition of the `keyValuePairs` feature to prebuilt-layout, the layout model now produces the same results.
273
+
In earlier API versions, the prebuilt-document model extracted key-value pairs from forms and documents. With the addition of the `keyValuePairs` feature to prebuilt-layout, the layout model now produces the same results.
274
274
275
275
Key-value pairs are specific spans within the document that identify a label or key and its associated response or value. In a structured form, these pairs could be the label and the value the user entered for that field. In an unstructured document, they could be the date a contract was executed on based on the text in a paragraph. The AI model is trained to extract identifiable keys and values based on a wide variety of document types, formats, and structures.
276
276
@@ -282,10 +282,9 @@ Keys can also exist in isolation when the model detects that a key exists, with
Query fields is an add-on capability to extend the schema extracted from any prebuilt model or define a specific key name when the key name is variable. To use query fields, set the features to `queryFields` and provide a comma seperated list of field names in the `queryFields` property.
287
+
Query fields are an add-on capability to extend the schema extracted from any prebuilt model or define a specific key name when the key name is variable. To use query fields, set the features to `queryFields` and provide a comma-separated list of field names in the `queryFields` property.
289
288
290
289
* Document Intelligence now supports query field extractions. With query field extraction, you can add fields to the extraction process using a query request without the need for added training.
**Composed models**. A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. When a document is submitted for analysis using a composed model, the service performs a classification to decide which custom model best represents the submitted document.
36
36
37
-
With composed models, you can assign multiple custom models to a composed model called with a single model ID. It's useful when you've trained several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.
37
+
With composed models, you can assign multiple custom models to a composed model called with a single model ID. It's useful when you train several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.
38
38
39
39
*```Custom form``` and ```Custom template``` models can be composed together into a single composed model.
Copy file name to clipboardExpand all lines: articles/ai-services/document-intelligence/concept-custom-classifier.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: vkurpad
6
6
manager: nitinme
7
7
ms.service: azure-ai-document-intelligence
8
8
ms.topic: conceptual
9
-
ms.date: 11/21/2023
9
+
ms.date: 01/19/2024
10
10
ms.author: lajanuar
11
11
ms.custom:
12
12
- references_regions
@@ -107,23 +107,23 @@ Classification models can now be trained on documents of different languages. Se
107
107
108
108
* For custom classification model training, the total size of training data is `1GB` with a maximum of 10,000 pages.
109
109
110
-
111
110
## Document splitting
112
111
113
-
When you have more than one document in a file, the classifier can identify the different document types contained within the input file. The classifier response will contain the page ranges for each of the identified document types contined within a file. This can include multiple instances of the same document type.
112
+
When you have more than one document in a file, the classifier can identify the different document types contained within the input file. The classifier response contains the page ranges for each of the identified document types contained within a file. This response can include multiple instances of the same document type.
114
113
115
114
::: moniker range=">=doc-intel-4.0.0"
116
-
The analyze operation now includes a `splitMode` property that gives you granular control over the splitting behavior.
117
-
* To trat the entire input file as a single document for classification set the splitMode to `none`. When you do this, the service returns just one class for the entire input file.
118
-
* To classify each page of the input file, set the splitMode to `perPage`. The service will attept to classify each page as an individual document.
119
-
* Set the splitMode to `auto` and the service will identify the documents and associated page ranges.
115
+
The analyze operation now includes a `splitMode` property that gives you granular control over the splitting behavior.
116
+
117
+
* To treat the entire input file as a single document for classification set the splitMode to `none`. When you do so, the service returns just one class for the entire input file.
118
+
* To classify each page of the input file, set the splitMode to `perPage`. The service attempts to classify each page as an individual document.
119
+
* Set the splitMode to `auto` and the service identifies the documents and associated page ranges.
120
120
::: moniker-end
121
121
122
122
## Best practices
123
123
124
124
Custom classification models require a minimum of five samples per class to train. If the classes are similar, adding extra training samples improves model accuracy.
125
125
126
-
The classifier will attempt to assign each document to one of the classes, if you expect the model will see document types not in the classes that are part of the training dataset, you should plan to set a threshold on the classification score or add a few representative samples of the document types to an ```"other"``` class. Adding an ```"other"``` class will ensure that the documents not needed do not impact your classifier quality.
126
+
The classifier attempts to assign each document to one of the classes, if you expect the model to see document types not in the classes that are part of the training dataset, you should plan to set a threshold on the classification score or add a few representative samples of the document types to an ```"other"``` class. Adding an ```"other"``` class ensures that unneeded documents don't impact your classifier quality.
[Document Intelligence Studio](https://documentintelligence.ai.azure.com/) is an online tool for visually exploring, understanding, and integrating features from the Document Intelligence service into your applications. Use the Document Intelligence Studio to:
34
+
34
35
* Learn more about the different capabilities in Document Intelligence.
35
36
* Use your Document Intelligence resource to test models on sample documents or upload your own documents.
36
37
* Experiment with different add-on and preview features to adapt the output to your needs.
37
-
* Train custom classifcation models to classify documents.
38
+
* Train custom classification models to classify documents.
38
39
* Train custom extraction models to extract fields from documents.
39
-
* Get sample code for the language spcific SDKs to integrate into y9our applications.
40
+
* Get sample code for the language-specific SDKs to integrate into your applications.
40
41
41
42
Use the [Document Intelligence Studio quickstart](quickstarts/try-document-intelligence-studio.md) to get started analyzing documents with document analysis or prebuilt models. Build custom models and reference the models in your applications using one of the [language specific SDKs](quickstarts/get-started-sdks-rest-api.md?view=doc-intel-3.0.0&preserve-view=true) and other quickstarts.
42
43
@@ -46,7 +47,7 @@ The following image shows the landing page for Document Intelligence Studio.
46
47
47
48
## Getting started
48
49
49
-
If this is the first time you are visiting the Studio, follow the [getting started guide](studio-overview.md#get-started-using-document-intelligence-studio) to setup the Studio for use.
50
+
If you're visiting the Studio for the first time, follow the [getting started guide](studio-overview.md#get-started-using-document-intelligence-studio) to set up the Studio for use.
@@ -78,7 +78,7 @@ The following table shows the available models for each current preview and stab
78
78
|Query fields|Add-On*| ✔️|n/a|n/a| n/a|
79
79
80
80
81
-
Add-On* - Query fields are priced differently than the other addon features. See [pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/) for details.
81
+
Add-On* - Query fields are priced differently than the other add-on features. See [pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/) for details.
0 commit comments