You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|**A collection of several models each trained on similar-type documents.**|● Supply purchase orders</br>● Equipment purchase orders</br>● Furniture purchase orders</br> **All composed into a single model**.|[**Composed custom model**](concept-composed-models.md)|
77
77
78
+
## Custom classification model
79
+
80
+
| Training set | Example documents | Your best solution |
|**At least two different types of documents**. |Forms, letters, or documents |[**Custom classification model**](./concept-custom-classifier.md)|
83
+
84
+
85
+
78
86
## Next steps
79
87
80
88
*[Learn how to process your own forms and documents](quickstarts/try-document-intelligence-studio.md) with the [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio)
Document Intelligence supports more sophisticated and modular analysis capabilities. Use the add-on features to extend the results to include more features extracted from your documents. Some add-on features incur an extra cost. These optional features can be enabled and disabled depending on the scenario of the document extraction. The following add-on capabilities are available for `2023-07-31 (GA)` and later releases:
37
+
Document Intelligence supports more sophisticated and modular analysis capabilities. Use the add-on features to extend the results to include more features extracted from your documents. Some add-on features incur an extra cost. These optional features can be enabled and disabled depending on the scenario of the document extraction. To enable a feature, add the associated feature name to the `features` query string property. You can enable more than one add-on feature on a request by providing a comma-separated list of features. The following add-on capabilities are available for `2023-07-31 (GA)` and later releases.
@@ -52,11 +52,12 @@ Document Intelligence supports more sophisticated and modular analysis capabilit
52
52
53
53
> [!NOTE]
54
54
>
55
-
> Not all add-on capabilities are supported by all models. For more information, *see*[model data extraction](concept-model-overview.md#model-data-extraction).
55
+
> Not all add-on capabilities are supported by all models. For more information, *see*[model data extraction](concept-model-overview.md#analysis-features).
56
56
57
57
The following add-on capability is available for `2023-10-31-preview` and later releases:
58
58
59
59
*[`keyValuePairs`](#key-value-pairs)
60
+
60
61
*[`queryFields`](#query-fields)
61
62
62
63
> [!NOTE]
@@ -65,10 +66,37 @@ The following add-on capability is available for `2023-10-31-preview` and later
Add-On* - Query fields are priced differently than the other add-on features. See [pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/) for details.
81
+
68
82
## High resolution extraction
69
83
70
84
The task of recognizing small text from large-size documents, like engineering drawings, is a challenge. Often the text is mixed with other graphical elements and has varying fonts, sizes and orientations. Moreover, the text can be broken into separate parts or connected with other symbols. Document Intelligence now supports extracting content from these types of documents with the `ocr.highResolution` capability. You get improved quality of content extraction from A1/A2/A3 documents by enabling this add-on capability.
The `ocr.formula` capability extracts all identified formulas, such as mathematical equations, in the `formulas` collection as a top level object under `content`. Inside `content`, detected formulas are represented as `:formula:`. Each entry in this collection represents a formula that includes the formula type as `inline` or `display`, and its LaTeX representation as `value` along with its `polygon` coordinates. Initially, formulas appear at the end of each page.
@@ -101,6 +129,20 @@ The `ocr.formula` capability extracts all identified formulas, such as mathemati
The `ocr.font` capability extracts all font properties of text extracted in the `styles` collection as a top-level object under `content`. Each style object specifies a single font property, the text span it applies to, and its corresponding confidence score. The existing style property is extended with more font properties such as `similarFontFamily` for the font of the text, `fontStyle` for styles such as italic and normal, `fontWeight` for bold or normal, `color` for color of the text, and `backgroundColor` for color of the text bounding box.
@@ -141,6 +183,20 @@ The `ocr.font` capability extracts all font properties of text extracted in the
The `ocr.barcode` capability extracts all identified barcodes in the `barcodes` collection as a top level object under `content`. Inside the `content`, detected barcodes are represented as `:barcode:`. Each entry in this collection represents a barcode and includes the barcode type as `kind` and the embedded barcode content as `value` along with its `polygon` coordinates. Initially, barcodes appear at the end of each page. The `confidence` is hard-coded for as 1.
@@ -163,9 +219,23 @@ The `ocr.barcode` capability extracts all identified barcodes in the `barcodes`
163
219
|`ITF`|:::image type="content" source="media/barcodes/interleaved-two-five.png" alt-text="Screenshot of the interleaved-two-of-five barcode (ITF).":::|
164
220
|`Data Matrix`|:::image type="content" source="media/barcodes/datamatrix.gif" alt-text="Screenshot of the Data Matrix.":::|
It predicts the detected primary language for each text line along with the `confidence` in the `languages` collection under `analyzeResult`.
238
+
Adding the `languages` feature to the `analyzeResult` request predicts the detected primary language for each text line along with the `confidence` in the `languages` collection under `analyzeResult`.
169
239
170
240
```json
171
241
"languages": [
@@ -182,16 +252,40 @@ It predicts the detected primary language for each text line along with the `con
In earlier API versions, the prebuilt-document model extracted key-value pairs from forms and documents. With the addition of the `keyValuePairs` feature to prebuilt-layout, the layout model now produces the same results.
274
+
189
275
Key-value pairs are specific spans within the document that identify a label or key and its associated response or value. In a structured form, these pairs could be the label and the value the user entered for that field. In an unstructured document, they could be the date a contract was executed on based on the text in a paragraph. The AI model is trained to extract identifiable keys and values based on a wide variety of document types, formats, and structures.
190
276
191
277
Keys can also exist in isolation when the model detects that a key exists, with no associated value or when processing optional fields. For example, a middle name field can be left blank on a form in some instances. Key-value pairs are spans of text contained in the document. For documents where the same value is described in different ways, for example, customer/user, the associated key is either customer or user (based on context).
Query fields are an add-on capability to extend the schema extracted from any prebuilt model or define a specific key name when the key name is variable. To use query fields, set the features to `queryFields` and provide a comma-separated list of field names in the `queryFields` property.
288
+
195
289
* Document Intelligence now supports query field extractions. With query field extraction, you can add fields to the extraction process using a query request without the need for added training.
196
290
197
291
* Use query fields when you need to extend the schema of a prebuilt or custom model or need to extract a few fields with the output of layout.
@@ -222,10 +316,21 @@ For query field extraction, specify the fields you want to extract and Document
222
316
223
317
* In addition to the query fields, the response includes text, tables, selection marks, and other relevant data.
**Composed models**. A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. When a document is submitted for analysis using a composed model, the service performs a classification to decide which custom model best represents the submitted document.
36
36
37
-
With composed models, you can assign multiple custom models to a composed model called with a single model ID. It's useful when you've trained several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.
37
+
With composed models, you can assign multiple custom models to a composed model called with a single model ID. It's useful when you train several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.
38
38
39
39
*```Custom form``` and ```Custom template``` models can be composed together into a single composed model.
40
40
@@ -57,13 +57,15 @@ With the introduction of [**custom classification models**](./concept-custom-cla
57
57
58
58
### Composed model compatibility
59
59
60
-
|Custom model type|Models trained with v2.1 and v2.0 | Custom template models v3.0 |Custom neural models v3.0 (preview) |Custom neural models 3.0 (GA)|
60
+
|Custom model type|Models trained with v2.1 and v2.0 | Custom template models v3.0 |Custom neural models 3.0|Custom Neural models v3.1|
61
61
|--|--|--|--|--|
62
62
|**Models trained with version 2.1 and v2.0**|Supported|Supported|Not Supported|Not Supported|
* To compose a model trained with a prior version of the API (v2.1 or earlier), train a model with the v3.0 API using the same labeled dataset. That addition ensures that the v2.1 model can be composed with other models.
0 commit comments