Skip to content

Commit e187520

Browse files
authored
Merge pull request #263669 from laujan/262602-vinod-doc-cleanup
262602 vinod doc cleanup
2 parents 30e0584 + ac76919 commit e187520

15 files changed

+1378
-1167
lines changed

articles/ai-services/document-intelligence/choose-model-feature.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-document-intelligence
88
ms.custom:
99
- ignite-2023
1010
ms.topic: overview
11-
ms.date: 11/15/2023
11+
ms.date: 01/19/2024
1212
ms.author: lajanuar
1313
---
1414

@@ -75,6 +75,14 @@ The following decision charts highlight the features of each **Document Intellig
7575
|**Structured, semi-structured, and unstructured documents**.|&#9679; Structured &rightarrow; surveys</br>&#9679; Semi-structured &rightarrow; invoices</br>&#9679; Unstructured &rightarrow; letters| [**Custom neural model**](concept-custom-neural.md)|
7676
|**A collection of several models each trained on similar-type documents.** |&#9679; Supply purchase orders</br>&#9679; Equipment purchase orders</br>&#9679; Furniture purchase orders</br> **All composed into a single model**.| [**Composed custom model**](concept-composed-models.md)|
7777

78+
## Custom classification model
79+
80+
| Training set | Example documents | Your best solution |
81+
| -----------------|--------------|-------------------|
82+
|**At least two different types of documents**. |Forms, letters, or documents | [**Custom classification model**](./concept-custom-classifier.md)|
83+
84+
85+
7886
## Next steps
7987

8088
* [Learn how to process your own forms and documents](quickstarts/try-document-intelligence-studio.md) with the [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio)

articles/ai-services/document-intelligence/concept-add-on-capabilities.md

Lines changed: 109 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-document-intelligence
88
ms.custom:
99
- ignite-2023
1010
ms.topic: conceptual
11-
ms.date: 11/21/2023
11+
ms.date: 01/19/2024
1212
ms.author: lajanuar
1313
monikerRange: '>=doc-intel-3.1.0'
1414
---
@@ -34,7 +34,7 @@ monikerRange: '>=doc-intel-3.1.0'
3434

3535
:::moniker range=">=doc-intel-3.1.0"
3636

37-
Document Intelligence supports more sophisticated and modular analysis capabilities. Use the add-on features to extend the results to include more features extracted from your documents. Some add-on features incur an extra cost. These optional features can be enabled and disabled depending on the scenario of the document extraction. The following add-on capabilities are available for `2023-07-31 (GA)` and later releases:
37+
Document Intelligence supports more sophisticated and modular analysis capabilities. Use the add-on features to extend the results to include more features extracted from your documents. Some add-on features incur an extra cost. These optional features can be enabled and disabled depending on the scenario of the document extraction. To enable a feature, add the associated feature name to the `features` query string property. You can enable more than one add-on feature on a request by providing a comma-separated list of features. The following add-on capabilities are available for `2023-07-31 (GA)` and later releases.
3838

3939
* [`ocrHighResolution`](#high-resolution-extraction)
4040

@@ -52,11 +52,12 @@ Document Intelligence supports more sophisticated and modular analysis capabilit
5252

5353
> [!NOTE]
5454
>
55-
> Not all add-on capabilities are supported by all models. For more information, *see* [model data extraction](concept-model-overview.md#model-data-extraction).
55+
> Not all add-on capabilities are supported by all models. For more information, *see* [model data extraction](concept-model-overview.md#analysis-features).
5656
5757
The following add-on capability is available for `2023-10-31-preview` and later releases:
5858

5959
* [`keyValuePairs`](#key-value-pairs)
60+
6061
* [`queryFields`](#query-fields)
6162

6263
> [!NOTE]
@@ -65,10 +66,37 @@ The following add-on capability is available for `2023-10-31-preview` and later
6566
6667
::: moniker-end
6768

69+
|Add-on Capability| Add-On/Free|[2023-10-31-preview](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-10-31-preview&preserve-view=true&tabs=HTTP)|[`2023-07-31` (GA)](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-07-31&preserve-view=true&tabs=HTTP)|[`2022-08-31` (GA)](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2022-08-31/operations/AnalyzeDocument)|[v2.1 (GA)](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1/operations/AnalyzeBusinessCardAsync)|
70+
|----------------|-----------|---|--|---|---|
71+
|Font property extraction|Add-On| ✔️| ✔️| n/a| n/a|
72+
|Formula extraction|Add-On| ✔️| ✔️| n/a| n/a|
73+
|High resolution extraction|Add-On| ✔️| ✔️| n/a| n/a|
74+
|Barcode extraction|Free| ✔️| ✔️| n/a| n/a|
75+
|Language detection|Free| ✔️| ✔️| n/a| n/a|
76+
|Key value pairs|Free| ✔️|n/a|n/a| n/a|
77+
|Query fields|Add-On*| ✔️|n/a|n/a| n/a|
78+
79+
80+
Add-On* - Query fields are priced differently than the other add-on features. See [pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/) for details.
81+
6882
## High resolution extraction
6983

7084
The task of recognizing small text from large-size documents, like engineering drawings, is a challenge. Often the text is mixed with other graphical elements and has varying fonts, sizes and orientations. Moreover, the text can be broken into separate parts or connected with other symbols. Document Intelligence now supports extracting content from these types of documents with the `ocr.highResolution` capability. You get improved quality of content extraction from A1/A2/A3 documents by enabling this add-on capability.
7185

86+
### REST API
87+
88+
::: moniker range="doc-intel-4.0.0"
89+
```REST
90+
https://{your resource}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2023-10-31-preview&features=ocrHighResolution
91+
```
92+
:::moniker-end
93+
94+
:::moniker range="doc-intel-3.1.0"
95+
```REST
96+
https://{your resource}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=ocrHighResolution
97+
```
98+
:::moniker-end
99+
72100
## Formula extraction
73101

74102
The `ocr.formula` capability extracts all identified formulas, such as mathematical equations, in the `formulas` collection as a top level object under `content`. Inside `content`, detected formulas are represented as `:formula:`. Each entry in this collection represents a formula that includes the formula type as `inline` or `display`, and its LaTeX representation as `value` along with its `polygon` coordinates. Initially, formulas appear at the end of each page.
@@ -101,6 +129,20 @@ The `ocr.formula` capability extracts all identified formulas, such as mathemati
101129
]
102130
```
103131

132+
### REST API
133+
134+
::: moniker range="doc-intel-4.0.0"
135+
```REST
136+
https://{your resource}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2023-10-31-preview&features=formulas
137+
```
138+
:::moniker-end
139+
140+
:::moniker range="doc-intel-3.1.0"
141+
```REST
142+
https://{your resource}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=formulas
143+
```
144+
:::moniker-end
145+
104146
## Font property extraction
105147

106148
The `ocr.font` capability extracts all font properties of text extracted in the `styles` collection as a top-level object under `content`. Each style object specifies a single font property, the text span it applies to, and its corresponding confidence score. The existing style property is extended with more font properties such as `similarFontFamily` for the font of the text, `fontStyle` for styles such as italic and normal, `fontWeight` for bold or normal, `color` for color of the text, and `backgroundColor` for color of the text bounding box.
@@ -141,6 +183,20 @@ The `ocr.font` capability extracts all font properties of text extracted in the
141183
]
142184
```
143185

186+
### REST API
187+
188+
::: moniker range="doc-intel-4.0.0"
189+
```REST
190+
https://{your resource}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2023-10-31-preview&features=styleFont
191+
```
192+
:::moniker-end
193+
194+
:::moniker range="doc-intel-3.1.0"
195+
```REST
196+
https://{your resource}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=styleFont
197+
```
198+
:::moniker-end
199+
144200
## Barcode property extraction
145201

146202
The `ocr.barcode` capability extracts all identified barcodes in the `barcodes` collection as a top level object under `content`. Inside the `content`, detected barcodes are represented as `:barcode:`. Each entry in this collection represents a barcode and includes the barcode type as `kind` and the embedded barcode content as `value` along with its `polygon` coordinates. Initially, barcodes appear at the end of each page. The `confidence` is hard-coded for as 1.
@@ -163,9 +219,23 @@ The `ocr.barcode` capability extracts all identified barcodes in the `barcodes`
163219
| `ITF` |:::image type="content" source="media/barcodes/interleaved-two-five.png" alt-text="Screenshot of the interleaved-two-of-five barcode (ITF).":::|
164220
| `Data Matrix` |:::image type="content" source="media/barcodes/datamatrix.gif" alt-text="Screenshot of the Data Matrix.":::|
165221

222+
### REST API
223+
224+
::: moniker range="doc-intel-4.0.0"
225+
```REST
226+
https://{your resource}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2023-10-31-preview&features=barcodes
227+
```
228+
:::moniker-end
229+
230+
:::moniker range="doc-intel-3.1.0"
231+
```REST
232+
https://{your resource}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=barcodes
233+
```
234+
:::moniker-end
235+
166236
## Language detection
167237

168-
It predicts the detected primary language for each text line along with the `confidence` in the `languages` collection under `analyzeResult`.
238+
Adding the `languages` feature to the `analyzeResult` request predicts the detected primary language for each text line along with the `confidence` in the `languages` collection under `analyzeResult`.
169239

170240
```json
171241
"languages": [
@@ -182,16 +252,40 @@ It predicts the detected primary language for each text line along with the `con
182252
]
183253
```
184254

255+
### REST API
256+
257+
::: moniker range="doc-intel-4.0.0"
258+
```REST
259+
https://{your resource}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2023-10-31-preview&features=languages
260+
```
261+
:::moniker-end
262+
263+
:::moniker range="doc-intel-3.1.0"
264+
```REST
265+
https://{your resource}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=languages
266+
```
267+
:::moniker-end
268+
185269
:::moniker range="doc-intel-4.0.0"
186270

187271
## Key-value Pairs
188272

273+
In earlier API versions, the prebuilt-document model extracted key-value pairs from forms and documents. With the addition of the `keyValuePairs` feature to prebuilt-layout, the layout model now produces the same results.
274+
189275
Key-value pairs are specific spans within the document that identify a label or key and its associated response or value. In a structured form, these pairs could be the label and the value the user entered for that field. In an unstructured document, they could be the date a contract was executed on based on the text in a paragraph. The AI model is trained to extract identifiable keys and values based on a wide variety of document types, formats, and structures.
190276

191277
Keys can also exist in isolation when the model detects that a key exists, with no associated value or when processing optional fields. For example, a middle name field can be left blank on a form in some instances. Key-value pairs are spans of text contained in the document. For documents where the same value is described in different ways, for example, customer/user, the associated key is either customer or user (based on context).
192278

279+
### REST API
280+
281+
```REST
282+
https://{your resource}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2023-10-31-preview&features=keyValuePairs
283+
```
284+
193285
## Query Fields
194286

287+
Query fields are an add-on capability to extend the schema extracted from any prebuilt model or define a specific key name when the key name is variable. To use query fields, set the features to `queryFields` and provide a comma-separated list of field names in the `queryFields` property.
288+
195289
* Document Intelligence now supports query field extractions. With query field extraction, you can add fields to the extraction process using a query request without the need for added training.
196290

197291
* Use query fields when you need to extend the schema of a prebuilt or custom model or need to extract a few fields with the output of layout.
@@ -222,10 +316,21 @@ For query field extraction, specify the fields you want to extract and Document
222316

223317
* In addition to the query fields, the response includes text, tables, selection marks, and other relevant data.
224318

319+
### REST API
320+
321+
```REST
322+
https://{your resource}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2023-10-31-preview&features=queryFields&queryFields=TERMS
323+
```
324+
225325
:::moniker-end
226326

227327
## Next steps
228328

229329
> [!div class="nextstepaction"]
230330
> Learn more:
231331
> [**Read model**](concept-read.md) [**Layout model**](concept-layout.md).
332+
333+
> [!div class="nextstepaction"]
334+
> SDK samples:
335+
> [**python**](/python/api/overview/azure/ai-documentintelligence-readme).
336+

articles/ai-services/document-intelligence/concept-composed-models.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-document-intelligence
88
ms.custom:
99
- ignite-2023
1010
ms.topic: conceptual
11-
ms.date: 11/21/2023
11+
ms.date: 01/19/2024
1212
ms.author: lajanuar
1313
---
1414

@@ -34,7 +34,7 @@ ms.author: lajanuar
3434

3535
**Composed models**. A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. When a document is submitted for analysis using a composed model, the service performs a classification to decide which custom model best represents the submitted document.
3636

37-
With composed models, you can assign multiple custom models to a composed model called with a single model ID. It's useful when you've trained several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.
37+
With composed models, you can assign multiple custom models to a composed model called with a single model ID. It's useful when you train several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.
3838

3939
* ```Custom form``` and ```Custom template``` models can be composed together into a single composed model.
4040

@@ -57,13 +57,15 @@ With the introduction of [**custom classification models**](./concept-custom-cla
5757
5858
### Composed model compatibility
5959

60-
|Custom model type|Models trained with v2.1 and v2.0 | Custom template models v3.0 |Custom neural models v3.0 (preview) |Custom neural models 3.0 (GA)|
60+
|Custom model type|Models trained with v2.1 and v2.0 | Custom template models v3.0 |Custom neural models 3.0|Custom Neural models v3.1|
6161
|--|--|--|--|--|
6262
|**Models trained with version 2.1 and v2.0** |Supported|Supported|Not Supported|Not Supported|
63-
|**Custom template models v3.0** |Supported|Supported|Not Supported|NotSupported|
64-
|**Custom template models v3.0 (GA)** |Not Supported|Not Supported|Supported|Not Supported|
65-
|**Custom neural models v3.0 (preview)**|Not Supported|Not Supported|Supported|Not Supported|
66-
|**Custom Neural models v3.0 (GA)**|Not Supported|Not Supported|Not Supported|Supported|
63+
|**Custom template models v3.0** |Supported|Supported|Not Supported|Not Supported|
64+
|**Custom template models v3.0** |Not Supported|Not Supported|Not Supported|Not Supported|
65+
|**Custom template models v3.1** |Not Supported|Not Supported|Not Supported|Not Supported|
66+
|**Custom Neural models v3.0**|Not Supported|Not Supported|Supported|Supported|
67+
|**Custom Neural models v3.1**|Not Supported|Not Supported|Supported|Supported|
68+
6769

6870
* To compose a model trained with a prior version of the API (v2.1 or earlier), train a model with the v3.0 API using the same labeled dataset. That addition ensures that the v2.1 model can be composed with other models.
6971

0 commit comments

Comments
 (0)