You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -184,15 +184,16 @@ Document Intelligence v2.1 supports the following tools, applications, and libra
184
184
The layout model extracts text, selection marks, tables, paragraphs, and paragraph types (`roles`) from your documents.
185
185
186
186
> [!NOTE]
187
-
> Microsoft Word and HTML file are supported since `2023-10-31-preview`. Compared with PDF and images, below features are not supported:
188
-
> - There are no angle, width/height and unit with each page object.
189
-
> - For each object detected, there is no bounding polygon or bounding region.
190
-
> - Page range (`pages`) is not supported as a parameter.
191
-
> - No `lines` object.
187
+
> Version `2023-10-31-preview` and later support Microsoft Word and HTML files. The following features are not supported:
188
+
>
189
+
> * There are no angle, width/height and unit with each page object.
190
+
> * For each object detected, there is no bounding polygon or bounding region.
191
+
> * Page range (`pages`) is not supported as a parameter.
192
+
> * No `lines` object.
192
193
193
194
### Pages
194
195
195
-
The pages collection is a list of pages within the document. For each page, it is represented with the sequential number of the page within the document, the orientation angle, which could indicate if the page has been rotated, the width and height (dimentions in pixels) of the page. The page units in the model output are computed as shown:
196
+
The pages collection is a list of pages within the document. Each pageis represented sequentially within the document and includes the orientation angle indicating if the page is rotated and the width and height (dimensions in pixels). The page units in the model output are computed as shown:
@@ -271,11 +272,11 @@ The new machine-learning based page object detection extracts logical roles like
271
272
272
273
```
273
274
274
-
### Text, lines and words
275
+
### Text, lines, and words
275
276
276
277
The document layout model in Document Intelligence extracts print and handwritten style text as `lines` and `words`. The `styles` collection includes any handwritten style for lines if detected along with the spans pointing to the associated text. This feature applies to [supported handwritten languages](language-support.md).
277
278
278
-
For Microsoft Word, Excel, PowerPoint, and HTML, Document Intelligence version 2023-10-31-preview the Layout model extracts all embedded text as is. Texts are extrated as words and paragraphs. Embedded images are not supported.
279
+
For Microsoft Word, Excel, PowerPoint, and HTML, Document Intelligence version 2023-10-31-preview the Layout model extracts all embedded text as is. Texts are extrated as words and paragraphs. Embedded images aren't supported.
279
280
280
281
```json
281
282
"words": [
@@ -312,7 +313,8 @@ The response includes classifying whether each text line is of handwriting style
312
313
]
313
314
}
314
315
```
315
-
If you have turned on [font/style addon capability](concept-add-on-capabilities.md#font-property-extraction), you will also get the font/style result as part of the `styles` object.
316
+
317
+
If you enable the [font/style addon capability](concept-add-on-capabilities.md#font-property-extraction), you also get the font/style result as part of the `styles` object.
Azure AI Document Intelligence supports a wide variety of models that enable you to add intelligent document processing to your apps and flows. You can use a prebuilt domain-specific model or train a custom model tailored to your specific business need and use cases. Document Intelligence can be used with the REST API or Python, C#, Java, and JavaScript SDKs.
41
+
Azure AI Document Intelligence supports a wide variety of models that enable you to add intelligent document processing to your apps and flows. You can use a prebuilt domain-specific model or train a custom model tailored to your specific business need and use cases. Document Intelligence can be used with the REST API or Python, C#, Java, and JavaScript client libraries.
42
42
::: moniker-end
43
43
44
44
## Model overview
@@ -88,7 +88,7 @@ Add-On* - Query fields are priced differently than the other add-on features. Se
88
88
|[Read OCR](#read-ocr)| Extract print and handwritten text including words, locations, and detected languages.|
89
89
|[Layout analysis](#layout-analysis)| Extract text and document layout elements like tables, selection marks, titles, section headings, and more.|
90
90
|**Prebuilt models**||
91
-
|[Health insurance card](#health-insurance-card)| Automate healthcare processes by extracting insurer, member, prescription, group number and other key information from US health insurance cards.|
91
+
|[Health insurance card](#health-insurance-card)| Automate healthcare processes by extracting insurer, member, prescription, group number, and other key information from US health insurance cards.|
92
92
|[US Tax document models](#us-tax-documents)| Process US tax forms to extract employee, employer, wage, and other information. |
93
93
|[Contract](#contract)| Extract agreement and party details.|
94
94
|[Invoice](#invoice)| Automate invoices. |
@@ -97,8 +97,8 @@ Add-On* - Query fields are priced differently than the other add-on features. Se
97
97
|[Business card](#business-card)| Scan business cards to extract key fields and data into your applications. |
98
98
|**Custom models**||
99
99
|[Custom model (overview)](#custom-models)| Extract data from forms and documents specific to your business. Custom models are trained for your distinct data and use cases. |
100
-
|[Custom extraction models](#custom-extraction)|●**Custom template models** use layout cues to extract values from documents and are suitable to extract fields from highly structured documents with defined visual templates.</br>●**Custom neural models** are trained on various document types to extract fields from structured, semi-structured and unstructured documents.|
101
-
| [Custom classification model](#custom-classifier)| The **Custom classification model** can classify each page in an input file to identify the document(s) within and can also identify multiple documents or multiple instances of a single document within an input file.
100
+
|[Custom extraction models](#custom-extraction)|●**Custom template models** use layout cues to extract values from documents and are suitable to extract fields from highly structured documents with defined visual templates.</br>●**Custom neural models** are trained on various document types to extract fields from structured, semi-structured, and unstructured documents.|
101
+
| [Custom classification model](#custom-classifier)| The **Custom classification model** can classify each page in an input file to identify the documents within and can also identify multiple documents or multiple instances of a single document within an input file.
102
102
| [Composed models](#composed-models) | Combine several custom models into a single model to automate processing of diverse document types with a single composed model.
103
103
104
104
For all models, except Business card model, Document Intelligence now supports add-on capabilities to allow for more sophisticated analysis. These optional capabilities can be enabled and disabled depending on the scenario of the document extraction. There are seven add-on capabilities available for the `2023-07-31` (GA) and later API version:
@@ -109,7 +109,7 @@ For all models, except Business card model, Document Intelligence now supports a
The invoice model automates processing of invoices to extracts customer name, billing address, due date, and amount due, line items and other key data. Currently, the model supports English, Spanish, German, French, Italian, Portuguese, and Dutch invoices.
196
+
The invoice model automates processing of invoices to extracts customer name, billing address, due date, and amount due, line items, and other key data. Currently, the model supports English, Spanish, German, French, Italian, Portuguese, and Dutch invoices.
197
197
198
198
***Sample invoice processed using [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio/prebuilt?formType=invoice)***:
199
199
@@ -263,7 +263,7 @@ Custom extraction model can be one of two types, **custom template** or **custom
The custom classification model enables you to identify the document type prior to invoking the extraction model. The classification model is available starting with the `2023-07-31 (GA)` API. Training a custom classification model requires at least two distinct classes and a minimum of five samples per class.
266
+
The custom classification model enables you to identify the document type before invoking the extraction model. The classification model is available starting with the `2023-07-31 (GA)` API. Training a custom classification model requires at least two distinct classes and a minimum of five samples per class.
@@ -399,7 +399,7 @@ The business card model analyzes and extracts key information from business card
399
399
400
400
#### Composed custom model
401
401
402
-
A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. You can assign multiple custom models to a composed model called with a single model ID. you can assign up to 100 trained custom models to a single composed model.
402
+
A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. You can assign multiple custom models to a composed model called with a single model ID. You can assign up to 100 trained custom models to a single composed model.
403
403
404
404
***Composed model dialog window using the [Sample Labeling tool](https://formrecognizer.appliedai.azure.com/studio/customform/projects)***:
405
405
@@ -436,15 +436,15 @@ A composed model is created by taking a collection of custom models and assignin
436
436
437
437
::: moniker range=">=doc-intel-3.0.0"
438
438
439
-
* Try processing your own forms and documents with the [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio)
439
+
* Try processing your own forms and documents with the [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio).
440
440
441
441
* Complete a [Document Intelligence quickstart](quickstarts/get-started-sdks-rest-api.md?view=doc-intel-3.0.0&preserve-view=true) and get started creating a document processing app in the development language of your choice.
442
442
443
443
::: moniker-end
444
444
445
445
::: moniker range="doc-intel-2.1.0"
446
446
447
-
* Try processing your own forms and documents with the [Document Intelligence Sample Labeling tool](https://fott-2-1.azurewebsites.net/)
447
+
* Try processing your own forms and documents with the [Document Intelligence Sample Labeling tool](https://fott-2-1.azurewebsites.net/).
448
448
449
449
* Complete a [Document Intelligence quickstart](quickstarts/get-started-sdks-rest-api.md?view=doc-intel-2.1.0&preserve-view=true) and get started creating a document processing app in the development language of your choice.
Copy file name to clipboardExpand all lines: articles/ai-services/document-intelligence/concept-read.md
+11-10Lines changed: 11 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -109,23 +109,24 @@ See our [Language Support—document analysis models](language-support-ocr.md) p
109
109
110
110
> [!NOTE]
111
111
> Microsoft Word and HTML file are supported in v3.1 and later versions. Compared with PDF and images, below features are not supported:
112
-
> - There are no angle, width/height and unit with each page object.
113
-
> - For each object detected, there is no bounding polygon or bounding region.
114
-
> - Page range (`pages`) is not supported as a parameter.
115
-
> - No `lines` object.
112
+
>
113
+
> * There are no angle, width/height and unit with each page object.
114
+
> * For each object detected, there is no bounding polygon or bounding region.
115
+
> * Page range (`pages`) is not supported as a parameter.
116
+
> * No `lines` object.
116
117
117
118
### Pages
118
119
119
-
The pages collection is a list of pages within the document. For each page, it is represented with the sequential number of the page within the document, the orientation angle, which could indicate if the page has been rotated, the width and height (dimentions in pixels) of the page. The page units in the model output are computed as shown:
120
+
The pages collection is a list of pages within the document. Each pageis represented sequentially within the document and includes the orientation angle indicating if the page is rotated and the width and height (dimensions in pixels). The page units in the model output are computed as shown:
|Images (JPEG/JPG, PNG, BMP, HEIF) | Each image = 1 page unit | Total images |
124
125
|PDF | Each page in the PDF = 1 page unit | Total pages in the PDF |
125
126
|TIFF | Each image in the TIFF = 1 page unit | Total images in the PDF |
126
127
|Word (DOCX) | Up to 3,000 characters = 1 page unit, embedded or linked images not supported | Total pages of up to 3,000 characters each |
127
-
|Excel (XLSX) | Each worksheet = 1 page unit, embedded or linked images not supported | Total worksheets |
128
-
|PowerPoint (PPTX) | Each slide = 1 page unit, embedded or linked images not supported | Total slides |
128
+
|Excel (XLSX) | Each worksheet = 1 page unit, embedded or linked images not supported | Total worksheets |
129
+
|PowerPoint (PPTX) | Each slide = 1 page unit, embedded or linked images not supported | Total slides |
129
130
|HTML | Up to 3,000 characters = 1 page unit, embedded or linked images not supported | Total pages of up to 3,000 characters each |
130
131
131
132
```json
@@ -165,7 +166,7 @@ The Read OCR model in Document Intelligence extracts all identified blocks of te
165
166
166
167
The Read OCR model extracts print and handwritten style text as `lines` and `words`. The model outputs bounding `polygon` coordinates and `confidence` for the extracted words. The `styles` collection includes any handwritten style for lines if detected along with the spans pointing to the associated text. This feature applies to [supported handwritten languages](language-support.md).
167
168
168
-
For Microsoft Word, Excel, PowerPoint, and HTML, Document Intelligence Read model v3.1 and later versions extracts all embedded text as is. Texts are extrated as words and paragraphs. Embedded images are not supported.
169
+
For Microsoft Word, Excel, PowerPoint, and HTML, Document Intelligence Read model v3.1 and later versions extracts all embedded text as is. Texts are extrated as words and paragraphs. Embedded images aren't supported.
169
170
170
171
171
172
```json
@@ -204,7 +205,7 @@ The response includes classifying whether each text line is of handwriting style
204
205
}
205
206
```
206
207
207
-
If you have turned on [font/style addon capability](concept-add-on-capabilities.md#font-property-extraction), you will also get the font/style result as part of the `styles` object.
208
+
If you enabled the [font/style addon capability](concept-add-on-capabilities.md#font-property-extraction), you also get the font/style result as part of the `styles` object.
@@ -266,7 +266,7 @@ Document Intelligence supports optional features that can be enabled and disable
266
266
267
267
✓ - Enabled</br>
268
268
O - Optional</br>
269
-
\* - Premium features incur extra costs
269
+
\* - Premium features incur extra costs.
270
270
271
271
## Models and development options
272
272
@@ -416,7 +416,7 @@ You can use Document Intelligence to automate document processing in application
416
416
417
417
| Model ID |Description|Development options |
418
418
|----------|--------------|-----------------|
419
-
|**prebuilt-tax.us.1099(Variations)**|Extract information from 1099form variations.|●[**Document Intelligence Studio**](https://formrecognizer.appliedai.azure.com/studio)</br>●[**REST API**](https://westus.dev.cognitive.microsoft.com/docs/services?pattern=intelligence)|
419
+
|**prebuilt-tax.us.1099(Variations)**|Extract information from 1099-form variations.|●[**Document Intelligence Studio**](https://formrecognizer.appliedai.azure.com/studio)</br>●[**REST API**](https://westus.dev.cognitive.microsoft.com/docs/services?pattern=intelligence)|
420
420
421
421
> [!div class="nextstepaction"]
422
422
> [Return to model types](#prebuilt-models)
@@ -550,17 +550,17 @@ Use the links in the table to learn more about each model and browse the API ref
550
550
551
551
::: moniker range=">=doc-intel-3.0.0"
552
552
553
-
*[Choose a Document Intelligence model](choose-model-feature.md)
553
+
*[Choose a Document Intelligence model](choose-model-feature.md).
554
554
555
-
* Try processing your own forms and documents with the [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio)
555
+
* Try processing your own forms and documents with the [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio).
556
556
557
557
* Complete a [Document Intelligence quickstart](quickstarts/get-started-sdks-rest-api.md?view=doc-intel-3.0.0&preserve-view=true) and get started creating a document processing app in the development language of your choice.
558
558
559
559
::: moniker-end
560
560
561
561
::: moniker range="doc-intel-2.1.0"
562
562
563
-
* Try processing your own forms and documents with the [Document Intelligence Sample Labeling tool](https://fott-2-1.azurewebsites.net/)
563
+
* Try processing your own forms and documents with the [Document Intelligence Sample Labeling tool](https://fott-2-1.azurewebsites.net/).
564
564
565
565
* Complete a [Document Intelligence quickstart](quickstarts/get-started-sdks-rest-api.md?view=doc-intel-2.1.0&preserve-view=true) and get started creating a document processing app in the development language of your choice.
0 commit comments