You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The layout model extracts text, selection marks, tables, paragraphs, and paragraph types (`roles`) from your documents.
112
+
The layout model extracts text, selection marks, tables, paragraphs, and paragraph types (`roles`) from your documents. To follow are descriptions of page layout structural elements with guidance on how to extract them:
113
+
114
+
*[**Pages**](#pages)
115
+
*[**Paragraphs**](#paragraphs)
116
+
*[**Text, lines, and words**](#text-lines-and-words)
@@ -158,7 +166,7 @@ print(f"Page has width: {page.width} and height: {page.height}, measured with un
158
166
159
167
---
160
168
161
-
### Extract selected pages from documents
169
+
####Extract selected pages
162
170
163
171
For large multi-page documents, use the `pages` query parameter to indicate specific page numbers or page ranges for text extraction.
164
172
@@ -177,7 +185,7 @@ The Layout model extracts all identified blocks of text in the `paragraphs` coll
177
185
]
178
186
```
179
187
180
-
### Paragraph roles
188
+
####Paragraph roles
181
189
182
190
The new machine-learning based page object detection extracts logical roles like titles, section headings, page headers, page footers, and more. The Document Intelligence Layout model assigns certain text blocks in the `paragraphs` collection with their specialized role or type predicted by the model. It's best to use paragraph roles with unstructured documents to help understand the layout of the extracted content for a richer semantic analysis. The following paragraph roles are supported:
183
191
@@ -582,7 +590,11 @@ The following illustration shows the typical components in an image of a sample
582
590
583
591
:::image type="content" source="../media/document-layout-example-new.png" alt-text="Illustration of document layout example.":::
584
592
585
-
## Development options
593
+
## Supported languages and locales
594
+
595
+
*See* our [Language Support—document analysis models](../language-support/ocr.md) page for a complete list of supported languages.
596
+
597
+
## Tool and development options
586
598
587
599
:::moniker-end
588
600
@@ -616,16 +628,18 @@ Document Intelligence v2.1 supports the following tools, applications, and libra
* Supported file formats: JPEG, PNG, PDF, and TIFF.
630
644
* Supported number of pages: For PDF and TIFF, up to 2,000 pages are processed. For free tier subscribers, only the first two pages are processed.
631
645
* Supported file size: the file size must be less than 50 MB and dimensions at least 50 x 50 pixels and at most 10,000 x 10,000 pixels.
@@ -634,7 +648,7 @@ Document Intelligence v2.1 supports the following tools, applications, and libra
634
648
635
649
:::moniker range="<=doc-intel-3.1.0"
636
650
637
-
### Get started with Layout model
651
+
### Get started
638
652
639
653
See how data, including text, tables, table headers, selection marks, and structure information is extracted from documents using Document Intelligence. You need the following resources:
640
654
@@ -698,10 +712,6 @@ See how data, including text, tables, table headers, selection marks, and struct
698
712
699
713
:::moniker-end
700
714
701
-
## Supported languages and locales
702
-
703
-
*See* our [Language Support—document analysis models](../language-support/ocr.md) page for a complete list of supported languages.
704
-
705
715
:::moniker range="doc-intel-2.1.0"
706
716
707
717
Document Intelligence v2.1 supports the following tools, applications, and libraries:
@@ -714,7 +724,7 @@ Document Intelligence v2.1 supports the following tools, applications, and libra
714
724
715
725
:::moniker range="<=doc-intel-3.1.0"
716
726
717
-
## Data extraction
727
+
## Extract data
718
728
719
729
The layout model extracts text, selection marks, tables, paragraphs, and paragraph types (`roles`) from your documents.
720
730
@@ -726,7 +736,7 @@ The layout model extracts text, selection marks, tables, paragraphs, and paragra
726
736
> * Page range (`pages`) is not supported as a parameter.
727
737
> * No `lines` object.
728
738
729
-
### Pages
739
+
### Page
730
740
731
741
The pages collection is a list of pages within the document. Each page is represented sequentially within the document and ../includes the orientation angle indicating if the page is rotated and the width and height (dimensions in pixels). The page units in the model output are computed as shown:
732
742
@@ -804,7 +814,7 @@ for page in result.pages:
804
814
805
815
For large multi-page documents, use the `pages` query parameter to indicate specific page numbers or page ranges for text extraction.
806
816
807
-
### Paragraphs
817
+
### Paragraph
808
818
809
819
The Layout model extracts all identified blocks of text in the `paragraphs` collection as a top level object under `analyzeResults`. Each entry in this collection represents a text block and ../includes the extracted text as`content`and the bounding `polygon` coordinates. The `span` information points to the text fragment within the top level `content` property that contains the full text from the document.
810
820
@@ -819,7 +829,7 @@ The Layout model extracts all identified blocks of text in the `paragraphs` coll
819
829
]
820
830
```
821
831
822
-
### Paragraph roles
832
+
####Paragraph role
823
833
824
834
The new machine-learning based page object detection extracts logical roles like titles, section headings, page headers, page footers, and more. The Document Intelligence Layout model assigns certain text blocks in the `paragraphs` collection with their specialized role or type predicted by the model. It's best to use paragraph roles with unstructured documents to help understand the layout of the extracted content for a richer semantic analysis. The following paragraph roles are supported:
825
835
@@ -852,7 +862,7 @@ The new machine-learning based page object detection extracts logical roles like
852
862
853
863
```
854
864
855
-
### Text, lines, and words
865
+
### Text, line, and word
856
866
857
867
The document layout model in Document Intelligence extracts print and handwritten style text as `lines` and `words`. The `styles` collection ../includes any handwritten style for lines if detected along with the spans pointing to the associated text. This feature applies to [supported handwritten languages](../language-support/prebuilt.md).
858
868
@@ -931,7 +941,7 @@ for line_idx, line in enumerate(page.lines):
931
941
932
942
:::moniker range="<=doc-intel-3.1.0"
933
943
934
-
### Handwritten style for text lines
944
+
### Handwritten style
935
945
936
946
The response ../includes classifying whether each text line is of handwriting style or not, along with a confidence score. For more information. See [Handwritten language support](../language-support/ocr.md). The following example shows an example JSON snippet.
937
947
@@ -951,7 +961,7 @@ The response ../includes classifying whether each text line is of handwriting st
951
961
952
962
If you enable the [font/style addon capability](../concept-add-on-capabilities.md#font-property-extraction), you also get the font/style result as part of the `styles` object.
953
963
954
-
### Selection marks
964
+
### Selection mark
955
965
956
966
The Layout model also extracts selection marks from documents. Extracted selection marks appear within the `pages` collection for each page. They include the bounding `polygon`, `confidence`, and selection `state` (`selected/unselected`). The text representation (that is, `:selected:` and `:unselected`) is also included as the starting index (`offset`) and `length` that references the top level `content` property that contains the full text from the document.
957
967
@@ -1017,7 +1027,7 @@ for selection_mark in page.selection_marks:
1017
1027
1018
1028
:::moniker range="<=doc-intel-3.1.0"
1019
1029
1020
-
### Tables
1030
+
### Table
1021
1031
1022
1032
Extracting tables is a key requirement for processing documents containing large volumes of data typically formatted as tables. The Layout model extracts tables in the `pageResults` section of the JSON output. Extracted table information ../includes the number of columns and rows, row span, and column span. Each cell with its bounding polygon is output along with information whether the area is recognized as a `columnHeader` or not. The model supports extracting tables that are rotated. Each table cell contains the row and column index and bounding polygon coordinates. For the cell text, the model outputs the `span` information containing the starting index (`offset`). The model also outputs the `length` within the top-level content that contains the full text from the document.
1023
1033
@@ -1200,7 +1210,7 @@ Layout API extracts tables in the `pageResults` section of the JSON output. Docu
Layout API also extracts selection marks from documents. Extracted selection marks include the bounding box, confidence, and state (selected/unselected). Selection mark information is extracted in the `readResults` section of the JSON output.
0 commit comments