Skip to content

Commit 74955a8

Browse files
authored
Merge pull request #5457 from HeidiSteen/heidist-work
[azure search] Updates to document intelligence layout skill
2 parents 82a653b + c9da546 commit 74955a8

File tree

3 files changed

+40
-34
lines changed

3 files changed

+40
-34
lines changed

articles/search/cognitive-search-skill-document-intelligence-layout.md

Lines changed: 32 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -11,16 +11,16 @@ ms.custom:
1111
- references_regions
1212
- ignite-2024
1313
ms.topic: reference
14-
ms.date: 05/27/2025
14+
ms.date: 06/10/2025
1515
---
1616

1717
# Document Layout skill
1818

1919
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
2020

21-
The **Document Layout** skill analyzes a document to extract regions of interest and their inter-relationships to produce a syntactical representation of the document in Markdown or Text format. You can use it to extract text and images. Image extraction includes location metadata that preserves image position within the document. Image proximity to related content is better for Retrieval Augmented Generation (RAG) workloads and [multimodal search](multimodal-search-overview.md).
21+
The **Document Layout** skill analyzes a document to detect structure and characteristics, and produces a syntactical representation of the document in Markdown or Text format. You can use it to extract text and images, where image extraction includes location metadata that preserves image position within the document. Image proximity to related content adds value to Retrieval Augmented Generation (RAG) workloads and [multimodal search](multimodal-search-overview.md).
2222

23-
This article is the reference documentation for the Document Layout skill. For usage information, see [Structure-aware chunking and vectorization](search-how-to-semantic-chunking.md).
23+
This article is the reference documentation for the Document Layout skill. For usage information, see [How to chunk and vectorize by document layout](search-how-to-semantic-chunking.md).
2424

2525
It's common to use this skill on content such as PDFs that have structure and images. The following tutorials demonstrate several scenarios:
2626

@@ -34,15 +34,23 @@ It's common to use this skill on content such as PDFs that have structure and im
3434
> This skill is bound to a [billable Azure AI multi-service resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services Standard price](https://azure.microsoft.com/pricing/details/cognitive-services/).
3535
>
3636
37-
## Supported regions
37+
## Limitations
3838

39-
The Document Layout skill calls the [Document Intelligence Public preview version 2024-07-31-preview](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-07-31-preview)&preserve-view=true).
39+
During the public preview, this skill has the following restrictions:
4040

41-
Supported regions vary by modality:
41+
+ The skill isn't suitable for large documents requiring more than 5 minutes of processing in the AI Document Intelligence layout model. The skill times out, but charges still apply to the AI Services multi-services resource if it attaches to the skillset for billing purposes. Ensure documents are optimized to stay within processing limits to avoid unnecessary costs.
42+
43+
## Supported regions
44+
45+
The Document Layout skill calls the [Document Intelligence Public preview version 2024-07-31-preview](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-07-31-preview)&preserve-view=true).
4246

43-
+ When you're using AI services keys [to attach your multi-service resource to your skillset](cognitive-search-attach-cognitive-services.md#bill-through-a-resource-key) via the REST API, both your Azure AI Search service and AI multi-service resource must be in the same region. This is only possible in the Azure regions of **East US**, **West Europe**, **North Central US**, **West US 2**. But if you're using a managed identity for [billing through a keyless connection](cognitive-search-attach-cognitive-services.md#bill-through-a-keyless-connection), your Azure AI Search service must be in one of the following regions: **East US**, **West Europe**, **North Central US**, **West US 2**. On the other hand, you can use AI Document Intelligence through an Azure AI multi-service resource in any region where this service is available. See [Product availability by region](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/table).
47+
Supported regions vary by modality and how the skill connects to the Document Intelligence layout model.
4448

45-
+ In the [Import and vectorize data wizard](search-import-data-portal.md) in the Azure portal, you can enable document layout detection in the data source connection step. Document layout detection in the portal is available in the following Azure regions: **East US**, **West Europe**, **North Central US**. Create an Azure AI multi-service resource in one of these three regions to get the portal experience.
49+
| Approach | Regions | Requirement |
50+
|----------|---------|-------------|
51+
| [Import and vectorize data wizard](search-import-data-portal.md) | **East US**, **West Europe**, **North Central US** | Create an Azure AI multi-service resource in one of these regions to get the portal experience. |
52+
| Programmatic, using a [keyless connection (preview)](cognitive-search-attach-cognitive-services.md#bill-through-a-keyless-connection) for billing | Varies by resource | Create Azure AI Search in one of these regions: **East US**, **West Europe**, **North Central US**, **West US 2**. <br>Access Document Intelligence through an Azure AI multi-service resource in any region listed in the [Product availability by region](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/table) table.|
53+
| Programmatic, using a [multi-service resource API key](cognitive-search-attach-cognitive-services.md#bill-through-a-keyless-connection) for billing | **East US**, **West Europe**, **North Central US**, **West US 2** | Create your Azure AI Search service and AI multi-service resource in the same region. |
4654

4755
## Supported file formats
4856

@@ -59,9 +67,13 @@ This skill recognizes the following file formats.
5967
+ .PPTX
6068
+ .HTML
6169

70+
## Supported languages
71+
72+
Refer to [Azure AI Document Intelligence layout model supported languages](/azure/ai-services/document-intelligence/language-support/ocr?view=doc-intel-3.1.0&tabs=read-print%2Clayout-print%2Cgeneral#layout&preserve-view=true) for printed text.
73+
6274
## Supported parameters
6375

64-
Several parameters are version-specific. The skills parameter table notes the API version in which a parameter was introduced so that you know whether a version upgrade is required. To use version-specific features such as image and location metadata extraction in [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true), you can use the Azure portal, or target a REST API version, or check an Azure SDK change log to see if it supports the feature.
76+
Several parameters are version-specific. The skills parameter table notes the API version in which a parameter was introduced so that you know how to configure the skill. To use version-specific features such as image and location metadata extraction in [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true), you can use the Azure portal, or target 2025-05-01-preview, or check an Azure SDK change log to see if it supports the new parameters.
6577

6678
The Azure portal supports most preview features and can be used to create or update a skillset. For updates to the Document Layout skill, edit the skillset JSON definition to add new preview parameters.
6779

@@ -75,17 +87,6 @@ Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill
7587
+ Even if the file size for analyzing documents is 500 MB for [Azure AI Document Intelligence paid (S0) tier](https://azure.microsoft.com/pricing/details/cognitive-services/) and 4 MB for [Azure AI Document Intelligence free (F0) tier](https://azure.microsoft.com/pricing/details/cognitive-services/), indexing is subject to the [indexer limits](search-limits-quotas-capacity.md#indexer-limits) of your search service tier.
7688
+ Image dimensions must be between 50 pixels x 50 pixels or 10,000 pixels x 10,000 pixels.
7789
+ If your PDFs are password-locked, remove the lock before running the indexer.
78-
79-
## Supported languages
80-
81-
Refer to [Azure AI Document Intelligence layout model supported languages](/azure/ai-services/document-intelligence/language-support/ocr?view=doc-intel-3.1.0&tabs=read-print%2Clayout-print%2Cgeneral#layout&preserve-view=true) for printed text.
82-
83-
## Limitations
84-
85-
During the public preview, this skill has the following restrictions:
86-
87-
+ The skill isn't suitable for large documents requiring more than 5 minutes of processing in the AI Document Intelligence layout model. The skill times out, but charges still apply to the AI Services multi-services resource if it attaches to the skillset for billing purposes. Ensure documents are optimized to stay within processing limits to avoid unnecessary costs.
88-
8990

9091
## Skill parameters
9192

@@ -97,13 +98,13 @@ Parameters are case-sensitive.
9798
| `markdownHeaderDepth` | [2024-11-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true) |`h1`, `h2`, `h3`, `h4`, `h5`, `h6(default)` | Only applies if `outputFormat` is set to `markdown`. This parameter describes the deepest nesting level that should be considered. For instance, if the markdownHeaderDepth is `h3`, any sections that are deeper such as `h4`, are rolled into `h3`. |
9899
| `outputFormat` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) |`markdown(default)`, `text` | **New**. Controls the format of the output generated by the skill. |
99100
| `extractionOptions` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) |`["images"]`, `["images", "locationMetadata"]`, `["locationMetadata"]` | **New**. Identify any extra content extracted from the document. Define an array of enums that correspond to the content to be included in the output. For instance, if the `extractionOptions` is `["images", "locationMetadata"]`, the output includes images and location metadata which provides page location information related to where the content was extracted, such as a page number or section. This parameter applies to both output formats. |
100-
| `chunkingProperties` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | See below | **New**. Only applies if `outputFormat` is set to `text`. Options that encapsulate how to chunk text content while recomputing other metadata. |
101+
| `chunkingProperties` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | See below. | **New**. Only applies if `outputFormat` is set to `text`. Options that encapsulate how to chunk text content while recomputing other metadata. |
101102

102103
| ChunkingProperties Parameter | Version | Allowed Values | Description |
103104
|--------------------|-------------|-------------|-------------|
104-
| `unit` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) |`Characters`. currently the only allowed value. Chunk length is measured in characters, as opposed to words or tokens | Controls the cardinality of the chunk unit. |
105-
| `maximumLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Any integer between 300-50000 | The maximum chunk length in characters as measured by String.Length. |
106-
| `overlapLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Integer. The value needs to be less than the half of the `maximumLength` | The length of overlap provided between two text chunks. |
105+
| `unit` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | `Characters`. currently the only allowed value. Chunk length is measured in characters, as opposed to words or tokens | **New**. Controls the cardinality of the chunk unit. |
106+
| `maximumLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Any integer between 300-50000 | **New**. The maximum chunk length in characters as measured by String.Length. |
107+
| `overlapLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Integer. The value needs to be less than the half of the `maximumLength` | **New**. The length of overlap provided between two text chunks. |
107108

108109
## Skill inputs
109110

@@ -144,7 +145,8 @@ The file reference object can be generated in one of following ways:
144145
| `text_sections` | Only applies if `outputFormat` is set to `text`. A collection of text chunk objects, which represent the text within the bounds of a page (factoring in any more chunking configured), *inclusive* of any section headers themselves. The text chunk object includes `locationMetadata` if applicable.|
145146
| `normalized_images` | Only applies if `outputFormat` is set to `text` and `extractionOptions` includes `images`. A collection of images that were extracted from the document, including `locationMetadata` if applicable.|
146147

147-
## Sample definition for markdown output mode
148+
### Sample definition for markdown output mode
149+
148150
```json
149151
{
150152
"skills": [
@@ -171,7 +173,7 @@ The file reference object can be generated in one of following ways:
171173
}
172174
```
173175

174-
## Sample output for markdown output mode
176+
### Sample output for markdown output mode
175177

176178
```json
177179
{
@@ -204,7 +206,7 @@ The value of the `markdownHeaderDepth` controls the number of keys in the "secti
204206

205207
This example demonstrates how to use the new parameters introduced in the **2025-05-01-preview** to output text content in fixed-sized chunks and extract images along with location metadata from the document.
206208

207-
## Sample definition for text output mode and image and metadata extraction
209+
### Sample definition for text output mode and image and metadata extraction
208210

209211
```json
210212
{
@@ -242,7 +244,7 @@ This example demonstrates how to use the new parameters introduced in the **2025
242244
}
243245
```
244246

245-
## Sample output for text output mode and image and metadata extraction
247+
### Sample output for text output mode and image and metadata extraction
246248

247249
```json
248250
{
@@ -292,7 +294,9 @@ This example demonstrates how to use the new parameters introduced in the **2025
292294
]
293295
}
294296
```
297+
295298
The skill uses [Azure AI Document Intelligence](/azure/ai-services/document-intelligence/overview) to compute locationMetadata. Refer to [Document Intelligence layout model](/azure/ai-services/document-intelligence/concept-layout) for details on how pages and bounding polygon coordinates are defined.
299+
296300
The `imagePath` represents the relative path of a stored image. If the knowledge store file projection is configured in the skillset, this path matches the relative path of the image stored in the knowledge store.
297301

298302
## See also

articles/search/search-get-started-portal-image-search.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ For content extraction, you can choose either default extraction via Azure AI Se
4040
| Method | Description |
4141
|--|--|
4242
| Default extraction | Extracts location metadata from PDF images only. Doesn't require another Azure AI resource. |
43-
| Enhanced extraction | Extracts location metadata from text and images for multiple document types. Requires an [Azure AI services multi-service resource](/azure/ai-services/multi-service-resource#azure-ai-multi-services-resource-for-azure-ai-search-skills) <sup>1</sup> in a [supported region](cognitive-search-skill-document-intelligence-layout.md#supported--regions). |
43+
| Enhanced extraction | Extracts location metadata from text and images for multiple document types. Requires an [Azure AI services multi-service resource](/azure/ai-services/multi-service-resource#azure-ai-multi-services-resource-for-azure-ai-search-skills) <sup>1</sup> in a [supported region](cognitive-search-skill-document-intelligence-layout.md#supported-regions). |
4444

4545
<sup>1</sup> For billing purposes, you must [attach your multi-service resource](cognitive-search-attach-cognitive-services.md) to the skillset in your Azure AI Search service. Unless you use a [keyless connection](cognitive-search-attach-cognitive-services.md#bill-through-a-keyless-connection) to create the skillset, both resources must be in the same region.
4646

articles/search/toc.yml

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -204,11 +204,6 @@ items:
204204
href: search-security-overview.md
205205
- name: Secure access to external data
206206
href: search-indexer-securing-resources.md
207-
- name: Security controls by Azure Policy
208-
displayName: regulatory, compliance, standards, domains
209-
href: ./security-controls-policy.md
210-
- name: Security baseline
211-
href: /security/benchmark/azure/baselines/cognitive-search-security-baseline?toc=/azure/search/TOC.json
212207
- name: How-to guides
213208
items:
214209
- name: Service management
@@ -707,6 +702,13 @@ items:
707702
href: ./policy-reference.md
708703
- name: Monitoring data reference
709704
href: monitor-azure-cognitive-search-data-reference.md
705+
- name: Security reference
706+
items:
707+
- name: Security controls by Azure Policy
708+
displayName: regulatory, compliance, standards, domains
709+
href: ./security-controls-policy.md
710+
- name: Security baseline
711+
href: /security/benchmark/azure/baselines/cognitive-search-security-baseline?toc=/azure/search/TOC.json
710712
- name: Skills reference
711713
items:
712714
- name: Overview

0 commit comments

Comments
 (0)