Skip to content

Commit 26d5b74

Browse files
authored
Merge pull request #7227 from MicrosoftDocs/release-sept-2025-search
release-sept-2025-search -> main -- 9/28
2 parents 7d8e2be + 4857f94 commit 26d5b74

File tree

44 files changed

+350
-742
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+350
-742
lines changed

articles/search/cognitive-search-skill-azure-openai-embedding.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.custom:
99
- ignite-2023
1010
- build-2024
1111
ms.topic: reference
12-
ms.date: 09/12/2025
12+
ms.date: 09/26/2025
1313
---
1414

1515
# Azure OpenAI Embedding skill
@@ -45,7 +45,7 @@ Parameters are case-sensitive.
4545

4646
| Inputs | Description |
4747
|---------------------|-------------|
48-
| `resourceUri` | The URI of the model provider. This parameter only supports URLs with the `openai.azure.com` domain, such as `https://<resourcename>.openai.azure.com`. If your Azure OpenAI endpoint has a URL with the `cognitiveservices.azure.com` domain, such as `https://<resourcename>.cognitiveservices.azure.com`, you must create a [custom subdomain](/azure/ai-services/openai/how-to/use-your-data-securely#enabled-custom-subdomain) with `openai.azure.com` for the Azure OpenAI resource and use `https://<resourcename>.openai.azure.com` instead. This field is required if your Azure OpenAI resource is deployed behind a private endpoint or uses Virtual Network (VNet) integration. |
48+
| `resourceUri` | The URI of the model provider. This parameter only supports URLs with the `openai.azure.com` domain, such as `https://<resourcename>.openai.azure.com`. If your Azure OpenAI endpoint has a URL with the `cognitiveservices.azure.com` domain, such as `https://<resourcename>.cognitiveservices.azure.com`, you must create a [custom subdomain](/azure/ai-services/openai/how-to/use-your-data-securely#enabled-custom-subdomain) with `openai.azure.com` for the Azure OpenAI resource and use `https://<resourcename>.openai.azure.com` instead. This field is required if your Azure OpenAI resource is deployed behind a private endpoint or uses Virtual Network (VNet) integration. [Azure API Management](/azure/api-management/api-management-key-concepts) endpoints are supported with URL `https://<resourcename>.azure-api.net `. Shared private links aren't supported for API Management endpoints.
4949
| `apiKey` | The secret key used to access the model. If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
5050
| `deploymentId` | The name of the deployed Azure OpenAI embedding model. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models.|
5151
| `authIdentity` | A user-managed identity used by the search service for connecting to Azure OpenAI. You can use either a [system or user managed identity](search-how-to-managed-identities.md). To use a system managed identity, leave `apiKey` and `authIdentity` blank. The system-managed identity is used automatically. A managed identity must have [Cognitive Services OpenAI User](/azure/ai-services/openai/how-to/role-based-access-control#azure-openai-roles) permissions to send text to Azure OpenAI. |

articles/search/cognitive-search-skill-document-intelligence-layout.md

Lines changed: 15 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,12 @@ ms.custom:
99
- references_regions
1010
- ignite-2024
1111
ms.topic: reference
12-
ms.date: 09/19/2025
12+
ms.date: 09/28/2025
1313
ms.update-cycle: 365-days
1414
---
1515

1616
# Document Layout skill
1717

18-
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
19-
2018
The **Document Layout** skill analyzes a document to detect structure and characteristics, and produces a syntactical representation of the document in Markdown or Text format. You can use it to extract text and images, where image extraction includes location metadata that preserves image position within the document. Image proximity to related content is beneficial in Retrieval Augmented Generation (RAG) workloads and [multimodal search](multimodal-search-overview.md) scenarios.
2119

2220
This article is the reference documentation for the Document Layout skill. For usage information, see [How to chunk and vectorize by document layout](search-how-to-semantic-chunking.md).
@@ -37,11 +35,12 @@ This skill is bound to a [billable Azure AI multi-service resource](cognitive-se
3735
This skill has the following limitations:
3836

3937
+ The skill isn't suitable for large documents requiring more than 5 minutes of processing in the AI Document Intelligence layout model. The skill times out, but charges still apply to the AI Services multi-services resource if it attaches to the skillset for billing purposes. Ensure documents are optimized to stay within processing limits to avoid unnecessary costs.
38+
4039
+ Since this skill calls the Azure AI Document Intelligence layout model, all documented [service behaviors for different document types](/azure/ai-services/document-intelligence/prebuilt/layout#pages) for different file types apply to its output. For example, Word (DOCX) and PDF files may produce different results due to differences in how images are handled. If consistent image behavior across DOCX and PDF is required, consider converting documents to PDF or reviewing the [multimodal search documentation](multimodal-search-overview.md) for alternative approaches.
4140

4241
## Supported regions
4342

44-
The Document Layout skill calls the [Document Intelligence Public preview version 2024-07-31-preview](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-07-31-preview)&preserve-view=true).
43+
The Document Layout skill calls the [Document Intelligence 2024-11-30 API](/rest/api/aiservices/operation-groups).
4544

4645
Supported regions vary by modality and how the skill connects to the Document Intelligence layout model.
4746

@@ -70,12 +69,6 @@ This skill recognizes the following file formats.
7069

7170
Refer to [Azure AI Document Intelligence layout model supported languages](/azure/ai-services/document-intelligence/language-support/ocr?view=doc-intel-3.1.0&tabs=read-print%2Clayout-print%2Cgeneral#layout&preserve-view=true) for printed text.
7271

73-
## Supported parameters
74-
75-
Several parameters are version-specific. The skills parameter table notes the API version in which a parameter was introduced so that you know how to configure the skill. To use version-specific features such as image and location metadata extraction in [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true), you can use the Azure portal, or target 2025-05-01-preview, or check an Azure SDK change log to see if it supports the new parameters.
76-
77-
The Azure portal supports most preview features and can be used to create or update a skillset. For updates to the Document Layout skill, edit the skillset JSON definition to add new preview parameters.
78-
7972
## @odata.type
8073

8174
Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill
@@ -89,21 +82,21 @@ Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill
8982

9083
## Skill parameters
9184

92-
Parameters are case-sensitive.
85+
Parameters are case-sensitive. Several parameters were introduced in specific preview versions of the REST API. We recommend using the generally available version (2025-09-01) or the latest preview (2025-08-01-preview) for full access to all parameters.
9386

94-
| Parameter name | Version | Allowed Values | Description |
95-
|--------------------|-------------|-------------|-------------|
96-
| `outputMode` | [2024-11-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true) |`oneToMany` | Controls the cardinality of the output produced by the skill. |
97-
| `markdownHeaderDepth` | [2024-11-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true) |`h1`, `h2`, `h3`, `h4`, `h5`, `h6(default)` | Only applies if `outputFormat` is set to `markdown`. This parameter describes the deepest nesting level that should be considered. For instance, if the markdownHeaderDepth is `h3`, any sections that are deeper such as `h4`, are rolled into `h3`. |
98-
| `outputFormat` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) |`markdown(default)`, `text` | **New**. Controls the format of the output generated by the skill. |
99-
| `extractionOptions` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) |`["images"]`, `["images", "locationMetadata"]`, `["locationMetadata"]` | **New**. Identify any extra content extracted from the document. Define an array of enums that correspond to the content to be included in the output. For instance, if the `extractionOptions` is `["images", "locationMetadata"]`, the output includes images and location metadata which provides page location information related to where the content was extracted, such as a page number or section. This parameter applies to both output formats. |
100-
| `chunkingProperties` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | See below. | **New**. Only applies if `outputFormat` is set to `text`. Options that encapsulate how to chunk text content while recomputing other metadata. |
87+
| Parameter name | Allowed Values | Description |
88+
|--------------------|----------------|-------------|
89+
| `outputMode` |`oneToMany` | Controls the cardinality of the output produced by the skill. |
90+
| `markdownHeaderDepth` |`h1`, `h2`, `h3`, `h4`, `h5`, `h6(default)` | Only applies if `outputFormat` is set to `markdown`. This parameter describes the deepest nesting level that should be considered. For instance, if the markdownHeaderDepth is `h3`, any sections that are deeper such as `h4`, are rolled into `h3`. |
91+
| `outputFormat` |`markdown(default)`, `text` | **New**. Controls the format of the output generated by the skill. |
92+
| `extractionOptions` |`["images"]`, `["images", "locationMetadata"]`, `["locationMetadata"]` | **New**. Identify any extra content extracted from the document. Define an array of enums that correspond to the content to be included in the output. For instance, if the `extractionOptions` is `["images", "locationMetadata"]`, the output includes images and location metadata which provides page location information related to where the content was extracted, such as a page number or section. This parameter applies to both output formats. |
93+
| `chunkingProperties` | See below. | **New**. Only applies if `outputFormat` is set to `text`. Options that encapsulate how to chunk text content while recomputing other metadata. |
10194

10295
| ChunkingProperties Parameter | Version | Allowed Values | Description |
10396
|--------------------|-------------|-------------|-------------|
104-
| `unit` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | `Characters`. currently the only allowed value. Chunk length is measured in characters, as opposed to words or tokens | **New**. Controls the cardinality of the chunk unit. |
105-
| `maximumLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Any integer between 300-50000 | **New**. The maximum chunk length in characters as measured by String.Length. |
106-
| `overlapLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Integer. The value needs to be less than the half of the `maximumLength` | **New**. The length of overlap provided between two text chunks. |
97+
| `unit` | `Characters`. currently the only allowed value. Chunk length is measured in characters, as opposed to words or tokens | **New**. Controls the cardinality of the chunk unit. |
98+
| `maximumLength` | Any integer between 300-50000 | **New**. The maximum chunk length in characters as measured by String.Length. |
99+
| `overlapLength` | Integer. The value needs to be less than the half of the `maximumLength` | **New**. The length of overlap provided between two text chunks. |
107100

108101
## Skill inputs
109102

@@ -203,7 +196,7 @@ The value of the `markdownHeaderDepth` controls the number of keys in the "secti
203196

204197
## Example for text output mode and image and metadata extraction
205198

206-
This example demonstrates how to use the new parameters introduced in the **2025-05-01-preview** to output text content in fixed-sized chunks and extract images along with location metadata from the document.
199+
This example demonstrates how to output text content in fixed-sized chunks and extract images along with location metadata from the document.
207200

208201
### Sample definition for text output mode and image and metadata extraction
209202

articles/search/cognitive-search-skill-genai-prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ The GenAI Prompt skill is available in the [latest preview REST API](/rest/api/s
3939

4040
- For image verbalization, the model you use to analyze the image determines what image formats are supported.
4141

42-
- For GPT-5 model, the `temperature` parameter is not supported in the same way as previous models. If defined, it must be set to `1.0`, as other values will result in errors.
42+
- For GPT-5 models, the `temperature` parameter is not supported in the same way as previous models. If defined, it must be set to `1.0`, as other values will result in errors.
4343

4444
- Billing is based on the pricing of the model you use.
4545

articles/search/cognitive-search-skill-image-analysis.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
88
ms.custom:
99
- ignite-2023
1010
ms.topic: reference
11-
ms.date: 07/11/2024
11+
ms.date: 09/17/2025
1212
---
1313

1414
# Image Analysis cognitive skill
@@ -21,7 +21,7 @@ This skill uses the machine learning models provided by [Azure AI Vision](/azure
2121
+ The file size of the image must be less than 4 megabytes (MB)
2222
+ The dimensions of the image must be greater than 50 x 50 pixels
2323

24-
Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
24+
Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in Microsoft OneLake. Images can be standalone files or embedded images in a PDF or other files.
2525

2626
This skill is implemented using the [AI Image Analysis API](/azure/ai-services/computer-vision/overview-image-analysis) version 3.2. If your solution requires calling a newer version of that service API (such as version 4.0), consider implementing through [Web API custom skill](cognitive-search-custom-skill-web-api.md) or use the [ImageAnalysisV4 power skill](https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Vision/ImageAnalysisV4/README.md).
2727

articles/search/cognitive-search-skill-ocr.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
88
ms.custom:
99
- ignite-2023
1010
ms.topic: reference
11-
ms.date: 06/24/2022
11+
ms.date: 09/17/2025
1212
ms.update-cycle: 365-days
1313
---
1414
# OCR cognitive skill
@@ -29,7 +29,7 @@ The **OCR** skill extracts text from image files and embedded images. Supported
2929
+ .BMP
3030
+ .TIFF
3131

32-
Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
32+
Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in Microsoft OneLake. Images can be standalone files or embedded images in a PDF or other files.
3333

3434
> [!NOTE]
3535
> This skill is bound to Azure AI services and requires [a billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services Standard price](https://azure.microsoft.com/pricing/details/cognitive-services/).

0 commit comments

Comments
 (0)