Skip to content

Commit 594db81

Browse files
Merge pull request #6261 from HeidiSteen/heidist-freshness
Updates for multimodal tutorials and GenAI prompt skill
2 parents 1ec46bd + cd96a8f commit 594db81

14 files changed

+156
-140
lines changed

articles/search/chat-completion-skill-example-usage.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22
title: Utilize the content generation capabilities of language models as part of content ingestion pipeline
33
titleSuffix: Azure AI Search
44
description: Use language models to caption your images and facilitate an image search through your data.
5-
author: amitkalay
6-
ms.author: amitkalay
5+
author: gmndrg
6+
ms.author: gimondra
77
ms.service: azure-ai-search
88
ms.topic: how-to
9-
ms.date: 05/05/2025
9+
ms.date: 07/28/2025
1010
ms.custom:
1111
- devx-track-csharp
1212
- build-2025
@@ -22,20 +22,20 @@ The GenAI Prompt skill (preview) generates a description of each image in your d
2222

2323
To work with image content in a skillset, you need:
2424

25-
+ A supported data source
26-
+ Files or blobs containing images
27-
+ Read access on the supported data source. This article uses key-based authentication, but indexers can also connect using the search service identity and Microsoft Entra ID authentication. For role-based access control, assign roles on the data source to allow read access by the service identity. If you're testing on a local development machine, make sure you also have read access on the supported data source.
28-
+ A search indexer, configured for image actions
29-
+ A skillset with the new custom genAI prompt skill
30-
+ A search index with fields to receive the verbalized text output, plus output field mappings in the indexer that establish association
25+
+ A [supported data source](search-indexer-overview.md#supported-data-sources). We recommend Azure Storage.
26+
+ Files or blobs containing images.
27+
+ Read access to the supported data source. This article uses key-based authentication, but indexers can also connect using the search service identity and Microsoft Entra ID authentication. For role-based access control, assign roles on the data source to allow read access by the service identity. If you're testing on a local development machine, make sure you also have read access on the supported data source.
28+
+ A [search indexer](search-how-to-create-indexers.md), configured for image actions.
29+
+ A skillset with the new custom genAI prompt skill.
30+
+ A search index with fields to receive the verbalized text output, plus output field mappings in the indexer that establish association.
3131

3232
Optionally, you can define projections to accept image-analyzed output into a [knowledge store](knowledge-store-concept-intro.md) for data mining scenarios.
3333

3434
<a name="get-normalized-images"></a>
3535

3636
## Configure indexers for image processing
3737

38-
After the source files are set up, enable image normalization by setting the `imageAction` parameter in indexer configuration. Image normalization helps make images more uniform for downstream processing. Image normalization includes the following operations:
38+
After the source files are set up, enable image normalization by setting the `imageAction` parameter in the indexer configuration. Image normalization helps make images more uniform for downstream processing. Image normalization includes the following operations:
3939

4040
+ Large images are resized to a maximum height and width to make them uniform.
4141
+ For images that have metadata that specifies orientation, image rotation is adjusted for vertical loading.

articles/search/cognitive-search-concept-image-scenarios.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ ms.custom:
1616

1717
Images often contain useful information that's relevant in search scenarios. You can [vectorize images](search-get-started-portal-image-search.md) to represent visual content in your search index. Or, you can use [AI enrichment and skillsets](cognitive-search-concept-intro.md) to create and extract searchable *text* from images, including:
1818

19+
+ [GenAI Prompt](cognitive-search-skill-genai-prompt.md) to pass a prompt to a chat completion skill, requesting a description of image content.
1920
+ [OCR](cognitive-search-skill-ocr.md) for optical character recognition of text and digits
2021
+ [Image Analysis](cognitive-search-skill-image-analysis.md) that describes images through visual features
2122
+ [Custom skills](#passing-images-to-custom-skills) to invoke any external image processing that you want to provide

articles/search/cognitive-search-skill-document-extraction.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ The **Document Extraction** skill extracts content from a file within the enrich
1919

2020
For [vector](vector-search-overview.md) and [multimodal search](multimodal-search-overview.md), Document Extraction combined with the [Text Split skill](cognitive-search-skill-textsplit.md) is more affordable than other [data chunking approaches](vector-search-how-to-chunk-documents.md). The following tutorials demonstrate skill usage for different scenarios:
2121

22-
+ [Tutorial: Index mixed content using multimodal embeddings and the Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md)
22+
+ [Tutorial: Vectorize images and text](tutorial-document-extraction-multimodal-embeddings.md)
2323

24-
+ [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-document-extraction-image-verbalization.md)
24+
+ [Tutorial: Verbalize images using generative AI](tutorial-document-extraction-image-verbalization.md)
2525

2626
> [!NOTE]
2727
> This skill isn't bound to Azure AI services and has no Azure AI services key requirement.

articles/search/cognitive-search-skill-document-intelligence-layout.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,16 +22,15 @@ The **Document Layout** skill analyzes a document to detect structure and charac
2222

2323
This article is the reference documentation for the Document Layout skill. For usage information, see [How to chunk and vectorize by document layout](search-how-to-semantic-chunking.md).
2424

25-
It's common to use this skill on content such as PDFs that have structure and images. The following tutorials demonstrate several scenarios:
25+
This skill uses the [Document Intelligence layout model](/azure/ai-services/document-intelligence/concept-layout) provided in [Azure AI Document Intelligence](/azure/ai-services/document-intelligence/overview).
2626

27-
+ [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md)
27+
This skill is bound to a [billable Azure AI multi-service resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services Standard price](https://azure.microsoft.com/pricing/details/cognitive-services/).
2828

29-
+ [Tutorial: Index mixed content using multimodal embeddings and the Document Layout skill](tutorial-document-layout-multimodal-embeddings.md)
30-
31-
> [!NOTE]
32-
> This skill uses the [Document Intelligence layout model](/azure/ai-services/document-intelligence/concept-layout) provided in [Azure AI Document Intelligence](/azure/ai-services/document-intelligence/overview).
29+
> [!TIP]
30+
> It's common to use this skill on content such as PDFs that have structure and images. The following tutorials demonstrate image verbalization with two different data chunking techniques:
3331
>
34-
> This skill is bound to a [billable Azure AI multi-service resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services Standard price](https://azure.microsoft.com/pricing/details/cognitive-services/).
32+
> - [Tutorial: Verbalize images from a structured document layout](tutorial-document-layout-image-verbalization.md)
33+
> - [Tutorial: Vectorize from a structured document layout](tutorial-document-layout-multimodal-embeddings.md)
3534
>
3635
3736
## Limitations

articles/search/cognitive-search-skill-genai-prompt.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,37 @@
11
---
22
title: GenAI Prompt skill (Preview)
33
titleSuffix: Azure AI Search
4-
description: Invokes Chat Completion models from Azure OpenAI or other Azure AI Foundry-hosted models at indexing time.
4+
description: Invokes chat completion models from Azure OpenAI or other Azure AI Foundry-hosted models to create content at indexing time.
55
author: gmndrg
66
ms.author: gimondra
77
ms.service: azure-ai-search
88
ms.custom:
99
- build-2025
1010
ms.topic: reference
11-
ms.date: 05/27/2025
11+
ms.date: 07/28/2025
1212
---
1313

1414
# GenAI Prompt skill
1515

1616
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
1717

18-
The **GenAI (Generative AI) Prompt** skill executes a *chat completion* request against a Large Language Model (LLM) deployed in Azure AI Foundry or Azure OpenAI in Azure AI Foundry Models.
18+
The **GenAI (Generative AI) Prompt** skill executes a *chat completion* request against a Large Language Model (LLM) deployed in Azure AI Foundry or Azure OpenAI in Azure AI Foundry Models. Use this capability to create new information that can be indexed and stored as searchable content.
1919

20-
Use this capability to create new information that can be indexed and stored as searchable content. Examples include verbalize images, summarize larger passages, simplify complex content, or any other task that an LLM can perform. The skill supports text, image, and multimodal content such as a PDF that contains text and images. It's common to use this skill combined with a data chunking skill. The following tutorials demonstrate the image verbalization scenarios with two different data chunking techniques:
20+
Here are some examples of how the GenAI prompt skill can help you create content:
2121

22-
- [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md)
22+
- Verbalize images
23+
- Summarize large passages of text
24+
- Simplify complex content
25+
- Perform any other task that you can articulate in a prompt
2326

24-
- [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-document-extraction-image-verbalization.md)
27+
The GenAI Prompt skill is available in the [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true) only. The skill supports text, image, and multimodal content such as a PDF that contains text and images.
2528

26-
The GenAI Prompt skill is available in the [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true) only.
29+
> [!TIP]
30+
> It's common to use this skill combined with a data chunking skill. The following tutorials demonstrate image verbalization with two different data chunking techniques:
31+
>
32+
> - [Tutorial: Verbalize images using generative AI](tutorial-document-extraction-image-verbalization.md)
33+
> - [Tutorial: Verbalize images from a structured document layout](tutorial-document-layout-image-verbalization.md)
34+
>
2735
2836
## Supported models
2937

articles/search/knowledge-store-projection-example-long.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,22 @@ manager: nitinme
77
author: HeidiSteen
88
ms.author: heidist
99
ms.service: azure-ai-search
10-
ms.topic: conceptual
11-
ms.date: 06/17/2025
10+
ms.topic: concept-article
11+
ms.date: 07/28/2025
1212
ms.custom:
1313
- ignite-2023
1414
- sfi-ropc-nochange
1515
---
1616

17-
# Detailed example of shapes and projections in a knowledge store
17+
# Example of shapes and projections in a knowledge store
1818

19-
This article provides a detailed example that supplements [high-level concepts](knowledge-store-projection-overview.md) and [syntax-based articles](knowledge-store-projections-examples.md) by walking you through the shaping and projection steps required for fully expressing the output of a rich skillset in a [knowledge store](knowledge-store-concept-intro.md).
19+
This article provides a detailed example that supplements [high-level concepts](knowledge-store-projection-overview.md) and [syntax-based articles](knowledge-store-projections-examples.md) by walking you through the shaping and projection steps required for fully expressing the output of a rich skillset in a [knowledge store](knowledge-store-concept-intro.md) in Azure Storage.
2020

21-
If your application requirements call for multiple skills and projections, this example can give you a better idea of how shapes and projections intersect.
21+
If your application requirements call for multiple skills and projections, this example can give you a better idea of how shapes and projections interact.
2222

2323
## Set up sample data
2424

25-
Sample documents aren't included with the Projections collection, but the [AI enrichment demo data files](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/ai-enrichment-mixed-media) contain text and images that work with the projections described in this example.
25+
Sample documents aren't included with the Projections collection, but the [AI enrichment demo data files](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/ai-enrichment-mixed-media) contain text and images that work with the projections described in this example. If you use this sample data, you can skip step that [attaches an Azure AI multi-service account](cognitive-search-attach-cognitive-services.md) because you stay under the daily indexer limit for free enrichments.
2626

2727
Create a blob container in Azure Storage and upload all 14 items.
2828

@@ -39,7 +39,7 @@ Pay close attention to skill outputs (targetNames). Outputs written to the enric
3939
```json
4040
{
4141
"name": "projections-demo-ss",
42-
"description": "Skillset that enriches blob data found in "merged_content". The enrichment granularity is a document.",
42+
"description": "Skillset that enriches blob data found in the merged_content field. The enrichment granularity is a document.",
4343
"skills": [
4444
{
4545
"@odata.type": "#Microsoft.Skills.Text.V3.EntityRecognitionSkill",
@@ -182,12 +182,15 @@ Pay close attention to skill outputs (targetNames). Outputs written to the enric
182182
"cognitiveServices": {
183183
"@odata.type": "#Microsoft.Azure.Search.CognitiveServicesByKey",
184184
"description": "An Azure AI services resource in the same region as Search.",
185-
"key": "<Azure AI services All-in-ONE KEY>"
185+
"key": ""
186186
},
187187
"knowledgeStore": null
188188
}
189189
```
190190

191+
> [!NOTE]
192+
> Under `"cognitiveServices"`, the key field is unspecified because the indexer can use an Azure AI multi-service account in the same region as your search service and process up to 20 transactions daily at no charge. The sample data for this example stays under the 20 transaction limit.
193+
191194
## Example Shaper skill
192195

193196
A [Shaper skill](cognitive-search-skill-shaper.md) is a utility for working with existing enriched content instead of creating new enriched content. Adding a Shaper to a skillset lets you create a custom shape that you can project into table or blob storage. Without a custom shape, projections are limited to referencing a single node (one projection per output), which isn't suitable for tables. Creating a custom shape aggregates various elements into a new logical whole that can be projected as a single table, or sliced and distributed across a collection of tables.

articles/search/multimodal-search-overview.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -116,8 +116,8 @@ To help you get started with multimodal search in Azure AI Search, here's a coll
116116
| Content | Description |
117117
|--|--|
118118
| [Quickstart: Multimodal search in the Azure portal](search-get-started-portal-image-search.md) | Create and test a multimodal index in the Azure portal using the wizard and Search Explorer. |
119-
| [Tutorial: Image verbalization and Document Extraction skill](tutorial-document-extraction-image-verbalization.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. |
120-
| [Tutorial: Multimodal embeddings and Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. |
121-
| [Tutorial: Image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. |
122-
| [Tutorial: Multimodal embeddings and Document Layout skill](tutorial-document-layout-multimodal-embeddings.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. |
119+
| [Tutorial: Verbalize images using generative AI](tutorial-document-extraction-image-verbalization.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. |
120+
| [Tutorial: Vectorize images and text](tutorial-document-extraction-multimodal-embeddings.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. |
121+
| [Tutorial: Verbalize images from a structured document layout](tutorial-document-layout-image-verbalization.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. |
122+
| [Tutorial: Vectorize from a structured document layout](tutorial-document-layout-multimodal-embeddings.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. |
123123
| [Sample app: Multimodal RAG GitHub repository](https://aka.ms/azs-multimodal-sample-app-repo) | An end-to-end, code-ready RAG application with multimodal capabilities that surfaces both text snippets and image annotations. Ideal for jump-starting enterprise copilots. |

0 commit comments

Comments
 (0)