Skip to content

Commit 774b179

Browse files
authored
Merge pull request #5269 from haileytap/multimodal
[Azure Search] Rename multimodal tutorials
2 parents 9ed02ca + 1ed02d9 commit 774b179

11 files changed

+53
-33
lines changed

articles/search/.openpublishing.redirection.search.json

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -395,6 +395,26 @@
395395
"source_path_from_root": "/articles/search/search-data-sources-terms-of-use.md",
396396
"redirect_url": "https://partner.microsoft.com/partnership/find-a-partner",
397397
"redirect_document_id": false
398+
},
399+
{
400+
"source_path_from_root": "/articles/search/tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md",
401+
"redirect_url": "/azure/search/tutorial-document-extraction-multimodal-embeddings",
402+
"redirect_document_id": true
403+
},
404+
{
405+
"source_path_from_root": "/articles/search/tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md",
406+
"redirect_url": "/azure/search/tutorial-document-extraction-image-verbalization",
407+
"redirect_document_id": true
408+
},
409+
{
410+
"source_path_from_root": "/articles/search/tutorial-multimodal-index-embeddings-skill.md",
411+
"redirect_url": "/azure/search/tutorial-document-layout-multimodal-embeddings",
412+
"redirect_document_id": true
413+
},
414+
{
415+
"source_path_from_root": "/articles/search/tutorial-multimodal-index-image-verbalization-skill.md",
416+
"redirect_url": "/azure/search/tutorial-document-layout-image-verbalization",
417+
"redirect_document_id": true
398418
}
399419
]
400420
}

articles/search/cognitive-search-skill-document-extraction.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ The **Document Extraction** skill extracts content from a file within the enrich
1919

2020
For [vector](vector-search-overview.md) and [multimodal search](multimodal-search-overview.md), Document Extraction combined with the [Text Split skill](cognitive-search-skill-textsplit.md) is more affordable than other [data chunking approaches](vector-search-how-to-chunk-documents.md). The following tutorials demonstrate skill usage for different scenarios:
2121

22-
+ [Tutorial: Index mixed content using multimodal embeddings and the Document Extraction skill](tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md)
22+
+ [Tutorial: Index mixed content using multimodal embeddings and the Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md)
2323

24-
+ [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md)
24+
+ [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-document-extraction-image-verbalization.md)
2525

2626
> [!NOTE]
2727
> This skill isn't bound to Azure AI services and has no Azure AI services key requirement.

articles/search/cognitive-search-skill-document-intelligence-layout.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ This article is the reference documentation for the Document Layout skill. For u
2424

2525
It's common to use this skill on content such as PDFs that have structure and images. The following tutorials demonstrate several scenarios:
2626

27-
+ [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
27+
+ [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md)
2828

29-
+ [Tutorial: Index mixed content using multimodal embeddings and the Document Layout skill](tutorial-multimodal-index-embeddings-skill.md)
29+
+ [Tutorial: Index mixed content using multimodal embeddings and the Document Layout skill](tutorial-document-layout-multimodal-embeddings.md)
3030

3131
> [!NOTE]
3232
> This skill uses the [Document Intelligence layout model](/azure/ai-services/document-intelligence/concept-layout) provided in [Azure AI Document Intelligence](/azure/ai-services/document-intelligence/overview).

articles/search/cognitive-search-skill-genai-prompt.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ The **GenAI (Generative AI) Prompt** skill executes a *chat completion* request
1919

2020
Use this capability to create new information that can be indexed and stored as searchable content. Examples include verbalize images, summarize larger passages, simplify complex content, or any other task that an LLM can perform. The skill supports text, image, and multimodal content such as a PDF that contains text and images. It's common to use this skill combined with a data chunking skill. The following tutorials demonstrate the image verbalization scenarios with two different data chunking techniques:
2121

22-
- [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
22+
- [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md)
2323

24-
- [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md)
24+
- [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-document-extraction-image-verbalization.md)
2525

2626
The GenAI Prompt skill is available in the [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true) only.
2727

articles/search/multimodal-search-overview.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
---
2-
title: Multimodal search concepts and guidance
2+
title: Multimodal Search Concepts and Guidance
33
titleSuffix: Azure AI Search
44
description: Learn what multimodal search is, how Azure AI Search supports it for text and image content, and where to find detailed concepts, tutorials, and samples.
55
ms.service: azure-ai-search
66
ms.topic: conceptual
7-
ms.date: 05/28/2025
7+
ms.date: 05/29/2025
88
author: gmndrg
99
ms.author: gimondra
1010
---
1111

1212
# Multimodal search in Azure AI Search
1313

14-
Multimodal search refers to the ability to ingest, understand, and retrieve content across multiple data types, including text, images, video, and audio. In Azure AI Search, multimodal search natively supports the ingestion of documents containing text and images and the retrieval of their content, enabling you to perform searches that combine both modalities.
14+
Multimodal search refers to the ability to ingest, understand, and retrieve information across multiple content types, including text, images, video, and audio. In Azure AI Search, multimodal search natively supports the ingestion of documents containing text and images and the retrieval of their content, enabling you to perform searches that combine both modalities.
1515

1616
Building a robust multimodal pipeline typically involves:
1717

@@ -115,8 +115,8 @@ To help you get started with multimodal search in Azure AI Search, here's a coll
115115
| Content | Description |
116116
|--|--|
117117
| [Quickstart: Multimodal search in the Azure portal](search-get-started-portal-image-search.md) | Create and test a multimodal index in the Azure portal using the wizard and Search Explorer. |
118-
| [Tutorial: Image verbalization and Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. |
119-
| [Tutorial: Multimodal embeddings and Document Extraction skill](tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. |
120-
| [Tutorial: Image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. |
121-
| [Tutorial: Multimodal embeddings and Document Layout skill](tutorial-multimodal-index-embeddings-skill.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. |
118+
| [Tutorial: Image verbalization and Document Extraction skill](tutorial-document-extraction-image-verbalization.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. |
119+
| [Tutorial: Multimodal embeddings and Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. |
120+
| [Tutorial: Image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. |
121+
| [Tutorial: Multimodal embeddings and Document Layout skill](tutorial-document-layout-multimodal-embeddings.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. |
122122
| [Sample app: Multimodal RAG GitHub repository](https://aka.ms/azs-multimodal-sample-app-repo) | An end-to-end, code-ready RAG application with multimodal capabilities that surfaces both text snippets and image annotations. Ideal for jump-starting enterprise copilots. |

articles/search/search-get-started-portal-image-search.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -348,4 +348,4 @@ This quickstart uses billable Azure resources. If you no longer need the resourc
348348

349349
## Next step
350350

351-
This quickstart introduced you to the **Import and vectorize data wizard**, which creates all of the necessary objects for multimodal search. To explore each step in detail, see [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md).
351+
This quickstart introduced you to the **Import and vectorize data wizard**, which creates all of the necessary objects for multimodal search. To explore each step in detail, see [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md).

articles/search/toc.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -105,13 +105,13 @@ items:
105105
- name: Multimodal indexing tutorials
106106
items:
107107
- name: Use document extraction and multimodal embeddings
108-
href: tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md
108+
href: tutorial-document-extraction-multimodal-embeddings.md
109109
- name: Use document extraction and image verbalizations
110-
href: tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md
110+
href: tutorial-document-extraction-image-verbalization.md
111111
- name: Use semantic chunking and multimodal embeddings
112-
href: tutorial-multimodal-index-embeddings-skill.md
112+
href: tutorial-document-layout-multimodal-embeddings.md
113113
- name: Use semantic chunking and image verbalizations
114-
href: tutorial-multimodal-index-image-verbalization-skill.md
114+
href: tutorial-document-layout-image-verbalization.md
115115
- name: RAG tutorials
116116
items:
117117
- name: Build a RAG solution
Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: 'Tutorial: Index multimodal content using image verbalization and Document Extraction skill'
2+
title: 'Tutorial: Use Image Verbalization and Document Extraction Skill for Multimodal Indexing'
33
titleSuffix: Azure AI Search
44
description: Learn how to extract, index, and search multimodal content using the Document Extraction skill for chunking and GenAI Prompt skill for image verbalizations.
55

@@ -9,7 +9,7 @@ ms.author: mdonovan
99
ms.service: azure-ai-search
1010
ms.custom:
1111
ms.topic: tutorial
12-
ms.date: 05/28/2025
12+
ms.date: 05/29/2025
1313

1414
---
1515

@@ -31,7 +31,7 @@ In this tutorial, you use:
3131

3232
This tutorial demonstrates a lower-cost approach for indexing multimodal content using Document Extraction skill and image captioning. It enables extraction and search over both text and images from documents in Azure Blob Storage. However, it doesn't include locational metadata for text, such as page numbers or bounding regions.
3333

34-
For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md).
34+
For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md).
3535

3636
> [!NOTE]
3737
> Setting `imageAction` to `generateNormalizedImages` is required for this tutorial and incurs an additional charge for image extraction according to [Azure AI Search pricing](https://azure.microsoft.com/pricing/details/search/).
@@ -751,4 +751,4 @@ Now that you're familiar with a sample implementation of a multimodal indexing s
751751
* [GenAI Prompt skill](cognitive-search-skill-genai-prompt.md)
752752
* [Vectors in Azure AI Search](vector-search-overview.md)
753753
* [Semantic ranking in Azure AI Search](semantic-search-overview.md)
754-
* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
754+
* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md)

articles/search/tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md renamed to articles/search/tutorial-document-extraction-multimodal-embeddings.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: 'Tutorial: Index multimodal content using embedding and Document Extraction skill'
2+
title: 'Tutorial: Use Multimodal Embeddings and Document Extraction Skill for Multimodal Indexing'
33
titleSuffix: Azure AI Search
44
description: Learn how to extract, index, and search multimodal content using the Document Extraction skill for chunking and Azure AI Vision for embeddings.
55

@@ -9,7 +9,7 @@ ms.author: mdonovan
99
ms.service: azure-ai-search
1010
ms.custom:
1111
ms.topic: tutorial
12-
ms.date: 05/28/2025
12+
ms.date: 05/29/2025
1313

1414
---
1515

@@ -29,7 +29,7 @@ In this tutorial, you use:
2929

3030
This tutorial demonstrates a lower-cost approach for indexing multimodal content using Document Extraction skill and image captioning. It enables extraction and search over both text and images from documents in Azure Blob Storage. However, it doesn't include locational metadata for text, such as page numbers or bounding regions.
3131

32-
For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md).
32+
For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md).
3333

3434
> [!NOTE]
3535
> Setting `imageAction` to `generateNormalizedImages` as is required for this tutorial incurs an additional charge for image extraction according to [Azure AI Search pricing](https://azure.microsoft.com/pricing/details/search/).
@@ -711,4 +711,4 @@ Now that you're familiar with a sample implementation of a multimodal indexing s
711711
* [AI Vision multimodal embeddings skill](cognitive-search-skill-vision-vectorize.md)
712712
* [Vectors in Azure AI Search](vector-search-overview.md)
713713
* [Semantic ranking in Azure AI Search](semantic-search-overview.md)
714-
* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
714+
* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md)

articles/search/tutorial-multimodal-index-image-verbalization-skill.md renamed to articles/search/tutorial-document-layout-image-verbalization.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: 'Tutorial: Index multimodal content using image verbalization and Document Layout skill'
2+
title: 'Tutorial: Use Image Verbalization and Document Layout Skill for Multimodal Indexing'
33
titleSuffix: Azure AI Search
44
description: Learn how to extract, index, and search multimodal content using the Document Layout skill for chunking and GenAI Prompt skill for image verbalizations.
55

@@ -9,7 +9,7 @@ ms.author: rawan
99
ms.service: azure-ai-search
1010
ms.custom:
1111
ms.topic: tutorial
12-
ms.date: 05/28/2025
12+
ms.date: 05/29/2025
1313

1414
---
1515

@@ -25,7 +25,7 @@ In this tutorial, you use:
2525

2626
+ The [Document Layout skill (preview)](cognitive-search-skill-document-intelligence-layout.md) for extracting text and normalized images with its locationMetadata from various documents, such as page numbers or bounding regions.
2727

28-
The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) has limited regional availability, is bound to Azure AI services, and requires a [billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. For a lower-cost solution to indexing multimodal content, see [Index multimodal content using image verbalization and Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md).
28+
The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) has limited regional availability, is bound to Azure AI services, and requires a [billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. For a lower-cost solution to indexing multimodal content, see [Index multimodal content using image verbalization and Document Extraction skill](tutorial-document-extraction-image-verbalization.md).
2929

3030
+ The [GenAI Prompt skill (preview)](cognitive-search-skill-genai-prompt.md) to generate image captions, which are text-based descriptions of visual content, for search and grounding.
3131

0 commit comments

Comments
 (0)