Merge pull request #5269 from haileytap/multimodal

denrea · web-flow · commit 774b179b2927 · 2025-05-29T09:49:10.000-07:00
[Azure Search] Rename multimodal tutorials
diff --git a/articles/search/.openpublishing.redirection.search.json b/articles/search/.openpublishing.redirection.search.json
@@ -395,6 +395,26 @@
             "source_path_from_root": "/articles/search/search-data-sources-terms-of-use.md",
             "redirect_url": "https://partner.microsoft.com/partnership/find-a-partner",
             "redirect_document_id": false
+        },
+        {
+            "source_path_from_root": "/articles/search/tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md",
+            "redirect_url": "/azure/search/tutorial-document-extraction-multimodal-embeddings",
+            "redirect_document_id": true
+        },
+        {
+            "source_path_from_root": "/articles/search/tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md",
+            "redirect_url": "/azure/search/tutorial-document-extraction-image-verbalization",
+            "redirect_document_id": true
+        },
+        {
+            "source_path_from_root": "/articles/search/tutorial-multimodal-index-embeddings-skill.md",
+            "redirect_url": "/azure/search/tutorial-document-layout-multimodal-embeddings",
+            "redirect_document_id": true
+        },
+        {
+            "source_path_from_root": "/articles/search/tutorial-multimodal-index-image-verbalization-skill.md",
+            "redirect_url": "/azure/search/tutorial-document-layout-image-verbalization",
+            "redirect_document_id": true
         }
     ]
   }
diff --git a/articles/search/cognitive-search-skill-document-extraction.md b/articles/search/cognitive-search-skill-document-extraction.md
@@ -19,9 +19,9 @@ The **Document Extraction** skill extracts content from a file within the enrich
 
 For [vector](vector-search-overview.md) and [multimodal search](multimodal-search-overview.md), Document Extraction combined with the [Text Split skill](cognitive-search-skill-textsplit.md) is more affordable than other [data chunking approaches](vector-search-how-to-chunk-documents.md). The following tutorials demonstrate skill usage for different scenarios:
 
-+ [Tutorial: Index mixed content using multimodal embeddings and the Document Extraction skill](tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md)
++ [Tutorial: Index mixed content using multimodal embeddings and the Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md)
 
-+ [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md)
++ [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-document-extraction-image-verbalization.md)
 
 > [!NOTE]
 > This skill isn't bound to Azure AI services and has no Azure AI services key requirement.
diff --git a/articles/search/cognitive-search-skill-document-intelligence-layout.md b/articles/search/cognitive-search-skill-document-intelligence-layout.md
@@ -24,9 +24,9 @@ This article is the reference documentation for the Document Layout skill. For u
 
 It's common to use this skill on content such as PDFs that have structure and images. The following tutorials demonstrate several scenarios: 
 
-+ [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
++ [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md)
 
-+ [Tutorial: Index mixed content using multimodal embeddings and the Document Layout skill](tutorial-multimodal-index-embeddings-skill.md)
++ [Tutorial: Index mixed content using multimodal embeddings and the Document Layout skill](tutorial-document-layout-multimodal-embeddings.md)
 
 > [!NOTE]
 > This skill uses the [Document Intelligence layout model](/azure/ai-services/document-intelligence/concept-layout) provided in [Azure AI Document Intelligence](/azure/ai-services/document-intelligence/overview).
diff --git a/articles/search/cognitive-search-skill-genai-prompt.md b/articles/search/cognitive-search-skill-genai-prompt.md
@@ -19,9 +19,9 @@ The **GenAI (Generative AI) Prompt** skill executes a *chat completion* request
 
 Use this capability to create new information that can be indexed and stored as searchable content. Examples include verbalize images, summarize larger passages, simplify complex content, or any other task that an LLM can perform. The skill supports text, image, and multimodal content such as a PDF that contains text and images. It's common to use this skill combined with a data chunking skill. The following tutorials demonstrate the image verbalization scenarios with two different data chunking techniques: 
 
-- [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
+- [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md)
 
-- [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md)
+- [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-document-extraction-image-verbalization.md)
 
 The GenAI Prompt skill is available in the [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true) only. 
 
diff --git a/articles/search/multimodal-search-overview.md b/articles/search/multimodal-search-overview.md
@@ -1,17 +1,17 @@
 ---
-title: Multimodal search concepts and guidance
+title: Multimodal Search Concepts and Guidance
 titleSuffix: Azure AI Search
 description: Learn what multimodal search is, how Azure AI Search supports it for text and image content, and where to find detailed concepts, tutorials, and samples.
 ms.service: azure-ai-search
 ms.topic: conceptual
-ms.date: 05/28/2025
+ms.date: 05/29/2025
 author: gmndrg
 ms.author: gimondra
 ---
 
 # Multimodal search in Azure AI Search
 
-Multimodal search refers to the ability to ingest, understand, and retrieve content across multiple data types, including text, images, video, and audio. In Azure AI Search, multimodal search natively supports the ingestion of documents containing text and images and the retrieval of their content, enabling you to perform searches that combine both modalities.
+Multimodal search refers to the ability to ingest, understand, and retrieve information across multiple content types, including text, images, video, and audio. In Azure AI Search, multimodal search natively supports the ingestion of documents containing text and images and the retrieval of their content, enabling you to perform searches that combine both modalities.
 
 Building a robust multimodal pipeline typically involves:
 
@@ -115,8 +115,8 @@ To help you get started with multimodal search in Azure AI Search, here's a coll
 | Content | Description |
 |--|--|
 | [Quickstart: Multimodal search in the Azure portal](search-get-started-portal-image-search.md) | Create and test a multimodal index in the Azure portal using the wizard and Search Explorer. |
-| [Tutorial: Image verbalization and Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. |
-| [Tutorial: Multimodal embeddings and Document Extraction skill](tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. |
-| [Tutorial: Image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. |
-| [Tutorial: Multimodal embeddings and Document Layout skill](tutorial-multimodal-index-embeddings-skill.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. |
+| [Tutorial: Image verbalization and Document Extraction skill](tutorial-document-extraction-image-verbalization.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. |
+| [Tutorial: Multimodal embeddings and Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. |
+| [Tutorial: Image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. |
+| [Tutorial: Multimodal embeddings and Document Layout skill](tutorial-document-layout-multimodal-embeddings.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. |
 | [Sample app: Multimodal RAG GitHub repository](https://aka.ms/azs-multimodal-sample-app-repo) | An end-to-end, code-ready RAG application with multimodal capabilities that surfaces both text snippets and image annotations. Ideal for jump-starting enterprise copilots. |
diff --git a/articles/search/search-get-started-portal-image-search.md b/articles/search/search-get-started-portal-image-search.md
@@ -348,4 +348,4 @@ This quickstart uses billable Azure resources. If you no longer need the resourc
 
 ## Next step
 
-This quickstart introduced you to the **Import and vectorize data wizard**, which creates all of the necessary objects for multimodal search. To explore each step in detail, see [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md).
+This quickstart introduced you to the **Import and vectorize data wizard**, which creates all of the necessary objects for multimodal search. To explore each step in detail, see [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md).
diff --git a/articles/search/toc.yml b/articles/search/toc.yml
@@ -105,13 +105,13 @@ items:
   - name: Multimodal indexing tutorials
     items:
     - name: Use document extraction and multimodal embeddings
-      href: tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md
+      href: tutorial-document-extraction-multimodal-embeddings.md
     - name: Use document extraction and image verbalizations
-      href: tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md
+      href: tutorial-document-extraction-image-verbalization.md
     - name: Use semantic chunking and multimodal embeddings
-      href: tutorial-multimodal-index-embeddings-skill.md
+      href: tutorial-document-layout-multimodal-embeddings.md
     - name: Use semantic chunking and image verbalizations
-      href: tutorial-multimodal-index-image-verbalization-skill.md      
+      href: tutorial-document-layout-image-verbalization.md      
   - name: RAG tutorials
     items:
     - name: Build a RAG solution
diff --git a/articles/search/tutorial-document-extraction-image-verbalization.md b/articles/search/tutorial-document-extraction-image-verbalization.md
@@ -1,5 +1,5 @@
 ---
-title: 'Tutorial: Index multimodal content using image verbalization and Document Extraction skill'
+title: 'Tutorial: Use Image Verbalization and Document Extraction Skill for Multimodal Indexing'
 titleSuffix: Azure AI Search
 description: Learn how to extract, index, and search multimodal content using the Document Extraction skill for chunking and GenAI Prompt skill for image verbalizations.
 
@@ -9,7 +9,7 @@ ms.author: mdonovan
 ms.service: azure-ai-search
 ms.custom:
 ms.topic: tutorial
-ms.date: 05/28/2025
+ms.date: 05/29/2025
 
 ---
 
@@ -31,7 +31,7 @@ In this tutorial, you use:
 
 This tutorial demonstrates a lower-cost approach for indexing multimodal content using Document Extraction skill and image captioning. It enables extraction and search over both text and images from documents in Azure Blob Storage. However, it doesn't include locational metadata for text, such as page numbers or bounding regions.
 
-For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md).
+For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md).
 
 > [!NOTE]
 > Setting `imageAction` to `generateNormalizedImages` is required for this tutorial and incurs an additional charge for image extraction according to [Azure AI Search pricing](https://azure.microsoft.com/pricing/details/search/).
@@ -751,4 +751,4 @@ Now that you're familiar with a sample implementation of a multimodal indexing s
 * [GenAI Prompt skill](cognitive-search-skill-genai-prompt.md)
 * [Vectors in Azure AI Search](vector-search-overview.md)
 * [Semantic ranking in Azure AI Search](semantic-search-overview.md)
-* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
+* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md)
diff --git a/articles/search/tutorial-document-extraction-multimodal-embeddings.md b/articles/search/tutorial-document-extraction-multimodal-embeddings.md
@@ -1,5 +1,5 @@
 ---
-title: 'Tutorial: Index multimodal content using embedding and Document Extraction skill'
+title: 'Tutorial: Use Multimodal Embeddings and Document Extraction Skill for Multimodal Indexing'
 titleSuffix: Azure AI Search
 description: Learn how to extract, index, and search multimodal content using the Document Extraction skill for chunking and Azure AI Vision for embeddings.
 
@@ -9,7 +9,7 @@ ms.author: mdonovan
 ms.service: azure-ai-search
 ms.custom:
 ms.topic: tutorial
-ms.date: 05/28/2025
+ms.date: 05/29/2025
 
 ---
 
@@ -29,7 +29,7 @@ In this tutorial, you use:
 
 This tutorial demonstrates a lower-cost approach for indexing multimodal content using Document Extraction skill and image captioning. It enables extraction and search over both text and images from documents in Azure Blob Storage. However, it doesn't include locational metadata for text, such as page numbers or bounding regions.
 
-For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md).
+For a more comprehensive solution that includes structured text layout and spatial metadata, see [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md).
 
 > [!NOTE]
 > Setting `imageAction` to `generateNormalizedImages` as is required for this tutorial incurs an additional charge for image extraction according to [Azure AI Search pricing](https://azure.microsoft.com/pricing/details/search/).
@@ -711,4 +711,4 @@ Now that you're familiar with a sample implementation of a multimodal indexing s
 * [AI Vision multimodal embeddings skill](cognitive-search-skill-vision-vectorize.md)
 * [Vectors in Azure AI Search](vector-search-overview.md)
 * [Semantic ranking in Azure AI Search](semantic-search-overview.md)
-* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md)
+* [Indexing blobs with text and images for multimodal RAG scenarios using image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md)
diff --git a/articles/search/tutorial-document-layout-image-verbalization.md b/articles/search/tutorial-document-layout-image-verbalization.md
@@ -1,5 +1,5 @@
 ---
-title: 'Tutorial: Index multimodal content using image verbalization and Document Layout skill'
+title: 'Tutorial: Use Image Verbalization and Document Layout Skill for Multimodal Indexing'
 titleSuffix: Azure AI Search
 description: Learn how to extract, index, and search multimodal content using the Document Layout skill for chunking and GenAI Prompt skill for image verbalizations.
 
@@ -9,7 +9,7 @@ ms.author: rawan
 ms.service: azure-ai-search
 ms.custom:
 ms.topic: tutorial
-ms.date: 05/28/2025
+ms.date: 05/29/2025
 
 ---
 
@@ -25,7 +25,7 @@ In this tutorial, you use:
 
 + The [Document Layout skill (preview)](cognitive-search-skill-document-intelligence-layout.md) for extracting text and normalized images with its locationMetadata from various documents, such as page numbers or bounding regions.
 
-  The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) has limited regional availability, is bound to Azure AI services, and requires a [billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. For a lower-cost solution to indexing multimodal content, see [Index multimodal content using image verbalization and Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md).
+  The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) has limited regional availability, is bound to Azure AI services, and requires a [billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. For a lower-cost solution to indexing multimodal content, see [Index multimodal content using image verbalization and Document Extraction skill](tutorial-document-extraction-image-verbalization.md).
 
 + The [GenAI Prompt skill (preview)](cognitive-search-skill-genai-prompt.md) to generate image captions, which are text-based descriptions of visual content, for search and grounding.
 
diff --git a/articles/search/tutorial-document-layout-multimodal-embeddings.md b/articles/search/tutorial-document-layout-multimodal-embeddings.md
@@ -1,5 +1,5 @@
 ---
-title: 'Tutorial: Index multimodal content using multimodal embedding and Document Layout skill'
+title: 'Tutorial: Use Multimodal Embeddings and Document Layout Skill for Multimodal Indexing'
 titleSuffix: Azure AI Search
 description: Learn how to extract, index, and search multimodal content using the Document Layout skill for chunking and Azure AI Vision for embeddings.
 
@@ -9,7 +9,7 @@ ms.author: rawan
 ms.service: azure-ai-search
 ms.custom:
 ms.topic: tutorial
-ms.date: 05/28/2025
+ms.date: 05/29/2025
 
 ---
 
@@ -24,7 +24,7 @@ In this tutorial, you use:
 
 + The [Document Layout skill (preview)](cognitive-search-skill-document-intelligence-layout.md) for extracting text and normalized images with its locationMetadata from various documents, such as page numbers or bounding regions.
 
-  The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) has limited regional availability, is bound to Azure AI services, and requires a [billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. For a lower-cost solution to indexing multimodal content, see [Index multimodal content using image verbalization and Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md).
+  The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) has limited regional availability, is bound to Azure AI services, and requires a [billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. For a lower-cost solution to indexing multimodal content, see [Index multimodal content using image verbalization and Document Extraction skill](tutorial-document-extraction-image-verbalization.md).
 
 + Vectorization using the [Azure AI Vision multimodal embeddings skill](cognitive-search-skill-vision-vectorize.md), which generates embeddings for both text and images.
 
@@ -613,4 +613,4 @@ Now that you're familiar with a sample implementation of a multimodal indexing s
 + [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md)
 + [Vectors in Azure AI Search](vector-search-overview.md)
 + [Semantic ranking in Azure AI Search](semantic-search-overview.md)
-+ [Index multimodal content using embeddings and Document Extraction skill](tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md)
++ [Index multimodal content using embeddings and Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md)

Original file line number	Diff line number	Diff line change
`@@ -348,4 +348,4 @@ This quickstart uses billable Azure resources. If you no longer need the resourc`
`348`	`348`
`349`	`349`	`## Next step`
`350`	`350`
`351`		`-This quickstart introduced you to the Import and vectorize data wizard, which creates all of the necessary objects for multimodal search. To explore each step in detail, see [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md).`
	`351`	`+This quickstart introduced you to the Import and vectorize data wizard, which creates all of the necessary objects for multimodal search. To explore each step in detail, see [Tutorial: Index mixed content using image verbalizations and the Document Layout skill](tutorial-document-layout-image-verbalization.md).`