|
1 | 1 | --- |
2 | | -title: Multimodal search concepts and guidance |
| 2 | +title: Multimodal Search Concepts and Guidance |
3 | 3 | titleSuffix: Azure AI Search |
4 | 4 | description: Learn what multimodal search is, how Azure AI Search supports it for text and image content, and where to find detailed concepts, tutorials, and samples. |
5 | 5 | ms.service: azure-ai-search |
6 | 6 | ms.topic: conceptual |
7 | | -ms.date: 05/28/2025 |
| 7 | +ms.date: 05/29/2025 |
8 | 8 | author: gmndrg |
9 | 9 | ms.author: gimondra |
10 | 10 | --- |
11 | 11 |
|
12 | 12 | # Multimodal search in Azure AI Search |
13 | 13 |
|
14 | | -Multimodal search refers to the ability to ingest, understand, and retrieve content across multiple data types, including text, images, video, and audio. In Azure AI Search, multimodal search natively supports the ingestion of documents containing text and images and the retrieval of their content, enabling you to perform searches that combine both modalities. |
| 14 | +Multimodal search refers to the ability to ingest, understand, and retrieve information across multiple content types, including text, images, video, and audio. In Azure AI Search, multimodal search natively supports the ingestion of documents containing text and images and the retrieval of their content, enabling you to perform searches that combine both modalities. |
15 | 15 |
|
16 | 16 | Building a robust multimodal pipeline typically involves: |
17 | 17 |
|
@@ -115,8 +115,8 @@ To help you get started with multimodal search in Azure AI Search, here's a coll |
115 | 115 | | Content | Description | |
116 | 116 | |--|--| |
117 | 117 | | [Quickstart: Multimodal search in the Azure portal](search-get-started-portal-image-search.md) | Create and test a multimodal index in the Azure portal using the wizard and Search Explorer. | |
118 | | -| [Tutorial: Image verbalization and Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. | |
119 | | -| [Tutorial: Multimodal embeddings and Document Extraction skill](tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. | |
120 | | -| [Tutorial: Image verbalization and Document Layout skill](tutorial-multimodal-index-image-verbalization-skill.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. | |
121 | | -| [Tutorial: Multimodal embeddings and Document Layout skill](tutorial-multimodal-index-embeddings-skill.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. | |
| 118 | +| [Tutorial: Image verbalization and Document Extraction skill](tutorial-document-extraction-image-verbalization.md) | Extract text and images, verbalize diagrams, and embed the resulting descriptions and text into a searchable index. | |
| 119 | +| [Tutorial: Multimodal embeddings and Document Extraction skill](tutorial-document-extraction-multimodal-embeddings.md) | Use a vision-text model to embed both text and images directly, enabling visual-similarity search over scanned PDFs. | |
| 120 | +| [Tutorial: Image verbalization and Document Layout skill](tutorial-document-layout-image-verbalization.md) | Apply layout-aware chunking and diagram verbalization, capture location metadata, and store cropped images for precise citations and page highlights. | |
| 121 | +| [Tutorial: Multimodal embeddings and Document Layout skill](tutorial-document-layout-multimodal-embeddings.md) | Combine layout-aware chunking with unified embeddings for hybrid semantic and keyword search that returns exact hit locations. | |
122 | 122 | | [Sample app: Multimodal RAG GitHub repository](https://aka.ms/azs-multimodal-sample-app-repo) | An end-to-end, code-ready RAG application with multimodal capabilities that surfaces both text snippets and image annotations. Ideal for jump-starting enterprise copilots. | |
0 commit comments