You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/multimodal-search-overview.md
+7-2Lines changed: 7 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure AI Search
4
4
description: Learn what multimodal search is, how Azure AI Search supports it for text + image content, and where to find detailed concepts, tutorials, and samples.
5
5
ms.service: azure-ai-search
6
6
ms.topic: conceptual
7
-
ms.date: 05/12/2025
7
+
ms.date: 05/19/2025
8
8
author: gmndrg
9
9
ms.author: gimondra
10
10
---
@@ -28,6 +28,9 @@ Azure AI Search simplifies the construction of a multimodal pipeline through a g
28
28
The functionality behind the **Import and vectorize data** wizard's multimodality option is powered by managed, configurable AI skills and the Azure Search knowledge store:
29
29
30
30
+[Document Intelligence layout skill](cognitive-search-skill-document-intelligence-layout.md) and [document extraction skill](cognitive-search-skill-document-extraction.md) obtain page text, inline images, and structural metadata. The Document Extraction skill doesn't support polygon extraction or page number extraction. Also, the range of supported file types may vary. To ensure optimal alignment with your specific use case, check each skill documentation for detailed information on compatibility and capabilities.
31
+
The native document extraction mechanisms (document layout or document extraction skills) don't support either table extraction or the preservation of their structure. To extract tables and retain their structure, you can:
32
+
1. Build a [custom Web API skill](cognitive-search-custom-skill-web-api.md).
33
+
1. Use this skill to call the [Azure AI Content Understanding service](/azure/ai-services/content-understanding/tutorial/build-rag-solution), which supports content extraction, including tables.
31
34
+[Split skill](cognitive-search-skill-textsplit.md) chunks the extracted text for utilization in the remaining pipeline functionality (such as embedding skills).
32
35
+[Gen AI prompt skill](cognitive-search-skill-genai-prompt.md) verbalizes images, producing concise natural-language descriptions suitable for text search and embedding using a Large Language Model (LLM).
33
36
+ Text/image (or multimodal) embedding skills create embeddings for text and images, enabling similarity and hybrid retrieval. You can call [Azure OpenAI](cognitive-search-skill-azure-openai-embedding.md), [AI Foundry](cognitive-search-aml-skill.md), or [AI Vision](cognitive-search-skill-vision-vectorize.md) embedding models natively.
@@ -39,7 +42,9 @@ A multimodal pipeline begins by cracking each source document into chunks of tex
| Location metadata extraction based on file type | Multiple file support according to [Azure AI Document Intelligence layout model (preview)](/azure/ai-services/document-intelligence/prebuilt/layout)| PDF only |
43
48
| Data-extraction billing | Billed according to [Document Intelligence layout-model pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/). | Image extraction is billed as outlined in the [Azure AI Search pricing page](https://azure.microsoft.com/pricing/details/search/). |
44
49
| Recommended scenarios | RAG pipelines and agent workflows that need precise page numbers, on-page highlights, or diagram overlays in client apps. | Rapid prototyping or production pipelines where the exact position or detailed layout information isn't required. |
Copy file name to clipboardExpand all lines: articles/search/search-how-to-create-search-index.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
7
7
ms.author: heidist
8
8
ms.service: azure-ai-search
9
9
ms.topic: how-to
10
-
ms.date: 05/08/2025
10
+
ms.date: 05/19/2025
11
11
---
12
12
13
13
# Create an index in Azure AI Search
@@ -52,6 +52,8 @@ Use this checklist to assist the design decisions for your search index.
52
52
53
53
1. Review [supported data types](/rest/api/searchservice/supported-data-types). The data type affects how the field is used. For example, numeric content is filterable but not full text searchable. The most common data type is `Edm.String` for searchable text, which is tokenized and queried using the full text search engine. The most common data type for a vector field is `Edm.Single` but you can use other types as well.
54
54
55
+
1. Provide a description of the index (preview), 4,000 character maximum. This human-readable text is invaluable when a system must access several indexes and make a decision based on the description. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on index name alone. An index Description field is available in the [2025-05-01-preview REST API](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true), the Azure portal, or a prerelease package of an Azure SDK that provides the feature. For more information, see [Add an index description](search-howto-reindex.md#add-an-index-description-preview).
56
+
55
57
1. Identify a [document key](#document-keys). A document key is an index requirement. It's a single string field populated from a source data field that contains unique values. For example, if you're indexing from Blob Storage, the metadata storage path is often used as the document key because it uniquely identifies each blob in the container.
56
58
57
59
1. Identify the fields in your data source that contribute searchable content in the index.
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-one-to-many-blobs.md
+10-5Lines changed: 10 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ ms.service: azure-ai-search
11
11
ms.custom:
12
12
- ignite-2023
13
13
ms.topic: conceptual
14
-
ms.date: 01/18/2025
14
+
ms.date: 05/19/2025
15
15
---
16
16
17
17
# Indexing blobs and files to produce multiple search documents
@@ -24,11 +24,12 @@ When you use any of these parsing modes, the new search documents that emerge mu
24
24
25
25
To address this problem, the blob indexer generates an `AzureSearch_DocumentKey` that uniquely identifies each child search document created from the single blob parent. This article explains how this feature works.
26
26
27
+
27
28
## One-to-many document key
28
29
29
-
Each document in an index is uniquely identified by a document key. When no parsing mode is specified, and if there's no [explicit field mapping](search-indexer-field-mappings.md) in the indexer definition for the search document key, the blob indexer automatically maps the `metadata_storage_path property` as the document key. This default mapping ensures that each blob appears as a distinct search document, and it saves you the step of having to create this field mapping yourself (normally, only fields having identical names and types are automatically mapped).
30
+
A document key uniquely identifies each document in an index. When no parsing mode is specified, and if there's no [explicit field mapping](search-indexer-field-mappings.md) in the indexer definition for the search document key, the blob indexer automatically maps the `metadata_storage_path property` as the document key. This default mapping ensures that each blob appears as a distinct search document. It also eliminates the need for you to manually create this field mapping. Normally, fields with identical names and types are the only ones mapped automatically.
30
31
31
-
In a one-to-many search document scenario, an implicit document key based on `metadata_storage_path property` isn't possible. As a workaround, Azure AI Search can generate a document key for each individual entity extracted from a blob. The generated key is named `AzureSearch_DocumentKey` and it's added to each search document. The indexer keeps track of the "many documents" created from each blob, and can target updates to the search index when source data changes over time.
32
+
In a one-to-many search document scenario, an implicit document key based on `metadata_storage_path property` isn't possible. As a workaround, Azure AI Search can generate a document key for each individual entity extracted from a blob. The system generates a key called `AzureSearch_DocumentKey` and adds it to each search document. The indexer keeps track of the "many documents" created from each blob, and can target updates to the search index when source data changes over time.
32
33
33
34
By default, when no explicit field mappings for the key index field are specified, the `AzureSearch_DocumentKey` is mapped to it, using the `base64Encode` field-mapping function.
Notice that each document contains the `id` field, which is defined as the `key` field in the index. In such a case, even though a document-unique `AzureSearch_DocumentKey` is generated, it isn't used as the "key" for the document. Rather, the value of the `id` field is mapped to the `key` field
136
+
Each document contains the `id` field, which is defined as the `key` field in the index. In this situation, the system generates a unique AzureSearch_DocumentKey` for the document, but it isn't used as the "key." Instead, the value of the `id` field is mapped to the `key` field.
137
+
138
+
Similar to the previous example, this mapping doesn't result in four documents showing up in the index because the `id` field isn't unique _across blobs_. When this situation occurs, any JSON entry that specifies an `id` causes a merge with the existing document instead of uploading a new one. The index then reflects the latest state of the entry with the specified `id`.
139
+
140
+
## Limitations
136
141
137
-
Similar to the previous example, this mapping doesn't result in four documents showing up in the index because the `id` field isn't unique _across blobs_. When this is the case, any json entry that specifies an `id` results in a merge on the existing document instead of an upload of a new document, and the state of the index reflects the latest read entry with the specified `id`.
142
+
When a document entry in the index is created from a line in a file, as explained in this article, deleting that line from the file does'nt automatically remove the corresponding entry from the index. To delete the document entry, you must manually submit a deletion request to the index using the [REST API deletion operation](/rest/api/searchservice/addupdate-or-delete-documents).
Copy file name to clipboardExpand all lines: articles/search/search-howto-reindex.md
+41-1Lines changed: 41 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ ms.service: azure-ai-search
11
11
ms.custom:
12
12
- ignite-2024
13
13
ms.topic: how-to
14
-
ms.date: 04/28/2025
14
+
ms.date: 05/19/2025
15
15
---
16
16
17
17
# Update or rebuild an index in Azure AI Search
@@ -199,6 +199,7 @@ The index schema defines the physical data structures created on the search serv
199
199
200
200
The following list enumerates the schema changes that can be introduced seamlessly into an existing index. Generally, the list includes new fields and functionality used during query execution.
201
201
202
+
+ Add an [index description (preview)]()
202
203
+ Add a new field
203
204
+ Set the `retrievable` attribute on an existing field
204
205
+ Update `searchAnalyzer` on a field having an existing `indexAnalyzer`
@@ -253,6 +254,45 @@ When you create the index, physical storage is allocated for each field in the i
253
254
254
255
To minimize disruption to application code, consider [creating an index alias](search-how-to-alias.md). Application code references the alias, but you can update the name of the index that the alias points to.
255
256
257
+
## Add an index description (preview)
258
+
259
+
Beginning with REST API version 2025-05-01-preview, an `indexdescription` is now supported. This human-readable text is invaluable when a system must access several indexes and make a decision based on the description. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on the index name alone.
260
+
261
+
An index description is a schema update, and you can add it without having to rebuild the entire index.
262
+
263
+
+ String length is 4,000 characters maximum.
264
+
+ Content must be human-readable, in Unicode. Your use-case should determine which language to use.
265
+
266
+
Support for an index description is provided in the preview REST API, the Azure portal, or in a prerelease Azure SDK package that provides the feature.
267
+
268
+
### [**Azure portal**](#tab/portal)
269
+
270
+
The Azure portal supports the latest preview API.
271
+
272
+
1. Sign in to the Azure portal and find your search service.
273
+
274
+
1. Under **Search management** > **Indexes**, select an index.
275
+
276
+
1. Select **Edit JSON**.
277
+
278
+
1. Insert `"indexDescription"`, followed by the description.
279
+
280
+
:::image type="content" source="media/search-how-to-index/edit-index-json.png" alt-text="Screenshot of the JSON definition of an index in the Azure portal.":::
281
+
282
+
1. Save the index.
283
+
284
+
### [**REST**](#tab/rest)
285
+
286
+
1.[GET an index definition](/rest/api/searchservice/indexes/get).
287
+
288
+
1. Copy the JSON.
289
+
290
+
1.[Formulate an index update PUT request](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) using the preview API, providing the *full* JSON of the existing schema, plus the new description field.
291
+
292
+
1. To confirm the description, issue another [GET using the 2025-05-01-preview REST API](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true).
293
+
294
+
---
295
+
256
296
## Balancing workloads
257
297
258
298
Indexing doesn't run in the background, but the search service will balance any indexing jobs against ongoing queries. During indexing, you can [monitor query requests](search-monitor-queries.md) in the Azure portal to ensure queries are completing in a timely manner.
Copy file name to clipboardExpand all lines: articles/search/search-index-access-control-lists-and-rbac-push-api.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure AI Search
4
4
description: Learn how to use the REST API for indexing documents with ACLs and RBAC metadata.
5
5
ms.service: azure-ai-search
6
6
ms.topic: conceptual
7
-
ms.date: 05/09/2025
7
+
ms.date: 05/19/2025
8
8
author: admayber
9
9
ms.author: admayber
10
10
---
@@ -16,7 +16,7 @@ ms.author: admayber
16
16
Indexing documents, along with their associated [Access Control Lists (ACLs)](/azure/storage/blobs/data-lake-storage-access-control) and container [Role-Based Access Control (RBAC) roles](/azure/role-based-access-control/overview), into an Azure AI Search index via the [REST API](/rest/api/searchservice/) offers fine-grained control over the indexing pipeline. This approach enables the inclusion of document entries with precise, document-level permissions directly within the index. This article explains how to use the REST API to index document-level permissions' metadata in Azure AI Search. This process prepares your index to query and enforce end-user permissions.
17
17
18
18
## Supported scenarios
19
-
- Indexing ACLs metadata from [ENTRA-based](/entra/fundamentals/whatis), POSIX-style ACL systems, such as [Azure Data Lake Storage (ADLS) Gen2].(/azure/storage/blobs/data-lake-storage-introduction)
19
+
- Indexing ACLs metadata from [ENTRA-based](/entra/fundamentals/whatis), POSIX-style ACL systems, such as [Azure Data Lake Storage (ADLS) Gen2](/azure/storage/blobs/data-lake-storage-introduction)
20
20
- Indexing RBAC container metadata from ADLS Gen2.
21
21
22
22
### Limitations
@@ -107,9 +107,9 @@ This example illustrates how the document access rules are resolved based on the
107
107
| 3 |["none"]|["group1", "group2"]| Empty | Members of group1 or group2 ||
108
108
| 4 |["all"]|["none"]| Empty | Any user | Any querying user matches the ACL filter "all", so all users have access |
109
109
| 5 |["all"]|["group1", "group2"]| scope/to/container1 | Any user | Since all users match the "all" filter for userID, the groupID and RBAC filters don't have any impact |
110
-
|5|["user1", "user2"]|["group1"]| Empty | User1, user2, or any member of group1 ||
111
-
|5|["user1", "user2"]|[]| Empty | User1, user2, or any user with RBAC permissions to container1 ||
110
+
|6|["user1", "user2"]|["group1"]| Empty | User1, user2, or any member of group1 ||
111
+
|7|["user1", "user2"]|[]| Empty | User1, user2, or any user with RBAC permissions to container1 ||
112
112
113
113
## Next steps
114
-
-[How to query the index using end user ENTRA-token to enforce document-level permissions](https://aka.ms/azs-query-preserving-permissions)
114
+
-[How to query the index using end user ENTRA-token to enforce document-level permissions](search-query-access-control-rbac-enforcement.md)
115
115
-[How to index ADLS Gen2 document-level permission information using indexers](tutorial-adls-gen2-indexer-acls.md)
Copy file name to clipboardExpand all lines: articles/search/search-what-is-an-index.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ ms.service: azure-ai-search
11
11
ms.custom:
12
12
- ignite-2023
13
13
ms.topic: conceptual
14
-
ms.date: 04/14/2025
14
+
ms.date: 05/19/2025
15
15
---
16
16
17
17
# Search indexes in Azure AI Search
@@ -35,6 +35,7 @@ The structure of a document is determined by the *index schema*, as illustrated
35
35
```json
36
36
{
37
37
"name": "name_of_index, unique across the service",
38
+
"description" : "Health plan coverage for standard and premium plans for Northwind and Contoso employees."
38
39
"fields": [
39
40
{
40
41
"name": "name_of_field",
@@ -50,7 +51,7 @@ The structure of a document is determined by the *index schema*, as illustrated
50
51
"indexAnalyzer": "name_of_indexing_analyzer", (only if 'searchAnalyzer' is set and 'analyzer' is not set)
51
52
"normalizer": "name_of_normalizer", (applies to fields that are filterable)
52
53
"synonymMaps": "name_of_synonym_map", (optional, only one synonym map per field is currently supported)
53
-
"dimensions": "number of dimensions used by an emedding models", (applies to vector fields only, of type Collection(Edm.Single))
54
+
"dimensions": "number of dimensions used by an embedding models", (applies to vector fields only, of type Collection(Edm.Single))
54
55
"vectorSearchProfile": "name_of_vector_profile"(indexes can have many configurations, a field can use just one)
55
56
}
56
57
],
@@ -187,6 +188,7 @@ You can get hands-on experience creating an index using almost any sample or wal
187
188
But you'll also want to become familiar with methodologies for loading an index with data. Index definition and data import strategies are defined in tandem. The following articles provide more information about creating and loading an index.
188
189
189
190
+[Create a search index](search-how-to-create-search-index.md)
191
+
+[Update an index](search-howto-reindex.md)
190
192
+[Create a vector store](vector-search-how-to-create-index.md)
0 commit comments