Skip to content

Commit b384b7c

Browse files
authored
Merge pull request #6081 from HeidiSteen/heidist-july
[azure search] ACL doc updates
2 parents 163cd9e + 3eb94b9 commit b384b7c

8 files changed

+107
-77
lines changed

articles/search/search-blob-indexer-role-based-access.md

Lines changed: 37 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
---
22
title: Use a Blob indexer to ingest RBAC scopes metadata
33
titleSuffix: Azure AI Search
4-
description: Learn how to configure Azure AI Search indexers for ingesting Azure Role-Based Access (RBAC) metadata on Azure Blobs.
4+
description: Learn how to configure Azure AI Search indexers for ingesting Azure Role-Based Access (RBAC) metadata on Azure blobs.
55
ms.service: azure-ai-search
66
ms.topic: how-to
7-
ms.date: 07/07/2025
7+
ms.date: 07/16/2025
88
author: vaishalishah
99
ms.author: vaishalishah
1010
---
@@ -13,35 +13,45 @@ ms.author: vaishalishah
1313

1414
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
1515

16-
Starting in 2025-05-01-preview, you can now include RBAC scope alongside document ingestion in Azure AI Search and use those permissions to control access to search results.
16+
Azure Storage allows for role-based access on containers in blob storage, where roles like **Storage Blob Data Reader** or **Storage Blob Data Contributor** determine whether someone has access to content. Starting in 2025-05-01-preview, you can now include RBAC scope alongside document ingestion in Azure AI Search and use those permissions to control access to search results. If you have rights to the content, you can see those results in a search query. If you don't have rights (or more specifically, a role assignment on the blob container), you *can't* see those results even if you personally have a **Search Index Data Reader** assignment on the index.
1717

18-
You can use the push APIs to upload and index content and permission metadata manually see [Indexing Permissions using the push REST API](search-index-access-control-lists-and-rbac-push-api.md), or you can use an indexer to automate data ingestion. This article focuses on the indexer approach.
18+
RBAC scope is set at the container level and flows to all blobs (documents) through permission inheritance. RBAC scope is captured during indexing as permission metadata, You can use the push APIs to upload and index content and permission metadata manually see [Indexing Permissions using the push REST API](search-index-access-control-lists-and-rbac-push-api.md), or you can use an indexer to automate data ingestion. This article focuses on the indexer approach.
19+
20+
At query time, user or group identities are included in the request header via the `x-ms-query-source-authorization` parameter. The identity must match the permission metadata on documents if the user is to see the search results.
1921

2022
The indexer approach is built on this foundation:
2123

2224
+ [Role-based access control (Azure RBAC)](/azure/storage/blobs/data-lake-storage-access-control-model#role-based-access-control-azure-rbac). There's no support for Attribute-based access control (Azure ABAC).
2325

24-
+ [An Azure AI Search indexer for Blob](search-howto-indexing-azure-blob-storage.md) that retrieves and ingests data and metadata, including permission filters. To get permission filter support, you must use the 2025-05-01-preview REST API or a prerelease package of an Azure SDK that supports the feature.
26+
+ [An Azure AI Search indexer for blobs](search-howto-indexing-azure-blob-storage.md) that retrieves and ingests data and metadata, including permission filters. To get permission filter support, you must use the 2025-05-01-preview REST API or a preview package of an Azure SDK that supports the feature.
2527

26-
+ [An index in Azure AI Search](search-how-to-create-search-index.md) containing the ingested documents and corresponding permissions. Permission metadata is stored as fields in the index. To set up queries that respect the permission filters, you must use the 2025-05-01-preview REST API or a prerelease package of an Azure SDK that supports the feature.
28+
+ [An index in Azure AI Search](search-how-to-create-search-index.md) containing the ingested documents and corresponding permissions. Permission metadata is stored as fields in the index. To set up queries that respect the permission filters, you must use the 2025-05-01-preview REST API or a preview package of an Azure SDK that supports the feature.
2729

2830
## Prerequisites
2931

30-
+ [Microsoft Entra ID authentication and authorization](/entra/identity/authentication/overview-authentication). Services and apps must be in the same tenant. Role assignments are used for each authenticated connection.
32+
+ [Microsoft Entra ID authentication and authorization](/entra/identity/authentication/overview-authentication). Services, apps, and users must be in the same tenant. Role assignments are used for each authenticated connection.
3133

3234
+ Azure AI Search, any region, but you must have a billable tier (basic and higher) see [Service limits](search-limits-quotas-capacity.md) for managed identity support. The search service must be [configured for role-based access](search-security-enable-roles.md) and it must [have a managed identity (either system or user)](search-howto-managed-identities-data-sources.md).
3335

3436
## Limitations
3537

36-
+ The following indexer features don't support permission preservation capabilities but are otherwise operational for Azure Blob content-only indexing:
37-
+ One-to-many [parsing modes](/rest/api/searchservice/indexers/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true#blobindexerparsingmode), such as: `delimitedText`, `jsonArray`, `jsonLines`, and `markdown` with sub-mode `oneToMany`
38+
+ Permission inheritance isn't available if the blob indexer is using a [one-to-many parsing mode](/rest/api/searchservice/indexers/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true#blobindexerparsingmode), such as: `delimitedText`, `jsonArray`, `jsonLines`, and `markdown` with sub-mode `oneToMany`. You must use the default parsing mode that creates one search document for each blob in the container.
39+
40+
## Configure Blob storage
41+
42+
Verify your blob container uses role-based access.
43+
44+
1. Sign in to the Azure portal and find your storage account.
3845

46+
1. Expand **containers** and select the container that has the blobs you want to index.
47+
48+
1. Select **Access Control (IAM)** to check role assignments. Users and groups with **Storage Blob Data Reader** or **Storage Blob Data Contributor** will have access to search documents in the index after the container is indexed.
3949

4050
### Authorization
4151

42-
For indexer execution, your search service identity must have **Storage Blob Data Reader** permission see [Connect to Azure Storage using a managed identity](search-howto-managed-identities-storage.md).
52+
For indexer execution, your search service identity must have **Storage Blob Data Reader** permission. For more information, see [Connect to Azure Storage using a managed identity](search-howto-managed-identities-storage.md).
4353

44-
## Configure Azure AI Search for indexing permission filters
54+
## Configure Azure AI Search
4555

4656
Recall that the search service must have:
4757

@@ -52,14 +62,16 @@ Recall that the search service must have:
5262

5363
For indexer execution, the client issuing the API call must have **Search Service Contributor** permission to create objects, **Search Index Data Contributor** permission to perform data import, and **Search Index Data Reader** to query an index see [Connect to Azure AI Search using roles](search-security-rbac.md).
5464

55-
## Indexing permission metadata
65+
## Configure indexing
5666

5767
In Azure AI Search, configure an indexer, data source, and index to pull permission metadata from blobs.
5868

59-
### Configure the data source
69+
### Create the data source
6070

6171
+ Data Source type must be `azureblob`.
6272

73+
+ Data source parsing mode must be the default.
74+
6375
+ Data source must have `indexerPermissionOptions` with `rbacScope`.
6476

6577
+ For `rbacScope`, configure the [connection string](search-howto-index-azure-data-lake-storage.md#supported-credentials-and-connection-strings) with managed identity format.
@@ -78,8 +90,8 @@ JSON example with system managed identity:
7890
"connectionString": "ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Storage/storageAccounts/<your storage account name>/;"
7991
},
8092
"container": {
81-
"name": "<your container name>",
82-
"query": "<optional-query>"
93+
"name": "<your-container-name>",
94+
"query": "<optional-query-used-for-selecting-specific-blobs>"
8395
}
8496
}
8597
```
@@ -95,8 +107,8 @@ JSON schema example with a user-managed identity in the connection string:
95107
"connectionString": "ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Storage/storageAccounts/<your storage account name>/;"
96108
},
97109
"container": {
98-
"name": "<your container name>",
99-
"query": "<optional-query>"
110+
"name": "<your-container-name>",
111+
"query": "<optional-query-used-for-selecting-specific-blobs>"
100112
},
101113
"identity": {
102114
"@odata.type": "#Microsoft.Azure.Search.DataUserAssignedIdentity",
@@ -116,7 +128,7 @@ Recommended schema attributes RBAC Scope:
116128
+ Use string fields for permission metadata
117129
+ Set `filterable` to true on all fields.
118130

119-
Notice that `retrievable` is false. You can set it true during development to verify permissions are present, but remember to set to back to false before deploying to a production environment.
131+
Notice that `retrievable` is false. You can set it true during development to verify permissions are present, but remember to set to back to false before deploying to a production environment so that security principal identities aren't visible in results.
120132

121133
JSON schema example:
122134

@@ -139,7 +151,7 @@ JSON schema example:
139151

140152
### Configure the indexer
141153

142-
Field mappings within an indexer set the data path to fields in an index. Target and destination fields that vary by name or data type require an explicit field mapping. The following metadata fields in Azure Blob might need field mappings if you vary the field name:
154+
Field mappings within an indexer set the data path to fields in an index. Target and destination fields that vary by name or data type require an explicit field mapping. The following metadata fields in Azure Blob Storage might need field mappings if you vary the field name:
143155

144156
+ **metadata_rbac_scope** (`Edm.String`) - the container RBAC scope.
145157

@@ -160,3 +172,9 @@ JSON schema example:
160172

161173
To effectively manage blob deletion, ensure that you have enabled [deletion tracking](search-howto-index-changed-deleted-blobs.md) before your indexer runs for the first time. This feature allows the system to detect deleted blobs from your source and have them deleted from the index.
162174

175+
## Related content
176+
177+
+ [Search over Azure Blob Storage content](search-blob-storage-integration.md)
178+
+ [Configure a blob indexer](search-howto-indexing-azure-blob-storage.md)
179+
+ [Change and delete detection using indexers for Azure Storage](search-howto-index-changed-deleted-blobs.md)
180+
+ [Connect to Azure AI Search using roles](search-security-rbac.md)

articles/search/search-document-level-access-overview.md

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure AI Search
44
description: Conceptual overview of document-level permissions in Azure AI Search.
55
author: gmndrg
66
ms.author: gimondra
7-
ms.date: 07/03/2025
7+
ms.date: 07/16/2025
88
ms.service: azure-ai-search
99
ms.update-cycle: 90-days
1010
ms.topic: conceptual
@@ -21,37 +21,46 @@ Azure AI Search supports document-level access control, enabling organizations t
2121
| Approach | Description |
2222
|----------|-------------|
2323
| Security filters | String comparison. Your application passes in a user or group identity as a string, which populates a filter on a query, excluding any documents that don't match on the string. <br><br>Security filters are a technique for achieving document-level access control. This approach isn't bound to an API so you can use any version or package. |
24-
| ACLs / RBAC scopes (preview) | Microsoft Entra ID security principal behind the query token is compared to the permission metadata of documents returned in search results, excluding any documents that don't match on permissions. <br><br>Built-in support for preserving Access Control Lists (ACLs) and Azure Data Lake Storage (ADLS) Gen2 Role-Based Access Control (RBAC) container scopes at the file level for security principals is in preview, available in REST APIs and prerelease Azure SDK packages that provide the feature. |
24+
| ACLs / RBAC scopes (preview) | Microsoft Entra ID security principal behind the query token is compared to the permission metadata of documents returned in search results, excluding any documents that don't match on permissions. <br><br>Built-in support for identity-based access at the document level is in preview, available in REST APIs and prerelease Azure SDK packages that provide the feature. Be sure to check the [SDK package change log](#retrieve-permissions-metadata-during-data-ingestion-process) for evidence of feature support.|
2525

2626
## Pattern for security trimming using filters
2727

28-
For scenarios where native ACL/RBAC scopes integration isn't viable, we recommend security filters for trimming results based on exclusion criteria. The pattern includes the following components:
28+
For scenarios where native ACL/RBAC scopes integration isn't viable, we recommend security string filters for trimming results based on exclusion criteria. The pattern includes the following components:
2929

3030
- Create a string field in the index to store strings of user or group identities.
3131
- Load the index with source documents that include a field containing the identities.
3232
- Include a filter expression in your query logic for matching on the string.
3333
- At query time, get the identity of the caller.
3434
- Pass in the identity of the caller as the filter string.
35+
- Results are trimmed to exclude any matches that fail to include the user or group identity string,
3536

3637
You can use push or pull model APIs. Because this approach is API agnostic, you just need to ensure that the index and query have valid strings (identities) for the filtration step.
3738

3839
This approach is useful for systems with custom access models or non-Microsoft security frameworks. For more information this approach, see [Security filters for trimming results in Azure AI Search](search-security-trimming-for-azure-search.md).
3940

4041
## Pattern for native support for POSIX-like ACL and RBAC scope permissions (preview)
4142

42-
Native support is based on Microsoft Entra ID user and group access IDs affiliated with documents that you want to index and query. ADLS container RBAC scopes preservation at document level is also supported.
43+
Native support is based on Microsoft Entra ID user and group access IDs affiliated with documents that you want to index and query.
4344

44-
For ACLs, we recommend group access IDs for ease of management. The pattern includes the following components:
45+
Azure Data Lake Storage (ADLS) Gen2 containers support ACLs on the container and on files. For ADLS Gen2, RBAC scope preservation at document level is natively supported when you use the [ADLS Gen2 indexer](search-howto-index-azure-data-lake-storage.md) and a preview API to ingest content.
46+
47+
For any content that's secured through ACLs, we recommend group access IDs over user access IDs for ease of management. The pattern includes the following components:
4548

4649
- Start with documents or files that have ACL assignments.
4750
- [Enable permission filters](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true#searchindexpermissionfilteroption) in the index.
4851
- [Add a permission filter](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true#permissionfilter) to a string field in an index.
4952
- Load the index with source documents having associated ACLs.
5053
- Query the index, [adding `x-ms-query-source-authorization`](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-05-01-preview&preserve-view=true#request-headers) in the request header.
5154

52-
You can use the push model API, pushing any JSON documents to the search index, where the payload includes a string field providing POSIX-like ACLs for each document.
55+
Your client app has read permissions to the index via **Search Index Data Reader**, but user or group permission metadata on indexed content determines access at query time. Queries that include a permission filter pass a user or group token as `x-ms-query-source-authorization` in the request header. When you use permission filters at query time, Azure AI Search checks for 2 things:
56+
57+
- First, it checks for **Search Index Data Reader** permission that allows your client application to access the index.
58+
59+
-Second, given the extra token on the request, it checks for user or group permissions on documents that are returned in search results, excluding any that don't match.
60+
61+
To get permission metadata into the index, you can use the push model API, pushing any JSON documents to the search index, where the payload includes a string field providing POSIX-like ACLs for each document. The important difference between this approach and security trimming is that the permission filter metadata in the index and query is recognized as Microsoft Entra ID authentication, whereas the security trimming workaround is simple string comparison. Also, you can use the Graph SDK to retrieve the identities.
5362

54-
Or, use the pull model (indexer) APIs if the data source is [Azure Data Lake Storage (ADLS) Gen2](/azure/storage/blobs/data-lake-storage-introduction).
63+
You can also use the pull model (indexer) APIs if the data source is [Azure Data Lake Storage (ADLS) Gen2](/azure/storage/blobs/data-lake-storage-introduction) and your code calls a preview API for indexing.
5564

5665
### Retrieve permissions metadata during data ingestion process
5766

articles/search/search-howto-run-reset-indexers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,7 @@ The [Indexers - Reset Docs](/rest/api/searchservice/indexers/reset-docs?view=res
211211

212212
On a per-document basis, all fields in the search document are refreshed with values and metadata from the data source. You can't pick and choose which fields to refresh.
213213

214-
If the data source is Azure Data Lake Storage (ADLS) Gen2, and the blobs are associated with permission metadata, those permissions are also re-ingested in the search index if permissions change in the underlying data. For more information, see [Re-indexing ACL and RBAC scope with ADLS Gen2 indexers](search-indexer-access-control-lists-and-role-based-access.md#keep-aclrbac-metadata-in-sync-with-the-data-source).
214+
If the data source is Azure Data Lake Storage (ADLS) Gen2, and the blobs are associated with permission metadata, those permissions are also re-ingested in the search index if permissions change in the underlying data. For more information, see [Re-indexing ACL and RBAC scope with ADLS Gen2 indexers](search-indexer-access-control-lists-and-role-based-access.md#synchronize-permissions-between-indexed-and-source-content).
215215

216216
If the document is enriched through a skillset and has cached data, the skillset is invoked for just the specified documents, and the cache is updated for the reprocessed documents.
217217

0 commit comments

Comments
 (0)