Skip to content

Commit c15a8a6

Browse files
authored
Merge pull request #278859 from HeidiSteen/heidist-june12
[azure search] cross-link security trimming to related content
2 parents c9ca26d + 65ff64a commit c15a8a6

4 files changed

+46
-47
lines changed

articles/search/search-howto-index-sharepoint-online.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ Here are the limitations of this feature:
5858

5959
+ Renaming a SharePoint folder doesn't trigger incremental indexing. A renamed folder is treated as new content.
6060

61-
+ SharePoint supports a granular authorization model that determines per-user access at the document level. The indexer doesn't pull these permissions into the index, and Azure AI Search doesn't support document-level authorization. When a document is indexed from SharePoint into a search service, the content is available to anyone who has read access to the index. If you require document-level permissions, you should consider [security filters to trim results](search-security-trimming-for-azure-search-with-aad.md) and automate copying the permissions at a file level to a field in the index.
61+
+ SharePoint supports a granular authorization model that determines per-user access at the document level. The indexer doesn't pull these permissions into the index, and Azure AI Search doesn't support document-level authorization. When a document is indexed from SharePoint into a search service, the content is available to anyone who has read access to the index. If you require document-level permissions, you should consider [security filters to trim results](search-security-trimming-for-azure-search.md) and automate copying the permissions at a file level to a field in the index.
6262

6363
+ Indexing user-encrypted files, Information Rights Management (IRM) protected files, ZIP files with passwords or similar encrypted content isn't supported. For encrypted content to be processed, the user with proper permissions to the specific file must remove the encryption so the item can be indexed accordingly when the indexer runs the next scheduled iteration.
6464

@@ -287,7 +287,7 @@ There are a few steps to creating the indexer:
287287
}
288288
```
289289
290-
If you're using application permissions, it's necessary to wait until the initial run is complete before starting to query your index. The following instructions provided in this step pertain specifically to delegated permissions, and are not applicable to application permissions.
290+
If you're using application permissions, it's necessary to wait until the initial run is complete before starting to query your index. The following instructions provided in this step pertain specifically to delegated permissions, and are not applicable to application permissions.
291291
292292
1. When you create the indexer for the first time, the [Create Indexer (preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2023-10-01-preview&tabs=HTTP&preserve-view=true) request waits until you complete the next step. You must call [Get Indexer Status](/rest/api/searchservice/indexers/get-status?view=rest-searchservice-2023-10-01-preview&tabs=HTTP&preserve-view=true) to get the link and enter your new device code.
293293
@@ -299,7 +299,7 @@ If you're using application permissions, it's necessary to wait until the initia
299299
300300
If you don’t run the [Get Indexer Status](/rest/api/searchservice/indexers/get-status?view=rest-searchservice-2023-10-01-preview&tabs=HTTP&preserve-view=true) within 10 minutes, the code expires and you’ll need to recreate the [data source](#create-data-source).
301301
302-
1. Copy the device login code from the [Get Indexer Status](/rest/api/searchservice/indexers/get-status?view=rest-searchservice-2023-10-01-preview&tabs=HTTP&preserve-view=true) response. The device login can be found in the "errorMessage".
302+
1. Copy the device login code from the [Get Indexer Status](/rest/api/searchservice/indexers/get-status?view=rest-searchservice-2023-10-01-preview&tabs=HTTP&preserve-view=true) response. The device login can be found in the "errorMessage".
303303
304304
```http
305305
{
@@ -309,6 +309,7 @@ If you're using application permissions, it's necessary to wait until the initia
309309
}
310310
}
311311
```
312+
312313
1. Provide the code that was included in the error message.
313314
314315
:::image type="content" source="media/search-howto-index-sharepoint-online/enter-device-code.png" alt-text="Screenshot showing how to enter a device code.":::

articles/search/search-security-overview.md

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -156,12 +156,7 @@ For multitenancy solutions requiring security boundaries at the index level, it'
156156

157157
User permissions at the document level, also known as *row-level security*, isn't natively supported in Azure AI Search. If you import data from an external system that provides row-level security, such as Azure Cosmos DB, those permissions won't transfer with the data as its being indexed by Azure AI Search.
158158

159-
If you require permissioned access over content in search results, there's a technique for applying filters that include or exclude documents based on user identity. This workaround adds a string field in the data source that represents a group or user identity, which you can make filterable in your index. The following table describes two approaches for trimming search results of unauthorized content.
160-
161-
| Approach | Description |
162-
|----------|-------------|
163-
|[Security trimming based on identity filters](search-security-trimming-for-azure-search.md) | Documents the basic workflow for implementing user identity access control. It covers adding security identifiers to an index, and then explains filtering against that field to trim results of prohibited content. |
164-
|[Security trimming based on Microsoft Entra identities](search-security-trimming-for-azure-search-with-aad.md) | This article expands on the previous article, providing steps for retrieving identities from Microsoft Entra ID, one of the [free services](https://azure.microsoft.com/free/) in the Azure cloud platform. |
159+
If you require permissioned access over content in search results, there's a technique for applying filters that include or exclude documents based on user identity. This workaround adds a string field in the data source that represents a group or user identity, which you can make filterable in your index. For more information about this pattern, see [Security trimming based on identity filters](search-security-trimming-for-azure-search.md).
165160

166161
## Data residency
167162

articles/search/search-security-trimming-for-azure-search-with-aad.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ ms.date: 02/15/2024
1212
ms.custom:
1313
- devx-track-csharp
1414
- ignite-2023
15+
ROBOTS: NOINDEX,NOFOLLOW
1516
---
1617
# Security filters for trimming Azure AI Search results using Microsoft Entra tenants and identities
1718

articles/search/search-security-trimming-for-azure-search.md

Lines changed: 40 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Security filters for trimming results
2+
title: Security filter pattern
33
titleSuffix: Azure AI Search
44
description: Learn how to implement security privileges at the document level for Azure AI Search search results, using security filters and user identities.
55

@@ -10,68 +10,73 @@ ms.service: cognitive-search
1010
ms.custom:
1111
- ignite-2023
1212
ms.topic: how-to
13-
ms.date: 01/10/2024
13+
ms.date: 06/20/2024
1414
---
1515

1616
# Security filters for trimming results in Azure AI Search
1717

18-
Azure AI Search doesn't provide document-level permissions and can't vary search results from within the same index by user permissions. As a workaround, you can create a filter that trims search results based on a string containing a group or user identity.
18+
Azure AI Search doesn't provide native document-level permissions and can't vary search results from within the same index by user permissions. As a workaround, you can create a filter that trims search results based on a string containing a group or user identity.
1919

20-
This article describes a pattern for security filtering that includes following steps:
20+
This article describes a pattern for security filtering having the following steps:
2121

2222
> [!div class="checklist"]
2323
> * Assemble source documents with the required content
2424
> * Create a field for the principal identifiers
2525
> * Push the documents to the search index for indexing
2626
> * Query the index with the `search.in` filter function
2727
28+
It concludes with links to demos and examples that provide hands-on learning. We recommend reviewing this article first to understand the pattern.
29+
2830
## About the security filter pattern
2931

30-
Although Azure AI Search doesn't integrate with security subsystems for access to content within an index, many customers who have document-level security requirements have found that filters can meet their needs.
32+
Although Azure AI Search doesn't integrate with security subsystems for access to content within an index, many customers who have document-level security requirements find that filters can meet their needs.
3133

32-
In Azure AI Search, a security filter is a regular OData filter that includes or excludes a search result based on a matching value, except that in a security filter, the criteria is a string consisting of a security principal. There's no authentication or authorization through the security principal. The principal is just a string, used in a filter expression, to include or exclude a document from the search results.
34+
In Azure AI Search, a security filter is a regular OData filter that includes or excludes a search result based on a string consisting of a security principal. There's no authentication or authorization through the security principal. The principal is just a string, used in a filter expression, to include or exclude a document from the search results.
3335

3436
There are several ways to achieve security filtering. One way is through a complicated disjunction of equality expressions: for example, `Id eq 'id1' or Id eq 'id2'`, and so forth. This approach is error-prone, difficult to maintain, and in cases where the list contains hundreds or thousands of values, slows down query response time by many seconds.
3537

3638
A better solution is using the `search.in` function for security filters, as described in this article. If you use `search.in(Id, 'id1, id2, ...')` instead of an equality expression, you can expect subsecond response times.
3739

3840
## Prerequisites
3941

40-
* The field containing group or user identity must be a string with the filterable attribute. It should be a collection. It shouldn't allow nulls.
42+
* A string field containing a group or user identity, such as a Microsoft Entra object identifier.
4143

42-
* Other fields in the same document should provide the content that's accessible to that group or user. In the following JSON documents, the "security_id" fields contain identities used in a security filter, and the name, salary, and marital status will be included if the identity of the caller matches the "security_id" of the document.
44+
* Other fields in the same document should provide the content that's accessible to that group or user. In the following JSON documents, the "security_id" fields contain identities used in a security filter, and the name, salary, and marital status are included if the identity of the caller matches the "security_id" of the document.
4345

4446
```json
4547
{
4648
"Employee-1": {
47-
"id": "100-1000-10-1-10000-1",
49+
"employee_id": "100-1000-10-1-10000-1",
4850
"name": "Abram",
4951
"salary": 75000,
5052
"married": true,
51-
"security_id": "10011"
53+
"security_id": "alphanumeric-object-id-for-employee-1"
5254
},
5355
"Employee-2": {
54-
"id": "200-2000-20-2-20000-2",
56+
"employee_id": "200-2000-20-2-20000-2",
5557
"name": "Adams",
5658
"salary": 75000,
5759
"married": true,
58-
"security_id": "20022"
60+
"security_id": "alphanumeric-object-id-for-employee-2"
5961
}
6062
}
6163
```
6264

63-
>[!NOTE]
64-
> The process of retrieving the principal identifiers and injecting those strings into source documents that can be indexed by Azure AI Search isn't covered in this article. Refer to the documentation of your identity service provider for help with obtaining identifiers.
65-
6665
## Create security field
6766

68-
In the search index, within the field collection, you need one field that contains the group or user identity, similar to the fictitious "security_id" field in the previous example.
67+
In the search index, within the fields collection, you need one field that contains the group or user identity, similar to the fictitious "security_id" field in the previous example.
68+
69+
1. Add a security field as a `Collection(Edm.String)`.
70+
71+
1. Set the field's `filterable` attribute set to `true`.
6972

70-
1. Add a security field as a `Collection(Edm.String)`. Make sure it has a `filterable` attribute set to `true` so that search results are filtered based on the access the user has. For example, if you set the `group_ids` field to `["group_id1, group_id2"]` for the document with `file_name` "secured_file_b", only users that belong to group IDs "group_id1" or "group_id2" have read access to the file.
73+
1. Set the field's `retrievable` attribute to `false` so that it isn't returned as part of the search request.
7174

72-
Set the field's `retrievable` attribute to `false` so that it isn't returned as part of the search request.
75+
1. Indexes require a document key. The "file_id" field satisfies that requirement.
7376

74-
1. Indexes require a document key. The "file_id" field satisfies that requirement. Indexes should also contain searchable content. The "file_name" and "file_description" fields represent that in this example.
77+
1. Indexes should also contain searchable and retrievable content. The "file_name" and "file_description" fields represent that in this example.
78+
79+
The following index schema satisfies the field requirements. Documents that you index on Azure AI Search should have values for all of these fields, including the "group_ids". For the document with `file_name` "secured_file_b", only users that belong to group IDs "group_id1" or "group_id2" have read access to the file.
7580

7681
```https
7782
POST https://[search service].search.windows.net/indexes/securedfiles/docs/index?api-version=2023-11-01
@@ -87,23 +92,25 @@ In the search index, within the field collection, you need one field that contai
8792
```
8893

8994
## Push data into your index using the REST API
90-
91-
Send an HTTP POST request to the docs collection of your index's URL endpoint (see [Documents - Index](/rest/api/searchservice/documents/)). The body of the HTTP request is a JSON rendering of the documents to be indexed:
9295

93-
```http
94-
POST https://[search service].search.windows.net/indexes/securedfiles/docs/index?api-version=2023-11-01
95-
```
96+
Populate your search index with documents that provide values for each field in the fields collection, including values for the security field. Azure AI Search doesn't provide APIs or features for populating the security field specifically. However, several of the examples listed at the end of this article explain techniques for populating this field.
9697

97-
In the request body, specify the content of your documents:
98+
In Azure AI Search, the approaches for loading data are:
9899

99-
```JSON
100+
* A single push or pull (indexer) operation that imports documents populated with all fields
101+
* Multiple push or pull operations. As long as secondary import operations target the right document identifier, you can load fields individually through multiple imports.
102+
103+
The following example shows a single HTTP POST request to the docs collection of your index's URL endpoint (see [Documents - Index](/rest/api/searchservice/documents/)). The body of the HTTP request is a JSON rendering of the documents to be indexed:
104+
105+
```http
106+
POST https://[search service].search.windows.net/indexes/securedfiles/docs/index?api-version=2023-11-01
100107
{
101108
"value": [
102109
{
103110
"@search.action": "upload",
104111
"file_id": "1",
105112
"file_name": "secured_file_a",
106-
"file_description": "File access is restricted to the Human Resources.",
113+
"file_description": "File access is restricted to Human Resources.",
107114
"group_ids": ["group_id1"]
108115
},
109116
{
@@ -147,17 +154,11 @@ For full details on searching documents using Azure AI Search, you can read [Sea
147154

148155
This sample shows how to set up query using a POST request.
149156

150-
Issue the HTTP POST request:
157+
Issue the HTTP POST request, specifying the filter in the request body:
151158

152159
```http
153-
POST https://[service name].search.windows.net/indexes/securedfiles/docs/search?api-version=2020-06-30
154-
Content-Type: application/json
155-
api-key: [admin or query key]
156-
```
160+
POST https://[service name].search.windows.net/indexes/securedfiles/docs/search?api-version=2023-11-01
157161
158-
Specify the filter in the request body:
159-
160-
```JSON
161162
{
162163
"filter":"group_ids/any(g:search.in(g, 'group_id1, group_id2'))"
163164
}
@@ -186,7 +187,8 @@ You should get the documents back where `group_ids` contains either "group_id1"
186187

187188
This article describes a pattern for filtering results based on user identity and the `search.in()` function. You can use this function to pass in principal identifiers for the requesting user to match against principal identifiers associated with each target document. When a search request is handled, the `search.in` function filters out search results for which none of the user's principals have read access. The principal identifiers can represent things like security groups, roles, or even the user's own identity.
188189

189-
For an alternative pattern based on Microsoft Entra ID, or to revisit other security features, see the following links.
190+
For more examples, demos, and videos:
190191

191-
* [Security filters for trimming results using Microsoft Entra ID identities](search-security-trimming-for-azure-search-with-aad.md)
192-
* [Security in Azure AI Search](search-security-overview.md)
192+
* [Get started with chat document security in Python](/azure/developer/python/get-started-app-chat-document-security-trim)
193+
* [Set up optional sign in and document level access control (modifications to the AzureOpenAIDemo app)](https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/login_and_acl.md)
194+
* [Video: Secure your Intelligent Applications with Microsoft Entra](https://build.microsoft.com/en-US/sessions/b5636ca7-64c2-493c-9b30-4a35852acfbe?source=/speakers/cc9b56a0-4af0-4b60-a2f3-8312c5b35ca2)

0 commit comments

Comments
 (0)