Skip to content

Commit b2e071d

Browse files
committed
Initial draft of changes to first security trimming article
1 parent b3589e5 commit b2e071d

File tree

1 file changed

+64
-32
lines changed

1 file changed

+64
-32
lines changed

articles/search/search-security-trimming-for-azure-search.md

Lines changed: 64 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -8,60 +8,88 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 01/30/2023
11+
ms.date: 03/24/2023
1212
---
1313

1414
# Security filters for trimming results in Azure Cognitive Search
1515

16-
You can apply security filters to trim search results in Azure Cognitive Search based on user identity. This search experience generally requires comparing the identity of whoever requests the search against a field containing the principals who have permissions to the document. When a match is found, the user or principal (such as a group or role) has access to that document.
16+
Cognitive Search doesn't provide document-level permissions and can't vary search results based on user permissions. As a workaround, you can create a filter that trims search results based on a string consisting of user identity information.
1717

18-
One way to achieve security filtering is through a complicated disjunction of equality expressions: for example, `Id eq 'id1' or Id eq 'id2'`, and so forth. This approach is error-prone, difficult to maintain, and in cases where the list contains hundreds or thousands of values, slows down query response time by many seconds.
18+
This article describes a pattern for security filtering that includes following steps:
1919

20-
A simpler and faster approach is through the `search.in` function. If you use `search.in(Id, 'id1, id2, ...')` instead of an equality expression, you can expect sub-second response times.
21-
22-
This article shows you how to accomplish security filtering using the following steps:
2320
> [!div class="checklist"]
24-
> * Create a field that contains the principal identifiers
25-
> * Push or update existing documents with the relevant principal identifiers
26-
> * Issue a search request with `search.in` `filter`
21+
> * Assemble source documents that contain the required content
22+
> * Create a field in your search index to contain the principal identifiers
23+
> * Push the documents to the search index for indexing
24+
> * Query the index with `search.in` filter function
25+
26+
## Choosing the security filter pattern
27+
28+
Although Cognitive Search doesn't integrate with security subsystems at query time, many customers who have document-level security requirements have found that filters can meet their needs.
29+
30+
In Cognitive Search, a security filter is a regular OData filter that includes or excludes a search result based on a matching value. The security principal is just a string. There's no authentication or authorization. The service uses the string as filter criteria to include or exclude a document from the search results.
2731

28-
>[!NOTE]
29-
> The process of retrieving the principal identifiers is not covered in this document. You should get it from your identity service provider.
32+
There are several ways to achieve security filtering. One way is through a complicated disjunction of equality expressions: for example, `Id eq 'id1' or Id eq 'id2'`, and so forth. This approach is error-prone, difficult to maintain, and in cases where the list contains hundreds or thousands of values, slows down query response time by many seconds.
33+
34+
A better solution is using the `search.in` function for security filters. This solution is described in this article. If you use `search.in(Id, 'id1, id2, ...')` instead of an equality expression, you can expect subsecond response times.
3035

3136
## Prerequisites
3237

33-
This article assumes you have an [Azure subscription](https://azure.microsoft.com/pricing/free-trial/?WT.mc_id=A261C142F), an[Azure Cognitive Search service](search-create-service-portal.md), and an [index](search-what-is-an-index.md).
38+
* You must have a [search index](search-what-is-an-index.md) that you can modify.
39+
40+
* You must also have source documents that include a field containing a group or user identity having access to the document. This information becomes the filter criteria against which documents are selected or rejected from the result set returned to the issuer. In the following JSON documents, the "security_id" fields contain an identity string that can be used in a security filter.
41+
42+
```json
43+
{
44+
"Employee-1": {
45+
"id": "000-0000-00-0-00000-1",
46+
"name": "Sanchez",
47+
"salary": 75000,
48+
"married": true,
49+
"security_id": "10011"
50+
},
51+
"Employee-2": {
52+
"id": "000-0000-00-0-00000-2",
53+
"name": "Smith",
54+
"salary": 75000,
55+
"married": true,
56+
"security_id": "20022"
57+
}
58+
}
59+
```
60+
61+
>[!NOTE]
62+
> The process of retrieving the principal identifiers and injecting those strings into source documents that can be indexed by Cognitive Search isn't covered in this article. See the documentation of your identity service provider for help with obtaining identifiers.
3463

3564
## Create security field
3665

37-
Your documents must include a field specifying which groups have access. This information becomes the filter criteria against which documents are selected or rejected from the result set returned to the issuer.
38-
Let's assume that we have an index of secured files, and each file is accessible by a different set of users.
66+
In the search index, within the field collection, you need one field that contains the group or user identity, similar to the fictitious "security_id" field in the previous example.
67+
68+
1. Add a security field as a `Collection(Edm.String)`. Make sure it has a `filterable` attribute set to `true` so that search results are filtered based on the access the user has. For example, if you set the `group_ids` field to `["group_id1, group_id2"]` for the document with `file_name` "secured_file_b", only users that belong to group IDs "group_id1" or "group_id2" have read access to the file.
3969

40-
1. Add field `group_ids` (you can choose any name here) as a `Collection(Edm.String)`. Make sure the field has a `filterable` attribute set to `true` so that search results are filtered based on the access the user has. For example, if you set the `group_ids` field to `["group_id1, group_id2"]` for the document with `file_name` "secured_file_b", only users that belong to group IDs "group_id1" or "group_id2" have read access to the file.
41-
42-
Make sure the field's `retrievable` attribute is set to `false` so that it isn't returned as part of the search request.
70+
Set the field's `retrievable` attribute to `false` so that it isn't returned as part of the search request.
4371

44-
2. Also add `file_id` and `file_name` fields for the sake of this example.
72+
1. Indexes require a document key. The "file_id" field satisfies that requirement. Indexes should also contain searchable content. The "file_name" and "file_description" fields represent that in this example.
4573

46-
```JSON
47-
{
74+
```https
75+
POST https://[search service].search.windows.net/indexes/securedfiles/docs/index?api-version=2020-06-30
76+
{
4877
"name": "securedfiles",
4978
"fields": [
50-
{"name": "file_id", "type": "Edm.String", "key": true, "searchable": false, "sortable": false, "facetable": false},
51-
{"name": "file_name", "type": "Edm.String"},
52-
{"name": "group_ids", "type": "Collection(Edm.String)", "filterable": true, "retrievable": false}
79+
{"name": "file_id", "type": "Edm.String", "key": true, "searchable": false },
80+
{"name": "file_name", "type": "Edm.String", "searchable": true },
81+
{"name": "file_description", "type": "Edm.String", "searchable": true },
82+
{"name": "group_ids", "type": "Collection(Edm.String)", "filterable": true, "retrievable": false }
5383
]
5484
}
55-
```
85+
```
5686

57-
## Pushing data into your index using the REST API
87+
## Push data into your index using the REST API
5888

59-
Issue an HTTP POST request to your index's URL endpoint. The body of the HTTP request is a JSON object containing the documents to be added:
89+
Issue an HTTP POST request to your index's URL endpoint. The body of the HTTP request is a JSON object containing the documents to be indexed:
6090

6191
```http
6292
POST https://[search service].search.windows.net/indexes/securedfiles/docs/index?api-version=2020-06-30
63-
Content-Type: application/json
64-
api-key: [admin key]
6593
```
6694

6795
In the request body, specify the content of your documents:
@@ -73,18 +101,21 @@ In the request body, specify the content of your documents:
73101
"@search.action": "upload",
74102
"file_id": "1",
75103
"file_name": "secured_file_a",
104+
"file_description": "File access is restricted to the Human Resources.",
76105
"group_ids": ["group_id1"]
77106
},
78107
{
79108
"@search.action": "upload",
80109
"file_id": "2",
81110
"file_name": "secured_file_b",
111+
"file_description": "File access is restricted to Human Resources and Recruiting.",
82112
"group_ids": ["group_id1", "group_id2"]
83113
},
84114
{
85115
"@search.action": "upload",
86116
"file_id": "3",
87117
"file_name": "secured_file_c",
118+
"file_description": "File access is restricted to Operations and Logistics.",
88119
"group_ids": ["group_id5", "group_id6"]
89120
}
90121
]
@@ -105,15 +136,16 @@ If you need to update an existing document with the list of groups, you can use
105136
}
106137
```
107138

108-
For full details on adding or updating documents, you can read [Edit documents](/rest/api/searchservice/addupdate-or-delete-documents).
139+
For more information on uploading documents, see [Add, Update, or Delete Documents (REST)](/rest/api/searchservice/addupdate-or-delete-documents).
109140

110-
## Apply the security filter
141+
## Apply the security filter in the query
111142

112143
In order to trim documents based on `group_ids` access, you should issue a search query with a `group_ids/any(g:search.in(g, 'group_id1, group_id2,...'))` filter, where 'group_id1, group_id2,...' are the groups to which the search request issuer belongs.
113144

114145
This filter matches all documents for which the `group_ids` field contains one of the given identifiers.
115146
For full details on searching documents using Azure Cognitive Search, you can read [Search Documents](/rest/api/searchservice/search-documents).
116-
Note that this sample shows how to search documents using a POST request.
147+
148+
This sample shows how to set up query using a POST request.
117149

118150
Issue the HTTP POST request:
119151

@@ -154,7 +186,7 @@ You should get the documents back where `group_ids` contains either "group_id1"
154186

155187
This article described a pattern for filtering results based on user identity and the `search.in()` function. You can use this function to pass in principal identifiers for the requesting user to match against principal identifiers associated with each target document. When a search request is handled, the `search.in` function filters out search results for which none of the user's principals have read access. The principal identifiers can represent things like security groups, roles, or even the user's own identity.
156188

157-
For an alternative pattern based on Active Directory, or to revisit other security features, see the following links.
189+
For an alternative pattern based on Azure Active Directory, or to revisit other security features, see the following links.
158190

159191
* [Security filters for trimming results using Active Directory identities](search-security-trimming-for-azure-search-with-aad.md)
160192
* [Security in Azure Cognitive Search](search-security-overview.md)

0 commit comments

Comments
 (0)