Skip to content

Commit 4d065ba

Browse files
authored
Merge pull request #284975 from HeidiSteen/heidist-rag2
[azure search] Vector filters and UUF
2 parents b584f11 + 76f2295 commit 4d065ba

File tree

5 files changed

+117
-105
lines changed

5 files changed

+117
-105
lines changed

articles/search/TOC.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -322,7 +322,7 @@
322322
href: vector-search-how-to-index-binary-data.md
323323
- name: Query vectors
324324
href: vector-search-how-to-query.md
325-
- name: Filter vectors
325+
- name: Add filters to a vector query
326326
href: vector-search-filters.md
327327
- name: Vector quotas and limits
328328
href: vector-search-index-size.md

articles/search/query-odata-filter-orderby-syntax.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: bevloh
77
ms.author: beloh
88
ms.service: cognitive-search
99
ms.topic: conceptual
10-
ms.date: 06/06/2024
10+
ms.date: 08/19/2024
1111
---
1212

1313
# OData language overview for `$filter`, `$orderby`, and `$select` in Azure AI Search
@@ -27,6 +27,8 @@ Once you understand these common concepts, you can continue with the top-level s
2727

2828
The syntax of these expressions is distinct from the [simple](query-simple-syntax.md) or [full](query-lucene-syntax.md) query syntax used in the **search** parameter, although there's some overlap in the syntax for referencing fields.
2929

30+
For examples in other languages such as Python or C#, see the examples in the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repository.
31+
3032
> [!NOTE]
3133
> Terminology in Azure AI Search differs from the [OData standard](https://www.odata.org/documentation/) in a few ways. What we call a **field** in Azure AI Search is called a **property** in OData, and similarly for **field path** versus **property path**. An **index** containing **documents** in Azure AI Search is referred to more generally in OData as an **entity set** containing **entities**. The Azure AI Search terminology is used throughout this reference.
3234

articles/search/vector-search-filters.md

Lines changed: 109 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,33 +8,131 @@ ms.author: heidist
88
ms.service: cognitive-search
99
ms.custom:
1010
- ignite-2023
11-
ms.topic: conceptual
12-
ms.date: 08/05/2024
11+
ms.topic: how-to
12+
ms.date: 08/19/2024
1313
---
1414

15-
# Filters in vector queries
15+
# Add a filter in a vector query in Azure AI Search
1616

17-
You can set a vector filter modes on a vector query to specify whether you want filtering before or after query execution.
17+
You can define a vector query request that includes a [filter expression](search-filters.md) to add inclusion or exclusion criteria to your queries. In this article, learn how to:
18+
19+
> [!div class="checklist"]
20+
> - [Define a `filter` expression](#define-a-filter)
21+
> - [Set the `vectorFilterMode` for pre-query or post-query filtering](#set-the-vectorfiltermode)
22+
23+
This article uses REST for illustration. For code samples in other languages, see the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) GitHub repository for end-to-end solutions that include vector queries.
24+
25+
You can also use [Search Explorer](search-get-started-portal-import-vectors.md#check-results) in the Azure portal to query vector content. If you use the JSON view, you can add filters and specify the filter mode.
26+
27+
## How filtering works in a vector query
28+
29+
Filters apply to `filterable` nonvector fields, either a string field or numeric, to include or exclude search documents based on filter criteria. Although a vector field isn't filterable itself, filters can be applied to other fields in the same index, including or excluding the documents that also contain vector fields.
30+
31+
Filters are applied before or after query execution based on the `vectorFilterMode` parameter.
32+
33+
## Define a filter
1834

1935
Filters determine the scope of a vector query. Filters are set on and iterate over nonvector string and numeric fields attributed as `filterable` in the index, but the purpose of a filter determines *what* the vector query executes over: the entire searchable space, or the contents of a search result.
2036

21-
This article provides conceptual information, describing each filter mode and providing guidance on when to use each one.
37+
If you don't have source fields with text or numeric values, check for document metadata, such as LastModified or CreatedBy properties, that might be useful in a metadata filter.
38+
39+
### [**2024-07-01**](#tab/filter-2024-07-01)
40+
41+
[**2024-07-01**](/rest/api/searchservice/search-service-api-versions#2024-07-01) is the stable version for this API. It has:
42+
43+
- `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
44+
- `filter` provides the criteria.
45+
46+
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
47+
48+
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
49+
50+
```http
51+
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-07-01
52+
Content-Type: application/json
53+
api-key: {{admin-api-key}}
54+
{
55+
"count": true,
56+
"select": "title, content, category",
57+
"filter": "category eq 'Databases'",
58+
"vectorFilterMode": "preFilter",
59+
"vectorQueries": [
60+
{
61+
"kind": "vector",
62+
"vector": [
63+
-0.009154141,
64+
0.018708462,
65+
. . .
66+
-0.02178128,
67+
-0.00086512347
68+
],
69+
"exhaustive": true,
70+
"fields": "contentVector",
71+
"k": 5
72+
}
73+
]
74+
}
75+
```
76+
77+
### [**2024-05-01-preview**](#tab/filter-2024-05-01-preview)
78+
79+
[**2024-05-01-preview**](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) introduces filter options. This version adds:
80+
81+
- `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
82+
- `filter` provides the criteria.
83+
84+
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
85+
86+
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
87+
88+
```http
89+
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-05-01-preview
90+
Content-Type: application/json
91+
api-key: {{admin-api-key}}
92+
{
93+
"count": true,
94+
"select": "title, content, category",
95+
"filter": "category eq 'Databases'",
96+
"vectorFilterMode": "preFilter",
97+
"vectorQueries": [
98+
{
99+
"kind": "vector",
100+
"vector": [
101+
-0.009154141,
102+
0.018708462,
103+
. . .
104+
-0.02178128,
105+
-0.00086512347
106+
],
107+
"exhaustive": true,
108+
"fields": "contentVector",
109+
"k": 5
110+
}
111+
]
112+
}
113+
```
114+
115+
---
116+
117+
## Set the vectorFilterMode
118+
119+
The vectorFilterMode query parameter determines whether the filter is applied before or after vector query execution.
22120

23-
For instructions on setting up the vector filter in your query, see [Vector query with filter](vector-search-how-to-query.md#vector-query-with-filter).
121+
### Use prefilter mode
24122

25-
## Prefilter mode
123+
Prefiltering applies filters before query execution, reducing the search surface area over which the vector search algorithm looks for similar content.
26124

27-
Prefiltering applies filters before query execution, reducing the search surface area over which the vector search algorithm looks for similar content. In a vector query, `preFilter` is the default.
125+
In a vector query, `preFilter` is the default.
28126

29127
:::image type="content" source="media/vector-search-filters/pre-filter.svg" alt-text="Diagram of prefilters." border="true" lightbox="media/vector-search-filters/pre-filter.png":::
30128

31-
## Postfilter mode
129+
### Use postfilter mode
32130

33131
Post-filtering applies filters after query execution, narrowing the search results.
34132

35133
:::image type="content" source="media/vector-search-filters/post-filter.svg" alt-text="Diagram of post-filters." border="true" lightbox="media/vector-search-filters/post-filter.png":::
36134

37-
## Benchmark testing of vector filter modes
135+
### Benchmark testing of vector filter modes
38136

39137
To understand the conditions under which one filter mode performs better than the other, we ran a series of tests to evaluate query outcomes over small, medium, and large indexes.
40138

@@ -93,7 +191,7 @@ Outcomes were measured in Queries Per Second (QPS).
93191
+ Postfiltering is for customers who:
94192

95193
+ value speed over selection (postfiltering can return fewer than `k` results)
96-
+ use filters that are not overly selective
194+
+ use filters that aren't overly selective
97195
+ have indexes of sufficient size such that prefiltering performance is unacceptable
98196

99197
### Details

articles/search/vector-search-how-to-create-index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ ms.date: 08/05/2024
1212

1313
# Create a vector index
1414

15-
In Azure AI Search, a *vector store* has an index schema that defines vector and nonvector fields, a vector configuration for algorithms that create and compress the embedding space, and settings on vector field definitions that are used in query requests. The [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) API creates the vector store.
15+
In Azure AI Search, a *vector store* has an index schema that defines vector and nonvector fields, a vector configuration for algorithms that create and compress the embedding space, and settings on vector field definitions that are used in query requests.
1616

17-
Follow these steps to index vector data:
17+
The [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) API creates the vector store. Follow these steps to index vector data:
1818

1919
> [!div class="checklist"]
2020
> + Define a schema with vector algorithms and optional compression
@@ -24,7 +24,7 @@ Follow these steps to index vector data:
2424
This article explains the workflow and uses REST for illustration. Once you understand the basic workflow, continue with the Azure SDK code samples in the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repository for guidance on using these features in test and production code.
2525

2626
> [!TIP]
27-
> Use the Azure portal to [create a vector index](search-get-started-portal-import-vectors.md) and try out integrated vectorization.
27+
> Use the Azure portal to [create a vector index](search-get-started-portal-import-vectors.md) and try integrated data chunking and vectorization.
2828
2929
## Prerequisites
3030

articles/search/vector-search-how-to-query.md

Lines changed: 1 addition & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.service: cognitive-search
99
ms.custom:
1010
- build-2024
1111
ms.topic: how-to
12-
ms.date: 08/05/2024
12+
ms.date: 08/19/2024
1313
---
1414

1515
# Create a vector query in Azure AI Search
@@ -18,7 +18,6 @@ In Azure AI Search, if you have a [vector index](vector-search-how-to-create-ind
1818

1919
> [!div class="checklist"]
2020
> + [Query vector fields](#vector-query-request)
21-
> + [Filter a vector query](#vector-query-with-filter)
2221
> + [Query multiple vector fields at once](#multiple-vector-fields)
2322
> + [Set vector weights](#vector-weighting)
2423
> + [Query with integrated vectorization](#query-with-integrated-vectorization)
@@ -256,93 +255,6 @@ If you do want vector fields in the result, here's an example of the response st
256255

257256
+ Fields in search results are either all `retrievable` fields, or fields in a `select` clause. During vector query execution, the match is made on vector data alone. However, a response can include any `retrievable` field in an index. Because there's no facility for decoding a vector field result, the inclusion of nonvector text fields is helpful for their human readable values.
258257

259-
## Vector query with filter
260-
261-
A query request can include a vector query and a [filter expression](search-filters.md). Filters apply to `filterable` nonvector fields, either a string field or numeric, and are useful for including or excluding search documents based on filter criteria. Although a vector field isn't filterable itself, filters can be applied to other fields in the same index.
262-
263-
You can apply filters as exclusion criteria before the query executes, or after query execution to filter search results. For a comparison of each mode and the expected performance based on index size, see [Filters in vector queries](vector-search-filters.md).
264-
265-
> [!TIP]
266-
> If you don't have source fields with text or numeric values, check for document metadata, such as LastModified or CreatedBy properties, that might be useful in a metadata filter.
267-
268-
### [**2024-07-01**](#tab/filter-2024-07-01)
269-
270-
[**2024-07-01**](/rest/api/searchservice/search-service-api-versions#2024-07-01) is the stable version for this API. It has:
271-
272-
+ `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
273-
+ `filter` provides the criteria.
274-
275-
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
276-
277-
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
278-
279-
```http
280-
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-07-01
281-
Content-Type: application/json
282-
api-key: {{admin-api-key}}
283-
{
284-
"count": true,
285-
"select": "title, content, category",
286-
"filter": "category eq 'Databases'",
287-
"vectorFilterMode": "preFilter",
288-
"vectorQueries": [
289-
{
290-
"kind": "vector",
291-
"vector": [
292-
-0.009154141,
293-
0.018708462,
294-
. . .
295-
-0.02178128,
296-
-0.00086512347
297-
],
298-
"exhaustive": true,
299-
"fields": "contentVector",
300-
"k": 5
301-
}
302-
]
303-
}
304-
```
305-
306-
### [**2024-05-01-preview**](#tab/filter-2024-05-01-preview)
307-
308-
[**2024-05-01-preview**](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) introduces filter options. This version adds:
309-
310-
+ `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
311-
+ `filter` provides the criteria.
312-
313-
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
314-
315-
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
316-
317-
```http
318-
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-05-01-preview
319-
Content-Type: application/json
320-
api-key: {{admin-api-key}}
321-
{
322-
"count": true,
323-
"select": "title, content, category",
324-
"filter": "category eq 'Databases'",
325-
"vectorFilterMode": "preFilter",
326-
"vectorQueries": [
327-
{
328-
"kind": "vector",
329-
"vector": [
330-
-0.009154141,
331-
0.018708462,
332-
. . .
333-
-0.02178128,
334-
-0.00086512347
335-
],
336-
"exhaustive": true,
337-
"fields": "contentVector",
338-
"k": 5
339-
}
340-
]
341-
}
342-
```
343-
344-
---
345-
346258
## Multiple vector fields
347259

348260
You can set the "vectorQueries.fields" property to multiple vector fields. The vector query executes against each vector field that you provide in the `fields` list. When querying multiple vector fields, make sure each one contains embeddings from the same embedding model, and that the query is also generated from the same embedding model.

0 commit comments

Comments
 (0)