Skip to content

Commit 913cc1f

Browse files
committed
Vector filters and UUF
1 parent 39a9e92 commit 913cc1f

File tree

4 files changed

+114
-103
lines changed

4 files changed

+114
-103
lines changed

articles/search/query-odata-filter-orderby-syntax.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: bevloh
77
ms.author: beloh
88
ms.service: cognitive-search
99
ms.topic: conceptual
10-
ms.date: 06/06/2024
10+
ms.date: 08/19/2024
1111
---
1212

1313
# OData language overview for `$filter`, `$orderby`, and `$select` in Azure AI Search
@@ -27,6 +27,8 @@ Once you understand these common concepts, you can continue with the top-level s
2727

2828
The syntax of these expressions is distinct from the [simple](query-simple-syntax.md) or [full](query-lucene-syntax.md) query syntax used in the **search** parameter, although there's some overlap in the syntax for referencing fields.
2929

30+
For examples in other languages such as Python or C#, see the examples in the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repository.
31+
3032
> [!NOTE]
3133
> Terminology in Azure AI Search differs from the [OData standard](https://www.odata.org/documentation/) in a few ways. What we call a **field** in Azure AI Search is called a **property** in OData, and similarly for **field path** versus **property path**. An **index** containing **documents** in Azure AI Search is referred to more generally in OData as an **entity set** containing **entities**. The Azure AI Search terminology is used throughout this reference.
3234

articles/search/vector-search-filters.md

Lines changed: 107 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,33 +8,130 @@ ms.author: heidist
88
ms.service: cognitive-search
99
ms.custom:
1010
- ignite-2023
11-
ms.topic: conceptual
12-
ms.date: 08/05/2024
11+
ms.topic: how-to
12+
ms.date: 08/19/2024
1313
---
1414

15-
# Filters in vector queries
15+
# Add a filter in a vector query in Azure AI Search
1616

17-
You can set a vector filter modes on a vector query to specify whether you want filtering before or after query execution.
17+
You can define a vector query request that includes a [filter expression](search-filters.md) to add inclusion or exclusion criteria to your queries.
18+
19+
In this article, learn how to:
20+
21+
- Define a `filter` expression
22+
- Set the `vectorFilterMode` to control whether the filter executes before or after the query.
23+
24+
This article uses REST for illustration. For code samples in other languages, see the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) GitHub repository for end-to-end solutions that include vector queries.
25+
26+
You can also use [Search Explorer](search-get-started-portal-import-vectors.md#check-results) in the Azure portal to query vector content. If you use the JSON view, you can add filters and specify the filter mode.
27+
28+
## How filtering works in a vector query
29+
30+
Filters apply to `filterable` nonvector fields, either a string field or numeric, and are useful for including or excluding search documents. Although a vector field isn't filterable itself, filters can be applied to other fields in the same index, including or excluding the documents that also contain vector fields.
31+
32+
Filters can be applied before the query executes, or after query execution to filter search results. Set the `vectorFilterMode`
33+
34+
## Define a filter
1835

1936
Filters determine the scope of a vector query. Filters are set on and iterate over nonvector string and numeric fields attributed as `filterable` in the index, but the purpose of a filter determines *what* the vector query executes over: the entire searchable space, or the contents of a search result.
2037

21-
This article provides conceptual information, describing each filter mode and providing guidance on when to use each one.
38+
If you don't have source fields with text or numeric values, check for document metadata, such as LastModified or CreatedBy properties, that might be useful in a metadata filter.
39+
40+
### [**2024-07-01**](#tab/filter-2024-07-01)
41+
42+
[**2024-07-01**](/rest/api/searchservice/search-service-api-versions#2024-07-01) is the stable version for this API. It has:
43+
44+
- `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
45+
- `filter` provides the criteria.
46+
47+
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
48+
49+
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
50+
51+
```http
52+
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-07-01
53+
Content-Type: application/json
54+
api-key: {{admin-api-key}}
55+
{
56+
"count": true,
57+
"select": "title, content, category",
58+
"filter": "category eq 'Databases'",
59+
"vectorFilterMode": "preFilter",
60+
"vectorQueries": [
61+
{
62+
"kind": "vector",
63+
"vector": [
64+
-0.009154141,
65+
0.018708462,
66+
. . .
67+
-0.02178128,
68+
-0.00086512347
69+
],
70+
"exhaustive": true,
71+
"fields": "contentVector",
72+
"k": 5
73+
}
74+
]
75+
}
76+
```
77+
78+
### [**2024-05-01-preview**](#tab/filter-2024-05-01-preview)
79+
80+
[**2024-05-01-preview**](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) introduces filter options. This version adds:
81+
82+
- `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
83+
- `filter` provides the criteria.
84+
85+
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
86+
87+
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
88+
89+
```http
90+
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-05-01-preview
91+
Content-Type: application/json
92+
api-key: {{admin-api-key}}
93+
{
94+
"count": true,
95+
"select": "title, content, category",
96+
"filter": "category eq 'Databases'",
97+
"vectorFilterMode": "preFilter",
98+
"vectorQueries": [
99+
{
100+
"kind": "vector",
101+
"vector": [
102+
-0.009154141,
103+
0.018708462,
104+
. . .
105+
-0.02178128,
106+
-0.00086512347
107+
],
108+
"exhaustive": true,
109+
"fields": "contentVector",
110+
"k": 5
111+
}
112+
]
113+
}
114+
```
115+
116+
---
117+
118+
## Set the vectorFilterMode
22119

23-
For instructions on setting up the vector filter in your query, see [Vector query with filter](vector-search-how-to-query.md#vector-query-with-filter).
120+
The vectorFilterMode query parameter determines whether the filter is applied before or after vector query execution.
24121

25-
## Prefilter mode
122+
### Use prefilter mode
26123

27124
Prefiltering applies filters before query execution, reducing the search surface area over which the vector search algorithm looks for similar content. In a vector query, `preFilter` is the default.
28125

29126
:::image type="content" source="media/vector-search-filters/pre-filter.svg" alt-text="Diagram of prefilters." border="true" lightbox="media/vector-search-filters/pre-filter.png":::
30127

31-
## Postfilter mode
128+
### Use postfilter mode
32129

33130
Post-filtering applies filters after query execution, narrowing the search results.
34131

35132
:::image type="content" source="media/vector-search-filters/post-filter.svg" alt-text="Diagram of post-filters." border="true" lightbox="media/vector-search-filters/post-filter.png":::
36133

37-
## Benchmark testing of vector filter modes
134+
### Benchmark testing of vector filter modes
38135

39136
To understand the conditions under which one filter mode performs better than the other, we ran a series of tests to evaluate query outcomes over small, medium, and large indexes.
40137

@@ -93,7 +190,7 @@ Outcomes were measured in Queries Per Second (QPS).
93190
+ Postfiltering is for customers who:
94191

95192
+ value speed over selection (postfiltering can return fewer than `k` results)
96-
+ use filters that are not overly selective
193+
+ use filters that aren't overly selective
97194
+ have indexes of sufficient size such that prefiltering performance is unacceptable
98195

99196
### Details

articles/search/vector-search-how-to-create-index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ ms.date: 08/05/2024
1212

1313
# Create a vector index
1414

15-
In Azure AI Search, a *vector store* has an index schema that defines vector and nonvector fields, a vector configuration for algorithms that create and compress the embedding space, and settings on vector field definitions that are used in query requests. The [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) API creates the vector store.
15+
In Azure AI Search, a *vector store* has an index schema that defines vector and nonvector fields, a vector configuration for algorithms that create and compress the embedding space, and settings on vector field definitions that are used in query requests.
1616

17-
Follow these steps to index vector data:
17+
The [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) API creates the vector store. Follow these steps to index vector data:
1818

1919
> [!div class="checklist"]
2020
> + Define a schema with vector algorithms and optional compression
@@ -24,7 +24,7 @@ Follow these steps to index vector data:
2424
This article explains the workflow and uses REST for illustration. Once you understand the basic workflow, continue with the Azure SDK code samples in the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) repository for guidance on using these features in test and production code.
2525

2626
> [!TIP]
27-
> Use the Azure portal to [create a vector index](search-get-started-portal-import-vectors.md) and try out integrated vectorization.
27+
> Use the Azure portal to [create a vector index](search-get-started-portal-import-vectors.md) and try integrated data chunking and vectorization.
2828
2929
## Prerequisites
3030

articles/search/vector-search-how-to-query.md

Lines changed: 1 addition & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.service: cognitive-search
99
ms.custom:
1010
- build-2024
1111
ms.topic: how-to
12-
ms.date: 08/05/2024
12+
ms.date: 08/19/2024
1313
---
1414

1515
# Create a vector query in Azure AI Search
@@ -18,7 +18,6 @@ In Azure AI Search, if you have a [vector index](vector-search-how-to-create-ind
1818

1919
> [!div class="checklist"]
2020
> + [Query vector fields](#vector-query-request)
21-
> + [Filter a vector query](#vector-query-with-filter)
2221
> + [Query multiple vector fields at once](#multiple-vector-fields)
2322
> + [Set vector weights](#vector-weighting)
2423
> + [Query with integrated vectorization](#query-with-integrated-vectorization)
@@ -256,93 +255,6 @@ If you do want vector fields in the result, here's an example of the response st
256255

257256
+ Fields in search results are either all `retrievable` fields, or fields in a `select` clause. During vector query execution, the match is made on vector data alone. However, a response can include any `retrievable` field in an index. Because there's no facility for decoding a vector field result, the inclusion of nonvector text fields is helpful for their human readable values.
258257

259-
## Vector query with filter
260-
261-
A query request can include a vector query and a [filter expression](search-filters.md). Filters apply to `filterable` nonvector fields, either a string field or numeric, and are useful for including or excluding search documents based on filter criteria. Although a vector field isn't filterable itself, filters can be applied to other fields in the same index.
262-
263-
You can apply filters as exclusion criteria before the query executes, or after query execution to filter search results. For a comparison of each mode and the expected performance based on index size, see [Filters in vector queries](vector-search-filters.md).
264-
265-
> [!TIP]
266-
> If you don't have source fields with text or numeric values, check for document metadata, such as LastModified or CreatedBy properties, that might be useful in a metadata filter.
267-
268-
### [**2024-07-01**](#tab/filter-2024-07-01)
269-
270-
[**2024-07-01**](/rest/api/searchservice/search-service-api-versions#2024-07-01) is the stable version for this API. It has:
271-
272-
+ `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
273-
+ `filter` provides the criteria.
274-
275-
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
276-
277-
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
278-
279-
```http
280-
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-07-01
281-
Content-Type: application/json
282-
api-key: {{admin-api-key}}
283-
{
284-
"count": true,
285-
"select": "title, content, category",
286-
"filter": "category eq 'Databases'",
287-
"vectorFilterMode": "preFilter",
288-
"vectorQueries": [
289-
{
290-
"kind": "vector",
291-
"vector": [
292-
-0.009154141,
293-
0.018708462,
294-
. . .
295-
-0.02178128,
296-
-0.00086512347
297-
],
298-
"exhaustive": true,
299-
"fields": "contentVector",
300-
"k": 5
301-
}
302-
]
303-
}
304-
```
305-
306-
### [**2024-05-01-preview**](#tab/filter-2024-05-01-preview)
307-
308-
[**2024-05-01-preview**](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) introduces filter options. This version adds:
309-
310-
+ `vectorFilterMode` for prefilter (default) or postfilter [filtering modes](vector-search-filters.md).
311-
+ `filter` provides the criteria.
312-
313-
In the following example, the vector is a representation of this query string: "what Azure services support full text search". The query targets the `contentVector` field. The actual vector has 1536 embeddings, so it's trimmed in this example for readability.
314-
315-
The filter criteria are applied to a filterable text field (`category` in this example) before the search engine executes the vector query.
316-
317-
```http
318-
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-05-01-preview
319-
Content-Type: application/json
320-
api-key: {{admin-api-key}}
321-
{
322-
"count": true,
323-
"select": "title, content, category",
324-
"filter": "category eq 'Databases'",
325-
"vectorFilterMode": "preFilter",
326-
"vectorQueries": [
327-
{
328-
"kind": "vector",
329-
"vector": [
330-
-0.009154141,
331-
0.018708462,
332-
. . .
333-
-0.02178128,
334-
-0.00086512347
335-
],
336-
"exhaustive": true,
337-
"fields": "contentVector",
338-
"k": 5
339-
}
340-
]
341-
}
342-
```
343-
344-
---
345-
346258
## Multiple vector fields
347259

348260
You can set the "vectorQueries.fields" property to multiple vector fields. The vector query executes against each vector field that you provide in the `fields` list. When querying multiple vector fields, make sure each one contains embeddings from the same embedding model, and that the query is also generated from the same embedding model.

0 commit comments

Comments
 (0)