Skip to content

Commit 29c5b49

Browse files
authored
Add some docs explaining filter performance and behavior for HNSW (#110108) (#110142)
1 parent 4860b3c commit 29c5b49

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -278,6 +278,24 @@ post-filtering approach, where the filter is applied **after** the approximate
278278
kNN search completes. Post-filtering has the downside that it sometimes
279279
returns fewer than k results, even when there are enough matching documents.
280280

281+
[discrete]
282+
[[approximate-knn-search-and-filtering]]
283+
==== Approximate kNN search and filtering
284+
285+
Unlike conventional query filtering, where more restrictive filters typically lead to faster queries,
286+
applying filters in an approximate kNN search with an HNSW index can decrease performance.
287+
This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates`
288+
that meet the filter criteria.
289+
290+
To avoid significant performance drawbacks, Lucene implements the following strategies per segment:
291+
292+
* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and
293+
uses a brute force search on the filtered documents.
294+
295+
* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter,
296+
the search will stop exploring the graph and switch to a brute force search over the filtered documents.
297+
298+
281299
[discrete]
282300
==== Combine approximate kNN with other features
283301

0 commit comments

Comments
 (0)