Skip to content

Commit c7fe6ef

Browse files
authored
Add some docs explaining filter performance and behavior for HNSW (elastic#110108) (elastic#110139)
1 parent d566995 commit c7fe6ef

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,24 @@ post-filtering approach, where the filter is applied **after** the approximate
284284
kNN search completes. Post-filtering has the downside that it sometimes
285285
returns fewer than k results, even when there are enough matching documents.
286286

287+
[discrete]
288+
[[approximate-knn-search-and-filtering]]
289+
==== Approximate kNN search and filtering
290+
291+
Unlike conventional query filtering, where more restrictive filters typically lead to faster queries,
292+
applying filters in an approximate kNN search with an HNSW index can decrease performance.
293+
This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates`
294+
that meet the filter criteria.
295+
296+
To avoid significant performance drawbacks, Lucene implements the following strategies per segment:
297+
298+
* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and
299+
uses a brute force search on the filtered documents.
300+
301+
* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter,
302+
the search will stop exploring the graph and switch to a brute force search over the filtered documents.
303+
304+
287305
[discrete]
288306
==== Combine approximate kNN with other features
289307

0 commit comments

Comments
 (0)