Skip to content

Commit 91977f2

Browse files
authored
Add some docs explaining filter performance and behavior for HNSW (elastic#110108) (elastic#110135)
1 parent ba99492 commit 91977f2

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,24 @@ post-filtering approach, where the filter is applied **after** the approximate
285285
kNN search completes. Post-filtering has the downside that it sometimes
286286
returns fewer than k results, even when there are enough matching documents.
287287

288+
[discrete]
289+
[[approximate-knn-search-and-filtering]]
290+
==== Approximate kNN search and filtering
291+
292+
Unlike conventional query filtering, where more restrictive filters typically lead to faster queries,
293+
applying filters in an approximate kNN search with an HNSW index can decrease performance.
294+
This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates`
295+
that meet the filter criteria.
296+
297+
To avoid significant performance drawbacks, Lucene implements the following strategies per segment:
298+
299+
* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and
300+
uses a brute force search on the filtered documents.
301+
302+
* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter,
303+
the search will stop exploring the graph and switch to a brute force search over the filtered documents.
304+
305+
288306
[discrete]
289307
==== Combine approximate kNN with other features
290308

0 commit comments

Comments
 (0)