Skip to content

Commit 2a8cb9a

Browse files
authored
Add some docs explaining filter performance and behavior for HNSW (elastic#110108) (elastic#110143)
1 parent 43f993f commit 2a8cb9a

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -387,6 +387,24 @@ post-filtering approach, where the filter is applied **after** the approximate
387387
kNN search completes. Post-filtering has the downside that it sometimes
388388
returns fewer than k results, even when there are enough matching documents.
389389

390+
[discrete]
391+
[[approximate-knn-search-and-filtering]]
392+
==== Approximate kNN search and filtering
393+
394+
Unlike conventional query filtering, where more restrictive filters typically lead to faster queries,
395+
applying filters in an approximate kNN search with an HNSW index can decrease performance.
396+
This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates`
397+
that meet the filter criteria.
398+
399+
To avoid significant performance drawbacks, Lucene implements the following strategies per segment:
400+
401+
* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and
402+
uses a brute force search on the filtered documents.
403+
404+
* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter,
405+
the search will stop exploring the graph and switch to a brute force search over the filtered documents.
406+
407+
390408
[discrete]
391409
==== Combine approximate kNN with other features
392410

0 commit comments

Comments
 (0)