Skip to content

Commit 1d4a9f0

Browse files
authored
Add some docs explaining filter performance and behavior for HNSW (#110108) (#110145)
1 parent 8290743 commit 1d4a9f0

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -410,6 +410,24 @@ post-filtering approach, where the filter is applied **after** the approximate
410410
kNN search completes. Post-filtering has the downside that it sometimes
411411
returns fewer than k results, even when there are enough matching documents.
412412

413+
[discrete]
414+
[[approximate-knn-search-and-filtering]]
415+
==== Approximate kNN search and filtering
416+
417+
Unlike conventional query filtering, where more restrictive filters typically lead to faster queries,
418+
applying filters in an approximate kNN search with an HNSW index can decrease performance.
419+
This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates`
420+
that meet the filter criteria.
421+
422+
To avoid significant performance drawbacks, Lucene implements the following strategies per segment:
423+
424+
* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and
425+
uses a brute force search on the filtered documents.
426+
427+
* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter,
428+
the search will stop exploring the graph and switch to a brute force search over the filtered documents.
429+
430+
413431
[discrete]
414432
==== Combine approximate kNN with other features
415433

0 commit comments

Comments
 (0)