Description
@benwtrent 's PR at #14836 introduced an estimator for the number of visited nodes based on k and the number of vectors called expectedVisitedNodes(int k, int size).
Currently, filtered vector search works roughly this way:
- compute a per-leaf k value based on the global k value
- if (filterCost <= perLeafK) do an exact search
- else run an approximate search, stop after visiting filterCost+1 nodes
- if the approximate search stopped early (and is thus incomplete), run an exact search
Could/should we update step 2 above to do an exact search if (filterCost <= expectedVisitedNodes(perLeafK, size))? This should help save approximate searches that are almost certainly going to be stopped early and ignored anyway?