You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit exposes a new vector_search_rerank_multiplier session setting that
controls how many vectors will be reranked at the end of a search. Empirical
testing shows that for difficult datasets (e.g. Glove and CLIP), the number of
vectors that need to be reranked is proportional to:
log2(search_beam_size) * log2(top-k-results)
We multiply this number by the vector_search_rerank_multiplier setting, which is
set to 50 by default (also derived empirically). This formula will rerank enough
vectors to allow us to achieve high accuracies while still providing a safeguard
that prevents runaway evaluation in edge cases.
Less difficult datasets need to rerank far fewer vectors, but it doesn't hurt
to have a limit that's too high, since RaBitQ error bounds let us avoid actually
evaluating unneeded vectors.
This commit also increases the max value of vector_search_beam_size from 512 to
2048, since some datasets need it.
Epic: CRDB-42943
Release note: None
0 commit comments