You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR updates the IVF queries to return top k distances in addition
to top k neighbor indices.
All of our search functions now use a fixed size min heap of to keep
distances and indices. The heap is templated on the type of distance
and type of index, which are kept as a tuple in the heap. The heap is
sorted on the first element (the score). To create a matrix of top k
neighbors, the search functions sorted the heap and then copied the
neighbor indices into a matrix.
With the new API, queries can now be called as ``` auto&& [ D, I ] =
query(...) ```
To update the query functions to return distances in addition to
indices, this PR
* Created a new file scoring.h which includes most of the
functionality previously in defs.h
* Deleted defs.h
* Created a function `consolidate` which consolidates distance and
index information from a vector of vector of indices into a single
vector (the 0th vector in the vector of vectors)
* Factored out the code to copy the index information in the heap
structure to a matrix into new functions `get_top_k_from_heap` and two
overloads of `get_top_k`
* Created augmented functions `get_top_k_from_heap_with_scores` and
two overloads of `get_top_k_with_scores` that return a tuple of a
distance matrix and an index matrix.
* Replaced the final logic in all of the ivf queries with
`consolidate` + `get_top_k_with_scores` (not all functions needed
`consolidate`)
* Added extensive new tests in unit_fixed_min_heap.cc, unit_linalg.cc,
and unit_scoring.cc, notably to validate the new `get_top_k` family of
functions.
* Updated the ivf_flat C++ CLI program to use the new query API
* Updated the Python bindings in module.cc to use the new query API
In addition
* Propagated the BLAS macro to exclude BLAS code from the library so
that the CLIs can be built without BLAS
* Added docstrings to the query functions in ivf/qv.h
* Added an initializer list constructor to `Matrix` to aid in creating
tests
0 commit comments