You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: apis/python/src/tiledb/vector_search/ivf_pq_index.py
+36-13Lines changed: 36 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,12 @@ class IVFPQIndex(index.Index):
37
37
If not provided, all index data are loaded in main memory.
38
38
Otherwise, no index data are loaded in main memory and this memory budget is
39
39
applied during queries.
40
+
preload_k_factor_vectors: bool
41
+
When using `k_factor` in a query, we first query for `k_factor * k` pq-encoded vectors,
42
+
and then do a re-ranking step using the original input vectors for the top `k` vectors.
43
+
If `True`, we will load all the input vectors in main memory. This can only be used with
44
+
`memory_budget` set to `-1`, and is useful when the input vectors are small enough to fit in
45
+
memory and you want to speed up re-ranking.
40
46
open_for_remote_query_execution: bool
41
47
If `True`, do not load any index data in main memory locally, and instead load index data in the TileDB Cloud taskgraph created when a non-`None` `driver_mode` is passed to `query()`.
42
48
If `False`, load index data in main memory locally. Note that you can still use a taskgraph for query execution, you'll just end up loading the data both on your local machine and in the cloud taskgraph.
@@ -48,15 +54,26 @@ def __init__(
48
54
config: Optional[Mapping[str, Any]] =None,
49
55
timestamp=None,
50
56
memory_budget: int=-1,
57
+
preload_k_factor_vectors: bool=False,
51
58
open_for_remote_query_execution: bool=False,
52
59
group: tiledb.Group=None,
53
60
**kwargs,
54
61
):
62
+
ifpreload_k_factor_vectorsandmemory_budget!=-1:
63
+
raiseValueError(
64
+
"preload_k_factor_vectors can only be used with memory_budget set to -1."
# TODO(SC-48710): Add support for `open_for_remote_query_execution`. We don't leave `self.index`` as `None` because we need to be able to call index.dimensions().
0 commit comments