You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: apis/python/src/tiledb/vector_search/flat_index.py
+17-4Lines changed: 17 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -35,17 +35,32 @@ class FlatIndex(index.Index):
35
35
timestamp: int or tuple(int)
36
36
If int, open the index at a given timestamp.
37
37
If tuple, open at the given start and end timestamps.
38
+
open_for_remote_query_execution: bool
39
+
If `True`, do not load any index data in main memory locally, and instead load index data in the TileDB Cloud taskgraph created when a non-`None` `driver_mode` is passed to `query()`.
40
+
If `False`, load index data in main memory locally. Note that you can still use a taskgraph for query execution, you'll just end up loading the data both on your local machine and in the cloud taskgraph.
If tuple, open at the given start and end timestamps.
39
+
open_for_remote_query_execution: bool
40
+
If `True`, do not load any index data in main memory locally, and instead load index data in the TileDB Cloud taskgraph created when a non-`None` `driver_mode` is passed to `query()`.
41
+
If `False`, load index data in main memory locally. Note that you can still use a taskgraph for query execution, you'll just end up loading the data both on your local machine and in the cloud taskgraph.
- Calls the algorithm specific implementation of `query_internal` to query the base data.
165
230
- Merges the results applying the updated data.
166
231
232
+
You can control where the query is executed by setting the `driver_mode` parameter:
233
+
- With `driver_mode = None`, the driver logic for the query will be executed locally.
234
+
- If `driver_mode` is not `None`, we will use a TileDB cloud taskgraph to re-open the index and run the query.
235
+
With both options, certain implementations, i.e. IVF Flat, may let you create further TileDB taskgraphs as defined in the implementation specific `query_internal` methods.
236
+
167
237
Parameters
168
238
----------
169
239
queries: np.ndarray
170
240
2D array of query vectors. This can be used as a batch query interface by passing multiple queries in one call.
171
241
k: int
172
242
Number of results to return per query vector.
243
+
driver_mode: Mode
244
+
If not `None`, the query will be executed in a TileDB cloud taskgraph using the driver mode specified.
245
+
driver_resources: Optional[str]
246
+
If `driver_mode` was not `None`, the resources to use for the driver execution.
247
+
driver_access_credentials_name: Optional[str]
248
+
If `driver_mode` was not `None`, the access credentials name to use for the driver execution.
173
249
**kwargs
174
250
Extra kwargs passed here are passed to the `query_internal` implementation of the concrete index class.
f"A query in queries has {query_dimensions} dimensions, but the indexed data had {self.dimensions} dimensions"
185
261
)
186
262
263
+
ifqueries.dtype!=np.float32:
264
+
raiseTypeError(
265
+
f"Expected queries to have dtype np.float32, but it had dtype {queries.dtype}"
266
+
)
267
+
268
+
ifdriver_mode==Mode.LOCAL:
269
+
# @todo: Fix bug with driver_mode=Mode.LOCAL and remove this check.
270
+
raiseTypeError(
271
+
"Cannot pass driver_mode=Mode.LOCAL to query() - use driver_mode=None to query locally."
272
+
)
273
+
274
+
ifdriver_modeisnotNone:
275
+
returnself._query_with_driver(
276
+
queries,
277
+
k,
278
+
driver_mode,
279
+
driver_resources,
280
+
driver_access_credentials_name,
281
+
**kwargs,
282
+
)
283
+
284
+
ifself.open_for_remote_query_execution:
285
+
raiseValueError(
286
+
"Cannot query an index with driver_mode=None without loading the index data in main memory. Set open_for_remote_query_execution=False when creating the index to load the index data before query."
Copy file name to clipboardExpand all lines: apis/python/src/tiledb/vector_search/ivf_flat_index.py
+33-19Lines changed: 33 additions & 19 deletions
Original file line number
Diff line number
Diff line change
@@ -70,6 +70,9 @@ class IVFFlatIndex(index.Index):
70
70
If not provided, all index data are loaded in main memory.
71
71
Otherwise, no index data are loaded in main memory and this memory budget is
72
72
applied during queries.
73
+
open_for_remote_query_execution: bool
74
+
If `True`, do not load any index data in main memory locally, and instead load index data in the TileDB Cloud taskgraph created when a non-`None` `driver_mode` is passed to `query()`. We then load index data in the taskgraph based on `memory_budget`.
75
+
If `False`, load index data in main memory locally according to `memory_budget`. Note that you can still use a taskgraph for query execution, you'll just end up loading the data both on your local machine and in the cloud taskgraph..
Copy file name to clipboardExpand all lines: apis/python/src/tiledb/vector_search/vamana_index.py
+18-5Lines changed: 18 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -38,17 +38,33 @@ class VamanaIndex(index.Index):
38
38
URI of the index.
39
39
config: Optional[Mapping[str, Any]]
40
40
TileDB config dictionary.
41
+
open_for_remote_query_execution: bool
42
+
If `True`, do not load any index data in main memory locally, and instead load index data in the TileDB Cloud taskgraph created when a non-`None` `driver_mode` is passed to `query()`.
43
+
If `False`, load index data in main memory locally. Note that you can still use a taskgraph for query execution, you'll just end up loading the data both on your local machine and in the cloud taskgraph.
# TODO(SC-48710): Add support for `open_for_remote_query_execution`. We don't leave `self.index`` as `None` because we need to be able to call index.dimensions().
0 commit comments