You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
--infinite Load the entire array into RAM for the search [default: false]
130
129
--finite For backward compatibility, load only required partitions into memory [default: true]
131
130
--blocksize NN number of vectors to process in an out of core block (0 = all) [default: 0]
132
-
--nth use nth_element for top k [default: false]
133
-
--nthreads NN number of threads to use (0 = all) [default: 0]
131
+
--nthreads NN number of threads to use (0 = hardware concurrency) [default: 0]
132
+
--ppt NN minimum number of partitions to assign to a thread (0 = no min) [default: 0]
133
+
--vpt NN minimum number of vectors to assign to a thread (0 = no min) [default: 0]
134
+
--nodes NN number of nodes to use for (emulated) distributed query [default: 1]
134
135
--region REGION AWS S3 region [default: us-east-1]
135
136
--log FILE log info to FILE (- for stdout)
137
+
--stats log TileDB stats [default: false]
136
138
-d, --debug run in debug mode [default: false]
137
139
-v, --verbose run in verbose mode [default: false]
138
140
```
@@ -152,20 +154,18 @@ The inverted file index consists of data stored in multiple TileDB arrays, which
152
154
153
155
The user can also optionally specify
154
156
* An array containing ground truth vectors (`--groundtruth_uri`), i.e., the nearest-neighbors that would be returned from an exact (`flat L2)` search and/or
155
-
* An array for saving the results of the query.
157
+
* An array for saving the results of the query. (`--output_uri`).
@@ -178,8 +178,6 @@ The default is to use all the queries in the query array, which can also be spec
178
178
* Which search algorithm in the C++ library to use for performing the search (`--algo`). It is recommended to use the default (other algorithms are currently WIP).
179
179
* Whether to load the entire partitioned array into memory when performing the search or (if the `--infinite` option is given) whether to load only the necesary partitions, given the specified query. It is recommended to generally use the default value except in the case of large values of `nqueries` and `nprobe` and the availability of sufficient RAM to hold the entire partitioned array. (For backward compatibility, there is also a `--finite` flag which had the complementary behavior to `--infinite`). If `--blocksize` is specified with the finite-memory option, `ivf_flat` also operate in out-of-core fashion, loading subsets of partitions into memory, in the order they appear in the partitioned vector array.
180
180
* An upper bound to the number of vectors to be loaded during each batch when using the finite-memory case. `ivf_flat` will load complete partitions on each out-of-core iteration, so the number of vectors loaded will generally be fewer than the specified upper bound. Similarly, the specified upper bound must be larger than the largest partition in the partitioned array. Out of core operation is necessary if available RAM cannot hold all the index data (in general due to the size of the vector data to be searched). Even if available memory can accommodate the entire partitioned array, out of core operation can be useful for making more efficient use of hierarchical memory.
181
-
* Whether to use the `nth_element` C++ standard library algorithm for ranking top-k vectors (`--nth`). The default value is `false` and the default should always be used This option was used for performance experiments and should be considered deprecated.
182
-
* How many threads to use when executing the parallelized sections of the search (`--nthreads`). The default is `std::thread::hardware_concurrency`, i.e., the number of available cores. In general the default value should be used.
183
181
* The AWS region to use when accessing TileDB arrays stored in S3 (`--region`). The example array URIs provided with TileDB-Vector-Search are located in the `us-east-1` region, which is the default value.
184
182
* The name of a file to write logging information to (`--log`). The default is nil, meaning no logs will be written. If the value `-` is specified, the output will be written to `std::cout`.
185
183
* Whether to run in debug mode (`-d` or `--debug`). This will print copious information that is useful only to the library developers. End users should always use the default.
0 commit comments