@@ -14,7 +14,7 @@ All data and metadata required for a TileDB-Vector-Search index are stored insid
1414Metadata values required for configuring the different properties of an index are stored in the ` index_uri ` group metadata. There are some metadata values that are required for all algorithm implementations as well as per-algorithm specific metadata values. Below is a table of all the metadata values that are recorded for all algorithms.
1515
1616| Name | Description |
17- | ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
17+ | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
1818| ` dataset_type ` | The asset type for disambiguation in TileDB cloud. Value: ` vector_search ` |
1919| ` index_type ` | The index algorithm used for this index. Can be one of the following values: ` FLAT ` , ` IVF_FLAT ` , ` VAMANA ` , ` IVF_PQ ` |
2020| ` storage_version ` | The storage version used for the index. The storage version is used to make sure that indexing algorithms can update their storage logic without affecting previously created indexes and maintaining backwards compatibility. |
@@ -25,22 +25,22 @@ Metadata values required for configuring the different properties of an index ar
2525
2626### Object metadata
2727
28- This is a 1D sparse array with ` external_id ` as dimension and attributes the user defined metadata attributes for the respective vectors.
28+ This is a 1D sparse array with ` external_id ` as dimension and attributes the user defined metadata attributes for the respective vectors.
2929
3030#### Basic schema parameters
3131
3232| ** Parameter** | ** Value** |
33- | :-------------- | :---------- |
33+ | :------------ | :-------- |
3434| Array type | Sparse |
3535| Rank | 1D |
3636| Cell order | Row-major |
3737| Tile order | Row-major |
3838
3939#### Dimensions
4040
41- | Dimension Name | TileDB Datatype |
42- | :------------- | :-------------------- |
43- | ` external_id ` | ` uint64_t ` |
41+ | Dimension Name | TileDB Datatype |
42+ | :------------- | :-------------- |
43+ | ` external_id ` | ` uint64_t ` |
4444
4545### Updates
4646
@@ -57,15 +57,15 @@ TileDB-Vector-Search offers support for updates for all different index algorith
5757
5858#### Dimensions
5959
60- | Dimension Name | TileDB Datatype |
61- | :------------- | :-------------------- |
62- | ` external_id ` | ` uint64_t ` |
60+ | Dimension Name | TileDB Datatype |
61+ | :------------- | :-------------- |
62+ | ` external_id ` | ` uint64_t ` |
6363
6464#### Attributes
6565
66- | Attribute Name | TileDB Datatype | Description |
67- | :--------------- | :-------------- | :----------------------- ---------------------------------------------------------------------- |
68- | ` vector ` | variable ` dtype ` | Contains the vector value. Empty values correspond to vector deletions. |
66+ | Attribute Name | TileDB Datatype | Description |
67+ | :------------- | :--------------- | :---------------------------------------------------------------------- |
68+ | ` vector ` | variable ` dtype ` | Contains the vector value. Empty values correspond to vector deletions. |
6969
7070## Algorithm specific storage format
7171
@@ -78,7 +78,7 @@ This is a 2D dense array that holds all the vectors with no specific ordering.
7878#### Basic schema parameters
7979
8080| ** Parameter** | ** Value** |
81- | :-------------- | :---------- |
81+ | :------------ | :-------- |
8282| Array type | Dense |
8383| Rank | 2D |
8484| Cell order | Col-major |
@@ -87,15 +87,15 @@ This is a 2D dense array that holds all the vectors with no specific ordering.
8787#### Dimensions
8888
8989| Dimension Name | TileDB Datatype | Domain | Description |
90- | :--------------- | :---------------- | :------------------ | :---------------------------------------------------------- |
90+ | :------------- | :-------------- | :---------------- | :-------------------------------------------------------- |
9191| ` rows ` | ` int32_t ` | ` [0, dimensions] ` | Corresponds to the vector dimensions. |
9292| ` cols ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the vector position in the set of vectors. |
9393
9494#### Attributes
9595
96- | Attribute Name | TileDB Datatype | Description |
97- | :--------------- | :-------------- | :--------------------------------------------------------------------------- |
98- | ` values ` | ` dtype ` | Contains the vector value at the specific dimension. |
96+ | Attribute Name | TileDB Datatype | Description |
97+ | :------------- | :-------------- | :--------------------------------------------------- |
98+ | ` values ` | ` dtype ` | Contains the vector value at the specific dimension. |
9999
100100#### ` shuffled_ids `
101101
@@ -112,22 +112,22 @@ This is a 1D dense array that maps vector positions in the `shuffled_vectors` ar
112112
113113#### Dimensions
114114
115- | Dimension Name | TileDB Datatype | Domain | Description |
116- | :------------- | :-------------------- | :----------------- | :- -------------------------------------------------------- |
117- | ` rows ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the vector position in ` shuffled_vectors ` . |
115+ | Dimension Name | TileDB Datatype | Domain | Description |
116+ | :------------- | :-------------- | :--------------- | :-------------------------------------------------------- |
117+ | ` rows ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the vector position in ` shuffled_vectors ` . |
118118
119119#### Attributes
120120
121- | Attribute Name | TileDB Datatype | Description |
122- | :--------------- | :-------------- | :--------------------------------------------------------------------------- |
123- | ` values ` | ` uint64_t ` | Contains the vector's ` external_id ` . |
121+ | Attribute Name | TileDB Datatype | Description |
122+ | :------------- | :-------------- | :----------------------------------- |
123+ | ` values ` | ` uint64_t ` | Contains the vector's ` external_id ` . |
124124
125125### IVF_FLAT
126126
127127#### Metadata
128128
129- | Name | Description |
130- | ------ | ------ |
129+ | Name | Description |
130+ | ------------------- | ----------------------------------------------------------------------------- ------ |
131131| ` partition_history ` | An ordered list of the number of partitions used at different ingestion timestamps. |
132132
133133#### ` partition_centroids `
@@ -137,7 +137,7 @@ This is a 2D dense array storing the k-means centroids for the different vector
137137#### Basic schema parameters
138138
139139| ** Parameter** | ** Value** |
140- | :-------------- | :---------- |
140+ | :------------ | :-------- |
141141| Array type | Dense |
142142| Rank | 2D |
143143| Cell order | Col-major |
@@ -146,40 +146,40 @@ This is a 2D dense array storing the k-means centroids for the different vector
146146#### Dimensions
147147
148148| Dimension Name | TileDB Datatype | Domain | Description |
149- | :--------------- | :---------------- | :------------------ | :---------------------------------------- |
149+ | :------------- | :-------------- | :---------------- | :-------------------------------------- |
150150| ` rows ` | ` int32_t ` | ` [0, dimensions] ` | Corresponds to the centroid dimensions. |
151151| ` cols ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the centroid id. |
152152
153153#### Attributes
154154
155- | Attribute Name | TileDB Datatype | Description |
156- | :--------------- | :-------------- | :--------------------------------------------------------------------------- |
157- | ` centroids ` | ` dtype ` | Contains the centroid value at the specific dimension. |
155+ | Attribute Name | TileDB Datatype | Description |
156+ | :------------- | :-------------- | :----------------------------------------------------- |
157+ | ` centroids ` | ` dtype ` | Contains the centroid value at the specific dimension. |
158158
159159#### ` partition_indexes `
160160
161- This is a 1D dense array recording the start-end index of each partition of vectors in the ` shuffled_vectors ` array.
161+ This is a 1D dense array recording the start-end index of each partition of vectors in the ` shuffled_vectors ` array.
162162
163163#### Basic schema parameters
164164
165165| ** Parameter** | ** Value** |
166- | :-------------- | :---------- |
166+ | :------------ | :-------- |
167167| Array type | Dense |
168168| Rank | 1D |
169169| Cell order | Col-major |
170170| Tile order | Col-major |
171171
172172#### Dimensions
173173
174- | Dimension Name | TileDB Datatype | Domain | Description |
175- | :------------- | :-------------------- | :----------------- | :------------------------------- |
176- | ` rows ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the partition id. |
174+ | Dimension Name | TileDB Datatype | Domain | Description |
175+ | :------------- | :-------------- | :--------------- | :------------------------------- |
176+ | ` rows ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the partition id. |
177177
178178#### Attributes
179179
180- | Attribute Name | TileDB Datatype | Description |
181- | :--------------- | :-------------- | :-------------------------------------------------------------------------------- |
182- | ` values ` | ` uint64_t ` | Contains to the position of the partition split in the ` shuffled_vectors ` array. |
180+ | Attribute Name | TileDB Datatype | Description |
181+ | :------------- | :-------------- | :------------------------------------------------------------------------------- |
182+ | ` values ` | ` uint64_t ` | Contains to the position of the partition split in the ` shuffled_vectors ` array. |
183183
184184#### ` shuffled_vectors `
185185
@@ -188,24 +188,24 @@ This is a 2D dense array that holds all the vectors. Each vector partition is st
188188#### Basic schema parameters
189189
190190| ** Parameter** | ** Value** |
191- | :-------------- | :---------- |
191+ | :------------ | :-------- |
192192| Array type | Dense |
193193| Rank | 2D |
194194| Cell order | Col-major |
195195| Tile order | Col-major |
196196
197197#### Dimensions
198198
199- | Dimension Name | TileDB Datatype | Domain | Description |
200- | :------------- | :-------------------- | :----------------- | :- -------------------------------------------------------- |
201- | ` rows ` | ` int32_t ` | ` [0, dimensions] ` | Corresponds to the vector dimensions. |
202- | ` cols ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the vector position in the set of vectors. |
199+ | Dimension Name | TileDB Datatype | Domain | Description |
200+ | :------------- | :-------------- | :---------------- | :-------------------------------------------------------- |
201+ | ` rows ` | ` int32_t ` | ` [0, dimensions] ` | Corresponds to the vector dimensions. |
202+ | ` cols ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the vector position in the set of vectors. |
203203
204204#### Attributes
205205
206- | Attribute Name | TileDB Datatype | Description |
207- | :--------------- | :-------------- | :--------------------------------------------------------------------------- |
208- | ` values ` | ` dtype ` | Contains the vector value at the specific dimension. |
206+ | Attribute Name | TileDB Datatype | Description |
207+ | :------------- | :-------------- | :--------------------------------------------------- |
208+ | ` values ` | ` dtype ` | Contains the vector value at the specific dimension. |
209209
210210#### ` shuffled_ids `
211211
@@ -214,29 +214,28 @@ This is a 1D dense array that maps vector indices in the `shuffled_vectors` arra
214214#### Basic schema parameters
215215
216216| ** Parameter** | ** Value** |
217- | :-------------- | :---------- |
217+ | :------------ | :-------- |
218218| Array type | Dense |
219219| Rank | 1D |
220220| Cell order | Col-major |
221221| Tile order | Col-major |
222222
223223#### Dimensions
224224
225- | Dimension Name | TileDB Datatype | Domain | Description |
226- | :------------- | :-------------------- | :----------------- | :- -------------------------------------------------------- |
227- | ` rows ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the vector position in ` shuffled_vectors ` . |
225+ | Dimension Name | TileDB Datatype | Domain | Description |
226+ | :------------- | :-------------- | :--------------- | :-------------------------------------------------------- |
227+ | ` rows ` | ` int32_t ` | ` [0, MAX_INT32] ` | Corresponds to the vector position in ` shuffled_vectors ` . |
228228
229229#### Attributes
230230
231- | Attribute Name | TileDB Datatype | Description |
232- | :--------------- | :-------------- | :---------------------------------------------------------------------------|
233- | ` values ` | ` uint64_t ` | Contains the vector ` external_id ` . |
234-
231+ | Attribute Name | TileDB Datatype | Description |
232+ | :------------- | :-------------- | :--------------------------------- |
233+ | ` values ` | ` uint64_t ` | Contains the vector ` external_id ` . |
235234
236235### IVF_PQ
237236
238237TODO
239238
240239### VAMANA
241240
242- TODO
241+ TODO
0 commit comments