You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Doc] Issue with databricks_vector_search_index documentation (#4534)
## Changes
Adjust documentation for `databricks_vector_search_index`:
* Added documentation for param `embedding_writeback_table`
* Added separate configuration blocks to `delta_sync_index_spec` &
`direct_access_index_spec`
Fixes#4532
## Tests
No tests to be made since this is just a docs change
---------
Co-authored-by: Slim Ben Salah <[email protected]>
Copy file name to clipboardExpand all lines: docs/resources/vector_search_index.md
+27-20Lines changed: 27 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,26 +36,33 @@ The following arguments are supported (change of any parameter leads to recreati
36
36
*`index_type` - (required) Mosaic AI Vector Search index type. Currently supported values are:
37
37
*`DELTA_SYNC`: An index that automatically syncs with a source Delta Table, automatically and incrementally updating the index as the underlying data in the Delta Table changes.
38
38
*`DIRECT_ACCESS`: An index that supports the direct read and write of vectors and metadata through our REST and SDK APIs. With this model, the user manages index updates.
39
-
*`delta_sync_index_spec` - (object) Specification for Delta Sync Index. Required if `index_type` is `DELTA_SYNC`.
40
-
*`source_table` (required) The name of the source table.
41
-
*`columns_to_sync` - (optional) list of columns to sync. If not specified, all columns are syncronized.
42
-
*`embedding_source_columns` - (required if `embedding_vector_columns` isn't provided) array of objects representing columns that contain the embedding source. Each entry consists of:
43
-
* `name` - The name of the column
44
-
* `embedding_model_endpoint_name` - The name of the embedding model endpoint
45
-
*`embedding_vector_columns` - (required if `embedding_source_columns` isn't provided) array of objects representing columns that contain the embedding vectors. Each entry consists of:
46
-
* `name` - The name of the column.
47
-
* `embedding_dimension` - Dimension of the embedding vector.
48
-
*`pipeline_type` - Pipeline execution mode. Possible values are:
49
-
* `TRIGGERED`: If the pipeline uses the triggered execution mode, the system stops processing after successfully refreshing the source table in the pipeline once, ensuring the table is updated based on the data available when the update started.
50
-
* `CONTINUOUS`: If the pipeline uses continuous execution, the pipeline processes new data as it arrives in the source table to keep the vector index fresh.
51
-
*`direct_access_index_spec` - (object) Specification for Direct Vector Access Index. Required if `index_type` is `DIRECT_ACCESS`.
52
-
*`schema_json` - The schema of the index in JSON format. Check the [API documentation](https://docs.databricks.com/api/workspace/vectorsearchindexes/createindex#direct_access_index_spec-schema_json) for a list of supported data types.
53
-
*`embedding_source_columns` - (required if `embedding_vector_columns` isn't provided) array of objects representing columns that contain the embedding source. Each entry consists of:
54
-
* `name` - The name of the column
55
-
* `embedding_model_endpoint_name` - The name of the embedding model endpoint
56
-
*`embedding_vector_columns` - (required if `embedding_source_columns` isn't provided) array of objects representing columns that contain the embedding vectors. Each entry consists of:
57
-
* `name` - The name of the column.
58
-
* `embedding_dimension` - Dimension of the embedding vector.
39
+
*`delta_sync_index_spec` - (object) Specification for Delta Sync Index. Required if `index_type` is `DELTA_SYNC`. This field is a block and is [documented below](#delta_sync_index_spec-Configuration-Block).
40
+
*`direct_access_index_spec` - (object) Specification for Direct Vector Access Index. Required if `index_type` is `DIRECT_ACCESS`. This field is a block and is [documented below](#direct_access_index_spec-Configuration-Block).
41
+
42
+
### delta_sync_index_spec Configuration Block
43
+
44
+
*`source_table` (required) The name of the source table.
45
+
*`columns_to_sync` - (optional) list of columns to sync. If not specified, all columns are syncronized.
46
+
*`embedding_source_columns` - (required if `embedding_vector_columns` isn't provided) array of objects representing columns that contain the embedding source. Each entry consists of:
47
+
*`name` - The name of the column
48
+
*`embedding_model_endpoint_name` - The name of the embedding model endpoint
49
+
*`embedding_vector_columns` - (required if `embedding_source_columns` isn't provided) array of objects representing columns that contain the embedding vectors. Each entry consists of:
50
+
*`name` - The name of the column.
51
+
*`embedding_dimension` - Dimension of the embedding vector.
52
+
*`pipeline_type` - Pipeline execution mode. Possible values are:
53
+
*`TRIGGERED`: If the pipeline uses the triggered execution mode, the system stops processing after successfully refreshing the source table in the pipeline once, ensuring the table is updated based on the data available when the update started.
54
+
*`CONTINUOUS`: If the pipeline uses continuous execution, the pipeline processes new data as it arrives in the source table to keep the vector index fresh.
55
+
*`embedding_writeback_table` - (optional) Automatically sync the vector index contents and computed embeddings to the specified Delta table. The only supported table name is the index name with the suffix `_writeback_table`.
56
+
57
+
### direct_access_index_spec Configuration Block
58
+
59
+
*`schema_json` - The schema of the index in JSON format. Check the [API documentation](https://docs.databricks.com/api/workspace/vectorsearchindexes/createindex#direct_access_index_spec-schema_json) for a list of supported data types.
60
+
*`embedding_source_columns` - (required if `embedding_vector_columns` isn't provided) array of objects representing columns that contain the embedding source. Each entry consists of:
61
+
*`name` - The name of the column
62
+
*`embedding_model_endpoint_name` - The name of the embedding model endpoint
63
+
*`embedding_vector_columns` - (required if `embedding_source_columns` isn't provided) array of objects representing columns that contain the embedding vectors. Each entry consists of:
64
+
*`name` - The name of the column.
65
+
*`embedding_dimension` - Dimension of the embedding vector.
0 commit comments