You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure Cosmos DB may need to read secret/key data from Azure Key Vault. For example, your Azure Cosmos DB may require a customer-managed key stored in Azure Key Vault. To do this, Azure Cosmos DB should be configured with a managed identity, and then an Azure Key Vault access policy should grant the managed identity access.
17
18
19
+
18
20
## Prerequisites
19
21
20
22
- An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F).
Copy file name to clipboardExpand all lines: articles/cosmos-db/nosql/vector-search.md
+5-6Lines changed: 5 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ ms.date: 5/7/2024
18
18
19
19
[!INCLUDE[NoSQL](../includes/appliesto-nosql.md)]
20
20
21
-
Azure Cosmos DB for NoSQL now offers vector indexing and search in preview. This feature is designed to handle high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors directly in the documents alongside your data. This means that each document in your database can contain not only traditional schema-free data, but also high-dimensional vectors as other properties of the documents. This colocation of data and vectors allows for efficient indexing and searching, as the vectors are stored in the same logical unit as the data they represent. This simplifies data management, AI application architectures, and the efficiency of vector-based operations.
21
+
Azure Cosmos DB for NoSQL now offers vector indexing and search in preview. This feature is designed to handle high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors directly in the documents alongside your data. Each document in your database can contain not only traditional schema-free data, but also high-dimensional vectors as other properties of the documents. This colocation of data and vectors allows for efficient indexing and searching, as the vectors are stored in the same logical unit as the data they represent. Keeping vectors and data together simplifies data management, AI application architectures, and the efficiency of vector-based operations.
22
22
23
23
Azure Cosmos DB for NoSQL offers the flexibility it offers in choosing the vector indexing method:
24
24
- A "flat" or k-nearest neighbors exact search (sometimes called brute-force) can provide 100% retrieval recall for smaller, focused vector searches. especially when combined with query filters and partition-keys.
@@ -43,7 +43,7 @@ In a vector store, vector search algorithms are used to index and query embeddin
43
43
In the Integrated Vector Database in Azure Cosmos DB for NoSQL, embeddings can be stored, indexed, and queried alongside the original data. This approach eliminates the extra cost of replicating data in a separate pure vector database. Moreover, this architecture keeps the vector embeddings and original data together, which better facilitates multi-modal data operations, and enables greater data consistency, scale, and performance.
44
44
45
45
## Enroll in the Vector Search Preview Feature
46
-
Vector search for Azure Cosmos DB for NoSQL requires preview feature registration on the Features page of your Azure Cosmos DB. Follow the below steps to register:
46
+
Vector search for Azure Cosmos DB for NoSQL requires preview feature registration on the Features page of your Azure Cosmos DB. Follow the below steps to register:
47
47
48
48
1. Navigate to your Azure Cosmos DB for NoSQL resource page.
49
49
@@ -83,7 +83,7 @@ Performing vector search with Azure Cosmos DB for NoSQL requires you to define a
83
83
* “dimensions”: The dimensionality or length of each vector in the path. All vectors in a path should have the same number of dimensions. (default 1536).
84
84
* “distanceFunction”: The metric used to compute distance/similarity. Supported metrics are:
85
85
* [cosine](https://en.wikipedia.org/wiki/Cosine_similarity), which has values from -1 (least similar) to +1 (most similar).
86
-
* [dotproduct](https://en.wikipedia.org/wiki/Dot_product), which has values from -inf (least simialr) to +inf (most similar).
86
+
* [dot product](https://en.wikipedia.org/wiki/Dot_product), which has values from -inf (least similar) to +inf (most similar).
87
87
* [euclidean](https://en.wikipedia.org/wiki/Euclidean_distance), which has values from 0 (most similar) to +inf) (least similar).
88
88
89
89
@@ -203,7 +203,7 @@ Here are examples of valid vector index policies:
203
203
204
204
## Perform vector search with queries using VectorDistance()
205
205
206
-
Once you have created a container with the desired vector policy, and inserted vector data into the container, you can conduct a vector search using the [Vector Distance](query/vectordistance.md) system function in a query. An example of a NoSQL query that projects the similarity score as the alias `SimilarityScore`, and sorts in order of most-similar to least-similar is shown below:
206
+
Once you created a container with the desired vector policy, and inserted vector data into the container, you can conduct a vector search using the [Vector Distance](query/vectordistance.md) system function in a query. An example of a NoSQL query that projects the similarity score as the alias `SimilarityScore`, and sorts in order of most-similar to least-similar:
207
207
208
208
```sql
209
209
SELECTc.title, VectorDistance(c.contentVector, [1,2,3]) AS SimilarityScore
@@ -217,15 +217,14 @@ Vector indexing and search in Azure Cosmos DB for NoSQL has some limitations whi
217
217
- You can specify, at most, one DiskANN index type per container
218
218
- Vector indexing is only supported on new containers.
219
219
- Vectors indexed with the `flat` index type can be at most 505 dimensions. Vectors indexed with the `quantizedFlat` or `DiskANN` index type can be at most 4,096 dimensions.
220
-
-`quantizedFlat` utilizes the same quantization method as DiskANN and is not configurable at this time.
220
+
-`quantizedFlat` utilizes the same quantization method as DiskANN and isn't configurable at this time.
221
221
- Shared throughput databases can't use the vector search preview feature at this time.
222
222
- Ingestion rate should be limited while using an early preview of DiskANN.
223
223
224
224
## Next step
225
225
-[DiskANN + Azure Cosmos DB - Microsoft Mechanics Video](https://www.youtube.com/watch?v=MlMPIYONvfQ)
226
226
-[.NET - How-to Index and query vector data](how-to-dotnet-vector-index-query.md)
227
227
-[Python - How-to Index and query vector data](how-to-python-vector-index-query.md)
228
-
-[JavaScript - How-to Index and query vector data](how-to-javascript-vector-index-query.md)
229
228
-[Java - How-to Index and query vector data](how-to-java-vector-index-query.md)
230
229
-[VectorDistance system function](query/vectordistance.md)
231
230
-[Vector index overview](../index-overview.md#vector-indexes)
0 commit comments