Skip to content

Commit 5f38d68

Browse files
Merge pull request #279929 from wmwxwa/patch-23
Update vector-search-overview.md
2 parents 0e52333 + be14c39 commit 5f38d68

File tree

5 files changed

+49
-4
lines changed

5 files changed

+49
-4
lines changed

articles/cosmos-db/gen-ai/distance-functions.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,10 @@ Two vectors are multiplied to return a single number. It combines the two vector
3030

3131
## Related content
3232
- [VectorDistance system function](../nosql/query/vectordistance.md) in Azure Cosmos DB NoSQL
33+
- [What is a vector database?](../vector-database.md)
34+
- [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
35+
- [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
36+
- [What is vector search?](vector-search-overview.md)
37+
- LLM [tokens](tokens.md)
38+
- Vector [embeddings](vector-embeddings.md)
39+
- [kNN vs ANN vector search algorithms](knn-vs-ann.md)

articles/cosmos-db/gen-ai/knn-vs-ann.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,12 @@ Two popular vector search algorithms are k-Nearest Neighbors (kNN) and Approxima
3030
4. Making Predictions:
3131
- Classification: For classification tasks, ANN assigns the class label to the query point that is most common among the identified neighbors, similar to kNN.
3232
- Regression: For regression tasks, ANN predicts the value for the query point as the average (or weighted average) of the values of the identified neighbors.
33+
34+
## Related content
35+
- [What is a vector database?](../vector-database.md)
36+
- [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
37+
- [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
38+
- [What is vector search?](vector-search-overview.md)
39+
- LLM [tokens](tokens.md)
40+
- Vector [embeddings](vector-embeddings.md)
41+
- [Distance functions](distance-functions.md)

articles/cosmos-db/gen-ai/tokens.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: Tokens
3-
description: Overview of tokens in the context of large language models.
2+
title: LLM tokens
3+
description: Overview of tokens in large language models.
44
author: wmwxwa
55
ms.author: wangwilliam
66
ms.service: cosmos-db
@@ -11,3 +11,12 @@ ms.date: 07/01/2024
1111
# What are tokens?
1212

1313
Tokens are small chunks of text generated by splitting the input text into smaller segments. These segments can either be words or groups of characters, varying in length from a single character to an entire word. For instance, the word hamburger would be divided into tokens such as ham, bur, and ger while a short and common word like pear would be considered a single token. LLMs like GPT-3.5 or GPT-4 break words into tokens for processing.
14+
15+
## Related content
16+
- [What is a vector database?](../vector-database.md)
17+
- [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
18+
- [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
19+
- [What is vector search?](vector-search-overview.md)
20+
- Vector [embeddings](vector-embeddings.md)
21+
- [Distance functions](distance-functions.md)
22+
- [kNN vs ANN vector search algorithms](knn-vs-ann.md)

articles/cosmos-db/gen-ai/vector-embeddings.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,12 @@ This image shows the spatial closeness of vectors that are similar, contrasting
2929
Image source: [OpenAI](https://openai.com/index/introducing-text-and-code-embeddings/)
3030

3131
You can see more examples in this [interactive visualization](https://openai.com/index/introducing-text-and-code-embeddings/#_1Vr7cWWEATucFxVXbW465e) that transforms data into a three-dimensional space.
32+
33+
## Related content
34+
- [What is a vector database?](../vector-database.md)
35+
- [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
36+
- [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
37+
- [What is vector search?](vector-search-overview.md)
38+
- LLM [tokens](tokens.md)
39+
- [Distance functions](distance-functions.md)
40+
- [kNN vs ANN vector search algorithms](knn-vs-ann.md)

articles/cosmos-db/gen-ai/vector-search-overview.md

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,19 @@ ms.date: 07/01/2024
1010

1111
# What is vector search?
1212

13-
Vector search is a method that helps you find similar items based on their data characteristics rather than by exact matches on a property field. This technique is useful in applications such as searching for similar text, finding related images, making recommendations, or even detecting anomalies. It works by taking the [vector embeddings](vector-embeddings.md) of your data and query, and then measuring the [distance](distance-functions.md) between the data vectors and your query vector. The data vectors that are closest to your query vector are the ones that are found to be most similar semantically. Some well-known vector search algorithms include Hierarchical Navigable Small World (HNSW), Inverted File (IVF), and the state-of-the-art DiskANN.
13+
Vector search is a method that helps you find similar items based on their data characteristics rather than by exact matches on a property field. This technique is useful in applications such as searching for similar text, finding related images, making recommendations, or even detecting anomalies. It works by taking the [vector embeddings](vector-embeddings.md) of your data and query, and then measuring the [distance](distance-functions.md) between the data vectors and your query vector. The data vectors that are closest to your query vector are the ones that are found to be most similar semantically.
1414

1515
This [interactive visualization](https://openai.com/index/introducing-text-and-code-embeddings/#_1Vr7cWWEATucFxVXbW465e) shows some examples of closeness and distance between vectors.
1616

17-
Using an integrated vector search feature offers an efficient way to store, index, and search high-dimensional vector data directly alongside other application data. This approach removes the necessity of migrating your data to costlier alternative vector databases and provides a seamless integration of your AI-driven applications.
17+
Two popular types of vector search algorithms are [k-nearest neighbors (kNN) and approximate nearest neighbor (ANN)](knn-vs-ann.md). Some well-known vector search algorithms belonging to these categories include Inverted File (IVF), Hierarchical Navigable Small World (HNSW), and the state-of-the-art DiskANN.
18+
19+
Using an integrated vector search feature in a fully featured database ([as opposed to a pure vector database](../vector-database.md#integrated-vector-database-vs-pure-vector-database)) offers an efficient way to store, index, and search high-dimensional vector data directly alongside other application data. This approach removes the necessity of migrating your data to costlier alternative vector databases and provides a seamless integration of your AI-driven applications.
20+
21+
## Related content
22+
- [What is a vector database?](../vector-database.md)
23+
- [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
24+
- [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
25+
- LLM [tokens](tokens.md)
26+
- Vector [embeddings](vector-embeddings.md)
27+
- [Distance functions](distance-functions.md)
28+
- [kNN vs ANN vector search algorithms](knn-vs-ann.md)

0 commit comments

Comments
 (0)