Skip to content

Commit fd676bc

Browse files
authored
Update vector-embeddings.md
1 parent 8c2ca6a commit fd676bc

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

articles/cosmos-db/gen-ai/vector-embeddings.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.date: 07/01/2024
1010

1111
# What are vector embeddings?
1212

13-
Vectors, also known as embeddings or vector embeddings, are mathematical representations of data in a high-dimensional space. They represent various types of information — text, images, audio — a format that machine learning models can process. When an AI model receives text input, it first tokenizes the text into tokens. Each token is then converted into its corresponding embedding. This conversion process can done using an embedding generation model, such as [Azure OpenAI Embeddings](../../ai-services/openai/how-to/embeddings.md) or [Hugging Face on Azure](https://azure.microsoft.com/solutions/hugging-face-on-azure). The model processes these embeddings through multiple layers, capturing complex patterns and relationships within the text. The output embeddings can then be converted back into tokens if needed, generating readable text.
13+
Vectors, also known as embeddings or vector embeddings, are mathematical representations of data in a high-dimensional space. They represent various types of information — text, images, audio — a format that machine learning models can process. When an AI model receives text input, it first tokenizes the text into tokens. Each token is then converted into its corresponding embedding. This conversion process can be done using an embedding generation model, such as [Azure OpenAI Embeddings](../../ai-services/openai/how-to/embeddings.md) or [Hugging Face on Azure](https://azure.microsoft.com/solutions/hugging-face-on-azure). The model processes these embeddings through multiple layers, capturing complex patterns and relationships within the text. The output embeddings can then be converted back into tokens if needed, generating readable text.
1414

1515
Each embedding is a vector of floating-point numbers, such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format. For example, if two texts are similar, then their vector representations should also be similar. These high-dimensional representations capture semantic meaning, making it easier to perform tasks like searching, clustering, and classifying.
1616

@@ -23,7 +23,7 @@ Each box containing floating-point numbers corresponds to a dimension, and each
2323

2424
Between the two vectors in the above example, some dimensions are similar while other dimensions are different, which are due to the similarities and differences in the meaning of the two phrases.
2525

26-
This image shows the spatial closeness of vectors that are similar, constrasting vectors that are drastically different:
26+
This image shows the spatial closeness of vectors that are similar, contrasting vectors that are drastically different:
2727

2828
:::image type="content" source="../media/gen-ai/concepts/vector-closeness.png" lightbox="../media/gen-ai/concepts/vector-closeness.png" alt-text="Screenshot of vector closeness.":::
2929
Image source: [OpenAI](https://openai.com/index/introducing-text-and-code-embeddings/)

0 commit comments

Comments
 (0)