You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-generate-embeddings.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,17 +20,17 @@ Azure AI Search doesn't host embedding models, so you're responsible for creatin
20
20
| Approach | Description |
21
21
| --- | --- |
22
22
|[Integrated vectorization](vector-search-integrated-vectorization.md)| Use built-in data chunking and vectorization in Azure AI Search. This approach takes a dependency on indexers, skillsets, and built-in or custom skills that point to external embedding models, such as those in Azure AI Foundry. |
23
-
| Manual vectorization | Manage data chunking and vectorization yourself. For indexing, you [push prevectorized documents](vector-search-how-to-create-index.md#load-vector-data-for-indexing) into vector fields in a search index. For querying, you provide precomputed vectors to the search engine. For demos of this approach, see the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples/tree/main) GitHub repository. |
23
+
| Manual vectorization | Manage data chunking and vectorization yourself. For indexing, you [push prevectorized documents](vector-search-how-to-create-index.md#load-vector-data-for-indexing) into vector fields in a search index. For querying, you [provide precomputed vectors](#generate-an-embedding-for-an-improvised-query) to the search engine. For demos of this approach, see the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples/tree/main) GitHub repository. |
24
24
25
-
We recommend integrated vectorization for most scenarios. Although you can use any supported embedding model, this article uses Azure OpenAI embedding models for illustration.
25
+
We recommend integrated vectorization for most scenarios. Although you can use any supported embedding model, this article uses Azure OpenAI models for illustration.
26
26
27
27
## How embedding models are used in vector queries
28
28
29
29
Embedding models generate vectors for both [query inputs](#query-inputs) and [query outputs](#query-outputs).
30
30
31
31
### Query inputs
32
32
33
-
Query inputs include the following:
33
+
Query inputs include:
34
34
35
35
+**Text or images that are converted to vectors during query processing**. As part of integrated vectorization, a [vectorizer](vector-search-how-to-configure-vectorizer.md) performs this task.
36
36
@@ -46,11 +46,11 @@ Your search index must already contain documents with one or more vector fields
46
46
47
47
+**Identify use cases**. Evaluate specific use cases where embedding model integration for vector search features adds value to your search solution. Examples include [multimodal search](multimodal-search-overview.md) or matching image content with text content, multilingual search, and similarity search.
48
48
49
-
+**Design a chunking strategy**. Embedding models have limits on the number of tokens they accept, so data chunking is necessary for large files. For more information, see [Chunk large documents for vector search solutions](vector-search-how-to-chunk-documents.md).
49
+
+**Design a chunking strategy**. Embedding models have limits on the number of tokens they accept, so [data chunking](vector-search-how-to-chunk-documents.md) is necessary for large files.
50
50
51
51
+**Optimize cost and performance**. Vector search is resource intensive and subject to maximum limits, so vectorize only the fields that contain semantic meaning. [Reduce vector size](vector-search-how-to-configure-compression-storage.md) to store more vectors for the same price.
52
52
53
-
+**Choose the right embedding model**. Select a model for your use case, such as word embeddings for text-based searches or image embeddings for visual searches. Consider pretrained models, such as text-embedding-ada-002 from OpenAI or the Image Retrieval REST API from [Azure AI Computer Vision](/azure/ai-services/computer-vision/how-to/image-retrieval).
53
+
+**Choose the right embedding model**. Select a model for your use case, such as word embeddings for text-based searches or image embeddings for visual searches. Consider pretrained models, such as text-embedding-ada-002 from OpenAI or the Image Retrieval REST API from [Azure AI Vision](/azure/ai-services/computer-vision/how-to/image-retrieval).
54
54
55
55
+**Normalize vector lengths**. To improve the accuracy and performance of similarity search, normalize vector lengths before you store them in a search index. Most pretrained models are already normalized.
56
56
@@ -79,8 +79,8 @@ When you add knowledge to an agent workflow in the [Azure AI Foundry portal](htt
79
79
80
80
One step involves selecting an embedding model to vectorize your plain text content. The following models are supported:
81
81
82
-
+ text-embedding-3-large
83
82
+ text-embedding-3-small
83
+
+ text-embedding-3-large
84
84
+ text-embedding-ada-002
85
85
+ Cohere-embed-v3-english
86
86
+ Cohere-embed-v3-multilingual
@@ -237,6 +237,8 @@ POST https://YOUR-OPENAI-RESOURCE.openai.azure.com/openai/deployments/text-embed
0 commit comments