Skip to content

Commit 1cbb71f

Browse files
authored
Merge pull request #6375 from haileytap/vectors
[Azure Search] Refactor vector-search-how-to-generate-embeddings.md
2 parents 9dc60e3 + e0e3494 commit 1cbb71f

File tree

1 file changed

+185
-41
lines changed

1 file changed

+185
-41
lines changed
Lines changed: 185 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Generate embeddings
2+
title: Generate Embeddings
33
titleSuffix: Azure AI Search
44
description: Learn how to generate embeddings for downstream indexing into an Azure AI Search index.
55
author: haileytap
@@ -9,38 +9,196 @@ ms.update-cycle: 180-days
99
ms.custom:
1010
- ignite-2023
1111
ms.topic: how-to
12-
ms.date: 06/11/2025
12+
ms.date: 08/06/2025
1313
---
1414

1515
# Generate embeddings for search queries and documents
1616

17-
Azure AI Search doesn't host embedding models, so one of your challenges is creating vectors for query inputs and outputs. You can use any supported embedding model, but this article assumes Azure OpenAI embedding models for illustration.
17+
Azure AI Search doesn't host embedding models, so you're responsible for creating vectors for query inputs and outputs. Choose one of the following approaches:
1818

19-
We recommend [integrated vectorization](vector-search-integrated-vectorization.md), which provides built-in data chunking and vectorization. Integrated vectorization takes a dependency on indexers, skillsets, and built-in or custom skills that point to a model that executes externally from Azure AI Search. Several built-in skills point to embedding models in Azure AI Foundry, which makes integrated vectorization your easiest solution for solving the embedding challenge.
19+
| Approach | Description |
20+
| --- | --- |
21+
| [Integrated vectorization](vector-search-integrated-vectorization.md) | Use built-in data chunking and vectorization in Azure AI Search. This approach takes a dependency on indexers, skillsets, and built-in or custom skills that point to external embedding models, such as those in Azure AI Foundry. |
22+
| Manual vectorization | Manage data chunking and vectorization yourself. For indexing, you [push prevectorized documents](vector-search-how-to-create-index.md#load-vector-data-for-indexing) into vector fields in a search index. For querying, you [provide precomputed vectors](#generate-an-embedding-for-an-improvised-query) to the search engine. For demos of this approach, see the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples/tree/main) GitHub repository. |
2023

21-
If you want to handle data chunking and vectorization yourself, we provide demos in the [sample repository](https://github.com/Azure/azure-search-vector-samples/tree/main) that show you how to integrate with other community solutions.
24+
We recommend integrated vectorization for most scenarios. Although you can use any supported embedding model, this article uses Azure OpenAI models for illustration.
2225

2326
## How embedding models are used in vector queries
2427

25-
+ Query inputs are either vectors, or text or images that are converted to vectors during query processing. The built-in solution in Azure AI Search is to use a vectorizer.
28+
Embedding models generate vectors for both query inputs and query outputs. Query inputs include:
2629

27-
Alternatively, you can also handle the conversion yourself by passing the query input to an embedding model of your choice. To avoid [rate limiting](/azure/ai-services/openai/quotas-limits), you can implement retry logic in your workload. For the Python demo, we used [tenacity](https://pypi.org/project/tenacity/).
30+
+ **Text or images that are converted to vectors during query processing**. As part of integrated vectorization, a [vectorizer](vector-search-how-to-configure-vectorizer.md) performs this task.
2831

29-
+ Query outputs are any matching documents found in a search index. Your search index must have been previously loaded with documents having one or more vector fields with embeddings. Whatever embedding model you used for indexing, use that same model for queries.
32+
+ **Precomputed vectors**. You can generate these vectors by passing the query input to an embedding model of your choice. To avoid [rate limiting](/azure/ai-services/openai/quotas-limits), implement retry logic in your workload. Our [Python demo](https://github.com/Azure/azure-search-vector-samples/tree/93c839591bf92c2f10001d287871497b0f204a7c/demo-python) uses [tenacity](https://pypi.org/project/tenacity/).
33+
34+
Based on the query input, the search engine retrieves matching documents from your search index. These documents are the query outputs.
35+
36+
Your search index must already contain documents with one or more vector fields populated by embeddings. You can create these embeddings through integrated or manual vectorization. To ensure accurate results, use the same embedding model for indexing and querying.
37+
38+
## Tips for embedding model integration
39+
40+
+ **Identify use cases**. Evaluate specific use cases where embedding model integration for vector search features adds value to your search solution. Examples include [multimodal search](multimodal-search-overview.md) or matching image content with text content, multilingual search, and similarity search.
41+
42+
+ **Design a chunking strategy**. Embedding models have limits on the number of tokens they accept, so [data chunking](vector-search-how-to-chunk-documents.md) is necessary for large files.
43+
44+
+ **Optimize cost and performance**. Vector search is resource intensive and subject to maximum limits, so vectorize only the fields that contain semantic meaning. [Reduce vector size](vector-search-how-to-configure-compression-storage.md) to store more vectors for the same price.
45+
46+
+ **Choose the right embedding model**. Select a model for your use case, such as word embeddings for text-based searches or image embeddings for visual searches. Consider pretrained models, such as text-embedding-ada-002 from OpenAI or the Image Retrieval REST API from [Azure AI Vision](/azure/ai-services/computer-vision/how-to/image-retrieval).
47+
48+
+ **Normalize vector lengths**. To improve the accuracy and performance of similarity search, normalize vector lengths before you store them in a search index. Most pretrained models are already normalized.
49+
50+
+ **Fine-tune the model**. If needed, fine-tune the model on your domain-specific data to improve its performance and relevance to your search application.
51+
52+
+ **Test and iterate**. Continuously test and refine the embedding model integration to achieve your desired search performance and user satisfaction.
3053

3154
## Create resources in the same region
3255

3356
Although integrated vectorization with Azure OpenAI embedding models doesn't require resources to be in the same region, using the same region can improve performance and reduce latency.
3457

35-
1. [Check regions for a text embedding model](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability).
58+
To use the same region for your resources:
59+
60+
1. Check the [regional availability of text embedding models](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability).
61+
62+
1. Check the [regional availability of Azure AI Search](search-region-support.md).
3663

37-
1. [Find the same region for Azure AI Search](search-region-support.md).
64+
1. Create an Azure OpenAI resource and Azure AI Search service in the same region.
65+
66+
> [!TIP]
67+
> Want to use [semantic ranking](semantic-how-to-query-request.md) for [hybrid queries](hybrid-search-overview.md) or a machine learning model in a [custom skill](cognitive-search-custom-skill-interface.md) for [AI enrichment](cognitive-search-concept-intro.md)? Choose an Azure AI Search region that provides those features.
68+
69+
## Choose an embedding model in Azure AI Foundry
70+
71+
When you add knowledge to an agent workflow in the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs), you have the option of creating a search index. A wizard guides you through the steps.
72+
73+
One step involves selecting an embedding model to vectorize your plain text content. The following models are supported:
74+
75+
+ text-embedding-3-small
76+
+ text-embedding-3-large
77+
+ text-embedding-ada-002
78+
+ Cohere-embed-v3-english
79+
+ Cohere-embed-v3-multilingual
3880

39-
1. To support hybrid queries that include [semantic ranking](semantic-how-to-query-request.md), or if you want to try machine learning model integration using a [custom skill](cognitive-search-custom-skill-interface.md) in an [AI enrichment pipeline](cognitive-search-concept-intro.md), select an Azure AI Search region that provides those features.
81+
Your model must already be deployed, and you must have permission to access it. For more information, see [Deployment overview for Azure AI Foundry Models](/azure/ai-foundry/concepts/deployments-overview).
4082

4183
## Generate an embedding for an improvised query
4284

43-
The following Python code generates an embedding that you can paste into the "values" property of a vector query.
85+
If you don't want to use integrated vectorization, you can manually generate an embedding and paste it into the `vectorQueries.vector` property of a vector query. For more information, see [Create a vector query in Azure AI Search](vector-search-how-to-query.md).
86+
87+
The following examples assume the text-embedding-ada-002 model. Replace `YOUR-API-KEY` and `YOUR-OPENAI-RESOURCE` with your Azure OpenAI resource details.
88+
89+
### [.NET](#tab/dotnet)
90+
91+
```csharp
92+
using System;
93+
using System.Net.Http;
94+
using System.Text;
95+
using System.Threading.Tasks;
96+
using Newtonsoft.Json;
97+
98+
class Program
99+
{
100+
static async Task Main(string[] args)
101+
{
102+
var apiKey = "YOUR-API-KEY";
103+
var apiBase = "https://YOUR-OPENAI-RESOURCE.openai.azure.com";
104+
var apiVersion = "2024-02-01";
105+
var engine = "text-embedding-ada-002";
106+
107+
var client = new HttpClient();
108+
client.DefaultRequestHeaders.Add("Authorization", $"Bearer {apiKey}");
109+
110+
var requestBody = new
111+
{
112+
input = "How do I use C# in VS Code?"
113+
};
114+
115+
var response = await client.PostAsync(
116+
$"{apiBase}/openai/deployments/{engine}/embeddings?api-version={apiVersion}",
117+
new StringContent(JsonConvert.SerializeObject(requestBody), Encoding.UTF8, "application/json")
118+
);
119+
120+
var responseBody = await response.Content.ReadAsStringAsync();
121+
Console.WriteLine(responseBody);
122+
}
123+
}
124+
```
125+
126+
### [Java](#tab/java)
127+
128+
```java
129+
import java.net.HttpURLConnection;
130+
import java.net.URL;
131+
import java.io.OutputStream;
132+
import java.io.BufferedReader;
133+
import java.io.InputStreamReader;
134+
135+
public class Main {
136+
public static void main(String[] args) {
137+
String apiKey = "YOUR-API-KEY";
138+
String apiBase = "https://YOUR-OPENAI-RESOURCE.openai.azure.com";
139+
String engine = "text-embedding-ada-002";
140+
String apiVersion = "2024-02-01";
141+
142+
try {
143+
URL url = new URL(String.format("%s/openai/deployments/%s/embeddings?api-version=%s", apiBase, engine, apiVersion));
144+
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
145+
connection.setRequestMethod("POST");
146+
connection.setRequestProperty("Authorization", "Bearer " + apiKey);
147+
connection.setRequestProperty("Content-Type", "application/json");
148+
connection.setDoOutput(true);
149+
150+
String requestBody = "{\"input\": \"How do I use Java in VS Code?\"}";
151+
152+
try (OutputStream os = connection.getOutputStream()) {
153+
os.write(requestBody.getBytes());
154+
}
155+
156+
try (BufferedReader br = new BufferedReader(new InputStreamReader(connection.getInputStream()))) {
157+
StringBuilder response = new StringBuilder();
158+
String line;
159+
while ((line = br.readLine()) != null) {
160+
response.append(line);
161+
}
162+
System.out.println(response);
163+
}
164+
} catch (Exception e) {
165+
e.printStackTrace();
166+
}
167+
}
168+
}
169+
```
170+
171+
### [JavaScript](#tab/javascript)
172+
173+
```javascript
174+
const apiKey = "YOUR-API-KEY";
175+
const apiBase = "https://YOUR-OPENAI-RESOURCE.openai.azure.com";
176+
const engine = "text-embedding-ada-002";
177+
const apiVersion = "2024-02-01";
178+
179+
async function generateEmbedding() {
180+
const response = await fetch(
181+
`${apiBase}/openai/deployments/${engine}/embeddings?api-version=${apiVersion}`,
182+
{
183+
method: "POST",
184+
headers: {
185+
"Authorization": `Bearer ${apiKey}`,
186+
"Content-Type": "application/json",
187+
},
188+
body: JSON.stringify({
189+
input: "How do I use JavaScript in VS Code?",
190+
}),
191+
}
192+
);
193+
194+
const data = await response.json();
195+
console.log(data.data[0].embedding);
196+
}
197+
198+
generateEmbedding();
199+
```
200+
201+
### [Python](#tab/python)
44202

45203
```python
46204
!pip install openai
@@ -60,39 +218,25 @@ embeddings = response['data'][0]['embedding']
60218
print(embeddings)
61219
```
62220

63-
Output is a vector array of 1,536 dimensions.
221+
### [REST API](#tab/rest-api)
64222

65-
## Choose an embedding model in Azure AI Foundry
66-
67-
In the Azure AI Foundry portal, you have the option of creating a search index when you add knowledge to your agent workflow. A wizard guides you through the steps. When asked to provide an embedding model that vectorizes your plain text content, you can use one of the following supported models:
68-
69-
+ text-embedding-3-large
70-
+ text-embedding-3-small
71-
+ text-embedding-ada-002
72-
+ Cohere-embed-v3-english
73-
+ Cohere-embed-v3-multilingual
74-
75-
Your model must already be deployed and you must have permission to access it. For more information, see [Deploy AI models in Azure AI Foundry portal](/azure/ai-foundry/concepts/deployments-overview).
76-
77-
## Tips and recommendations for embedding model integration
78-
79-
+ **Identify use cases**: Evaluate the specific use cases where embedding model integration for vector search features can add value to your search solution. This can include multimodal or matching image content with text content, multilingual search, or similarity search.
80-
81-
+ **Design a chunking strategy**: Embedding models have limits on the number of tokens they can accept, which introduces a data chunking requirement for large files. For more information, see [Chunk large documents for vector search solutions](vector-search-how-to-chunk-documents.md).
82-
83-
+ **Optimize cost and performance**: Vector search can be resource-intensive and is subject to maximum limits, so consider only vectorizing the fields that contain semantic meaning. [Reduce vector size](vector-search-how-to-configure-compression-storage.md) so that you can store more vectors for the same price.
84-
85-
+ **Choose the right embedding model:** Select an appropriate model for your specific use case, such as word embeddings for text-based searches or image embeddings for visual searches. Consider using pretrained models like **text-embedding-ada-002** from OpenAI or **Image Retrieval** REST API from [Azure AI Computer Vision](/azure/ai-services/computer-vision/how-to/image-retrieval).
86-
87-
+ **Normalize Vector lengths**: Ensure that the vector lengths are normalized before storing them in the search index to improve the accuracy and performance of similarity search. Most pretrained models already are normalized but not all.
223+
```http
224+
POST https://YOUR-OPENAI-RESOURCE.openai.azure.com/openai/deployments/text-embedding-ada-002/embeddings?api-version=2024-02-01
225+
Authorization: Bearer YOUR-API-KEY
226+
Content-Type: application/json
227+
228+
{
229+
"input": "How do I use REST APIs in VS Code?"
230+
}
231+
```
88232

89-
+ **Fine-tune the model**: If needed, fine-tune the selected model on your domain-specific data to improve its performance and relevance to your search application.
233+
---
90234

91-
+ **Test and iterate**: Continuously test and refine your embedding model integration to achieve the desired search performance and user satisfaction.
235+
The output is a vector array of 1,536 dimensions.
92236

93-
## Next steps
237+
## Related content
94238

95-
+ [Understanding embeddings in Azure OpenAI in Azure AI Foundry Models](/azure/ai-services/openai/concepts/understand-embeddings)
96-
+ [Learn how to generate embeddings](/azure/ai-services/openai/how-to/embeddings?tabs=console)
239+
+ [Understand embeddings in Azure OpenAI in Azure AI Foundry Models](/azure/ai-services/openai/concepts/understand-embeddings)
240+
+ [Generate embeddings with Azure OpenAI](/azure/ai-services/openai/how-to/embeddings?tabs=console)
97241
+ [Tutorial: Explore Azure OpenAI embeddings and document search](/azure/ai-services/openai/tutorials/embeddings?tabs=command-line)
98242
+ [Tutorial: Choose a model (RAG solutions in Azure AI Search)](tutorial-rag-build-solution-models.md)

0 commit comments

Comments
 (0)