Skip to content

Commit 1e43265

Browse files
Merge pull request #247984 from HeidiSteen/heidist-vectors
[azure search] tabs for placeholder content
2 parents 0b00ba6 + 4183a43 commit 1e43265

File tree

4 files changed

+82
-38
lines changed

4 files changed

+82
-38
lines changed
44.1 KB
Loading

articles/search/vector-search-how-to-create-index.md

Lines changed: 47 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -15,27 +15,21 @@ ms.date: 08/10/2023
1515
> [!IMPORTANT]
1616
> Vector search is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). It's available through the Azure portal, preview REST API, and [beta client libraries](https://github.com/Azure/cognitive-search-vector-pr#readme).
1717
18-
In Azure Cognitive Search, vector data is indexed as *vector fields* in a [search index](search-what-is-an-index.md), using a *vector configuration* to specify the embedding space. Do this to create an index schema that contains vector data:
18+
In Azure Cognitive Search, vector data is indexed as *vector fields* in a [search index](search-what-is-an-index.md), using a *vector configuration* to specify the embedding space. Follow these steps to index vector data:
1919

20-
+ Add one or more vector fields of type `Collection(Edm.Single)`. This type holds single-precision floating-point values. A field of this type also has a "dimensions" property and a "vectorConfiguration" property.
21-
22-
+ Add one or more vector configurations. A configuration specifies the algorithm and parameters used during indexing to create "nearest neighbor" information among the vector nodes. Currently, only Hierarchical Navigable Small World (HNSW) is supported.
23-
24-
During indexing, HNSW determines how closely the vectors match and stores the neighborhood information as a proximity graph in the index. You can have multiple configurations within an index if you want different HNSW parameter combinations. As long as the vector fields contain embeddings from the same model, having a different vector configuration per field has no effect on queries.
25-
26-
[Loading the index with vector data](#load-vector-data-for-indexing) is a separate step that can occur once the index definition is in place.
20+
> [!div class="checklist"]
21+
> + Add one or more vector fields to the index schema.
22+
> + Add one or more vector configurations to the index schema.
23+
> + Load the index with vector data [as a separate step](#load-vector-data-for-indexing), after the index schema is defined.
2724
2825
## Prerequisites
2926

3027
+ Azure Cognitive Search, in any region and on any tier. Most existing services support vector search. For a small subset of services created prior to January 2019, an index containing vector fields fails on creation. In this situation, a new service must be created.
3128

32-
+ Pre-existing vector embeddings in your source documents. Cognitive Search doesn't generate vectors. We recommend [Azure OpenAI embedding models](/azure/ai-services/openai/concepts/models#embeddings-models) but you can use any model for vectorization.
29+
+ Pre-existing vector embeddings in your source documents. Cognitive Search doesn't generate vectors. We recommend [Azure OpenAI embedding models](/azure/ai-services/openai/concepts/models#embeddings-models) but you can use any model for vectorization. For more information, see [Create and use embeddings for search queries and documents](vector-search-how-to-generate-embeddings.md).
3330

3431
+ You should know the dimensions limit of the model used to create the embeddings and how similarity is computed. In Azure OpenAI, for **text-embedding-ada-002**, the length of the numerical vector is 1536. Similarity is computed using `cosine`.
3532

36-
> [!NOTE]
37-
> During query execution, your workflow must call an embedding model that converts the user's query string into a vector. Be sure to use the same embedding model for both queries and indexing. For more information, see [Create and use embeddings for search queries and documents](vector-search-how-to-generate-embeddings.md).
38-
3933
## Prepare documents for indexing
4034

4135
Prior to indexing, assemble a document payload that includes fields of vector and non-vector data. The document structure must conform to the index schema.
@@ -56,13 +50,21 @@ A short example of a documents payload that includes vector and non-vector field
5650

5751
## Add a vector field to the fields collection
5852

59-
The schema must include a `vectorConfiguration`` section, a field for the document key, vector fields, and any other fields that you require for hybrid search scenarios.
53+
The schema must include a `vectorConfiguration` section, a field for the document key, vector fields, and any other fields that you need for hybrid search scenarios.
54+
55+
+ `vectorConfiguration` specifies the algorithm and parameters used during indexing to create "nearest neighbor" information among the vector nodes. Currently, only Hierarchical Navigable Small World (HNSW) is supported.
56+
57+
+ Vector fields are of type `Collection(Edm.Single)` and single-precision floating-point values. A field of this type also has a `dimensions` property and a `vectorConfiguration` property
58+
59+
During indexing, HNSW determines how closely the vectors match and stores the neighborhood information as a proximity graph in the index. You can have multiple configurations within an index if you want different HNSW parameter combinations. As long as the vector fields contain embeddings from the same model, having a different vector configuration per field has no effect on queries.
60+
61+
You can use the Azure portal, REST APIs, or the beta packages of the Azure SDKs to index vectors.
6062

6163
### [**Azure portal**](#tab/portal-add-field)
6264

63-
You can use the index designer in the Azure portal to add vector field definitions. If the index doesn't have a vector configuration, you're prompted to create one when you add your first vector field to the index.
65+
Use the index designer in the Azure portal to add vector field definitions. If the index doesn't have a vector configuration, you're prompted to create one when you add your first vector field to the index.
6466

65-
Although you can add a field definition, there's no portal support for loading vectors into fields. Use the REST APIs or an SDK for data import.
67+
Although you can add a field to an index, there's no portal (Import data wizard) support for loading it with vector data. Instead, use the REST APIs or an SDK for data import.
6668

6769
1. [Sign in to Azure portal](https://portal.azure.com) and open your search service page in a browser.
6870

@@ -95,15 +97,15 @@ Although you can add a field definition, there's no portal support for loading v
9597
+ "efSearch default is 500. It's the number of nearest neighbors used during search.
9698
+ "Similarity metric" should be "cosine" if you're using Azure OpenAI, otherwise use the similarity metric of the embedding model. Supported values are `cosine`, `dotProduct`, `euclidean`.
9799

98-
If you're familiar with HNSW parameters, you might be wondering about "k" number of nearest neighbors to return in the result. In Cognitive Search, that value is set on the query request.
100+
If you're familiar with HNSW parameters, you might be wondering about how to set the "k" number of nearest neighbors to return in the result. In Cognitive Search, that value is set on the [query request](vector-search-how-to-query.md).
99101

100102
1. Select **Save** to save the vector configuration and the field definition.
101103

102104
### [**REST API**](#tab/rest-add-field)
103105

104-
In the following example, "title" and "content" contain textual content used in full text search and semantic search, while "titleVector" and "contentVector" contain vector data.
106+
Use the **2023-07-01-Prevew** REST API for vector scenarios. If you're updating an existing index to include vector fields, make sure the `allowIndexDowntime` query parameter is set to `true`.
105107

106-
Updating an existing index with vector fields requires `allowIndexDowntime` query parameter to be `true`.
108+
In the following REST API example, "title" and "content" contain textual content used in full text search and semantic search, while "titleVector" and "contentVector" contain vector data.
107109

108110
1. Use the [Create or Update Index Preview REST API](/rest/api/searchservice/preview-api/create-or-update-index) to create the index.
109111

@@ -202,13 +204,31 @@ Updating an existing index with vector fields requires `allowIndexDowntime` quer
202204
}
203205
```
204206
207+
### [**.NET**](#tab/dotnet-add-field)
208+
209+
+ Use the [**Azure.Search.Documents 11.5.0-beta.4**](https://www.nuget.org/packages/Azure.Search.Documents/11.5.0-beta.4) package for vector scenarios.
210+
211+
+ See the [cognitive-search-vector-pr](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-dotnet) GitHub repository for .NET code samples.
212+
213+
### [**Python**](#tab/python-add-field)
214+
215+
+ Use the [**Azure.Search.Documents 11.4.0b8**](https://pypi.org/project/azure-search-documents/11.4.0b8/) package for vector scenarios.
216+
217+
+ See the [cognitive-search-vector-pr](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-python) GitHub repository for Python code samples.
218+
219+
### [**JavaScript**](#tab/js-add-field)
220+
221+
+ Use the [**@azure/search-documents 12.0.0-beta.2**](https://www.npmjs.com/package/@azure/search-documents/v/12.0.0-beta.2) package for vector scenarios.
222+
223+
+ See the [cognitive-search-vector-pr](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-javascript) GitHub repository for JavaScript code samples.
224+
205225
---
206226
207227
## Load vector data for indexing
208228
209-
Content that you provide for indexing must conform to the index schema and include a unique string value for the document key. Vector data is loaded into one or more vector fields, which can coexist with other fields containing alphanumeric text.
229+
Content that you provide for indexing must conform to the index schema and include a unique string value for the document key. Vector data is loaded into one or more vector fields, which can coexist with other fields containing alphanumeric content.
210230
211-
You can use either [push or pull methodologies](search-what-is-data-import.md) for data ingestion. You can't use the portal for this step.
231+
You can use either [push or pull methodologies](search-what-is-data-import.md) for data ingestion. You can't use the portal (Import data wizard) for this step.
212232
213233
### [**Push APIs**](#tab/push)
214234
@@ -279,13 +299,15 @@ For validation purposes, you can query the index using Search Explorer in Azure
279299

280300
Fields must be attributed as "retrievable" to be included in the results.
281301

282-
### [**Azure portal**](#tab/portal-add-field)
302+
### [**Azure portal**](#tab/portal-check-index)
283303

284-
You can use [Search Explorer](search-explorer.md) to query an index. Search explorer has two views: Query view (default) and JSON view. The default query view is for full text search only. You can issue an empty search (`search=*`) to return all fields, including vector fields, as a quick check to confirm the presence of vector content.
304+
You can use [Search Explorer](search-explorer.md) to query an index. Search explorer has two views: Query view (default) and JSON view.
285305

286-
If you want to execute a vector query, use the JSON view and paste in a JSON definition of a vector query. For more information, see [Query vector data in a search index](vector-search-how-to-query.md).
306+
+ [Use the JSON view for vector queries](vector-search-how-to-query.md), pasting in a JSON definition of the vector query you want to execute.
287307

288-
### [**REST API**](#tab/rest-add-field)
308+
+ Use the default Query view for a quick confirmation that the index contains vectors. The query view is for full text search. Although you can't use it for vector queries, you can send an empty search (`search=*`) to check for content. The content of all fields, including vector fields, is returned as plain text.
309+
310+
### [**REST API**](#tab/rest-check-index)
289311

290312
The following REST API example is a vector query, but it returns only non-vector fields (title, content, category). Only fields marked as "retrievable" can be returned in search results.
291313

@@ -315,4 +337,4 @@ api-key: {{admin-api-key}}
315337

316338
As a next step, we recommend [Query vector data in a search index](vector-search-how-to-query.md).
317339

318-
You might also consider reviewing the demo code for [Python](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-python) or [C#](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-dotnet).
340+
You might also consider reviewing the demo code for [Python](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-python), [C#](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-dotnet) or [JavaScript](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-javascript).

articles/search/vector-search-how-to-query.md

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ ms.date: 08/10/2023
1515
> [!IMPORTANT]
1616
> Vector search is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). It's available through the Azure portal, preview REST API, and [beta client libraries](https://github.com/Azure/cognitive-search-vector-pr#readme).
1717
18-
In Azure Cognitive Search, if you added vector fields to a search index, this article explains how to query those fields. It also explains how to combine vector queries with full text search and semantic search for hybrid query combination scenarios.
18+
In Azure Cognitive Search, if you added vector fields to a search index, this article explains how to query those fields. It also explains how to combine vector queries with full text search and semantic search for *hybrid query* combination scenarios.
1919

20-
Query execution in Cognitive Search doesn't include vector conversion of the input string. Encoding (text-to-vector) of the query string requires that you pass the text to an embedding model for vectorization. You would then pass the output of the call to the embedding model to the search engine for similarity search over vector fields.
20+
Cognitive Search doesn't provide built-in vectorization of the input string. Encoding (text-to-vector) of the query string requires that you pass the string to an embedding model for vectorization. You would then pass the output of the call to the embedding model to the search engine for similarity search over vector fields.
2121

2222
All results are returned in plain text, including vectors. If you use Search Explorer in the Azure portal to query an index that contains vectors, the numeric vectors are returned in plain text. Because numeric vectors aren't useful in search results, choose other fields in the index as a proxy for the vector match. For example, if an index has "descriptionVector" and "descriptionText" fields, the query can match on "descriptionVector" but the search result shows "descriptionText". Use the `select` parameter to specify only human-readable fields in the results.
2323

@@ -45,7 +45,9 @@ You can also send an empty query (`search=*`) against the index. If the vector f
4545

4646
To query a vector field, the query itself must be a vector. To convert a text query string provided by a user into a vector representation, your application must call an embedding library that provides this capability. Use the same embedding library that you used to generate embeddings in the source documents.
4747

48-
Here's an example of a query string submitted to a deployment of an Azure OpenAI model:
48+
You can find multiple instances of query string conversion in the [cognitive-search-vector-pr](https://github.com/Azure/cognitive-search-vector-pr/) repository for each of the Azure SDKs.
49+
50+
Here's a REST API example of a query string submitted to a deployment of an Azure OpenAI model:
4951

5052
```http
5153
POST https://{{openai-service-name}}.openai.azure.com/openai/deployments/{{openai-deployment-name}}/embeddings?api-version={{openai-api-version}}
@@ -86,6 +88,8 @@ The actual response for this POST call to the deployment model includes 1536 emb
8688

8789
## Query syntax for vector search
8890

91+
You can use the Azure portal, REST APIs, or the beta packages of the Azure SDKs to query vectors.
92+
8993
### [**Azure portal**](#tab/portal-vector-query)
9094

9195
Be sure to the **JSON view** and formulate the query in JSON. The search bar in **Query view** is for full text search and will treat any vector input as plain text.
@@ -100,7 +104,7 @@ Be sure to the **JSON view** and formulate the query in JSON. The search bar in
100104

101105
:::image type="content" source="media/vector-search-how-to-query/select-json-view.png" alt-text="Screenshot of the index list." border="true":::
102106

103-
1. By default, the search API is 2023-07-01-Preview. This is the correct API version for vector search.
107+
1. By default, the search API is **2023-07-01-Preview**. This is the correct API version for vector search.
104108

105109
1. Paste in a JSON vector query, and then select **Search**. You can use the REST example as a template for your JSON query.
106110

@@ -136,6 +140,24 @@ The response includes 5 matches, and each result provides a search score, title,
136140

137141
Notice that "select" returns textual fields from the index. Although the vector field is "retrievable" in this example, its content isn't usable as a search result.
138142

143+
### [**.NET**](#tab/dotnet-vector-query)
144+
145+
+ Use the [**Azure.Search.Documents 11.5.0-beta.4**](https://www.nuget.org/packages/Azure.Search.Documents/11.5.0-beta.4) package for vector scenarios.
146+
147+
+ See the [cognitive-search-vector-pr](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-dotnet) GitHub repository for .NET code samples.
148+
149+
### [**Python**](#tab/python-vector-query)
150+
151+
+ Use the [**Azure.Search.Documents 11.4.0b8**](https://pypi.org/project/azure-search-documents/11.4.0b8/) package for vector scenarios.
152+
153+
+ See the [cognitive-search-vector-pr](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-python) GitHub repository for Python code samples.
154+
155+
### [**JavaScript**](#tab/js-vector-query)
156+
157+
+ Use the [**@azure/search-documents 12.0.0-beta.2**](https://www.npmjs.com/package/@azure/search-documents/v/12.0.0-beta.2) package for vector scenarios.
158+
159+
+ See the [cognitive-search-vector-pr](https://github.com/Azure/cognitive-search-vector-pr/tree/main/demo-javascript) GitHub repository for JavaScript code samples.
160+
139161
---
140162

141163
## Query syntax for hybrid search

0 commit comments

Comments
 (0)