Skip to content

Commit ce7c481

Browse files
Merge pull request #247966 from HeidiSteen/heidist-vectors
[azure search] vector query updates, proxy diagram
2 parents c1e8931 + e0af2d3 commit ce7c481

File tree

7 files changed

+56
-19
lines changed

7 files changed

+56
-19
lines changed
26.2 KB
Loading
27.6 KB
Loading
19.8 KB
Loading
6.6 KB
Loading

articles/search/vector-search-how-to-create-index.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -7,31 +7,31 @@ author: HeidiSteen
77
ms.author: heidist
88
ms.service: cognitive-search
99
ms.topic: how-to
10-
ms.date: 07/31/2023
10+
ms.date: 08/10/2023
1111
---
1212

1313
# Add vector fields to a search index
1414

1515
> [!IMPORTANT]
1616
> Vector search is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). It's available through the Azure portal, preview REST API, and [beta client libraries](https://github.com/Azure/cognitive-search-vector-pr#readme).
1717
18-
In Azure Cognitive Search, vector data is indexed as *vector fields* within a [search index](search-what-is-an-index.md), using a *vector configuration* to create the embedding space.
18+
In Azure Cognitive Search, vector data is indexed as *vector fields* in a [search index](search-what-is-an-index.md), using a *vector configuration* to specify the embedding space. Do this to create an index schema that contains vector data:
1919

20-
+ A vector field is of type `Collection(Edm.Single)` so that it can hold single-precision floating-point values. It also has a "dimensions" property and a "vectorConfiguration" property.
20+
+ Add one or more vector fields of type `Collection(Edm.Single)`. This type holds single-precision floating-point values. A field of this type also has a "dimensions" property and a "vectorConfiguration" property.
2121

22-
+ A vector configuration specifies the algorithm and parameters used during indexing to create the proximity graph. Currently, only Hierarchical Navigable Small World (HNSW) is supported.
22+
+ Add one or more vector configurations. A configuration specifies the algorithm and parameters used during indexing to create "nearest neighbor" information among the vector nodes. Currently, only Hierarchical Navigable Small World (HNSW) is supported.
2323

24-
During indexing, HNSW determines how closely the vectors match and stores the neighborhood information among vectors in the index. You can have multiple configurations within an index if you want different HNSW parameter combinations. As long as the vector fields contain embeddings from the same model, having a different vector configuration per field has no effect on queries.
24+
During indexing, HNSW determines how closely the vectors match and stores the neighborhood information as a proximity graph in the index. You can have multiple configurations within an index if you want different HNSW parameter combinations. As long as the vector fields contain embeddings from the same model, having a different vector configuration per field has no effect on queries.
2525

26-
## Prerequisites
26+
[Loading the index with vector data](#load-vector-data-for-indexing) is a separate step that can occur once the index definition is in place.
2727

28-
+ Azure Cognitive Search, in any region and on any tier.
28+
## Prerequisites
2929

30-
Most existing services support vector search. For a small subset of services created prior to January 2019, an index containing vector fields fails on creation. In this situation, a new service must be created.
30+
+ Azure Cognitive Search, in any region and on any tier. Most existing services support vector search. For a small subset of services created prior to January 2019, an index containing vector fields fails on creation. In this situation, a new service must be created.
3131

3232
+ Pre-existing vector embeddings in your source documents. Cognitive Search doesn't generate vectors. We recommend [Azure OpenAI embedding models](/azure/ai-services/openai/concepts/models#embeddings-models) but you can use any model for vectorization.
3333

34-
+ You should know the dimensions limit of the model used to create the embeddings and how similarity is computed. For **text-embedding-ada-002**, the length of the numerical vector is 1536. Similarity is computed using `cosine`.
34+
+ You should know the dimensions limit of the model used to create the embeddings and how similarity is computed. In Azure OpenAI, for **text-embedding-ada-002**, the length of the numerical vector is 1536. Similarity is computed using `cosine`.
3535

3636
> [!NOTE]
3737
> During query execution, your workflow must call an embedding model that converts the user's query string into a vector. Be sure to use the same embedding model for both queries and indexing. For more information, see [Create and use embeddings for search queries and documents](vector-search-how-to-generate-embeddings.md).
@@ -56,7 +56,7 @@ A short example of a documents payload that includes vector and non-vector field
5656

5757
## Add a vector field to the fields collection
5858

59-
The schema must include fields for the document key, vector fields, and any other fields that you require for hybrid search scenarios.
59+
The schema must include a `vectorConfiguration`` section, a field for the document key, vector fields, and any other fields that you require for hybrid search scenarios.
6060

6161
### [**Azure portal**](#tab/portal-add-field)
6262

@@ -277,13 +277,13 @@ Data sources provide the vectors in whatever format the data source supports (su
277277

278278
For validation purposes, you can query the index using Search Explorer in Azure portal or a REST API call. Because Cognitive Search can't convert a vector to human-readable text, try to return fields from the same document that provide evidence of the match. For example, if the vector query targets the "titleVector" field, you could select "title" for the search results.
279279

280-
### [**Azure portal**](#tab/portal-add-field)
280+
Fields must be attributed as "retrievable" to be included in the results.
281281

282-
You can use [Search Explorer](search-explorer.md) to query an index that contains vector fields. However, the query string in Search Explorer is plain text and isn't converted to a vector, so you can't use Search Explorer to test vector queries, but you can verify that data import occurred and that vector fields are populated with the expected numeric values.
282+
### [**Azure portal**](#tab/portal-add-field)
283283

284-
Fields must be attributed as "retrievable" to be included in the results.
284+
You can use [Search Explorer](search-explorer.md) to query an index. Search explorer has two views: Query view (default) and JSON view. The default query view is for full text search only. You can issue an empty search (`search=*`) to return all fields, including vector fields, as a quick check to confirm the presence of vector content.
285285

286-
You can issue an empty search (`search=*`) to return all fields, including vector fields. You can also `$select` specific fields for the result set.
286+
If you want to execute a vector query, use the JSON view and paste in a JSON definition of a vector query. For more information, see [Query vector data in a search index](vector-search-how-to-query.md).
287287

288288
### [**REST API**](#tab/rest-add-field)
289289

articles/search/vector-search-how-to-query.md

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
77
ms.author: heidist
88
ms.service: cognitive-search
99
ms.topic: how-to
10-
ms.date: 07/31/2023
10+
ms.date: 08/10/2023
1111
---
1212

1313
# Query vector data in a search index
@@ -27,7 +27,7 @@ All results are returned in plain text, including vectors. If you use Search Exp
2727

2828
+ A search index containing vector fields. See [Add vector fields to a search index](vector-search-how-to-query.md).
2929

30-
+ Use REST API version 2023-07-01-preview or Azure portal to query vector fields. You can also use [beta client libraries](https://github.com/Azure/cognitive-search-vector-pr/tree/main).
30+
+ Use REST API version 2023-07-01-preview, the [beta client libraries](https://github.com/Azure/cognitive-search-vector-pr/tree/main), or Search Explorer in the Azure portal.
3131

3232
+ (Optional) If you want to also use [semantic search (preview)](semantic-search-overview.md) and vector search together, your search service must be Basic tier or higher, with [semantic search enabled](semantic-search-overview.md#enable-semantic-search).
3333

@@ -56,7 +56,10 @@ api-key: {{admin-api-key}}
5656
}
5757
```
5858

59-
The expected response is 202 for a successful call to the deployed model. The body of the response provides the vector representation of the "input". The vector for the query is in the "embedding" field. For testing purposes, you would copy the value of the "embedding" array into "vector.value" in a query request, using syntax shown in the next several sections. The actual response for this call to the deployment model includes 1536 embeddings, trimmed here for brevity.
59+
The expected response is 202 for a successful call to the deployed model.
60+
The "embedding" field in the body of the response is the vector representation of the query string "input". For testing purposes, you would copy the value of the "embedding" array into "vector.value" in a query request, using syntax shown in the next several sections.
61+
62+
The actual response for this POST call to the deployment model includes 1536 embeddings, trimmed here to just the first few vectors for readability.
6063

6164
```json
6265
{
@@ -83,6 +86,28 @@ The expected response is 202 for a successful call to the deployed model. The bo
8386

8487
## Query syntax for vector search
8588

89+
### [**Azure portal**](#tab/portal-vector-query)
90+
91+
Be sure to the **JSON view** and formulate the query in JSON. The search bar in **Query view** is for full text search and will treat any vector input as plain text.
92+
93+
1. Sign in to Azure portal and find your search service.
94+
95+
1. Under **Search management** and **Indexes**, select the index.
96+
97+
:::image type="content" source="media/vector-search-how-to-query/select-index.png" alt-text="Screenshot of the indexes menu." border="true":::
98+
99+
1. On Search Explorer, under **View**, select **JSON view**.
100+
101+
:::image type="content" source="media/vector-search-how-to-query/select-json-view.png" alt-text="Screenshot of the index list." border="true":::
102+
103+
1. By default, the search API is 2023-07-01-Preview. This is the correct API version for vector search.
104+
105+
1. Paste in a JSON vector query, and then select **Search**. You can use the REST example as a template for your JSON query.
106+
107+
:::image type="content" source="media/vector-search-how-to-query/paste-vector-query.png" alt-text="Screenshot of the JSON query." border="true":::
108+
109+
### [**REST API**](#tab/rest-vector-query)
110+
86111
In this vector query, which is shortened for brevity, the "value" contains the vectorized text of the query input. The "fields" property specifies which vector fields are searched. The "k" property specifies the number of nearest neighbors to return as top hits.
87112

88113
The sample vector query for this article is: `"what Azure services support full text search"`. The query targets the "contentVector" field.
@@ -111,6 +136,8 @@ The response includes 5 matches, and each result provides a search score, title,
111136

112137
Notice that "select" returns textual fields from the index. Although the vector field is "retrievable" in this example, its content isn't usable as a search result.
113138

139+
---
140+
114141
## Query syntax for hybrid search
115142

116143
A hybrid query combines full text search and vector search, where the `"search"` parameter takes a query string and `"vectors.value"` takes the vector query. The search engine runs full text and vector queries in parallel. All matches are evaluated for relevance using Reciprocal Rank Fusion (RRF) and a single result set is returned in the response.

articles/search/vector-search-overview.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: robertklee
77
ms.author: robertlee
88
ms.service: cognitive-search
99
ms.topic: conceptual
10-
ms.date: 07/28/2023
10+
ms.date: 08/10/2023
1111
---
1212

1313
# Vector search within Azure Cognitive Search
@@ -25,11 +25,21 @@ We recommend this article for background, but if you'd rather get started, follo
2525
> + [Load vector data](search-what-is-data-import.md) into an index using push or pull methodologies.
2626
> + [Query vector data](vector-search-how-to-query.md) using Azure portal or the preview REST APIs.
2727
28+
You could also start with the [REST quickstart](search-get-started-vector.md) or [code samples on GitHub](https://github.com/Azure/cognitive-search-vector-pr).
29+
2830
## What's vector search in Cognitive Search?
2931

3032
Vector search is a new capability for indexing, storing, and retrieving vector embeddings from a search index. You can use it to power similarity search, multi-modal search, recommendations engines, or applications implementing the [Retrieval Augmented Generation (RAG) architecture](https://arxiv.org/abs/2005.11401).
3133

32-
Support for vector search is in public preview and available through the [**2023-07-01-Preview REST APIs**](/rest/api/searchservice/index-preview). To use vector search, define a *vector field* in the index definition and index documents with vector data. Then you can issue a search request with a query vector, returning documents with the requested `k` nearest neighbors (kNN) according to the selected vector similarity metric.
34+
Support for vector search is in public preview and available through the [**2023-07-01-Preview REST APIs**](/rest/api/searchservice/index-preview), Azure portal, and the more recent beta packages of the Azure SDKs for [.NET](https://www.nuget.org/packages/Azure.Search.Documents/11.5.0-beta.4), [Python](https://pypi.org/project/azure-search-documents/11.4.0b8/), and [JavaScript](https://www.npmjs.com/package/@azure/search-documents/v/12.0.0-beta.2).
35+
36+
The following diagram shows the indexing and query workflows for vector search.
37+
38+
:::image type="content" source="media/vector-search-overview/vector-search-architecture-diagram.png" alt-text="Architecture of vector search workflow." border="true":::
39+
40+
On the indexing side, prepare and load source documents that contain embeddings. Cognitive Search doesn't generate embeddings, so your solution should include calls to Azure OpenAI or other models that can create a vector representation of your image, audio, text, and other content. Add a *vector field* in your index definition on Cognitive Search. Load the index with a documents payload that includes the embeddings. Your index is now ready to query.
41+
42+
On the query side, in your client application, collect the query input. Add a step that converts the input into a vector, and then send the vector query to your index on Cognitive Search for a similarity search. Cognitive Search returns documents with the requested `k` nearest neighbors (kNN).
3343

3444
You can index vector data as fields in documents alongside textual and other types of content. Vector queries can be issued independently or in combination with other query types, including term queries (hybrid search) and filters in the same search request.
3545

0 commit comments

Comments
 (0)