You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-create-index.md
+11-5Lines changed: 11 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
7
7
ms.author: heidist
8
8
ms.service: cognitive-search
9
9
ms.topic: how-to
10
-
ms.date: 07/07/2023
10
+
ms.date: 07/14/2023
11
11
---
12
12
13
13
# Add vector fields to a search index
@@ -39,7 +39,7 @@ Prior to indexing, assemble a document payload that includes vector data. The do
39
39
40
40
1. Provide any other fields with alphanumeric content for any nonvector queries you want to support, as well as for hybrid query scenarios that include full text search or semantic ranking in the same request.
41
41
42
-
Your search index should include fields and content for all of the query scenarios you want to support. Suppose you want to search or filter over product names, versions, metadata, or addresses. In this case, similarity search isn't especially helpful and keyword search, geo-search, or filters would be a better choice. A search index that includes a comprehensive field collection of vector and non-vector data provides maximum flexibility for query construction.
42
+
Your search index should include fields and content for all of the query scenarios you want to support. Suppose you want to search or filter over product names, versions, metadata, or addresses. In this case, similarity search isn't especially helpful. Keyword search, geo-search, or filters would be a better choice. A search index that includes a comprehensive field collection of vector and non-vector data provides maximum flexibility for query construction and response composition.
43
43
44
44
## Add a vector field to the fields collection
45
45
@@ -71,14 +71,18 @@ The schema must include fields for the document key, vector fields, and any othe
71
71
}
72
72
```
73
73
74
-
1. Add vector fields to the fields collection. You can store one generated embedding per document field. For each field:
74
+
1. Add fields that define the substance and structure of the content you're indexing. At a minimum, you need a document key.
75
75
76
-
+ Assign the `Collection(Edm.Single)` data type
76
+
You should also add fields that are useful in the query response. The example below shows vector fields for title and content ("titleVector", "contentVector"). It also provides fields for equivalent textual content ("title", "content") that users can read in a search result.
77
+
78
+
1. Add vector fields to the fields collection. You can store one generated embedding per document field. For each vector field:
79
+
80
+
+ Assign the `Collection(Edm.Single)` data type.
81
+
+ For `Collection(Edm.Single)`, the "filterable", "facetable", "sortable" attributes are "false" by default. Don't set them to "true" because those behaviors don't apply within the context of vector fields and the request will fail.
77
82
+ Provide the name of the vector search algorithm configuration.
78
83
+ Provide the number of dimensions generated by the embedding model.
79
84
+ "searchable" must be "true".
80
85
+ "retrievable" set to "true" allows you to display the raw vectors (for example, as a verification step), but doing so will increase storage usage. Set to "false" if you don't need to return raw vectors.
81
-
+ For `Collection(Edm.Single)`, the "filterable", "facetable", "sortable" attributes are "false" by default. Don't set them to "true" because those behaviors don't apply within the context of vector fields and the request will fail.
82
86
83
87
```http
84
88
PUT https://my-search-service.search.windows.net/indexes/my-index?api-version=2023-07-01-Preview&allowIndexDowntime=true
@@ -97,6 +101,8 @@ The schema must include fields for the document key, vector fields, and any othe
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-query.md
+23-7Lines changed: 23 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ ms.date: 07/14/2023
17
17
18
18
In Azure Cognitive Search, if you added vector fields to a search index, this article explains how to query those fields. It also explains how to combine vector queries with full text search and semantic search for hybrid query combination scenarios.
19
19
20
-
Query execution in Cognitive Search doesn't include vector conversion. Encoding (text-to-vector) must be performed external to a search service. For both indexing and querying, your application code should call the same embedding model. To retrieve the text associated with a vector, remember that a query response can include non-vector fields in your search index. This allows you to query on a vector field (descriptionVector) but return the text field (description) in the response.
20
+
Query execution in Cognitive Search doesn't include vector conversion. Encoding (text-to-vector) of the query string requires that you pass the text to an embedding model for vectorization. The output of the call to the embedding model is then passed to the search engine for similarity search over vector fields.
21
21
22
22
## Prerequisites
23
23
@@ -31,13 +31,13 @@ Query execution in Cognitive Search doesn't include vector conversion. Encoding
31
31
32
32
## Check your index for vector fields
33
33
34
-
In the index schema, check for:
34
+
If you aren't sure whether your search index already has vector fields, look for:
35
35
36
-
+ A `vectorSearch` algorithm configuration.
36
+
+ A `vectorSearch` algorithm configuration embedded in the index schema.
37
37
38
38
+ In the fields collection, look for fields of type `Collection(Edm.Single)`, with a `dimensions` attribute and a `vectorSearchConfiguration` set to the name of the `vectorSearch` algorithm configuration used by the field.
39
39
40
-
Search documents containing vector data have fields containing many hundreds of floating point values.
40
+
You can also send an empty query (`search=*`) against the index. Search documents containing vector data have fields containing many hundreds of floating point values.
41
41
42
42
## Convert query input into a vector
43
43
@@ -54,7 +54,7 @@ api-key: {{admin-api-key}}
54
54
}
55
55
```
56
56
57
-
The expected response is 202 for a successful call to the deployed model. The body of the response provides the vector representation of the "input". The vector for the query is in the "embedding" field. For testing purposes, you would copy the embedding value into "vector.value" in a query request, using syntax from the next sections. Note that the actual response for this query included 1536 embeddings, trimmed here for brevity.
57
+
The expected response is 202 for a successful call to the deployed model. The body of the response provides the vector representation of the "input". The vector for the query is in the "embedding" field. For testing purposes, you would copy the value of the "embedding" array into "vector.value" in a query request, using syntax shown in the next several sections. The actual response for this call to the deployment model includes 1536 embeddings, trimmed here for brevity.
58
58
59
59
```json
60
60
{
@@ -79,6 +79,20 @@ The expected response is 202 for a successful call to the deployed model. The bo
79
79
}
80
80
```
81
81
82
+
## Design a query response
83
+
84
+
When you're setting up the vector query, think about how you want to structure the response. Search results are composed of either all "retrievable" fields (a REST API default) or the fields explicitly listed in a "select" parameter. In the query examples that follow, each one includes a "select" parameter that specifies text (non-vector) content for the response.
85
+
86
+
Vector fields themselves aren't human readable, so avoid returning them in the response. Instead, choose non-vector fields that provide equivalent information from the same search document. For example, if the query is on a vector field ("descriptionVector"), return an equivalent text field ("description") in the response.
87
+
88
+
The quantity of results are determines by query parameters. Quantity is either:
89
+
90
+
+`"k": n` results for vector-only queries
91
+
+`"top": n` results for hybrid queries
92
+
93
+
> [!NOTE]
94
+
> If you're familiar with full text search in Cognitive Search, you already know that a term or keyword, synonym, or filter criteria must match in order for a document to qualify as a match. Similarity search is less exacting because it's comparing vector compositions. It's possible for the HNSW model to sometimes return matches that don't seem especially relevant.
95
+
82
96
## Query syntax for vector search
83
97
84
98
In this vector query, which is shortened for brevity, the "value" contains the vectorized text of the query input. The "fields" property specifies which vector fields are searched. The "k" property specifies the number of nearest neighbors to return as top hits.
@@ -107,6 +121,8 @@ api-key: {{admin-api-key}}
107
121
108
122
The response includes 5 matches, and each result provides a search score, title, content, and category. In a similarity search, the response always includes "k" matches, even if the similarity is weak. For indexes that have fewer than "k" documents, only those number of documents will be returned.
109
123
124
+
Notice that "select" returns textual fields from the index. Although the vector field is "retrievable" in this example, its content isn't usable as a search result.
125
+
110
126
## Query syntax for hybrid search
111
127
112
128
A hybrid query combines full text search and vector search. The search engine runs full text and vector queries in parallel. All matches are evaluated for relevance using Reciprocal Rank Fusion (RRF) and a single result set is returned in the response.
@@ -145,7 +161,7 @@ api-key: {{admin-api-key}}
145
161
146
162
## Query syntax for vector query over multiple fields
147
163
148
-
You can set "vector.fields" property to multiple vector fields. For example, the Postman collection has vector fields named titleVector and contentVector. Your vector query executes over both the titleVector and contentVector fields, which must have the same embedding space since they share the same query vector.
164
+
You can set "vector.fields" property to multiple vector fields. For example, the Postman collection has vector fields named "titleVector" and "contentVector". Your vector query executes over both the "titleVector" and "contentVector" fields, which must have the same embedding space since they share the same query vector.
149
165
150
166
```http
151
167
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version={{api-version}}
@@ -170,7 +186,7 @@ api-key: {{admin-api-key}}
170
186
171
187
## Query syntax for multiple vector queries
172
188
173
-
You can issue a search request containing multiple query vectors using the `vectors` query parameter. The queries execute concurrently in the search index, each one looking for similarities in the target vector fields. The result set is a union of the documents that matched both vector queries. A common example of this query request is when using models such as [CLIP](https://openai.com/research/clip) for a multi-modal vector search where the same model can vectorize image and non-image content.
189
+
You can issue a search request containing multiple query vectors using the "vectors" query parameter. The queries execute concurrently in the search index, each one looking for similarities in the target vector fields. The result set is a union of the documents that matched both vector queries. A common example of this query request is when using models such as [CLIP](https://openai.com/research/clip) for a multi-modal vector search where the same model can vectorize image and non-image content.
174
190
175
191
You must use REST for this scenario. Currently, there isn't support for multiple vector queries in the alpha SDKs.
0 commit comments