You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-create-index.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,21 +23,21 @@ In Azure Cognitive Search, vector data is represented in fields in a [search ind
23
23
24
24
Most existing services support vector search. For a small subset of services created prior to January 2019, an index containing vector fields will fail on creation. In this situation, a new service must be created.
25
25
26
-
+ Pre-existing embeddings in your source documents. Cognitive Search doesn't generate embeddings. We recommend Azure OpenAI but you can use any model for vectorization. For more information, see [Create and use embeddings for search queries and documents](vector-search-how-to-generate-embeddings.md).
26
+
+ Pre-existing vector embeddings in your source documents. Cognitive Search doesn't generate vectors. We recommend Azure OpenAI but you can use any model for vectorization. For more information, see [Create and use embeddings for search queries and documents](vector-search-how-to-generate-embeddings.md).
27
27
28
-
Be sure to use the same embedding model for both indexing and queries. At query time, you must include a step that converts the user's query into a vector.
28
+
Be sure to use the same embedding model for both indexing and queries. At query time, you must include a step that converts the user's query string into a vector.
29
29
30
30
## Prepare documents for indexing
31
31
32
-
Prior to indexing, assemble a document payload that includes vector data. The document structure must conform to the index schema. Make sure your documents include the following elements:
32
+
Prior to indexing, assemble a document payload that includes vector data. The document structure must conform to the index schema. Make sure your documents:
33
33
34
-
1. Provide a unique value or a metadata property that uniquely identifies each source document. All search indexes require a document key as a unique identifier, which means all documents must have one field that can be mapped to type `Edm.String` and `key=true` in the search index.
34
+
1. Provide a field or a metadata property that uniquely identifies each document. All search indexes require a document key. Your documents must have one field or property that can be mapped to type `Edm.String` and `key=true` in the search index.
35
35
36
36
1. Provide vector data (an array of single-precision floating point numbers) in source fields.
37
37
38
-
Vector fields contain vector data generated by embedding models. We recommend the embedding models in [Azure OpenAI](https://aka.ms/oai/access), such as **text-embedding-ada-002** for text documents or the [Image Retrieval REST API](/rest/api/computervision/2023-02-01-preview/image-retrieval/vectorize-image) for images.
38
+
Vector fields contain vector data generated by embedding models, one embedding per field. We recommend the embedding models in [Azure OpenAI](https://aka.ms/oai/access), such as **text-embedding-ada-002** for text documents or the [Image Retrieval REST API](/rest/api/computervision/2023-02-01-preview/image-retrieval/vectorize-image) for images.
39
39
40
-
1. Provide any other fields with alphanumeric content for any nonvector queries you want to support, as well as for hybrid query scenarios that include full text search or semantic ranking in the same request.
40
+
1. Provide other fields with alphanumeric content for the search response and for hybrid query scenarios that include full text search or semantic ranking in the same request.
41
41
42
42
Your search index should include fields and content for all of the query scenarios you want to support. Suppose you want to search or filter over product names, versions, metadata, or addresses. In this case, similarity search isn't especially helpful. Keyword search, geo-search, or filters would be a better choice. A search index that includes a comprehensive field collection of vector and non-vector data provides maximum flexibility for query construction and response composition.
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-query.md
+14-8Lines changed: 14 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -81,28 +81,34 @@ The expected response is 202 for a successful call to the deployed model. The bo
81
81
82
82
## Configure a query response
83
83
84
-
When you're setting up the vector query, think about the response structure. You can control the shape of the response by choosing which fields are in the results and how many results are included. The search engine ranks the results. Ranking algorithms aren't generally configurable.
84
+
When you're setting up the vector query, think about the response structure. The response is a flattened rowset. Parameters on the query determine which fields are in each row and how many rows are in the response. The search engine ranks the matching documents and returns the most relevant results.
85
85
86
86
### Fields in a response
87
87
88
-
Search results are composed of either all "retrievable" fields (a REST API default) or the fields explicitly listed in a "select" parameter on the query. In the examples that follow, each one includes a "select" statement that specifies text (non-vector) fields to include the response.
88
+
Search results are composed of "retrievable" fields from your search index. A result is either:
89
89
90
-
Vectors aren't designed for readability, so avoid returning them in the response. Instead, choose non-vector fields that are representative of the search document. For example, if the query targets a "descriptionVector" field, return an equivalent text field if you have one ("description") in the response.
90
+
+ All "retrievable" fields (a REST API default).
91
+
+ Fields explicitly listed in a "select" parameter on the query.
92
+
93
+
The examples in this article include a "select" statement that specifies text (non-vector) fields to include the response.
94
+
95
+
> [!NOTE]
96
+
> Vectors aren't designed for readability, so avoid returning them in the response. Instead, choose non-vector fields that are representative of the search document. For example, if the query targets a "descriptionVector" field, return an equivalent text field if you have one ("description") in the response.
91
97
92
98
### Number of results
93
99
94
-
A query might match to any number of documents, up to all of them in the search index if the search criteria are weak. However, the size of the results sent back in the response is determined by the query parameters "k" and "top". Maximum results in a response are either:
100
+
A query might match to any number of documents, as many as all of them if the search criteria are weak (for example "search=*" for a null query). Because it's seldom practical to return unbounded results, you should specify a maximum for the response:
95
101
96
102
+`"k": n` results for vector-only queries
97
-
+`"top": n` results for hybrid queries
103
+
+`"top": n` results for hybrid queries that include a "search" parameter
98
104
99
-
Both "k" and "top" are optional. Unspecified, the default number of results in a response is 50. You can set "top" and "skip" to [page through more results](search-pagination-page-layout.md#paging-results).
105
+
Both "k" and "top" are optional. Unspecified, the default number of results in a response is 50. You can set "top" and "skip" to [page through more results](search-pagination-page-layout.md#paging-results) or to change the default.
100
106
101
107
### Ranking
102
108
103
109
Ranking of results is computed by either:
104
110
105
-
+ The similarity metric specified in the index `vectorConfiguration` for a vector-only query.
111
+
+ The similarity metric specified in the index `vectorConfiguration` for a vector-only query. Valid values are `cosine` , `euclidean`, and `dotProduct`.
106
112
+ Reciprocal Rank Fusion (RRF) if there are multiple sets of search results.
107
113
108
114
Azure OpenAI embedding models use cosine similarity, so if you're using Azure OpenAI embedding models, `cosine` is the recommended metric. Other supported ranking metrics include `euclidean` and `dotProduct`.
@@ -113,7 +119,7 @@ Multiple sets are created if the query targets multiple vector fields, or if the
113
119
114
120
In this vector query, which is shortened for brevity, the "value" contains the vectorized text of the query input. The "fields" property specifies which vector fields are searched. The "k" property specifies the number of nearest neighbors to return as top hits.
115
121
116
-
Recall that the vector query was generated from this string: `"what Azure services support full text search"`. The search targets the "contentVector" field.
122
+
The sample vector query for this article is: `"what Azure services support full text search"`. The query targets the "contentVector" field.
117
123
118
124
```http
119
125
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version={{api-version}}
0 commit comments