You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-create-index.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,13 +46,13 @@ Your search index should include fields and content for all of the query scenari
46
46
The schema must include fields for the document key, vector fields, and any other fields that you require for hybrid search scenarios. In the following example, "title" and "content" contain textual content used in full text search and semantic search, while "titleVector" and "contentVector" contain vector data.
47
47
48
48
> [!NOTE]
49
-
> +You don't need a special "vector index" to use vector search. You'll only need to add one or more "vector fields" to a new or existing index.
50
-
> + Both new and existing indexes support vector search. However, there is a small subset of older services that don't support vector search. In this case, a new search service must be created to use it.
49
+
> +Vectors are added to fields in a search index. Internally, a *vector index* is created for each vector field, but indexing and queries target fields in a search index, and not the vector indexes directly.
50
+
> + Both new and existing search indexes support vector search. However, there is a small subset of older services that don't support vector search. In this case, a new search service must be created to use it.
51
51
> + Updating an existing index to add vector fields requires `allowIndexDowntime` query parameter to be `true`.
52
52
53
53
1. Use the [Create or Update Index Preview REST API](/rest/api/searchservice/preview-api/create-or-update-index) to create the index.
54
54
55
-
1.Create a `vectorSearch` section in the index that specifies the algorithm used to create the embedding space. Currently, only `"hnsw"` is supported.
55
+
1.Add a `vectorSearch` section in the index that specifies the algorithm used to create the embedding space. Currently, only `"hnsw"` is supported. For "metric", valid values are `cosine`, `euclidean`, and `dotProduct`. The `cosine` metric is specified because it's the similarity metric that the Azure OpenAI models use to create embeddings.
56
56
57
57
```json
58
58
"vectorSearch": {
@@ -71,10 +71,6 @@ The schema must include fields for the document key, vector fields, and any othe
71
71
}
72
72
```
73
73
74
-
1. Add fields that define the substance and structure of the content you're indexing. At a minimum, you need a document key.
75
-
76
-
You should also add fields that are useful in the query or in it's response. The example below shows vector fields for title and content ("titleVector", "contentVector"). It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
77
-
78
74
1. Add vector fields to the fields collection. You can store one generated embedding per document field. For each vector field:
79
75
80
76
+ Assign the `Collection(Edm.Single)` data type.
@@ -84,6 +80,10 @@ The schema must include fields for the document key, vector fields, and any othe
84
80
+ "searchable" must be "true".
85
81
+ "retrievable" set to "true" allows you to display the raw vectors (for example, as a verification step), but doing so will increase storage. Set to "false" if you don't need to return raw vectors. You don't need to return vectors for a query, but if you're passing a vector result to a downstream app then set "retrievable" to "true".
86
82
83
+
1. Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
84
+
85
+
You should also add fields that are useful in the query or in its response. The example below shows vector fields for title and content ("titleVector", "contentVector") that are equivalent to vectosr. It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
86
+
87
87
An index definition with the described elements looks like this:
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-query.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -85,17 +85,21 @@ When you're setting up the vector query, think about the response structure. Sea
85
85
86
86
Vectors aren't designed for readability, so avoid returning them in the response. Instead, choose non-vector fields that are representative of the search document. For example, if the query targets a "descriptionVector" field, return an equivalent text field if you have one ("description") in the response.
87
87
88
-
Size of the results is determined by the query parameters "k" and "top". Maximum results in a response are either:
88
+
Size of the results is determined by the query parameters "k" and "top". Maximum results in a response are either:
89
89
90
90
+`"k": n` results for vector-only queries
91
91
92
92
+`"top": n` results for hybrid queries
93
93
94
94
Ranking of results is either:
95
95
96
-
+ Cosine similarity if the query is over a single vector field, assuming `cosine` is what you specified in the index `vectorConfiguration`. Azure OpenAI embedding models use cosine similarity, so if you're using Azure OpenAI embedding models, `cosine` is the recommended metric. Other supported ranking metrics include `euclidean` and `dotProduct`.
96
+
+ Cosine similarity if the query is over a single vector field, assuming `cosine` is what you specified in the index `vectorConfiguration`.
97
97
98
-
+ Reciprocal Rank Fusion (RRF) if there are multiple sets of search results. Multiple sets are created if the query targets multiple vector fields, or if the query is a hybrid of vector and full text search, with or without the optional semantic reranking capabilities of [semantic search](semantic-search-overview.md).
98
+
Azure OpenAI embedding models use cosine similarity, so if you're using Azure OpenAI embedding models, `cosine` is the recommended metric. Other supported ranking metrics include `euclidean` and `dotProduct`.
99
+
100
+
+ Reciprocal Rank Fusion (RRF) if there are multiple sets of search results.
101
+
102
+
Multiple sets are created if the query targets multiple vector fields, or if the query is a hybrid of vector and full text search, with or without the optional semantic reranking capabilities of [semantic search](semantic-search-overview.md).
99
103
100
104
Within vector search, a vector query can only target one internal vector index. So for [multiple vector fields](#query-syntax-for-vector-query-over-multiple-fields) and [multiple vector queries](#query-syntax-for-multiple-vector-queries), the search engine generates parallel queries that target the respective vector indexes of each field. Output is a set of ranked results for each query, which are fused using RRF. For more information, see [Vector query execution and scoring](vector-search-ranking.md).
0 commit comments