Skip to content

Commit 5ee473e

Browse files
committed
more misc edits
1 parent 05f05c6 commit 5ee473e

File tree

2 files changed

+14
-10
lines changed

2 files changed

+14
-10
lines changed

articles/search/vector-search-how-to-create-index.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -46,13 +46,13 @@ Your search index should include fields and content for all of the query scenari
4646
The schema must include fields for the document key, vector fields, and any other fields that you require for hybrid search scenarios. In the following example, "title" and "content" contain textual content used in full text search and semantic search, while "titleVector" and "contentVector" contain vector data.
4747

4848
> [!NOTE]
49-
> + You don't need a special "vector index" to use vector search. You'll only need to add one or more "vector fields" to a new or existing index.
50-
> + Both new and existing indexes support vector search. However, there is a small subset of older services that don't support vector search. In this case, a new search service must be created to use it.
49+
> + Vectors are added to fields in a search index. Internally, a *vector index* is created for each vector field, but indexing and queries target fields in a search index, and not the vector indexes directly.
50+
> + Both new and existing search indexes support vector search. However, there is a small subset of older services that don't support vector search. In this case, a new search service must be created to use it.
5151
> + Updating an existing index to add vector fields requires `allowIndexDowntime` query parameter to be `true`.
5252
5353
1. Use the [Create or Update Index Preview REST API](/rest/api/searchservice/preview-api/create-or-update-index) to create the index.
5454

55-
1. Create a `vectorSearch` section in the index that specifies the algorithm used to create the embedding space. Currently, only `"hnsw"` is supported.
55+
1. Add a `vectorSearch` section in the index that specifies the algorithm used to create the embedding space. Currently, only `"hnsw"` is supported. For "metric", valid values are `cosine`, `euclidean`, and `dotProduct`. The `cosine` metric is specified because it's the similarity metric that the Azure OpenAI models use to create embeddings.
5656

5757
```json
5858
"vectorSearch": {
@@ -71,10 +71,6 @@ The schema must include fields for the document key, vector fields, and any othe
7171
}
7272
```
7373

74-
1. Add fields that define the substance and structure of the content you're indexing. At a minimum, you need a document key.
75-
76-
You should also add fields that are useful in the query or in it's response. The example below shows vector fields for title and content ("titleVector", "contentVector"). It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
77-
7874
1. Add vector fields to the fields collection. You can store one generated embedding per document field. For each vector field:
7975

8076
+ Assign the `Collection(Edm.Single)` data type.
@@ -84,6 +80,10 @@ The schema must include fields for the document key, vector fields, and any othe
8480
+ "searchable" must be "true".
8581
+ "retrievable" set to "true" allows you to display the raw vectors (for example, as a verification step), but doing so will increase storage. Set to "false" if you don't need to return raw vectors. You don't need to return vectors for a query, but if you're passing a vector result to a downstream app then set "retrievable" to "true".
8682

83+
1. Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
84+
85+
You should also add fields that are useful in the query or in its response. The example below shows vector fields for title and content ("titleVector", "contentVector") that are equivalent to vectosr. It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
86+
8787
An index definition with the described elements looks like this:
8888

8989
```http

articles/search/vector-search-how-to-query.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -85,17 +85,21 @@ When you're setting up the vector query, think about the response structure. Sea
8585

8686
Vectors aren't designed for readability, so avoid returning them in the response. Instead, choose non-vector fields that are representative of the search document. For example, if the query targets a "descriptionVector" field, return an equivalent text field if you have one ("description") in the response.
8787

88-
Size of the results is determined by the query parameters "k" and "top". Maximum results in a response are either:
88+
Size of the results is determined by the query parameters "k" and "top". Maximum results in a response are either:
8989

9090
+ `"k": n` results for vector-only queries
9191

9292
+ `"top": n` results for hybrid queries
9393

9494
Ranking of results is either:
9595

96-
+ Cosine similarity if the query is over a single vector field, assuming `cosine` is what you specified in the index `vectorConfiguration`. Azure OpenAI embedding models use cosine similarity, so if you're using Azure OpenAI embedding models, `cosine` is the recommended metric. Other supported ranking metrics include `euclidean` and `dotProduct`.
96+
+ Cosine similarity if the query is over a single vector field, assuming `cosine` is what you specified in the index `vectorConfiguration`.
9797

98-
+ Reciprocal Rank Fusion (RRF) if there are multiple sets of search results. Multiple sets are created if the query targets multiple vector fields, or if the query is a hybrid of vector and full text search, with or without the optional semantic reranking capabilities of [semantic search](semantic-search-overview.md).
98+
Azure OpenAI embedding models use cosine similarity, so if you're using Azure OpenAI embedding models, `cosine` is the recommended metric. Other supported ranking metrics include `euclidean` and `dotProduct`.
99+
100+
+ Reciprocal Rank Fusion (RRF) if there are multiple sets of search results.
101+
102+
Multiple sets are created if the query targets multiple vector fields, or if the query is a hybrid of vector and full text search, with or without the optional semantic reranking capabilities of [semantic search](semantic-search-overview.md).
99103

100104
Within vector search, a vector query can only target one internal vector index. So for [multiple vector fields](#query-syntax-for-vector-query-over-multiple-fields) and [multiple vector queries](#query-syntax-for-multiple-vector-queries), the search engine generates parallel queries that target the respective vector indexes of each field. Output is a set of ranked results for each query, which are fused using RRF. For more information, see [Vector query execution and scoring](vector-search-ranking.md).
101105

0 commit comments

Comments
 (0)