You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-query.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,7 +95,7 @@ Ranking of results is either:
95
95
96
96
+ Cosine similarity if the query is over a single vector field, assuming `cosine` is what you specified in the index `vectorConfiguration`. Azure OpenAI embedding models use cosine similarity metrics. Other supported ranking metrics include `euclidean` and `dotProduct`.
97
97
98
-
+ Reciprocal Rank Fusion (RRF) if there multiple sets of search results. Multiple sets are created if the query targets multiple vector fields, or if the query is a hybrid of vector and full text search, with or withou the optional semantic re-ranking capabilities of [semantic search](semantic-search-overview.md).
98
+
+ Reciprocal Rank Fusion (RRF) if there are multiple sets of search results. Multiple sets are created if the query targets multiple vector fields, or if the query is a hybrid of vector and full text search, with or without the optional semantic reranking capabilities of [semantic search](semantic-search-overview.md).
99
99
100
100
Within vector search, a vector query can only target one internal vector index. So for [multiple vector fields](#query-syntax-for-vector-query-over-multiple-fields) and [multiple vector queries](#query-syntax-for-multiple-vector-queries), the search engine generates parallel queries that target the respective vector indexes of each field. Output is a set of ranked results for each query, which are fused using RRF. For more information, see [Vector query execution and scoring](vector-search-ranking.md).
Copy file name to clipboardExpand all lines: articles/search/vector-search-overview.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,9 +54,9 @@ Scenarios for vector search include:
54
54
55
55
+**Multi-lingual search**. Use a multi-lingual embeddings model to represent your document in multiple languages in a single vector space to find documents regardless of the language they are in.
56
56
57
-
+**Hybrid search**. Vector search is implemented at the field level, which means you can build qeuries that include vector fields and searchable text fields. The queries execute in parallel and the results are merged into a single reponse. Optionally, add [semantic search (preview)](semantic-search-overview.md) for even more accuracy with L2 reranking using the same language models that power Bing.
57
+
+**Hybrid search**. Vector search is implemented at the field level, which means you can build queries that include vector fields and searchable text fields. The queries execute in parallel and the results are merged into a single response. Optionally, add [semantic search (preview)](semantic-search-overview.md) for even more accuracy with L2 reranking using the same language models that power Bing.
58
58
59
-
+**Filtered vector search**. A query request can include a vector query and a [filter expression](search-filters.md). Filters apply to text and numeric fields, and are useful for including or excluding search documents based on filter criteria. Athough a vector field is not filterable itself, you can set up a filterable text or numeric field. The search engine processes the filter first, reducing the surface area of the search corpus before running the vector query.
59
+
+**Filtered vector search**. A query request can include a vector query and a [filter expression](search-filters.md). Filters apply to text and numeric fields, and are useful for including or excluding search documents based on filter criteria. Although a vector field isn't filterable itself, you can set up a filterable text or numeric field. The search engine processes the filter first, reducing the surface area of the search corpus before running the vector query.
60
60
61
61
+**Vector database**. Use Cognitive Search as a vector store to serve as long-term memory or an external knowledge base for Large Language Models (LLMs), or other applications.
62
62
@@ -99,7 +99,7 @@ For example, documents that talk about different species of dogs would be cluste
99
99
Popular vector similarity metrics include the following, which are all supported by Azure Cognitive Search.
100
100
101
101
+`euclidean` (also known as `L2 norm`): This measures the length of the vector difference between two vectors.
102
-
+`cosine`: This measures the angle between two vectors, and is not affected by differing vector lengths.
102
+
+`cosine`: This measures the angle between two vectors, and isn't affected by differing vector lengths.
103
103
+`dotProduct`: This measures both the length of each of the pair of two vectors, and the angle between them. For normalized vectors, this is identical to `cosine` similarity, but slightly more performant.
Copy file name to clipboardExpand all lines: articles/search/vector-search-ranking.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,22 +19,22 @@ This article is for developers who need a deeper understanding of vector query e
19
19
20
20
## Vector similarity
21
21
22
-
In a vector query, the search query is a vector as opposed to text in full-text queries. Documents that match the vector query are ranked using vector similarity configured on the vector field defined in the index. A vector query specifies the `k` parameter which determines how many nearest neighbors of the query vector should be returned from the index.
22
+
In a vector query, the search query is a vector as opposed to text in full-text queries. Documents that match the vector query are ranked using vector similarity configured on the vector field defined in the index. A vector query specifies the `k` parameter, which determines how many nearest neighbors of the query vector should be returned from the index.
23
23
24
24
> [!NOTE]
25
25
> Full-text search queries could return fewer than the requested number of results if there are fewer or no matches, but vector search will return up to `k` matches as long as there are enough documents in the index. This is because with vector search, similarity is relative to the input query vector, not absolute. This means less relevant results have a worse similarity score, but they can still be the "nearest" vectors if there aren't any closer vectors. As such, a response with no meaningful results can still return `k` results, but each result's similarity score would be low.
26
26
27
27
In a typical application, the input data within a query request would be fed into the same machine learning model that generated the embedding space for the vector index. This model would output a vector in the same embedding space. Since similar data are clustered close together, finding matches is equivalent to finding the nearest vectors and returning the associated documents as the search result.
28
28
29
-
If a query request is about dogs, the model maps the query into a vector that exists somewhere in the cluster of vectors representing documents about dogs. Finding the nearest vectors, or the most "similar" vector based on a similarity metric, would return those relevant documents.
29
+
If a query request is about dogs, the model maps the query into a vector that exists somewhere in the cluster of vectors representing documents about dogs. Identifying which vectors are the most similar to the query, based on a similarity metric, determines which documents are the most relevant.
30
30
31
31
Commonly used similarity metrics include `cosine`, `euclidean` (also known as `l2 norm`), and `dotProduct`, which are summarized here:
32
32
33
-
+ cosine calculates the angle between two vectors. Cosine is the similarity metric used by [Azure OpenAI embedding models](/azure/cognitive-services/openai/concepts/understand-embeddings#cosine-similarity).
33
+
+`cosine` calculates the angle between two vectors. Cosine is the similarity metric used by [Azure OpenAI embedding models](/azure/cognitive-services/openai/concepts/understand-embeddings#cosine-similarity).
34
34
35
-
+ euclidean calculates the Euclidean distance between two vectors, which is the l2-norm of the difference of the two vectors.
35
+
+`euclidean` calculates the Euclidean distance between two vectors, which is the l2-norm of the difference of the two vectors.
36
36
37
-
+ dotProduct is affected by both vectors' magnitudes and the angle between them.
37
+
+`dotProduct` is affected by both vectors' magnitudes and the angle between them.
38
38
39
39
For normalized embedding spaces, dotProduct is equivalent to the cosine similarity, but is more efficient.
0 commit comments