You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/index-ranking-similarity.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
7
7
ms.author: heidist
8
8
ms.service: cognitive-search
9
9
ms.topic: how-to
10
-
ms.date: 10/14/2022
10
+
ms.date: 04/18/2023
11
11
---
12
12
13
13
# Configure relevance scoring
@@ -18,10 +18,10 @@ Configuration changes are scoped to individual indexes, which means you can adju
18
18
19
19
## Default scoring algorithm
20
20
21
-
Depending on the age of your search service, Azure Cognitive Search supports two [similarity scoring algorithms](index-similarity-and-scoring.md) for assigning relevance to results in a full text search query:
21
+
Depending on the age of your search service, Azure Cognitive Search supports two [similarity scoring algorithms](index-similarity-and-scoring.md) for a full text search query:
22
22
23
-
+An *Okapi BM25* algorithm, used in all search services created after July 15, 2020
24
-
+A *classic similarity* algorithm, used by all search services created before July 15, 2020
23
+
+ Okapi BM25 algorithm (after July 15, 2020)
24
+
+Classic similarity algorithm (before July 15, 2020)
25
25
26
26
BM25 ranking is the default because it tends to produce search rankings that align better with user expectations. It includes [parameters](#set-bm25-parameters) for tuning results based on factors such as document size. For search services created after July 2020, BM25 is the only scoring algorithm. If you try to set "similarity" to ClassicSimilarity on a new service, an HTTP 400 error will be returned because that algorithm is not supported by the service.
Copy file name to clipboardExpand all lines: articles/search/index-similarity-and-scoring.md
+17-2Lines changed: 17 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
7
7
ms.author: heidist
8
8
ms.service: cognitive-search
9
9
ms.topic: conceptual
10
-
ms.date: 10/14/2022
10
+
ms.date: 04/18/2023
11
11
---
12
12
13
13
# Relevance and scoring in Azure Cognitive Search
@@ -41,7 +41,12 @@ If you want to break the tie among repeating scores, you can add an **$orderby**
41
41
42
42
## Scoring algorithms in Search
43
43
44
-
Azure Cognitive Search provides the `BM25Similarity` ranking algorithm. On older search services, you might be using `ClassicSimilarity`.
44
+
Azure Cognitive Search provides the following scoring algorithms:
45
+
46
+
| Algorithm | Usage | Range |
47
+
|-----------|-------------|-------|
48
+
| BM25Similarity | Built-in algorithm on all search services created after July 2020. You can tune relevance ranking, but on newer services, changing the algorithm isn't supported. | Unbounded range |
49
+
|ClassicSimilarity | Used on older search services. You can [opt-in for BM25](index-ranking-similarity.md). | 0 < 1.00 |
45
50
46
51
Both BM25 and Classic are TF-IDF-like retrieval functions that use the term frequency (TF) and the inverse document frequency (IDF) as variables to calculate relevance scores for each document-query pair, which is then used for ranking results. While conceptually similar to classic, BM25 is rooted in probabilistic information retrieval that produces more intuitive matches, as measured by user research.
47
52
@@ -54,6 +59,16 @@ The following video segment fast-forwards to an explanation of the generally ava
Search scores convey general sense of relevance, reflecting the strength of match relative to other documents in the same result set. But scores aren't always consistent from one query to the next, so as you work with queries, you might notice small discrepancies in how search documents are ordered. There are several explanations for why this might occur.
65
+
66
+
| Cause | Description |
67
+
|-----------|-------------|
68
+
| Data volatility | Index content varies as you add, modify, or delete documents. Term frequencies will change as index updates are processed over time, affecting the search scores of matching documents. |
69
+
| Multiple replicas | For services using multiple replicas, queries are issued against each replica in parallel. The index statistics used to calculate a search score are calculated on a per-replica basis, with results merged and ordered in the query response. Replicas are mostly mirrors of each other, but statistics can differ due to small differences in state. For example, one replica might have deleted documents contributing to their statistics, which were merged out of other replicas. Typically, differences in per-replica statistics are more noticeable in smaller indexes. For more information about this condition, see [Concepts: search units, replicas, partitions, shards](search-capacity-planning.md#concepts-search-units-replicas-partitions-shards) in the capacity planning documentation. |
70
+
| Identical scores | If multiple documents have the same score, any one of them might appear first. |
Copy file name to clipboardExpand all lines: articles/search/search-pagination-page-layout.md
+18-20Lines changed: 18 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: how-to
11
-
ms.date: 11/02/2022
11
+
ms.date: 04/18/2023
12
12
---
13
13
14
14
# How to work with search results in Azure Cognitive Search
@@ -27,9 +27,9 @@ Parameters on the query determine:
27
27
28
28
Results are tabular, composed of fields of either all "retrievable" fields, or limited to just those fields specified in the **`$select`** parameters. Rows are the matching documents.
29
29
30
-
While a search document might consist of a large number of fields, typically only a few are needed to represent each document in the result set. On a query request, append `$select=<field list>` to specify which fields include in the response. A field must be attributed as "retrievable" in the index to be included in a result.
30
+
You can choose which fields are in search results. While a search document might have a large number of fields, typically only a few are needed to represent each document in results. On a query request, append `$select=<field list>` to specify which "retrievable" fields should appear in the response.
31
31
32
-
Fields that work best include those that contrast and differentiate among documents, providing sufficient information to invite a click-through response on the part of the user. On an e-commerce site, it might be a product name, description, brand, color, size, price, and rating. For the built-in hotels-sample index, it might be the "select" fields in the following example:
32
+
Pick fields that offer contrast and differentiation among documents, providing sufficient information to invite a click-through response on the part of the user. On an e-commerce site, it might be a product name, description, brand, color, size, price, and rating. For the built-in hotels-sample index, it might be the "select" fields in the following example:
33
33
34
34
```http
35
35
POST /indexes/hotels-sample-index/docs/search?api-version=2020-06-30
@@ -41,7 +41,7 @@ POST /indexes/hotels-sample-index/docs/search?api-version=2020-06-30
41
41
```
42
42
43
43
> [!NOTE]
44
-
> If want to include image files in a result, such as a product photo or logo, store them outside of Azure Cognitive Search, but include a field in your index to reference the image URL in the search document. Sample indexes that support images in the results include the **realestate-sample-us** demo (a built-in sample dataset that you can build easily in the Import Data wizard), and the [New York City Jobs demo app](https://aka.ms/azjobsdemo).
44
+
> For images in results, such as a product photo or logo, store them outside of Azure Cognitive Search, but add a field in your index to reference the image URL in the search document. Sample indexes that demonstrate images in the results include the **realestate-sample-us** demo (a built-in sample dataset that you can build easily in the Import Data wizard), and the [New York City Jobs demo app](https://aka.ms/azjobsdemo).
45
45
46
46
### Tips for unexpected results
47
47
@@ -66,7 +66,7 @@ Count won't be affected by routine maintenance or other workloads on the search
66
66
67
67
## Paging results
68
68
69
-
By default, the search engine returns up to the first 50 matches. The top 50 are determined by search score, assuming the query is full text search or semantic search. Otherwise, the top 50 are an arbitrary order for exact match queries (where "@searchScore=1.0").
69
+
By default, the search engine returns up to the first 50 matches. The top 50 are determined by search score, assuming the query is full text search or semantic search. Otherwise, the top 50 are an arbitrary order for exact match queries (where uniform "@searchScore=1.0" indicates arbitrary ranking).
70
70
71
71
To control the paging of all documents returned in a result set, add `$top` and `$skip` parameters to the query request. The following list explains the logic.
72
72
@@ -103,33 +103,31 @@ Notice that document 2 is fetched twice. This is because the new document 5 has
103
103
104
104
## Ordering results
105
105
106
-
In a full text search query, results can be ranked by a search score, a semantic reranker score (if using [semantic search](semantic-search-overview.md)), or by an **`$orderby`** expression in the query request that specifies an explicit sort order.
106
+
In a full text search query, results can be ranked by:
107
107
108
-
Sorting methodologies aren't designed to be used together. For example, if you're sorting with **`$orderby`** for primary sorting, you can't apply a secondary sort based on search score (because the search score will be uniform).
108
+
+ a search score
109
+
+ a semantic reranker score
110
+
+ a sort order on a "sortable" field
109
111
110
-
### Ordering by search score
112
+
You can also boost any matches found in specific fields by adding a scoring profile.
111
113
112
-
For full text search queries, results are automatically ranked by a search score, calculated based on term frequency and proximity in a document (derived from [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf)), with higher scores going to documents having more or stronger matches on a search term.
114
+
### Order by search score
113
115
114
-
The "@search.score" range is 0 up to (but not including) 1.00. A "@search.score" equal to 1.00 indicates an unscored or unranked result set, where the 1.0 score is uniform across all results. Unscored results occur when the query form is fuzzy search, wildcard or regex queries, or an empty search (`search=*`). If you need to impose a ranking structure over unscored results, an **`$orderby`** expression will help you achieve that objective.
116
+
For full text search queries, results are automatically [ranked by a search score](index-similarity-and-scoring.md), calculated based on term frequency and proximity in a document (derived from [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf)), with higher scores going to documents having more or stronger matches on a search term.
115
117
116
-
Search scores convey general sense of relevance, reflecting the strength of match relative to other documents in the same result set. But scores aren't always consistent from one query to the next, so as you work with queries, you might notice small discrepancies in how search documents are ordered. There are several explanations for why this might occur.
118
+
The "@search.score" range is either unbounded, or 0 up to (but not including) 1.00 on older services.
117
119
118
-
| Cause | Description |
119
-
|-----------|-------------|
120
-
| Data volatility | Index content varies as you add, modify, or delete documents. Term frequencies will change as index updates are processed over time, affecting the search scores of matching documents. |
121
-
| Multiple replicas | For services using multiple replicas, queries are issued against each replica in parallel. The index statistics used to calculate a search score are calculated on a per-replica basis, with results merged and ordered in the query response. Replicas are mostly mirrors of each other, but statistics can differ due to small differences in state. For example, one replica might have deleted documents contributing to their statistics, which were merged out of other replicas. Typically, differences in per-replica statistics are more noticeable in smaller indexes. For more information about this condition, see [Concepts: search units, replicas, partitions, shards](search-capacity-planning.md#concepts-search-units-replicas-partitions-shards) in the capacity planning documentation. |
122
-
| Identical scores | If multiple documents have the same score, any one of them might appear first. |
120
+
For either algorithm, a "@search.score" equal to 1.00 indicates an unscored or unranked result set, where the 1.0 score is uniform across all results. Unscored results occur when the query form is fuzzy search, wildcard or regex queries, or an empty search (`search=*`). If you need to impose a ranking structure over unscored results, consider an **`$orderby`** expression to achieve that objective.
123
121
124
-
### Ordering by the semantic reranker
122
+
### Order by the semantic reranker
125
123
126
124
If you're using [semantic search](semantic-search-overview.md), the "@search.rerankerScore" determines the sort order of your results.
127
125
128
126
The "@search.rerankerScore" range is 1 to 4.00, where a higher score indicates a stronger semantic match.
129
127
130
-
### Ordering with $orderby
128
+
### Order with $orderby
131
129
132
-
If consistent ordering is an application requirement, you can explicitly define an [**`$orderby`** expression](query-odata-filter-orderby-syntax.md) on a field. Only fields that are indexed as "sortable" can be used to order results.
130
+
If consistent ordering is an application requirement, you can define an [**`$orderby`** expression](query-odata-filter-orderby-syntax.md) on a field. Only fields that are indexed as "sortable" can be used to order results.
133
131
134
132
Fields commonly used in an **`$orderby`** include rating, date, and location. Filtering by location requires that the filter expression calls the [**`geo.distance()` function**](search-query-odata-geo-spatial-functions.md?#order-by-examples), in addition to the field name.
135
133
@@ -143,7 +141,7 @@ String fields (Edm.String, Edm.ComplexType subfields) are sorted in either [ASCI
143
141
144
142
+ Strings that lead with diacritics appear last (Äpfel, Öffnen, Üben)
145
143
146
-
### Use a scoring profile to influence relevance
144
+
### Boost relevance using a scoring profile
147
145
148
146
Another approach that promotes order consistency is using a [custom scoring profile](index-add-scoring-profiles.md). Scoring profiles give you more control over the ranking of items in search results, with the ability to boost matches found in specific fields. The extra scoring logic can help override minor differences among replicas because the search scores for each document are farther apart. We recommend the [ranking algorithm](index-ranking-similarity.md) for this approach.
Copy file name to clipboardExpand all lines: articles/search/search-query-odata-search-score-function.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ author: bevloh
8
8
ms.author: beloh
9
9
ms.service: cognitive-search
10
10
ms.topic: reference
11
-
ms.date: 09/16/2021
11
+
ms.date: 04/18/2023
12
12
translation.priority.mt:
13
13
- "de-de"
14
14
- "es-es"
@@ -23,11 +23,14 @@ translation.priority.mt:
23
23
---
24
24
# OData `search.score` function in Azure Cognitive Search
25
25
26
-
When you send a query to Azure Cognitive Search without the [**$orderby** parameter](search-query-odata-orderby.md), the results that come back will be sorted in descending order by relevance score. Even when you do use **$orderby**, the relevance score will be used to break ties by default. However, sometimes it is useful to use the relevance score as an initial sort criteria, and some other criteria as the tie-breaker. The `search.score` function allows you to do this.
26
+
When you send a query to Azure Cognitive Search without the [**$orderby** parameter](search-query-odata-orderby.md), the results that come back will be sorted in descending order by relevance score. Even when you do use **$orderby**, the relevance score is used to break ties by default. However, sometimes it's useful to use the relevance score as an initial sort criteria, and some other criteria as the tie-breaker. The example in this article demonstrates using the `search.score` function for sorting.
27
+
28
+
> [!NOTE]
29
+
> The relevance score is computed by the similarity ranking algorithm, and the range varies depending on which algorithm you use. For more information, see [Relevance and scoring in Azure Cognitive Search](index-similarity-and-scoring.md).
27
30
28
31
## Syntax
29
32
30
-
The syntax for `search.score` in **$orderby** is `search.score()`. The function `search.score`does not take any parameters. It can be used with the `asc` or `desc` sort-order specifier, just like any other clause in the **$orderby** parameter. It can appear anywhere in the list of sort criteria.
33
+
The syntax for `search.score` in **$orderby** is `search.score()`. The function `search.score`doesn't take any parameters. It can be used with the `asc` or `desc` sort-order specifier, just like any other clause in the **$orderby** parameter. It can appear anywhere in the list of sort criteria.
0 commit comments