MicrosoftDocs
diff --git a/‎articles/search/index-ranking-similarity.md
Lines changed: 7 additions & 2 deletions b/‎articles/search/index-ranking-similarity.md
Lines changed: 7 additions & 2 deletions
diff --git a/‎articles/search/index-similarity-and-scoring.md
Lines changed: 19 additions & 19 deletions b/‎articles/search/index-similarity-and-scoring.md
Lines changed: 19 additions & 19 deletions
diff --git a/‎articles/search/media/search-explorer/search-explorer-full-text-query-hotels.png
99.7 KB b/‎articles/search/media/search-explorer/search-explorer-full-text-query-hotels.png
99.7 KB
@@ -6,14 +6,19 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: cognitive-search
 ms.topic: how-to
-ms.date: 09/07/2023
+ms.date: 09/25/2023
 ---
 
 # Configure BM25 relevance scoring
 
 In this article, learn how to configure the [BM25 relevance scoring algorithm](https://en.wikipedia.org/wiki/Okapi_BM25) used by Azure Cognitive Search for full text search queries. It also explains how to enable BM25 on older search services.
 
-BM25 applies to strings (text) on fields having a "searchable" attribution. At query time, the search engine uses BM25 to calculate a **@searchScore** for each match in a given query. Matching documents are ranked by their search score, with the top results returned in the query response.
+BM25 applies to:
+
++ Queries that use the `search` parameter for full text search, on text fields having a `searchable` attribution.
++ Scoring is scoped to `searchFields`, or to all `searchable` fields if `searchFields` is null.
+
+The search engine uses BM25 to calculate a **@searchScore** for each match in a given query. Matching documents are ranked by their search score, with the top results returned in the query response. It's possible to get some [score variation](index-similarity-and-scoring.md#score-variation) in results, even from the same query executing over the same search index, but usually these variations are small and don't change the overall ranking of results.
 
 BM25 has defaults for weighting term frequency and document length. You can customize these properties if the defaults aren't suited to your content. Configuration changes are scoped to individual indexes, which means you can adjust relevance scoring based on the characteristics of each index.
 
 
@@ -1,39 +1,36 @@
 ---
-title: Relevance and scoring
+title: BM25 relevance scoring
 titleSuffix: Azure Cognitive Search
-description: Explains the concepts of relevance and scoring in Azure Cognitive Search, and what a developer can do to customize the scoring result.
+description: Explains the concepts of BM25 relevance and scoring in Azure Cognitive Search, and what a developer can do to customize the scoring result.
 author: HeidiSteen
 ms.author: heidist
 ms.service: cognitive-search
 ms.topic: conceptual
-ms.date: 08/31/2023
+ms.date: 09/25/2023
 ---
 
-# Relevance and scoring in Azure Cognitive Search
+# BM25 relevance and scoring for full text search
 
-This article explains the relevance and the scoring algorithms used to compute search scores in Azure Cognitive Search. A relevance score is computed for each match found in a [full text search](search-lucene-query-architecture.md), where the strongest matches are assigned higher search scores. 
+This article explains the BM25 relevance scoring algorithm used to compute search scores for [full text search](search-lucene-query-architecture.md). BM25 relevance is exclusive to full text search. Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries aren't scored or ranked for relevance.
 
-Relevance applies to full text search only. Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries aren't scored or ranked for relevance.
-
-In Azure Cognitive Search, you can tune search relevance and boost search scores through these mechanisms:
+In Azure Cognitive Search, you can configure algorithm parameters, and tune search relevance and boost search scores through these mechanisms:
 
 + Scoring algorithm configuration
-+ Semantic ranking (in preview, described in [this article](semantic-search-overview.md))
 + Scoring profiles
++ [Semantic ranking](semantic-search-overview.md)
 + Custom scoring logic enabled through the *featuresMode* parameter
 
-> [!NOTE]
-> Matches are scored and ranked from high to low. The score is returned as "@search.score". By default, the top 50 are returned in the response, but you can use the **$top** parameter to return a smaller or larger number of items (up to 1000 in a single response), and **$skip** to get the next set of results.
-
 ## Relevance scoring
 
-Relevance scoring refers to the computation of a search score that serves as an indicator of an item's relevance in the context of the current query. The higher the score, the more relevant the item. 
+Relevance scoring refers to the computation of a search score (**@search.score**) that serves as an indicator of an item's relevance in the context of the current query. The range is unbounded. However, the higher the score, the more relevant the item. 
+
+By default, the top 50 highest scoring matches are returned in the response, but you can use the **$top** parameter to return a smaller or larger number of items (up to 1000 in a single response), and **$skip** to get the next set of results.
 
 The search score is computed based on statistical properties of the string input and the query itself. Azure Cognitive Search finds documents that match on search terms (some or all, depending on [searchMode](/rest/api/searchservice/search-documents#query-parameters)), favoring documents that contain many instances of the search term. The search score goes up even higher if the term is rare across the data index, but common within the document. The basis for this approach to computing relevance is known as *TF-IDF or* term frequency-inverse document frequency.
 
-Search scores can be repeated throughout a result set. When multiple hits have the same search score, the ordering of the same scored items is undefined and not stable. Run the query again, and you might see items shift position, especially if you are using the free service or a billable service with multiple replicas. Given two items with an identical score, there's no guarantee that one appears first.
+Search scores can be repeated throughout a result set. When multiple hits have the same search score, the ordering of the same scored items is undefined and not stable. Run the query again, and you might see items shift position, especially if you're using the free service or a billable service with multiple replicas. Given two items with an identical score, there's no guarantee that one appears first.
 
-If you want to break the tie among repeating scores, you can add an **$orderby** clause to first order by score, then order by another sortable field (for example, `$orderby=search.score() desc,Rating desc`). For more information, see [$orderby](search-query-odata-orderby.md).
+To break the tie among repeating scores, you can add an **$orderby** clause to first order by score, then order by another sortable field (for example, `$orderby=search.score() desc,Rating desc`). For more information, see [$orderby](search-query-odata-orderby.md).
 
 > [!NOTE]
 > A `@search.score = 1` indicates an un-scored or un-ranked result set. The score is uniform across all results. Un-scored results occur when the query form is fuzzy search, wildcard or regex queries, or an empty search (`search=*`, sometimes paired with filters, where the filter is the primary means for returning a match).
@@ -76,7 +73,7 @@ For scalability, Azure Cognitive Search distributes each index horizontally thro
 
 By default, the score of a document is calculated based on statistical properties of the data *within a shard*. This approach is generally not a problem for a large corpus of data, and it provides better performance than having to calculate the score based on information across all shards. That said, using this performance optimization could cause two very similar documents (or even identical documents) to end up with different relevance scores if they end up in different shards.
 
-If you prefer to compute the score based on the statistical properties across all shards, you can do so by adding *scoringStatistics=global* as a [query parameter](/rest/api/searchservice/search-documents) (or add *"scoringStatistics": "global"* as a body parameter of the [query request](/rest/api/searchservice/search-documents)).
+If you prefer to compute the score based on the statistical properties across all shards, you can do so by adding `scoringStatistics=global` as a [query parameter](/rest/api/searchservice/search-documents) (or add `"scoringStatistics": "global"` as a body parameter of the [query request](/rest/api/searchservice/search-documents)).
 
 ```http
 POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-version=2020-06-30
@@ -86,7 +83,7 @@ POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-ve
 }
 ```
 
-Using scoringStatistics will ensure that all shards in the same replica provide the same results. That said, different replicas may be slightly different from one another as they are always getting updated with the latest changes to your index. In some scenarios, you may want your users to get more consistent results during a "query session". In such scenarios, you can provide a `sessionId` as part of your queries. The `sessionId` is a unique string that you create to refer to a unique user session.
+Using `scoringStatistics` will ensure that all shards in the same replica provide the same results. That said, different replicas may be slightly different from one another as they're always getting updated with the latest changes to your index. In some scenarios, you may want your users to get more consistent results during a "query session". In such scenarios, you can provide a `sessionId` as part of your queries. The `sessionId` is a unique string that you create to refer to a unique user session.
 
 ```http
 POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-version=2020-06-30
@@ -96,7 +93,7 @@ POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-ve
 }
 ```
 
-As long as the same `sessionId` is used, a best-effort attempt will be made to target the same replica, increasing the consistency of results your users will see. 
+As long as the same `sessionId` is used, a best-effort attempt is made to target the same replica, increasing the consistency of results your users will see. 
 
 > [!NOTE]
 > Reusing the same `sessionId` values repeatedly can interfere with the load balancing of the requests across replicas and adversely affect the performance of the search service. The value used as sessionId cannot start with a '_' character.
@@ -111,7 +108,7 @@ A scoring profile is part of the index definition, composed of weighted fields,
 
 ## featuresMode parameter (preview)
 
-[Search Documents](/rest/api/searchservice/preview-api/search-documents) requests have a new [featuresMode](/rest/api/searchservice/preview-api/search-documents#featuresmode) parameter that can provide additional detail about relevance at the field level. Whereas the `@searchScore` is calculated for the document all-up (how relevant is this document in the context of this query), through featuresMode you can get information about individual fields, as expressed in a `@search.features` structure. The structure contains all fields used in the query (either specific fields through **searchFields** in a query, or all fields attributed as **searchable** in an index). For each field, you get the following values:
+[Search Documents](/rest/api/searchservice/preview-api/search-documents) requests have a new [featuresMode](/rest/api/searchservice/preview-api/search-documents#featuresmode) parameter that can provide more detail about relevance at the field level. Whereas the `@searchScore` is calculated for the document all-up (how relevant is this document in the context of this query), through featuresMode you can get information about individual fields, as expressed in a `@search.features` structure. The structure contains all fields used in the query (either specific fields through **searchFields** in a query, or all fields attributed as **searchable** in an index). For each field, you get the following values:
 
 + Number of unique tokens found in the field
 + Similarity score, or a measure of how similar the content of the field is, relative to the query term
@@ -134,6 +131,9 @@ For a query that targets the "description" and "title" fields, a response that i
             "similarityScore": 1.75451557,
             "termFrequency" : 6
         }
+    }
+ }
+]
 ```
 
 You can consume these data points in [custom scoring solutions](https://github.com/Azure-Samples/search-ranking-tutorial) or use the information to debug search relevance problems.