Skip to content

Commit 9f8447b

Browse files
committed
BM25 edits
1 parent 8639ecd commit 9f8447b

File tree

3 files changed

+35
-37
lines changed

3 files changed

+35
-37
lines changed

articles/search/index-similarity-and-scoring.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,36 @@
11
---
2-
title: Relevance and scoring
2+
title: BM25 relevance scoring
33
titleSuffix: Azure Cognitive Search
4-
description: Explains the concepts of relevance and scoring in Azure Cognitive Search, and what a developer can do to customize the scoring result.
4+
description: Explains the concepts of BM25 relevance and scoring in Azure Cognitive Search, and what a developer can do to customize the scoring result.
55
author: HeidiSteen
66
ms.author: heidist
77
ms.service: cognitive-search
88
ms.topic: conceptual
9-
ms.date: 08/31/2023
9+
ms.date: 09/25/2023
1010
---
1111

12-
# Relevance and scoring in Azure Cognitive Search
12+
# BM25 relevance and scoring for full text search
1313

14-
This article explains the relevance and the scoring algorithms used to compute search scores in Azure Cognitive Search. A relevance score is computed for each match found in a [full text search](search-lucene-query-architecture.md), where the strongest matches are assigned higher search scores.
14+
This article explains the BM25 relevance scoring algorithm used to compute search scores for [full text search](search-lucene-query-architecture.md) queries in Azure Cognitive Search. A relevance score is computed for every match as **@search.score**, where the strongest matches are assigned higher search scores. By default, the top 50 are returned in the response, but you can use the **$top** parameter to return a smaller or larger number of items (up to 1000 in a single response), and **$skip** to get the next set of results.
1515

16-
Relevance applies to full text search only. Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries aren't scored or ranked for relevance.
16+
BM25 relevance applies to *full text search* only. Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries aren't scored or ranked for relevance.
1717

18-
In Azure Cognitive Search, you can tune search relevance and boost search scores through these mechanisms:
18+
In Azure Cognitive Search, you can configure algorithm parameters, and tune search relevance and boost search scores through these mechanisms:
1919

2020
+ Scoring algorithm configuration
21-
+ Semantic ranking (in preview, described in [this article](semantic-search-overview.md))
2221
+ Scoring profiles
22+
+ [Semantic ranking](semantic-search-overview.md)
2323
+ Custom scoring logic enabled through the *featuresMode* parameter
2424

25-
> [!NOTE]
26-
> Matches are scored and ranked from high to low. The score is returned as "@search.score". By default, the top 50 are returned in the response, but you can use the **$top** parameter to return a smaller or larger number of items (up to 1000 in a single response), and **$skip** to get the next set of results.
27-
2825
## Relevance scoring
2926

30-
Relevance scoring refers to the computation of a search score that serves as an indicator of an item's relevance in the context of the current query. The higher the score, the more relevant the item.
27+
Relevance scoring refers to the computation of a search score that serves as an indicator of an item's relevance in the context of the current query. The range is unbounded. However, the higher the score, the more relevant the item.
3128

3229
The search score is computed based on statistical properties of the string input and the query itself. Azure Cognitive Search finds documents that match on search terms (some or all, depending on [searchMode](/rest/api/searchservice/search-documents#query-parameters)), favoring documents that contain many instances of the search term. The search score goes up even higher if the term is rare across the data index, but common within the document. The basis for this approach to computing relevance is known as *TF-IDF or* term frequency-inverse document frequency.
3330

34-
Search scores can be repeated throughout a result set. When multiple hits have the same search score, the ordering of the same scored items is undefined and not stable. Run the query again, and you might see items shift position, especially if you are using the free service or a billable service with multiple replicas. Given two items with an identical score, there's no guarantee that one appears first.
31+
Search scores can be repeated throughout a result set. When multiple hits have the same search score, the ordering of the same scored items is undefined and not stable. Run the query again, and you might see items shift position, especially if you're using the free service or a billable service with multiple replicas. Given two items with an identical score, there's no guarantee that one appears first.
3532

36-
If you want to break the tie among repeating scores, you can add an **$orderby** clause to first order by score, then order by another sortable field (for example, `$orderby=search.score() desc,Rating desc`). For more information, see [$orderby](search-query-odata-orderby.md).
33+
To break the tie among repeating scores, you can add an **$orderby** clause to first order by score, then order by another sortable field (for example, `$orderby=search.score() desc,Rating desc`). For more information, see [$orderby](search-query-odata-orderby.md).
3734

3835
> [!NOTE]
3936
> A `@search.score = 1` indicates an un-scored or un-ranked result set. The score is uniform across all results. Un-scored results occur when the query form is fuzzy search, wildcard or regex queries, or an empty search (`search=*`, sometimes paired with filters, where the filter is the primary means for returning a match).
@@ -86,7 +83,7 @@ POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-ve
8683
}
8784
```
8885

89-
Using scoringStatistics will ensure that all shards in the same replica provide the same results. That said, different replicas may be slightly different from one another as they are always getting updated with the latest changes to your index. In some scenarios, you may want your users to get more consistent results during a "query session". In such scenarios, you can provide a `sessionId` as part of your queries. The `sessionId` is a unique string that you create to refer to a unique user session.
86+
Using scoringStatistics will ensure that all shards in the same replica provide the same results. That said, different replicas may be slightly different from one another as they're always getting updated with the latest changes to your index. In some scenarios, you may want your users to get more consistent results during a "query session". In such scenarios, you can provide a `sessionId` as part of your queries. The `sessionId` is a unique string that you create to refer to a unique user session.
9087

9188
```http
9289
POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-version=2020-06-30
@@ -96,7 +93,7 @@ POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-ve
9693
}
9794
```
9895

99-
As long as the same `sessionId` is used, a best-effort attempt will be made to target the same replica, increasing the consistency of results your users will see.
96+
As long as the same `sessionId` is used, a best-effort attempt is made to target the same replica, increasing the consistency of results your users will see.
10097

10198
> [!NOTE]
10299
> Reusing the same `sessionId` values repeatedly can interfere with the load balancing of the requests across replicas and adversely affect the performance of the search service. The value used as sessionId cannot start with a '_' character.
@@ -111,7 +108,7 @@ A scoring profile is part of the index definition, composed of weighted fields,
111108

112109
## featuresMode parameter (preview)
113110

114-
[Search Documents](/rest/api/searchservice/preview-api/search-documents) requests have a new [featuresMode](/rest/api/searchservice/preview-api/search-documents#featuresmode) parameter that can provide additional detail about relevance at the field level. Whereas the `@searchScore` is calculated for the document all-up (how relevant is this document in the context of this query), through featuresMode you can get information about individual fields, as expressed in a `@search.features` structure. The structure contains all fields used in the query (either specific fields through **searchFields** in a query, or all fields attributed as **searchable** in an index). For each field, you get the following values:
111+
[Search Documents](/rest/api/searchservice/preview-api/search-documents) requests have a new [featuresMode](/rest/api/searchservice/preview-api/search-documents#featuresmode) parameter that can provide more detail about relevance at the field level. Whereas the `@searchScore` is calculated for the document all-up (how relevant is this document in the context of this query), through featuresMode you can get information about individual fields, as expressed in a `@search.features` structure. The structure contains all fields used in the query (either specific fields through **searchFields** in a query, or all fields attributed as **searchable** in an index). For each field, you get the following values:
115112

116113
+ Number of unique tokens found in the field
117114
+ Similarity score, or a measure of how similar the content of the field is, relative to the query term

articles/search/search-query-create.md

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,31 @@
11
---
2-
title: Create a query
2+
title: Full-text query
33
titleSuffix: Azure Cognitive Search
4-
description: Learn how to construct a query request in Cognitive Search, which tools and APIs to use for testing and code, and how query decisions start with index design.
4+
description: Learn how to construct a query request for full text search in Azure Cognitive Search.
55

66
manager: nitinme
77
author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
10-
ms.topic: conceptual
11-
ms.date: 03/22/2023
10+
ms.topic: how-to
11+
ms.date: 09/25/2023
1212
---
1313

14-
# Creating queries in Azure Cognitive Search
14+
# Create a full-text query in Azure Cognitive Search
1515

16-
If you're building a query for the first time, this article describes approaches and methods for setting up the request. It also introduces a query structure, and explains how field attributes and linguistic analyzers can impact query outcomes.
16+
If you're building a query for [full text search](search-lucene-query-architecture.md), this article provides steps for setting up the request. It also introduces a query structure, and explains how field attributes and linguistic analyzers can impact query outcomes.
1717

18-
## What's a query request?
18+
## Prerequisites
1919

20-
A query is a read-only request against the docs collection of a single search index. It specifies a 'search' parameter, which contains the query expression consisting of terms, quote-enclosed phrases, and operators.
20+
+ A [search index](search-how-to-create-search-index.md) with string fields attributed as `searchable`.
2121

22-
Other parameters on the request provide more definition to the query and response. For example, 'searchFields' scopes query execution to specific fields, 'select' specifies which fields are returned in results, and 'count' returns the number of matches found in the index.
22+
+ Read permissions on the documents collection of a search index. To send a query, include a [query API key](search-security-api-keys.md) on the request, or give the caller [Search Index Data Reader](search-security-rbac.md) permissions.
23+
24+
## Example of a basic query request
25+
26+
In Azure Cognitive Search, a query is a read-only request against the docs collection of a single search index. The query expression is specified in a `search` parameter and consists of terms, quote-enclosed phrases, and operators.
27+
28+
Other parameters on the request add definition to the query and response. For example, `searchFields` scopes query execution to specific fields, `select` specifies which fields are returned in results, and `count` returns the number of matches found in the index.
2329

2430
The following example gives you a general idea of a query request by showing some of the available parameters. For more information about query composition, see [Query types and compositions](search-query-overview.md) and [Search Documents (REST)](/rest/api/searchservice/search-documents).
2531

@@ -37,15 +43,9 @@ POST https://[service name].search.windows.net/indexes/hotels-sample-index/docs/
3743

3844
## Choose a client
3945

40-
For early development and proof-of-concept testing, we recommend starting with an interactive tool like Azure portal, or the Postman app for making REST API calls. With these approaches, you can test a query request in isolation and assess the effects of different properties without having to write any code.
41-
42-
To call search from within an app, we recommend the Azure.Document.Search client libraries in the Azure SDKs for .NET, Java, JavaScript, and Python.
43-
44-
### Permissions
45-
46-
A query request requires read permissions, granted via an API key passed in the header. Any operation, including query requests, will work under an [admin API key](search-security-api-keys.md), but query requests can optionally use a [query API key](search-security-api-keys.md#create-query-keys). Query API keys are strongly recommended. You can create up to 50 per service and assign different keys to different applications.
46+
For early development and proof-of-concept testing, start with Azure portal or the Postman app for making REST API calls. These approaches are interactive, useful for targeted testing, and help you assess the effects of different properties without having to write any code.
4747

48-
In Azure portal, access to the built-in tools, wizards, and objects require membership in the Contributor role or higher on the search service.
48+
To call search from within an app, use the **Azure.Document.Search** client libraries in the Azure SDKs for .NET, Java, JavaScript, and Python.
4949

5050
### Use Azure portal to query an index
5151

@@ -96,9 +96,9 @@ Search is fundamentally a user-driven exercise, where terms or phrases are colle
9696

9797
## Effect of field attributes on queries
9898

99-
If you're familiar with [query types and composition](search-query-overview.md), you might remember that the parameters on a query request depend on field attributes in an index. For example, only fields marked as *searchable* and *retrievable* can be used in queries and search results. When setting the `search`, `filter`, and `orderby` parameters in your request, you should check attributes to avoid unexpected results.
99+
If you're familiar with [query types and composition](search-query-overview.md), you might remember that the parameters on a query request depend on field attributes in an index. For example, only fields marked as `searchable` and `retrievable` can be used in queries and search results. When setting the `search`, `filter`, and `orderby` parameters in your request, you should check attributes to avoid unexpected results.
100100

101-
In the portal screenshot below of the [hotels sample index](search-get-started-portal.md), only the last two fields "LastRenovationDate" and "Rating" can be used in an `"$orderby"` only clause.
101+
In the portal screenshot below of the [hotels sample index](search-get-started-portal.md), only the last two fields "LastRenovationDate" and "Rating" are `sortable`, a requirement for use in an `"$orderby"` only clause.
102102

103103
![Index definition for the hotel sample](./media/search-query-overview/hotel-sample-index-definition.png "Index definition for the hotel sample")
104104

articles/search/search-query-overview.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,9 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 03/01/2023
11+
ms.date: 09/25/2023
1212
---
13+
1314
# Querying in Azure Cognitive Search
1415

1516
Azure Cognitive Search offers a rich query language to support a broad range of scenarios, from free text search, to highly-specified query patterns. This article describes query requests and the kinds of queries you can create.

0 commit comments

Comments
 (0)