Skip to content

Commit 41dfb69

Browse files
authored
Merge pull request #202653 from HeidiSteen/heidist-support-case
[azure search] Edits to similarity scoring articles
2 parents b4ae0c1 + 29b3077 commit 41dfb69

File tree

2 files changed

+14
-15
lines changed

2 files changed

+14
-15
lines changed

articles/search/index-ranking-similarity.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -17,20 +17,16 @@ Depending on the age of your search service, Azure Cognitive Search supports two
1717
+ An *Okapi BM25* algorithm, used in all search services created after July 15, 2020
1818
+ A *classic similarity* algorithm, used by all search services created before July 15, 2020
1919

20-
BM25 ranking is the default because it tends to produce search rankings that align better with user expectations. It includes [parameters](#set-bm25-parameters) for tuning results based on factors such as document size.
21-
22-
For search services created after July 2020, BM25 is the sole similarity algorithm. If you try to set similarity to ClassicSimilarity on a new service, an HTTP 400 error will be returned because that algorithm is not supported by the service.
20+
BM25 ranking is the default because it tends to produce search rankings that align better with user expectations. It includes [parameters](#set-bm25-parameters) for tuning results based on factors such as document size. For search services created after July 2020, BM25 is the sole similarity algorithm. If you try to set "similarity" to ClassicSimilarity on a new service, an HTTP 400 error will be returned because that algorithm is not supported by the service.
2321

2422
For older services, classic similarity remains the default algorithm. Older services can [upgrade to BM25](#enable-bm25-scoring-on-older-services) on a per-index basis. When switching from classic to BM25, you can expect to see some differences how search results are ordered.
2523

2624
## Set BM25 parameters
2725

2826
BM25 similarity adds two parameters to control the relevance score calculation. To set "similarity" parameters, issue a [Create or Update Index](/rest/api/searchservice/create-index) request as illustrated by the following example.
2927

30-
Because Cognitive Search won't allow updates to a live index, you'll need to take the index offline so that the parameters can be added. Indexing and query requests will fail while the index is offline. The duration of the outage is the amount of time it takes to update the index, usually no more than several seconds. When the update is complete, the index comes back automatically. To take the index offline, append the "allowIndexDowntime=true" URI parameter on the request that sets the "similarity" property:
31-
3228
```http
33-
PUT https://[search service name].search.windows.net/indexes/[index name]?api-version=2020-06-30&allowIndexDowntime=true
29+
PUT [service-name].search.windows.net/indexes/[index-name]?api-version=2020-06-30&allowIndexDowntime=true
3430
{
3531
"similarity": {
3632
"@odata.type": "#Microsoft.Azure.Search.BM25Similarity",
@@ -40,6 +36,8 @@ PUT https://[search service name].search.windows.net/indexes/[index name]?api-ve
4036
}
4137
```
4238

39+
Because Cognitive Search won't allow updates to a live index, you'll need to take the index offline so that the parameters can be added. Indexing and query requests will fail while the index is offline. The duration of the outage is the amount of time it takes to update the index, usually no more than several seconds. When the update is complete, the index comes back automatically. To take the index offline, append the "allowIndexDowntime=true" URI parameter on the request that sets the "similarity" property.
40+
4341
### BM25 property reference
4442

4543
| Property | Type | Description |
@@ -49,7 +47,7 @@ PUT https://[search service name].search.windows.net/indexes/[index name]?api-ve
4947

5048
## Enable BM25 scoring on older services
5149

52-
If you are running a search service that was created from March 2014 through July 15, 2020, you can enable BM25 by setting a "similarity" property on new indexes. The property is only exposed on new indexes, so if want BM25 on an existing index, you must drop and [rebuild the index](search-howto-reindex.md) with a "similarity" property set to "Microsoft.Azure.Search.BM25Similarity".
50+
If you're running a search service that was created from March 2014 through July 15, 2020, you can enable BM25 by setting a "similarity" property on new indexes. The property is only exposed on new indexes, so if want BM25 on an existing index, you must drop and [rebuild the index](search-howto-reindex.md) with a "similarity" property set to "Microsoft.Azure.Search.BM25Similarity".
5351

5452
Once an index exists with a "similarity" property, you can switch between `BM25Similarity` or `ClassicSimilarity`.
5553

@@ -64,10 +62,10 @@ The following links describe the Similarity property in the Azure SDKs.
6462

6563
### REST example
6664

67-
You can also use the [REST API](/rest/api/searchservice/create-index), as the following example illustrates:
65+
You can also use the [REST API](/rest/api/searchservice/create-index). The following example creates a new index with the "similarity" property set to BM25:
6866

6967
```http
70-
PUT https://[search service name].search.windows.net/indexes/[index name]?api-version=2020-06-30
68+
PUT [service-name].search.windows.net/indexes/[index name]?api-version=2020-06-30
7169
{
7270
"name": "indexName",
7371
"fields": [

articles/search/index-similarity-and-scoring.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.date: 06/22/2022
1212

1313
# Similarity and scoring in Azure Cognitive Search
1414

15-
This article describes relevance scoring and the similarity ranking algorithms used to rank search results in Azure Cognitive Search. A relevance score applies to matches returned in a [full text search query](search-lucene-query-architecture.md). Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries are not scored or ranked.
15+
This article describes relevance scoring and the similarity ranking algorithms used to compute search scores in Azure Cognitive Search. A relevance score applies to matches returned in [full text search](search-lucene-query-architecture.md), where the most relevant matches appear first. Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries are not scored or ranked for relevance.
1616

1717
In Azure Cognitive Search, you can tune search relevance and boost search scores through these mechanisms:
1818

@@ -21,15 +21,16 @@ In Azure Cognitive Search, you can tune search relevance and boost search scores
2121
+ Scoring profiles
2222
+ Custom scoring logic enabled through the *featuresMode* parameter
2323

24-
## Relevance scoring
24+
> [!NOTE]
25+
> Matches are scored and ranked from high to low. The score is returned as "@search.score". By default, the top 50 are returned in the response, but you can use the **$top** parameter to return a smaller or larger number of items (up to 1000 in a single response), and **$skip** to get the next set of results.
2526
26-
Relevance scoring refers to the computation of a search score for every item returned in search results for full text search queries. The score is an indicator of an item's relevance in the context of the current query. The higher the score, the more relevant the item.
27+
## Relevance scoring
2728

28-
In search results, items are rank ordered from high to low, based on the search scores calculated for each item. The score is returned in the response as "@search.score" on every document. By default, the top 50 are returned in the response, but you can use the **$top** parameter to return a smaller or larger number of items (up to 1000 in a single response), and **$skip** to get the next set of results.
29+
Relevance scoring refers to the computation of a search score that serves as an indicator of an item's relevance in the context of the current query. The higher the score, the more relevant the item.
2930

30-
The search score is computed based on statistical properties of the data and the query. Azure Cognitive Search finds documents that match on search terms (some or all, depending on [searchMode](/rest/api/searchservice/search-documents#query-parameters)), favoring documents that contain many instances of the search term. The search score goes up even higher if the term is rare across the data index, but common within the document. The basis for this approach to computing relevance is known as *TF-IDF or* term frequency-inverse document frequency.
31+
The search score is computed based on statistical properties of the string input and the query itself. Azure Cognitive Search finds documents that match on search terms (some or all, depending on [searchMode](/rest/api/searchservice/search-documents#query-parameters)), favoring documents that contain many instances of the search term. The search score goes up even higher if the term is rare across the data index, but common within the document. The basis for this approach to computing relevance is known as *TF-IDF or* term frequency-inverse document frequency.
3132

32-
Search score values can be repeated throughout a result set. When multiple hits have the same search score, the ordering of the same scored items is not defined, and is not stable. Run the query again, and you might see items shift position, especially if you are using the free service or a billable service with multiple replicas. Given two items with an identical score, there is no guarantee which one appears first.
33+
Search scores can be repeated throughout a result set. When multiple hits have the same search score, the ordering of the same scored items is undefined and not stable. Run the query again, and you might see items shift position, especially if you are using the free service or a billable service with multiple replicas. Given two items with an identical score, there is no guarantee which one appears first.
3334

3435
If you want to break the tie among repeating scores, you can add an **$orderby** clause to first order by score, then order by another sortable field (for example, `$orderby=search.score() desc,Rating desc`). For more information, see [$orderby](search-query-odata-orderby.md).
3536

0 commit comments

Comments
 (0)