Skip to content

Commit 0a69b20

Browse files
Merge pull request #214595 from HeidiSteen/heidist-fresh
[azure search] relevance doc refresh
2 parents 6c17a10 + 21e0570 commit 0a69b20

File tree

5 files changed

+130
-111
lines changed

5 files changed

+130
-111
lines changed

articles/search/TOC.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,8 @@
143143
href: search-query-overview.md
144144
- name: Relevance scoring
145145
href: index-similarity-and-scoring.md
146+
- name: Semantic ranking
147+
href: semantic-ranking.md
146148
- name: Indexing
147149
items:
148150
- name: Search indexes
@@ -456,11 +458,9 @@
456458
href: search-normalizers.md
457459
- name: Relevance
458460
items:
459-
- name: Similarity ranking
461+
- name: Configure scoring
460462
href: index-ranking-similarity.md
461-
- name: Semantic ranking
462-
href: semantic-ranking.md
463-
- name: Scoring profiles
463+
- name: Add a scoring profile
464464
href: index-add-scoring-profiles.md
465465
- name: Performance and monitoring
466466
items:

articles/search/index-add-scoring-profiles.md

Lines changed: 89 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,33 @@
11
---
2-
title: Add scoring profiles to boost search scores
2+
title: Add scoring profiles
33
titleSuffix: Azure Cognitive Search
44
description: Boost search relevance scores for Azure Cognitive Search results by adding scoring profiles to a search index.
55

66
manager: nitinme
77
author: shmed
88
ms.author: ramero
99
ms.service: cognitive-search
10-
ms.topic: conceptual
11-
ms.date: 06/24/2022
10+
ms.topic: how-to
11+
ms.date: 10/14/2022
1212
---
1313

14-
# Add scoring profiles to a search index
14+
# Add scoring profiles to boost search scores
1515

16-
For full text search queries, the search engine computes a search score for each matching document, which allows results to be ranked from high to low. Azure Cognitive Search uses a default scoring algorithm to compute an initial score, but you can customize the calculation through a *scoring profile*.
16+
In this article, you'll learn how to define a scoring profile for boosting search scores based on criteria.
1717

18-
Scoring profiles are embedded in index definitions and include properties for boosting the score of matches, where additional criteria found in the profile provides the boosting logic. For example, you might want to boost matches based on their revenue potential, promote newer items, or perhaps boost items that have been in inventory too long.
18+
Criteria can be a weighted field, such as when a match found in a "tags" field is more relevant than a match found in "descriptions". Criteria can also be a function, such as the `distance` function that favors results that are within a specified distance of the current location.
1919

20-
Unfamiliar with relevance concepts? The following video segment fast-forwards to how scoring profiles work in Azure Cognitive Search, but the video also covers basic concepts. You might also want to review [Relevance and scoring in Azure Cognitive Search](index-similarity-and-scoring.md) for more background.
20+
Scoring profiles are defined in a search index and invoked on query requests. You can create multiple profiles and then modify query logic to choose which one is used.
2121

22-
> [!VIDEO https://www.youtube.com/embed/Y_X6USgvB1g?version=3&start=463&end=970]
22+
> [!NOTE]
23+
> Unfamiliar with relevance concepts? The following video segment fast-forwards to how scoring profiles work in Azure Cognitive Search. You can also visit [Relevance and scoring in Azure Cognitive Search](index-similarity-and-scoring.md) for more background.
24+
>
25+
> > [!VIDEO https://www.youtube.com/embed/Y_X6USgvB1g?version=3&start=463&end=970]
26+
>
2327
24-
## What is a scoring profile?
28+
## Scoring profile definition
2529

26-
A scoring profile is part of the index definition and is composed of weighted fields, functions, and parameters. The purpose of a scoring profile is to boost or amplify matching documents based on criteria you provide.
30+
A scoring profile is part of the index definition and is composed of weighted fields, functions, and parameters.
2731

2832
The following definition shows a simple profile named 'geo'. This example boosts results that have the search term in the hotelName field. It also uses the `distance` function to favor results that are within 10 kilometers of the current location. If someone searches on the term 'inn', and 'inn' happens to be part of the hotel name, documents that include hotels with 'inn' within a 10 KM radius of the current location will appear higher in the search results.
2933

@@ -52,7 +56,7 @@ The following definition shows a simple profile named 'geo'. This example boosts
5256
]
5357
```
5458

55-
To use this scoring profile, your query is formulated to specify scoringProfile parameter in the request.
59+
Parameters are specified on invocation. To use this scoring profile, your query is formulated to specify scoringProfile parameter in the request.
5660

5761
```http
5862
POST /indexes/hotels/docs&api-version=2020-06-30
@@ -73,88 +77,14 @@ See the [Extended example](#bkmk_ex) to review a more detailed example of a scor
7377

7478
Scores are computed for full text search queries for the purpose of ranking the most relevant matches and returning them at the top of the response. The overall score for each document is an aggregation of the individual scores for each field, where the individual score of each field is computed based on the term frequency and document frequency of the searched terms within that field (known as [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) or term frequency-inverse document frequency).
7579

76-
> [!TIP]
77-
> You can use the [featuresMode](index-similarity-and-scoring.md#featuresmode-parameter-preview) parameter to request additional scoring details with the search results (including the field level scores).
80+
You can use the [featuresMode (preview)](index-similarity-and-scoring.md#featuresmode-parameter-preview) parameter to request additional scoring details with the search results (including the field level scores).
7881

7982
## When to add scoring logic
8083

8184
You should create one or more scoring profiles when the default ranking behavior doesn’t go far enough in meeting your business objectives. For example, you might decide that search relevance should favor newly added items. Likewise, you might have a field that contains profit margin, or some other field indicating revenue potential. Boosting results that are more meaningful to your users or the business is often the deciding factor in adoption of scoring profiles.
8285

8386
Relevancy-based ordering in a search page is also implemented through scoring profiles. Consider search results pages you’ve used in the past that let you sort by price, date, rating, or relevance. In Azure Cognitive Search, scoring profiles can be used to drive the ‘relevance’ option. The definition of relevance is user-defined, predicated on business objectives and the type of search experience you want to deliver.
8487

85-
<a name="bkmk_ex"></a>
86-
87-
## Extended example
88-
89-
The following example shows the schema of an index with two scoring profiles (`boostGenre`, `newAndHighlyRated`). Any query against this index that includes either profile as a query parameter will use the profile to score the result set.
90-
91-
The `boostGenre` profile uses weighted text fields, boosting matches found in albumTitle, genre, and artistName fields. The fields are boosted 1.5, 5, and 2 respectively. Why is genre boosted so much higher than the others? If search is conducted over data that is somewhat homogenous (as is the case with 'genre' in the musicstoreindex), you might need a larger variance in the relative weights. For example, in the musicstoreindex, 'rock' appears as both a genre and in identically phrased genre descriptions. If you want genre to outweigh genre description, the genre field will need a much higher relative weight.
92-
93-
```json
94-
{
95-
"name": "musicstoreindex",
96-
"fields": [
97-
{ "name": "key", "type": "Edm.String", "key": true },
98-
{ "name": "albumTitle", "type": "Edm.String" },
99-
{ "name": "albumUrl", "type": "Edm.String", "filterable": false },
100-
{ "name": "genre", "type": "Edm.String" },
101-
{ "name": "genreDescription", "type": "Edm.String", "filterable": false },
102-
{ "name": "artistName", "type": "Edm.String" },
103-
{ "name": "orderableOnline", "type": "Edm.Boolean" },
104-
{ "name": "rating", "type": "Edm.Int32" },
105-
{ "name": "tags", "type": "Collection(Edm.String)" },
106-
{ "name": "price", "type": "Edm.Double", "filterable": false },
107-
{ "name": "margin", "type": "Edm.Int32", "retrievable": false },
108-
{ "name": "inventory", "type": "Edm.Int32" },
109-
{ "name": "lastUpdated", "type": "Edm.DateTimeOffset" }
110-
],
111-
"scoringProfiles": [
112-
{
113-
"name": "boostGenre",
114-
"text": {
115-
"weights": {
116-
"albumTitle": 1.5,
117-
"genre": 5,
118-
"artistName": 2
119-
}
120-
}
121-
},
122-
{
123-
"name": "newAndHighlyRated",
124-
"functions": [
125-
{
126-
"type": "freshness",
127-
"fieldName": "lastUpdated",
128-
"boost": 10,
129-
"interpolation": "quadratic",
130-
"freshness": {
131-
"boostingDuration": "P365D"
132-
}
133-
},
134-
{
135-
"type": "magnitude",
136-
"fieldName": "rating",
137-
"boost": 10,
138-
"interpolation": "linear",
139-
"magnitude": {
140-
"boostingRangeStart": 1,
141-
"boostingRangeEnd": 5,
142-
"constantBoostBeyondRange": false
143-
}
144-
}
145-
]
146-
}
147-
],
148-
"suggesters": [
149-
{
150-
"name": "sg",
151-
"searchMode": "analyzingInfixMatching",
152-
"sourceFields": [ "albumTitle", "artistName" ]
153-
}
154-
]
155-
}
156-
```
157-
15888
## Steps for adding a scoring profile
15989

16090
To implement custom scoring behavior, add a scoring profile to the schema that defines the index. You can have up to 100 scoring profiles within an index (see [Service Limits](search-limits-quotas-capacity.md)), but you can only specify one profile at time in any given query.
@@ -330,6 +260,79 @@ The following table provides several examples.
330260

331261
For more examples, see [XML Schema: Datatypes (W3.org web site)](https://www.w3.org/TR/xmlschema11-2/#dayTimeDuration).
332262

263+
<a name="bkmk_ex"></a>
264+
265+
## Extended example
266+
267+
The following example shows the schema of an index with two scoring profiles (`boostGenre`, `newAndHighlyRated`). Any query against this index that includes either profile as a query parameter will use the profile to score the result set.
268+
269+
The `boostGenre` profile uses weighted text fields, boosting matches found in albumTitle, genre, and artistName fields. The fields are boosted 1.5, 5, and 2 respectively. Why is genre boosted so much higher than the others? If search is conducted over data that is somewhat homogenous (as is the case with 'genre' in the musicstoreindex), you might need a larger variance in the relative weights. For example, in the musicstoreindex, 'rock' appears as both a genre and in identically phrased genre descriptions. If you want genre to outweigh genre description, the genre field will need a much higher relative weight.
270+
271+
```json
272+
{
273+
"name": "musicstoreindex",
274+
"fields": [
275+
{ "name": "key", "type": "Edm.String", "key": true },
276+
{ "name": "albumTitle", "type": "Edm.String" },
277+
{ "name": "albumUrl", "type": "Edm.String", "filterable": false },
278+
{ "name": "genre", "type": "Edm.String" },
279+
{ "name": "genreDescription", "type": "Edm.String", "filterable": false },
280+
{ "name": "artistName", "type": "Edm.String" },
281+
{ "name": "orderableOnline", "type": "Edm.Boolean" },
282+
{ "name": "rating", "type": "Edm.Int32" },
283+
{ "name": "tags", "type": "Collection(Edm.String)" },
284+
{ "name": "price", "type": "Edm.Double", "filterable": false },
285+
{ "name": "margin", "type": "Edm.Int32", "retrievable": false },
286+
{ "name": "inventory", "type": "Edm.Int32" },
287+
{ "name": "lastUpdated", "type": "Edm.DateTimeOffset" }
288+
],
289+
"scoringProfiles": [
290+
{
291+
"name": "boostGenre",
292+
"text": {
293+
"weights": {
294+
"albumTitle": 1.5,
295+
"genre": 5,
296+
"artistName": 2
297+
}
298+
}
299+
},
300+
{
301+
"name": "newAndHighlyRated",
302+
"functions": [
303+
{
304+
"type": "freshness",
305+
"fieldName": "lastUpdated",
306+
"boost": 10,
307+
"interpolation": "quadratic",
308+
"freshness": {
309+
"boostingDuration": "P365D"
310+
}
311+
},
312+
{
313+
"type": "magnitude",
314+
"fieldName": "rating",
315+
"boost": 10,
316+
"interpolation": "linear",
317+
"magnitude": {
318+
"boostingRangeStart": 1,
319+
"boostingRangeEnd": 5,
320+
"constantBoostBeyondRange": false
321+
}
322+
}
323+
]
324+
}
325+
],
326+
"suggesters": [
327+
{
328+
"name": "sg",
329+
"searchMode": "analyzingInfixMatching",
330+
"sourceFields": [ "albumTitle", "artistName" ]
331+
}
332+
]
333+
}
334+
```
335+
333336
## See also
334337

335338
+ [Relevance and scoring in Azure Cognitive Search](index-similarity-and-scoring.md)

articles/search/index-ranking-similarity.md

Lines changed: 31 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,58 @@
11
---
2-
title: Configure scoring algorithm
2+
title: Configure relevance scoring
33
titleSuffix: Azure Cognitive Search
44
description: Enable Okapi BM25 ranking to upgrade the search ranking and relevance behavior on older Azure Search services.
55

66
author: HeidiSteen
77
ms.author: heidist
88
ms.service: cognitive-search
99
ms.topic: how-to
10-
ms.date: 06/22/2022
10+
ms.date: 10/14/2022
1111
---
1212

13-
# Configure the scoring algorithm in Azure Cognitive Search
13+
# Configure relevance scoring
1414

15-
Depending on the age of your search service, Azure Cognitive Search supports two [scoring algorithms](index-similarity-and-scoring.md) for assigning relevance to results in a full text search query:
15+
In this article, you'll learn how to configure the similarity scoring algorithm used by Azure Cognitive Search. The BM25 scoring model has defaults for weighting term frequency and document length. You can customize these properties if the defaults aren't suited to your content.
16+
17+
Configuration changes are scoped to individual indexes, which means you can adjust relevance scoring based on the characteristics of each index.
18+
19+
## Default scoring algorithm
20+
21+
Depending on the age of your search service, Azure Cognitive Search supports two [similarity scoring algorithms](index-similarity-and-scoring.md) for assigning relevance to results in a full text search query:
1622

1723
+ An *Okapi BM25* algorithm, used in all search services created after July 15, 2020
1824
+ A *classic similarity* algorithm, used by all search services created before July 15, 2020
1925

20-
BM25 ranking is the default because it tends to produce search rankings that align better with user expectations. It includes [parameters](#set-bm25-parameters) for tuning results based on factors such as document size. For search services created after July 2020, BM25 is the sole scoring algorithm. If you try to set "similarity" to ClassicSimilarity on a new service, an HTTP 400 error will be returned because that algorithm is not supported by the service.
26+
BM25 ranking is the default because it tends to produce search rankings that align better with user expectations. It includes [parameters](#set-bm25-parameters) for tuning results based on factors such as document size. For search services created after July 2020, BM25 is the only scoring algorithm. If you try to set "similarity" to ClassicSimilarity on a new service, an HTTP 400 error will be returned because that algorithm is not supported by the service.
2127

2228
For older services, classic similarity remains the default algorithm. Older services can [upgrade to BM25](#enable-bm25-scoring-on-older-services) on a per-index basis. When switching from classic to BM25, you can expect to see some differences how search results are ordered.
2329

2430
## Set BM25 parameters
2531

26-
BM25 similarity adds two parameters to control the relevance score calculation. To set "similarity" parameters, issue a [Create or Update Index](/rest/api/searchservice/create-index) request as illustrated by the following example.
32+
BM25 similarity adds two parameters to control the relevance score calculation.
2733

28-
```http
29-
PUT [service-name].search.windows.net/indexes/[index-name]?api-version=2020-06-30&allowIndexDowntime=true
30-
{
31-
"similarity": {
32-
"@odata.type": "#Microsoft.Azure.Search.BM25Similarity",
33-
"b" : 0.5,
34-
"k1" : 1.3
34+
1. Formulate a [Create or Update Index](/rest/api/searchservice/create-index) request as illustrated by the following example.
35+
36+
```http
37+
PUT [service-name].search.windows.net/indexes/[index-name]?api-version=2020-06-30&allowIndexDowntime=true
38+
{
39+
"similarity": {
40+
"@odata.type": "#Microsoft.Azure.Search.BM25Similarity",
41+
"b" : 0.75,
42+
"k1" : 1.2
43+
}
3544
}
36-
}
37-
```
45+
```
46+
47+
1. Set "b" and "k1" to custom values. See the property descriptions in the next section for details.
48+
49+
1. If the index is live, append the "allowIndexDowntime=true" URI parameter on the request.
50+
51+
Because Cognitive Search won't allow updates to a live index, you'll need to take the index offline so that the parameters can be added. Indexing and query requests will fail while the index is offline. The duration of the outage is the amount of time it takes to update the index, usually no more than several seconds. When the update is complete, the index comes back automatically.
3852
39-
Because Cognitive Search won't allow updates to a live index, you'll need to take the index offline so that the parameters can be added. Indexing and query requests will fail while the index is offline. The duration of the outage is the amount of time it takes to update the index, usually no more than several seconds. When the update is complete, the index comes back automatically. To take the index offline, append the "allowIndexDowntime=true" URI parameter on the request that sets the "similarity" property.
53+
1. Send the request.
4054
41-
### BM25 property reference
55+
### BM25 property descriptions
4256
4357
| Property | Type | Description |
4458
|----------|------|-------------|

articles/search/index-similarity-and-scoring.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,14 @@ author: HeidiSteen
77
ms.author: heidist
88
ms.service: cognitive-search
99
ms.topic: conceptual
10-
ms.date: 06/22/2022
10+
ms.date: 10/14/2022
1111
---
1212

1313
# Relevance and scoring in Azure Cognitive Search
1414

15-
This article describes relevance and the scoring algorithms used to compute search scores in Azure Cognitive Search. A relevance score applies to matches returned in [full text search](search-lucene-query-architecture.md), where the most relevant matches appear first. Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries are not scored or ranked for relevance.
15+
This article explains the relevance and the scoring algorithms used to compute search scores in Azure Cognitive Search. A relevance score is computed for each match found in a [full text search](search-lucene-query-architecture.md), where the strongest matches are assigned higher search scores.
16+
17+
Relevance applies to full text search only. Filter queries, autocomplete and suggested queries, wildcard search or fuzzy search queries are not scored or ranked for relevance.
1618

1719
In Azure Cognitive Search, you can tune search relevance and boost search scores through these mechanisms:
1820

@@ -87,7 +89,7 @@ As long as the same `sessionId` is used, a best-effort attempt will be made to t
8789
8890
## Scoring profiles
8991

90-
You can customize the way different fields are ranked by defining a *scoring profile*. Scoring profiles give you greater control over the ranking of items in search results. For example, you might want to boost items based on their revenue potential, promote newer items, or perhaps boost items that have been in inventory too long.
92+
You can customize the way different fields are ranked by defining a *scoring profile*. Scoring profiles provide criteria for boosting the search score of a match based on content characteristics. For example, you might want to boost matches based on their revenue potential, promote newer items, or perhaps boost items that have been in inventory too long.
9193

9294
A scoring profile is part of the index definition, composed of weighted fields, functions, and parameters. For more information about defining one, see [Scoring Profiles](index-add-scoring-profiles.md).
9395

0 commit comments

Comments
 (0)