Skip to content

Commit d91662d

Browse files
authored
Merge pull request #6188 from HeidiSteen/heidist-freshness
[azure search] scoring profile updates
2 parents 994ef4f + 44194f2 commit d91662d

12 files changed

+240
-154
lines changed

articles/search/hybrid-search-ranking.md

Lines changed: 1 addition & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ RRF is used anytime there's more than one query execution. The following example
4949

5050
Whenever results are ranked, **`@search.score`** property contains the value used to order the results. Scores are generated by ranking algorithms that vary for each method. Each algorithm has its own range and magnitude.
5151

52-
The following chart identifies the scoring property returned on each match, algorithm, and range of scores for each relevance ranking algorithm.
52+
The following chart identifies the scoring property returned on each match, algorithm, and range of scores for each relevance ranking algorithm. For more information and a diagram of the scoring workflow, see [Relevance in Azure AI Search](search-relevance-overview.md).
5353

5454
| Search method | Parameter | Scoring algorithm | Range |
5555
|---------------|-----------|-------------------|-------|
@@ -141,38 +141,6 @@ By default, full text search is subject to a maximum limit of 1,000 matches (see
141141

142142
For more information, see [How to work with search results](search-pagination-page-layout.md).
143143

144-
## Diagram of a search scoring workflow
145-
146-
The following diagram illustrates a hybrid query that invokes keyword and vector search, with [boosting through scoring profiles](index-add-scoring-profiles.md#how-search-scoring-works-in-azure-ai-search), and semantic ranking.
147-
148-
:::image type="content" source="media/scoring-profiles/scoring-over-ranked-results.png" alt-text="Diagram of prefilters." border="true" lightbox="media/scoring-profiles/scoring-over-ranked-results.png":::
149-
150-
A query that generates the previous workflow might look like this:
151-
152-
```http
153-
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-07-01
154-
Content-Type: application/json
155-
api-key: {{admin-api-key}}
156-
{
157-
"queryType":"semantic",
158-
"search":"hello world",
159-
"searchFields":"field_a, field_b",
160-
"vectorQueries": [
161-
{
162-
"kind":"vector",
163-
"vector": [1.0, 2.0, 3.0],
164-
"fields": "field_c, field_d"
165-
},
166-
{
167-
"kind":"vector",
168-
"vector": [4.0, 5.0, 6.0],
169-
"fields": "field_d, field_e"
170-
}
171-
],
172-
"scoringProfile":"my_scoring_profile"
173-
}
174-
```
175-
176144
## See also
177145

178146
+ [Learn more about hybrid search](hybrid-search-overview.md)

articles/search/index-add-scoring-profiles.md

Lines changed: 65 additions & 88 deletions
Large diffs are not rendered by default.
151 KB
Loading

articles/search/query-lucene-syntax.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ Proximity searches are used to find terms that are near each other in a document
130130

131131
Term boosting refers to ranking a document higher if it contains the boosted term, relative to documents that don't contain the term. This differs from scoring profiles in that scoring profiles boost certain fields, rather than specific terms.
132132

133-
The following example helps illustrate the differences. Suppose that there's a scoring profile that boosts matches in a certain field, say *genre* in the [musicstoreindex example](index-add-scoring-profiles.md#extended-example-for-keyword-search). Term boosting could be used to further boost certain search terms higher than others. For example, `rock^2 electronic` boosts documents that contain the search terms in the genre field higher than other searchable fields in the index. Further, documents that contain the search term *rock* are ranked higher than the other search term *electronic* as a result of the term boost value (2).
133+
The following example helps illustrate the differences. Suppose that there's a scoring profile that boosts matches in a certain field, say *genre* in the [musicstoreindex example](index-add-scoring-profiles.md#example-of-a-scoring-profile). Term boosting could be used to further boost certain search terms higher than others. For example, `rock^2 electronic` boosts documents that contain the search terms in the genre field higher than other searchable fields in the index. Further, documents that contain the search term *rock* are ranked higher than the other search term *electronic* as a result of the term boost value (2).
134134

135135
To boost a term, use the caret, `^`, symbol with a boost factor (a number) at the end of the term you're searching. You can also boost phrases. The higher the boost factor, the more relevant the term is relative to other search terms. By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (for example, 0.20).
136136

articles/search/search-agentic-retrieval-how-to-create.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,7 +196,7 @@ PUT https://{{search-url}}/agents/{{agent-name}}?api-version=2025-05-01-preview
196196
}
197197
```
198198

199-
+ `defaultRerankerThreshold` is the minimum semantic reranker score that's acceptable for inclusion in a response. [Reranker scores](semantic-search-overview.md#how-ranking-is-scored) range from 1 to 4. Plan on revising this value based on testing and what works for your content.
199+
+ `defaultRerankerThreshold` is the minimum semantic reranker score that's acceptable for inclusion in a response. [Reranker scores](semantic-search-overview.md#how-results-are-scored) range from 1 to 4. Plan on revising this value based on testing and what works for your content.
200200

201201
+ `defaultIncludeReferenceSourceData` is a boolean that determines whether the reference portion of the response includes source data. We recommend starting with this value set to true if you want to shape your own response using output from the search engine. Otherwise, if you want to use the output in the response `content` string, you can set it to false.
202202

articles/search/search-agentic-retrieval-how-to-retrieve.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ POST https://{{search-url}}/agents/{{agent-name}}/retrieve?api-version=2025-05-0
103103

104104
+ `rerankerThreshold` and `maxDocsForReranker` are also initially set in the knowledge agent definition as defaults. You can override them in the retrieve action to configure [semantic reranker](semantic-how-to-configure.md), setting minimum thresholds and the maximum number of inputs sent to the reranker.
105105

106-
`rerankerThreshold` is the minimum semantic reranker score that's acceptable for inclusion in a response. [Reranker scores](semantic-search-overview.md#how-ranking-is-scored) range from 1 to 4. Plan on revising this value based on testing and what works for your content.
106+
`rerankerThreshold` is the minimum semantic reranker score that's acceptable for inclusion in a response. [Reranker scores](semantic-search-overview.md#how-results-are-scored) range from 1 to 4. Plan on revising this value based on testing and what works for your content.
107107

108108
`maxDocsForReranker` dictates the maximum number of documents to consider for the final response string. Semantic reranker accepts 50 documents. If the maximum is 200, four more subqueries are added to the query plan to ensure all 200 documents are semantically ranked. for semantic ranking. If the number isn't evenly divisible by 50, the query plan rounds up to nearest whole number.
109109

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
---
2+
title: How relevance scoring works
3+
titleSuffix: Azure AI Search
4+
description: Describes how the scoring and ranking algorithms work in Azure AI Search and how to use them together.
5+
6+
manager: nitinme
7+
author: HeidiSteen
8+
ms.author: heidist
9+
ms.service: azure-ai-search
10+
ms.topic: concept-article
11+
ms.date: 07/23/2025
12+
---
13+
14+
# Relevance in Azure AI Search
15+
16+
In a query operation, the relevance of any given result is measured by a ranking algorithm that determines the strength of a match based on how closely it aligns in content or characteristics. An algorithm assigns a score, and results are rank ordered by that score, with the most relevant matches returned in the response.
17+
18+
Ranking occurs whenever the query request includes full text or vector queries. It doesn't occur if the query invokes strict pattern matching, such as a filter-only query or a specialized query form like autocomplete, suggestions, geospatial search, fuzzy search, or regular expression search. A uniform search score of 1.0 indicates the absence of a ranking algorithm.
19+
20+
The query engine in Azure AI Search supports a multi-level approach to ranking search results, where there's a built-in ranking modality for each query type, plus extra ranking capabilities for extended relevance tuning.
21+
22+
## Levels of ranking
23+
24+
This section describes the levels of scoring operations. For an illustration of how they work together, see the [diagram](#diagram-of-ranking-algorithms) in this article. A [comparison of all search score types and ranges](#types-of-search-scores) is also provided in this article.
25+
26+
| Level | Description |
27+
|-------|-------------|
28+
| Level&nbsp;1&nbsp;(L1) | Initial search score (`@search.score`). <br>For text queries matching on tokenized strings, results are always initially ranked using the [BM25 ranking algorithm](index-similarity-and-scoring.md). <br>For vector queries, results are ranked using either [Hierarchical Navigable Small World (HNSW) or exhaustive K-nearest neighbor (KNN)](vector-search-ranking.md). Image search or multimodal searches are based on vector queries and scored using the L1 vector ranking algorithms. |
29+
| Fused&nbsp;L1 | Scoring from multiple queries using the [Reciprocal Ranking Fusion (RRF) algorithm](hybrid-search-ranking.md). RRF is used for hybrid queries that include text and vector components. RRF is also used when multiple vector queries execute in parallel. A search score from RRF is reflected in `@search.score` over a different range.|
30+
| Level&nbsp;2&nbsp;(L2) | [Semantic ranking score (`@search.reRankerScore`)](semantic-search-overview.md) applies machine reading comprehension to textual content. Semantic ranking is a premium feature that bills for use of the semantic ranking models. It's optional for text queries and vector queries that contain text, but required for [agentic retrieval (preview)](search-agentic-retrieval-concept.md). Although agentic retrieval sends multiple queries to the query engine, the ranking algorithm for agentic retrieval is the semantic ranker. |
31+
32+
## Custom boosting logic using scoring profiles
33+
34+
[Scoring profiles](index-add-scoring-profiles.md) are an optional feature for boosting scores based on extra user-defined criteria. Criteria can include weighted fields, or functions that boost by freshness, proximity, magnitude, or range. There's no extra charge for using a scoring profile. To use a scoring profile, you define it in an index and then specify it on a query.
35+
36+
Scoring logic applies to text and numeric nonvector content. You can use scoring profiles with:
37+
38+
+ [Text (keyword) search](search-query-create.md)
39+
+ [Pure vector queries](vector-search-how-to-query.md)
40+
+ [Hybrid queries](hybrid-search-how-to-query.md), with text and vector subqueries execute in parallel
41+
+ [Semantically ranked queries](semantic-how-to-query-request.md)
42+
43+
For standalone text queries, scoring profiles identify the top 1,000 matches in a [BM25-ranked search](index-similarity-and-scoring.md), with the top 50 matches returned in the response.
44+
45+
For pure vectors, the query is vector-only, but if the [*k*-matching documents](vector-search-ranking.md) include nonvector fields with human-readable content, a scoring profile is applied to nonvector fields in `k` documents.
46+
47+
For the text component of a hybrid query, scoring profiles identify the top 1,000 matches in a BM25-ranked search. However, once those 1,000 results are identified, they're restored to their original BM25 order so that they can be rescored alongside vectors results in the final [Reciprocal Ranking Function (RRF)](hybrid-search-ranking.md) ordering, where the scoring profile (identified as "final document boosting adjustment" in the illustration) is applied to the merged results, along with [vector weighting](vector-search-how-to-query.md#vector-weighting), and [semantic ranking](semantic-search-overview.md) as the last step.
48+
49+
For semantically ranked queries (not shown in the diagram), assuming you use the latest preview REST API or a preview Azure SDK package, scoring profiles can be applied over an L2 ranked result set, generating a new `@search.rerankerBoostedScore` that determines the final ranking.
50+
51+
## Types of search scores
52+
53+
Scored results are indicated for each match in the query response. This table lists all of the search scores with an associated range. Range varies by algorithm.
54+
55+
| Score | Range | Algorithm|
56+
|-------|-------|-------------|
57+
| `@search.score` | 0 through unlimited | [BM25 ranking algorithm](index-similarity-and-scoring.md#scores-in-a-text-results) for text search |
58+
| `@search.score` | 0.333 - 1.00 | [HNSW or exhaustive KNN algorithm](vector-search-ranking.md#scores-in-a-vector-search-results) for vector search |
59+
| `@search.score` | 0 through an upper limit determined by the number of queries | [RRF algorithm](hybrid-search-ranking.md#scores-in-a-hybrid-search-results) |
60+
| `@search.rerankerScore` | 0.00 - 4.00 | [Semantic ranking algorithm](semantic-search-overview.md#how-results-are-scored) for L2 ranking |
61+
| `@search.rerankerScoreBoosted` | 0.00 - 4.00 | Semantic ranking algorithm for L2 ranking and custom boosting through a scoring profile |
62+
63+
## Diagram of ranking algorithms
64+
65+
The following diagram illustrates how the ranking algorithms work together.
66+
67+
:::image type="content" source="media/scoring-profiles/scoring-over-ranked-results.png" alt-text="Diagram showing which fields have a scoring profile and when ranking occurs.":::
68+
69+
> [!NOTE]
70+
> This workflow diagram currently omits `@search.rerankerScoreBoosted` and a step for semantic ranking with boosting from a scoring profile. If you use semantic ranking with scoring profile, the scoring profile is applied after L2 ranking, and the final score is based on `@search.rerankerScoreBoosted`.
71+
72+
## Example query inclusive of all ranking algorithms
73+
74+
A query that generates the previous workflow might look like the following example. This hybrid semantic query is scored using RRF (based on L1 scores for text and vectors), and semantic ranking.
75+
76+
```http
77+
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2025-05-01-preview
78+
79+
{
80+
"search": "cloud formation over water",
81+
"count": true,
82+
"vectorQueries": [
83+
{
84+
"kind": "text",
85+
"text": "cloud formation over water",
86+
"fields": "text_vector,image_vector"
87+
}
88+
],
89+
"queryType": "semantic",
90+
"semanticConfiguration": "my-semantic-configuration",
91+
"select": "title,chunk",
92+
"top": 5
93+
}
94+
```
95+
96+
A response for the above query includes the original RRF `@search.core` and the `@search.rerankerScore`.
97+
98+
```json
99+
"value": [
100+
{
101+
"@search.score": 0.03177805617451668,
102+
"@search.rerankerScore": 2.6919238567352295,
103+
"chunk": "A\nT\n\nM\nO\n\nS\nP\n\nH\nE\n\nR\nE\n\nE\nA\n\nR\nT\n\nH\n\n32\n\nFraming an Iceberg\nSouth Atlantic Ocean\n\nIn June 2016, the Suomi NPP satellite captured this image of various cloud formations in the South Atlantic Ocean. Note how low \n\nstratus clouds framed a hole over iceberg A-56 as it drifted across the sea. \n\nThe exact reason for the hole in the clouds is somewhat of a mystery. It could have formed by chance, although imagery from the \n\ndays before and after this date suggest something else was at work. It could be that the relatively unobstructed path of the clouds \n\nover the ocean surface was interrupted by thermal instability created by the iceberg. In other words, if an obstacle is big enough, \n\nit can divert the low-level atmospheric flow of air around it, a phenomenon often caused by islands.",
104+
"title": "page-39.pdf",
105+
},
106+
{
107+
"@search.score": 0.030621785670518875,
108+
"@search.rerankerScore": 2.557225465774536,
109+
"chunk": "A\nT\n\nM\nO\n\nS\nP\n\nH\nE\n\nR\nE\n\nE\nA\n\nR\nT\n\nH\n\n24\n\nMaking Tracks\nPacific Ocean\n\nShips steaming across the Pacific Ocean left this cluster of bright cloud trails lingering in the atmosphere in February 2012. The \n\nnarrow clouds, known as ship tracks, form when water vapor condenses around tiny particles of pollution from ship exhaust. The \n\ncrisscrossing clouds off the coast of California stretched for many hundreds of kilometers from end to end. The narrow ends of the \n\nclouds are youngest, while the broader, wavier ends are older.\n\nSome of the pollution particles generated by ships (especially sulfates) are soluble in water and can serve as the seeds around which \n\ncloud droplets form. Clouds infused with ship exhaust have more and smaller droplets than unpolluted clouds. As a result, light \n\nhitting the ship tracks scatters in many directions, often making them appear brighter than other types of marine clouds, which are \n\nusually seeded by larger, naturally occurring particles like sea salt.",
110+
"title": "page-31.pdf",
111+
},
112+
{
113+
"@search.score": 0.013698630034923553,
114+
"@search.rerankerScore": 2.515575408935547,
115+
"chunk": "A\nT\n\nM\nO\n\nS\nP\n\nH\nE\n\nR\nE\n\nE\nA\n\nR\nT\n\nH\n\n16\n\nRiding the Waves\nMauritania\n\nYou cannot see it directly, but air masses from Africa and the Atlantic Ocean are colliding in this Landsat 8 image from August 2016. \n\nThe collision off the coast of Mauritania produces a wave structure in the atmosphere. \n\nCalled an undular bore or solitary wave, this cloud formation was created by the interaction between cool, dry air coming off the \n\ncontinent and running into warm, moist air over the ocean. The winds blowing out from the land push a wave of air ahead like a \n\nbow wave moving ahead of a boat. \n\nParts of these waves are favorable for cloud formation, while other parts are not. The dust blowing out from Africa appears to be \n\nriding these waves. Dust has been known to affect cloud growth, but it probably has little to do with the cloud pattern observed here.",
116+
"title": "page-23.pdf",
117+
},
118+
{
119+
"@search.score": 0.028949543833732605,
120+
"@search.rerankerScore": 2.4990925788879395,
121+
"chunk": "A\nT\n\nM\nO\n\nS\nP\n\nH\nE\n\nR\nE\n\nE\nA\n\nR\nT\n\nH\n\n14\n\nBering Streets\nArctic Ocean\n\nWinds from the northeast pushed sea ice southward and formed cloud streets—parallel rows of clouds—over the Bering Strait in \n\nJanuary 2010. The easternmost reaches of Russia, blanketed in snow and ice, appear in the upper left. To the east, sea ice spans \n\nthe Bering Strait. Along the southern edge of the ice, wavy tendrils of newly formed, thin sea ice predominate.\n\nThe cloud streets run in the direction of the northerly wind that helps form them. When wind blows out from a cold surface like sea \n\nice over the warmer, moister air near the open ocean, cylinders of spinning air may develop. Clouds form along the upward cycle in \n\nthe cylinders, where air is rising, and skies remain clear along the downward cycle, where air is falling. The cloud streets run toward \n\nthe southwest in this image from the Terra satellite.",
122+
"title": "page-21.pdf",
123+
},
124+
{
125+
"@search.score": 0.027637723833322525,
126+
"@search.rerankerScore": 2.4686081409454346,
127+
"chunk": "A\nT\n\nM\nO\n\nS\nP\n\nH\nE\n\nR\nE\n\nE\nA\n\nR\nT\n\nH\n\n38\n\nLofted Over Land\nMadagascar\n\nAlong the muddy Mania River, midday clouds form over the forested land but not the water. In the tropical rainforests of Madagascar, \n\nthere is ample moisture for cloud formation. Sunlight heats the land all day, warming that moist air and causing it to rise high into the \n\natmosphere until it cools and condenses into water droplets. Clouds generally form where air is ascending (over land in this case), \n\nbut not where it is descending (over the river). Landsat 8 acquired this image in January 2015.",
128+
"title": "page-45.pdf",
129+
}
130+
]
131+
```

articles/search/search-security-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ ms.topic: conceptual
1414
ms.date: 02/28/2025
1515
---
1616

17-
# Security overview for Azure AI Search
17+
# Security in Azure AI Search
1818

1919
This article describes the security features in Azure AI Search that protect data and operations.
2020

0 commit comments

Comments
 (0)