Skip to content

Commit 0c4b910

Browse files
Merge main and fix conflicts
2 parents 9471826 + e1a9170 commit 0c4b910

File tree

21 files changed

+1057
-56
lines changed

21 files changed

+1057
-56
lines changed

docs/changelog/136141.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 136141
2+
summary: Add settings for health indicator `shard_capacity` thresholds
3+
area: Health
4+
type: enhancement
5+
issues:
6+
- 116697

docs/reference/elasticsearch/configuration-reference/health-diagnostic-settings.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,4 +47,8 @@ The following are the *expert-level* settings available for configuring an inter
4747
`health.periodic_logger.poll_interval`
4848
: ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting), [time unit value](/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) How often {{es}} logs the health status of the cluster and of each health indicator as observed by the Health API. Defaults to `60s` (60 seconds).
4949

50+
`health.shard_capacity.unhealthy_threshold.yellow` {applies_to}`stack: ga 9.3`
51+
: ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting)) The minimum number of additional shards the cluster must still be able to allocate (on data or frozen nodes) for shard capacity health to remain `GREEN`. If fewer are available, health becomes `YELLOW`. Must be greater than `health.shard_capacity.unhealthy_threshold.red`. Defaults to `10`.
5052

53+
`health.shard_capacity.unhealthy_threshold.red` {applies_to}`stack: ga 9.3`
54+
: ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting)) The minimum number of additional shards the cluster must still be able to allocate (on data or frozen nodes) below which shard capacity health becomes `RED`. Must be less than `health.shard_capacity.unhealthy_threshold.yellow`. Defaults to `5`.

docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,9 @@ First, let’s examine how to combine two different types of queries: a `kNN` qu
113113
While these queries may produce scores in different ranges, we can use Reciprocal Rank Fusion (`rrf`) to combine the results and generate a merged final result list.
114114

115115
To implement this in the retriever framework, we start with the top-level element: our `rrf` retriever.
116-
This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever. Our query structure would look like this:
116+
This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever.
117+
We can specify weights to adjust the influence of each retriever on the final ranking.
118+
In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
117119

118120
```console
119121
GET /retrievers_example/_search
@@ -197,6 +199,54 @@ This returns the following response based on the final rrf score for each result
197199
::::
198200

199201

202+
### Using the expanded format with weights {applies_to}`stack: ga 9.2`
203+
204+
The same query can be written using the expanded format, which allows you to specify custom weights to adjust the influence of each retriever on the final ranking.
205+
In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
206+
207+
```console
208+
GET /retrievers_example/_search
209+
{
210+
"retriever": {
211+
"rrf": {
212+
"retrievers": [
213+
{
214+
"retriever": {
215+
"standard": {
216+
"query": {
217+
"query_string": {
218+
"query": "(information retrieval) OR (artificial intelligence)",
219+
"default_field": "text"
220+
}
221+
}
222+
}
223+
},
224+
"weight": 2.0
225+
},
226+
{
227+
"retriever": {
228+
"knn": {
229+
"field": "vector",
230+
"query_vector": [
231+
0.23,
232+
0.67,
233+
0.89
234+
],
235+
"k": 3,
236+
"num_candidates": 5
237+
}
238+
},
239+
"weight": 1.0
240+
}
241+
],
242+
"rank_window_size": 10,
243+
"rank_constant": 1
244+
}
245+
},
246+
"_source": false
247+
}
248+
```
249+
200250

201251
## Example: Hybrid search with linear retriever [retrievers-examples-linear-retriever]
202252

docs/reference/elasticsearch/rest-apis/retrievers/rrf-retriever.md

Lines changed: 153 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ applies_to:
66

77
# RRF retriever [rrf-retriever]
88

9-
An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers.
9+
An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, combining two or more child retrievers.
1010
Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.
1111

1212

@@ -32,7 +32,13 @@ Combining `query` and `retrievers` is not supported.
3232
: (Optional, array of retriever objects)
3333

3434
A list of child retrievers to specify which sets of returned top documents will have the RRF formula applied to them.
35-
Each child retriever carries an equal weight as part of the RRF formula. Two or more child retrievers are required.
35+
Each retriever can optionally include a weight to adjust its influence on the final ranking. {applies_to}`stack: ga 9.2`
36+
37+
When weights are specified, the final RRF score is calculated as:
38+
```
39+
rrf_score = weight_1 × rrf_score_1 + weight_2 × rrf_score_2 + ... + weight_n × rrf_score_n
40+
```
41+
where `rrf_score_i` is the RRF score for document from retriever `i`, and `weight_i` is the weight for that retriever.
3642

3743
`rank_constant`
3844
: (Optional, integer)
@@ -53,6 +59,82 @@ Combining `query` and `retrievers` is not supported.
5359

5460
Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
5561

62+
Each entry in the `retrievers` array can be specified using the direct format or the wrapped format. {applies_to}`stack: ga 9.2`
63+
64+
**Direct format** (default weight of `1.0`):
65+
```json
66+
{
67+
"rrf": {
68+
"retrievers": [
69+
{
70+
"standard": {
71+
"query": {
72+
"multi_match": {
73+
"query": "search text",
74+
"fields": ["field1", "field2"]
75+
}
76+
}
77+
}
78+
},
79+
{
80+
"knn": {
81+
"field": "vector",
82+
"query_vector": [1, 2, 3],
83+
"k": 10,
84+
"num_candidates": 50
85+
}
86+
}
87+
]
88+
}
89+
}
90+
```
91+
92+
**Wrapped format with custom weights** {applies_to}`stack: ga 9.2`:
93+
```json
94+
{
95+
"rrf": {
96+
"retrievers": [
97+
{
98+
"retriever": {
99+
"standard": {
100+
"query": {
101+
"multi_match": {
102+
"query": "search text",
103+
"fields": ["field1", "field2"]
104+
}
105+
}
106+
}
107+
},
108+
"weight": 2.0
109+
},
110+
{
111+
"retriever": {
112+
"knn": {
113+
"field": "vector",
114+
"query_vector": [1, 2, 3],
115+
"k": 10,
116+
"num_candidates": 50
117+
}
118+
},
119+
"weight": 1.0
120+
}
121+
]
122+
}
123+
}
124+
```
125+
126+
In the wrapped format:
127+
128+
`retriever`
129+
: (Required, a retriever object)
130+
131+
Specifies a child retriever. Any valid retriever type can be used (e.g., `standard`, `knn`, `text_similarity_reranker`, etc.).
132+
133+
`weight` {applies_to}`stack: ga 9.2`
134+
: (Optional, float)
135+
136+
The weight that each score of this retriever's top docs will be multiplied in the RRF formula. Higher values increase this retriever's influence on the final ranking. Must be non-negative. Defaults to `1.0`.
137+
56138
## Example: Hybrid search [rrf-retriever-example-hybrid]
57139

58140
A simple hybrid search example (lexical search + dense vector search) combining a `standard` retriever with a `knn` retriever using RRF:
@@ -182,6 +264,75 @@ GET /restaurants/_search
182264
5. The rank constant for the RRF retriever.
183265
6. The rank window size for the RRF retriever.
184266

267+
## Example: Weighted hybrid search [rrf-retriever-example-weighted]
268+
269+
{applies_to}`stack: ga 9.2`
270+
271+
This example demonstrates how to use weights to adjust the influence of different retrievers in the RRF ranking.
272+
In this case, we're giving the `standard` retriever more importance (weight 2.0) compared to the `knn` retriever (weight 1.0):
273+
274+
```console
275+
GET /restaurants/_search
276+
{
277+
"retriever": {
278+
"rrf": {
279+
"retrievers": [
280+
{
281+
"retriever": { <1>
282+
"standard": {
283+
"query": {
284+
"multi_match": {
285+
"query": "Austria",
286+
"fields": ["city", "region"]
287+
}
288+
}
289+
}
290+
},
291+
"weight": 2.0 <2>
292+
},
293+
{
294+
"retriever": { <3>
295+
"knn": {
296+
"field": "vector",
297+
"query_vector": [10, 22, 77],
298+
"k": 10,
299+
"num_candidates": 10
300+
}
301+
},
302+
"weight": 1.0 <4>
303+
}
304+
],
305+
"rank_constant": 60,
306+
"rank_window_size": 50
307+
}
308+
}
309+
}
310+
```
311+
% TEST[continued]
312+
313+
1. The first retriever in weighted format.
314+
2. This retriever has a weight of 2.0, giving it twice the influence of the kNN retriever.
315+
3. The second retriever in weighted format.
316+
4. This retriever has a weight of 1.0 (default weight).
317+
318+
::::{note}
319+
You can mix weighted and non-weighted formats in the same query.
320+
The direct format (without explicit `retriever` wrapper) uses the default weight of `1.0`:
321+
322+
```json
323+
{
324+
"rrf": {
325+
"retrievers": [
326+
{ "standard": { "query": {...} } },
327+
{ "retriever": { "knn": {...} }, "weight": 2.0 }
328+
]
329+
}
330+
}
331+
```
332+
333+
In this example, the `standard` retriever uses weight `1.0` (default), while the `knn` retriever uses weight `2.0`.
334+
::::
335+
185336
## Example: Hybrid search with sparse vectors [rrf-retriever-example-hybrid-sparse]
186337

187338
A more complex hybrid search example (lexical search + ELSER sparse vector search + dense vector search) using RRF:

muted-tests.yml

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -456,9 +456,6 @@ tests:
456456
- class: org.elasticsearch.test.rest.yaml.RcsCcsCommonYamlTestSuiteIT
457457
method: test {p0=search.vectors/200_dense_vector_docvalue_fields/Enable docvalue_fields parameter for dense_vector fields}
458458
issue: https://github.com/elastic/elasticsearch/issues/136443
459-
- class: org.elasticsearch.xpack.downsample.ILMDownsampleDisruptionIT
460-
method: testILMDownsampleRollingRestart
461-
issue: https://github.com/elastic/elasticsearch/issues/136585
462459
- class: org.elasticsearch.xpack.esql.heap_attack.HeapAttackIT
463460
method: testManyConcat
464461
issue: https://github.com/elastic/elasticsearch/issues/136728
@@ -504,6 +501,12 @@ tests:
504501
- class: org.elasticsearch.readiness.ReadinessClusterIT
505502
method: testReadinessDuringRestartsNormalOrder
506503
issue: https://github.com/elastic/elasticsearch/issues/136955
504+
- class: org.elasticsearch.xpack.esql.expression.function.aggregate.DimensionValuesByteRefGroupingAggregatorFunctionTests
505+
method: testSimple
506+
issue: https://github.com/elastic/elasticsearch/issues/137378
507+
- class: org.elasticsearch.xpack.ilm.TimeSeriesDataStreamsIT
508+
method: testSearchableSnapshotAction
509+
issue: https://github.com/elastic/elasticsearch/issues/137167
507510

508511
# Examples:
509512
#

server/src/internalClusterTest/java/org/elasticsearch/health/HealthMetadataServiceIT.java

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@
3030
import static org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING;
3131
import static org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_LOW_DISK_MAX_HEADROOM_SETTING;
3232
import static org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_LOW_DISK_WATERMARK_SETTING;
33+
import static org.elasticsearch.health.node.ShardsCapacityHealthIndicatorService.SETTING_SHARD_CAPACITY_UNHEALTHY_THRESHOLD_RED;
34+
import static org.elasticsearch.health.node.ShardsCapacityHealthIndicatorService.SETTING_SHARD_CAPACITY_UNHEALTHY_THRESHOLD_YELLOW;
3335
import static org.elasticsearch.indices.ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE;
3436
import static org.elasticsearch.indices.ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE_FROZEN;
3537
import static org.elasticsearch.test.NodeRoles.onlyRoles;
@@ -55,7 +57,12 @@ public void testEachMasterPublishesTheirThresholds() throws Exception {
5557
ByteSizeValue randomBytes = ByteSizeValue.ofBytes(randomLongBetween(6, 19));
5658
String customWatermark = percentageMode ? randomIntBetween(86, 94) + "%" : randomBytes.toString();
5759
ByteSizeValue customMaxHeadroom = percentageMode ? randomBytes : ByteSizeValue.MINUS_ONE;
58-
var customShardLimits = new HealthMetadata.ShardLimits(randomIntBetween(1, 1000), randomIntBetween(1001, 2000));
60+
var customShardLimits = new HealthMetadata.ShardLimits(
61+
randomIntBetween(1, 1000),
62+
randomIntBetween(1001, 2000),
63+
randomIntBetween(101, 200),
64+
randomIntBetween(1, 100)
65+
);
5966
String nodeName = startNode(internalCluster, customWatermark, customMaxHeadroom.toString(), customShardLimits);
6067
watermarkByNode.put(nodeName, customWatermark);
6168
maxHeadroomByNode.put(nodeName, customMaxHeadroom);
@@ -111,7 +118,9 @@ public void testWatermarkSettingUpdate() throws Exception {
111118
ByteSizeValue initialMaxHeadroom = percentageMode ? randomBytes : ByteSizeValue.MINUS_ONE;
112119
HealthMetadata.ShardLimits initialShardLimits = new HealthMetadata.ShardLimits(
113120
randomIntBetween(1, 1000),
114-
randomIntBetween(1001, 2000)
121+
randomIntBetween(1001, 2000),
122+
randomIntBetween(101, 200),
123+
randomIntBetween(1, 100)
115124
);
116125
for (int i = 0; i < numberOfNodes; i++) {
117126
startNode(internalCluster, initialWatermark, initialMaxHeadroom.toString(), initialShardLimits);
@@ -128,7 +137,9 @@ public void testWatermarkSettingUpdate() throws Exception {
128137
ByteSizeValue updatedFloodStageMaxHeadroom = percentageMode ? randomBytes : ByteSizeValue.MINUS_ONE;
129138
HealthMetadata.ShardLimits updatedShardLimits = new HealthMetadata.ShardLimits(
130139
randomIntBetween(3000, 4000),
131-
randomIntBetween(4001, 5000)
140+
randomIntBetween(4001, 5000),
141+
randomIntBetween(101, 200),
142+
randomIntBetween(1, 100)
132143
);
133144

134145
ensureStableCluster(numberOfNodes);
@@ -146,7 +157,9 @@ public void testWatermarkSettingUpdate() throws Exception {
146157
.put(CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING.getKey(), updatedHighWatermark)
147158
.put(CLUSTER_ROUTING_ALLOCATION_DISK_FLOOD_STAGE_WATERMARK_SETTING.getKey(), updatedFloodStageWatermark)
148159
.put(SETTING_CLUSTER_MAX_SHARDS_PER_NODE.getKey(), updatedShardLimits.maxShardsPerNode())
149-
.put(SETTING_CLUSTER_MAX_SHARDS_PER_NODE_FROZEN.getKey(), updatedShardLimits.maxShardsPerNodeFrozen());
160+
.put(SETTING_CLUSTER_MAX_SHARDS_PER_NODE_FROZEN.getKey(), updatedShardLimits.maxShardsPerNodeFrozen())
161+
.put(SETTING_SHARD_CAPACITY_UNHEALTHY_THRESHOLD_YELLOW.getKey(), updatedShardLimits.shardCapacityUnhealthyThresholdYellow())
162+
.put(SETTING_SHARD_CAPACITY_UNHEALTHY_THRESHOLD_RED.getKey(), updatedShardLimits.shardCapacityUnhealthyThresholdRed());
150163

151164
if (percentageMode) {
152165
settingsBuilder.put(CLUSTER_ROUTING_ALLOCATION_LOW_DISK_MAX_HEADROOM_SETTING.getKey(), updatedLowMaxHeadroom)
@@ -214,7 +227,12 @@ public void testHealthNodeToggleEnabled() throws Exception {
214227
ByteSizeValue randomBytes = ByteSizeValue.ofBytes(randomLongBetween(6, 19));
215228
String customWatermark = percentageMode ? randomIntBetween(86, 94) + "%" : randomBytes.toString();
216229
ByteSizeValue customMaxHeadroom = percentageMode ? randomBytes : ByteSizeValue.MINUS_ONE;
217-
var customShardLimits = new HealthMetadata.ShardLimits(randomIntBetween(1, 1000), randomIntBetween(1001, 2000));
230+
var customShardLimits = new HealthMetadata.ShardLimits(
231+
randomIntBetween(1, 1000),
232+
randomIntBetween(1001, 2000),
233+
randomIntBetween(101, 200),
234+
randomIntBetween(1, 100)
235+
);
218236
String nodeName = startNode(internalCluster, customWatermark, customMaxHeadroom.toString(), customShardLimits);
219237
watermarkByNode.put(nodeName, customWatermark);
220238
maxHeadroomByNode.put(nodeName, customMaxHeadroom);
@@ -270,6 +288,8 @@ private String startNode(
270288
.put(createWatermarkSettings(customWatermark, customMaxHeadroom))
271289
.put(SETTING_CLUSTER_MAX_SHARDS_PER_NODE.getKey(), customShardLimits.maxShardsPerNode())
272290
.put(SETTING_CLUSTER_MAX_SHARDS_PER_NODE_FROZEN.getKey(), customShardLimits.maxShardsPerNodeFrozen())
291+
.put(SETTING_SHARD_CAPACITY_UNHEALTHY_THRESHOLD_YELLOW.getKey(), customShardLimits.shardCapacityUnhealthyThresholdYellow())
292+
.put(SETTING_SHARD_CAPACITY_UNHEALTHY_THRESHOLD_RED.getKey(), customShardLimits.shardCapacityUnhealthyThresholdRed())
273293
.build()
274294
);
275295
}

0 commit comments

Comments
 (0)