Skip to content

Commit 15eb5e9

Browse files
committed
Merge branch 'main' into esql_fuse_length
2 parents 8746cfa + fa252fa commit 15eb5e9

File tree

232 files changed

+5576
-1777
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

232 files changed

+5576
-1777
lines changed

docs/changelog/136141.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 136141
2+
summary: Add settings for health indicator `shard_capacity` thresholds
3+
area: Health
4+
type: enhancement
5+
issues:
6+
- 116697

docs/changelog/136828.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 136828
2+
summary: Can match phase coordinator duration APM metric
3+
area: Search
4+
type: enhancement
5+
issues: []

docs/changelog/136996.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 136996
2+
summary: Add periodic PKC JWK set reloading capability to JWT realm
3+
area: Security
4+
type: enhancement
5+
issues: []

docs/reference/elasticsearch/configuration-reference/health-diagnostic-settings.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,4 +47,8 @@ The following are the *expert-level* settings available for configuring an inter
4747
`health.periodic_logger.poll_interval`
4848
: ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting), [time unit value](/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) How often {{es}} logs the health status of the cluster and of each health indicator as observed by the Health API. Defaults to `60s` (60 seconds).
4949

50+
`health.shard_capacity.unhealthy_threshold.yellow` {applies_to}`stack: ga 9.3`
51+
: ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting)) The minimum number of additional shards the cluster must still be able to allocate (on data or frozen nodes) for shard capacity health to remain `GREEN`. If fewer are available, health becomes `YELLOW`. Must be greater than `health.shard_capacity.unhealthy_threshold.red`. Defaults to `10`.
5052

53+
`health.shard_capacity.unhealthy_threshold.red` {applies_to}`stack: ga 9.3`
54+
: ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting)) The minimum number of additional shards the cluster must still be able to allocate (on data or frozen nodes) below which shard capacity health becomes `RED`. Must be less than `health.shard_capacity.unhealthy_threshold.yellow`. Defaults to `5`.

docs/reference/elasticsearch/configuration-reference/security-settings.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1523,7 +1523,19 @@ $$$jwt-claim-pattern-principal$$$
15231523
: ([Static](docs-content://deploy-manage/stack-settings.md#static-cluster-setting)) Specifies the time-to-live for the period of time to cache JWT entries. JWTs can only be cached if client authentication is successful (or disabled). Uses the standard {{es}} [time units](/reference/elasticsearch/rest-apis/api-conventions.md#time-units). If clients use a different JWT for every request, set to `0` to disable the JWT cache. Defaults to `20m`.
15241524

15251525
`pkc_jwkset_path` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
1526-
: ([Static](docs-content://deploy-manage/stack-settings.md#static-cluster-setting)) The file name or URL to a JSON Web Key Set (JWKS) with the public key material that the JWT Realm uses for verifying token signatures. A value is considered a file name if it does not begin with `https`. The file name is resolved relative to the {{es}} configuration directory. If a URL is provided, then it must begin with `https://` (`http://` is not supported). {{es}} automatically caches the JWK set and will attempt to refresh the JWK set upon signature verification failure, as this might indicate that the JWT Provider has rotated the signing keys.
1526+
: ([Static](docs-content://deploy-manage/stack-settings.md#static-cluster-setting)) The file name or URL to a JSON Web Key Set (JWKS) with the public key material that the JWT Realm uses for verifying token signatures. A value is considered a file name if it does not begin with `https`. The file name is resolved relative to the {{es}} configuration directory. If a URL is provided, then it must begin with `https://` (`http://` is not supported). {{es}} automatically caches the JWK set and will attempt to refresh the JWK set upon signature verification failure, as this might indicate that the JWT Provider has rotated the signing keys. Background JWKS reloading can also be configured with the setting `pkc_jwkset_reload.enabled`. This ensures that rotated keys are automatically discovered and used to verify JWT signatures.
1527+
1528+
`pkc_jwkset_reload.enabled` {applies_to}`stack: ga 9.3` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
1529+
: ([Static](docs-content://deploy-manage/stack-settings.md#static-cluster-setting)) Indicates whether JWKS background reloading is enabled. Defaults to `false`.
1530+
1531+
`pkc_jwkset_reload.file_interval` {applies_to}`stack: ga 9.3` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
1532+
: ([Static](docs-content://deploy-manage/stack-settings.md#static-cluster-setting)) Specifies the reload interval for file-based JWKS. Defaults to `5m`.
1533+
1534+
`pkc_jwkset_reload.url_interval_min` {applies_to}`stack: ga 9.3` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
1535+
: ([Static](docs-content://deploy-manage/stack-settings.md#static-cluster-setting)) Specifies the minimum reload interval for URL-based JWKS. The `Expires` and `Cache-Control` HTTP response headers inform the reload interval. This configuration setting is the lower bound of what is considered, and it is also the default interval in the absence of useful response headers. Defaults to `1h`.
1536+
1537+
`pkc_jwkset_reload.url_interval_max` {applies_to}`stack: ga 9.3` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
1538+
: ([Static](docs-content://deploy-manage/stack-settings.md#static-cluster-setting)) Specifies the maximum reload interval for URL-based JWKS. This configuration setting is the upper bound of what is considered from header responses (`5d`).
15271539

15281540
`hmac_jwkset` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
15291541
: ([Secure](docs-content://deploy-manage/security/secure-settings.md)) Contents of a JSON Web Key Set (JWKS), including the secret key that the JWT realm uses to verify token signatures. This format supports multiple keys and optional attributes, and is preferred over the `hmac_key` setting. Cannot be used in conjunction with the `hmac_key` setting. Refer to [Configure {{es}} to use a JWT realm](docs-content://deploy-manage/users-roles/cluster-or-deployment-auth/jwt.md).

docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,9 @@ First, let’s examine how to combine two different types of queries: a `kNN` qu
113113
While these queries may produce scores in different ranges, we can use Reciprocal Rank Fusion (`rrf`) to combine the results and generate a merged final result list.
114114

115115
To implement this in the retriever framework, we start with the top-level element: our `rrf` retriever.
116-
This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever. Our query structure would look like this:
116+
This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever.
117+
We can specify weights to adjust the influence of each retriever on the final ranking.
118+
In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
117119

118120
```console
119121
GET /retrievers_example/_search
@@ -197,6 +199,57 @@ This returns the following response based on the final rrf score for each result
197199
::::
198200

199201

202+
### Using the expanded format with weights
203+
```{applies_to}
204+
stack: ga 9.2
205+
```
206+
207+
The same query can be written using the expanded format, which allows you to specify custom weights to adjust the influence of each retriever on the final ranking.
208+
In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
209+
210+
```console
211+
GET /retrievers_example/_search
212+
{
213+
"retriever": {
214+
"rrf": {
215+
"retrievers": [
216+
{
217+
"retriever": {
218+
"standard": {
219+
"query": {
220+
"query_string": {
221+
"query": "(information retrieval) OR (artificial intelligence)",
222+
"default_field": "text"
223+
}
224+
}
225+
}
226+
},
227+
"weight": 2.0
228+
},
229+
{
230+
"retriever": {
231+
"knn": {
232+
"field": "vector",
233+
"query_vector": [
234+
0.23,
235+
0.67,
236+
0.89
237+
],
238+
"k": 3,
239+
"num_candidates": 5
240+
}
241+
},
242+
"weight": 1.0
243+
}
244+
],
245+
"rank_window_size": 10,
246+
"rank_constant": 1
247+
}
248+
},
249+
"_source": false
250+
}
251+
```
252+
200253

201254
## Example: Hybrid search with linear retriever [retrievers-examples-linear-retriever]
202255

docs/reference/elasticsearch/rest-apis/retrievers/rrf-retriever.md

Lines changed: 153 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ applies_to:
66

77
# RRF retriever [rrf-retriever]
88

9-
An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers.
9+
An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, combining two or more child retrievers.
1010
Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.
1111

1212

@@ -32,7 +32,13 @@ Combining `query` and `retrievers` is not supported.
3232
: (Optional, array of retriever objects)
3333

3434
A list of child retrievers to specify which sets of returned top documents will have the RRF formula applied to them.
35-
Each child retriever carries an equal weight as part of the RRF formula. Two or more child retrievers are required.
35+
Each retriever can optionally include a weight to adjust its influence on the final ranking. {applies_to}`stack: ga 9.2`
36+
37+
When weights are specified, the final RRF score is calculated as:
38+
```
39+
rrf_score = weight_1 × rrf_score_1 + weight_2 × rrf_score_2 + ... + weight_n × rrf_score_n
40+
```
41+
where `rrf_score_i` is the RRF score for document from retriever `i`, and `weight_i` is the weight for that retriever.
3642

3743
`rank_constant`
3844
: (Optional, integer)
@@ -53,6 +59,82 @@ Combining `query` and `retrievers` is not supported.
5359

5460
Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
5561

62+
Each entry in the `retrievers` array can be specified using the direct format or the wrapped format. {applies_to}`stack: ga 9.2`
63+
64+
**Direct format** (default weight of `1.0`):
65+
```json
66+
{
67+
"rrf": {
68+
"retrievers": [
69+
{
70+
"standard": {
71+
"query": {
72+
"multi_match": {
73+
"query": "search text",
74+
"fields": ["field1", "field2"]
75+
}
76+
}
77+
}
78+
},
79+
{
80+
"knn": {
81+
"field": "vector",
82+
"query_vector": [1, 2, 3],
83+
"k": 10,
84+
"num_candidates": 50
85+
}
86+
}
87+
]
88+
}
89+
}
90+
```
91+
92+
**Wrapped format with custom weights** {applies_to}`stack: ga 9.2`:
93+
```json
94+
{
95+
"rrf": {
96+
"retrievers": [
97+
{
98+
"retriever": {
99+
"standard": {
100+
"query": {
101+
"multi_match": {
102+
"query": "search text",
103+
"fields": ["field1", "field2"]
104+
}
105+
}
106+
}
107+
},
108+
"weight": 2.0
109+
},
110+
{
111+
"retriever": {
112+
"knn": {
113+
"field": "vector",
114+
"query_vector": [1, 2, 3],
115+
"k": 10,
116+
"num_candidates": 50
117+
}
118+
},
119+
"weight": 1.0
120+
}
121+
]
122+
}
123+
}
124+
```
125+
126+
In the wrapped format:
127+
128+
`retriever`
129+
: (Required, a retriever object)
130+
131+
Specifies a child retriever. Any valid retriever type can be used (e.g., `standard`, `knn`, `text_similarity_reranker`, etc.).
132+
133+
`weight` {applies_to}`stack: ga 9.2`
134+
: (Optional, float)
135+
136+
The weight that each score of this retriever's top docs will be multiplied in the RRF formula. Higher values increase this retriever's influence on the final ranking. Must be non-negative. Defaults to `1.0`.
137+
56138
## Example: Hybrid search [rrf-retriever-example-hybrid]
57139

58140
A simple hybrid search example (lexical search + dense vector search) combining a `standard` retriever with a `knn` retriever using RRF:
@@ -182,6 +264,75 @@ GET /restaurants/_search
182264
5. The rank constant for the RRF retriever.
183265
6. The rank window size for the RRF retriever.
184266

267+
## Example: Weighted hybrid search [rrf-retriever-example-weighted]
268+
269+
{applies_to}`stack: ga 9.2`
270+
271+
This example demonstrates how to use weights to adjust the influence of different retrievers in the RRF ranking.
272+
In this case, we're giving the `standard` retriever more importance (weight 2.0) compared to the `knn` retriever (weight 1.0):
273+
274+
```console
275+
GET /restaurants/_search
276+
{
277+
"retriever": {
278+
"rrf": {
279+
"retrievers": [
280+
{
281+
"retriever": { <1>
282+
"standard": {
283+
"query": {
284+
"multi_match": {
285+
"query": "Austria",
286+
"fields": ["city", "region"]
287+
}
288+
}
289+
}
290+
},
291+
"weight": 2.0 <2>
292+
},
293+
{
294+
"retriever": { <3>
295+
"knn": {
296+
"field": "vector",
297+
"query_vector": [10, 22, 77],
298+
"k": 10,
299+
"num_candidates": 10
300+
}
301+
},
302+
"weight": 1.0 <4>
303+
}
304+
],
305+
"rank_constant": 60,
306+
"rank_window_size": 50
307+
}
308+
}
309+
}
310+
```
311+
% TEST[continued]
312+
313+
1. The first retriever in weighted format.
314+
2. This retriever has a weight of 2.0, giving it twice the influence of the kNN retriever.
315+
3. The second retriever in weighted format.
316+
4. This retriever has a weight of 1.0 (default weight).
317+
318+
::::{note}
319+
You can mix weighted and non-weighted formats in the same query.
320+
The direct format (without explicit `retriever` wrapper) uses the default weight of `1.0`:
321+
322+
```json
323+
{
324+
"rrf": {
325+
"retrievers": [
326+
{ "standard": { "query": {...} } },
327+
{ "retriever": { "knn": {...} }, "weight": 2.0 }
328+
]
329+
}
330+
}
331+
```
332+
333+
In this example, the `standard` retriever uses weight `1.0` (default), while the `knn` retriever uses weight `2.0`.
334+
::::
335+
185336
## Example: Hybrid search with sparse vectors [rrf-retriever-example-hybrid-sparse]
186337

187338
A more complex hybrid search example (lexical search + ELSER sparse vector search + dense vector search) using RRF:

muted-tests.yml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -456,9 +456,6 @@ tests:
456456
- class: org.elasticsearch.test.rest.yaml.RcsCcsCommonYamlTestSuiteIT
457457
method: test {p0=search.vectors/200_dense_vector_docvalue_fields/Enable docvalue_fields parameter for dense_vector fields}
458458
issue: https://github.com/elastic/elasticsearch/issues/136443
459-
- class: org.elasticsearch.xpack.downsample.ILMDownsampleDisruptionIT
460-
method: testILMDownsampleRollingRestart
461-
issue: https://github.com/elastic/elasticsearch/issues/136585
462459
- class: org.elasticsearch.xpack.esql.heap_attack.HeapAttackIT
463460
method: testManyConcat
464461
issue: https://github.com/elastic/elasticsearch/issues/136728
@@ -504,6 +501,18 @@ tests:
504501
- class: org.elasticsearch.readiness.ReadinessClusterIT
505502
method: testReadinessDuringRestartsNormalOrder
506503
issue: https://github.com/elastic/elasticsearch/issues/136955
504+
- class: org.elasticsearch.xpack.esql.expression.function.aggregate.DimensionValuesByteRefGroupingAggregatorFunctionTests
505+
method: testSimple
506+
issue: https://github.com/elastic/elasticsearch/issues/137378
507+
- class: org.elasticsearch.xpack.ilm.TimeSeriesDataStreamsIT
508+
method: testSearchableSnapshotAction
509+
issue: https://github.com/elastic/elasticsearch/issues/137167
510+
- class: org.elasticsearch.xpack.security.CoreWithSecurityClientYamlTestSuiteIT
511+
method: test {yaml=indices.validate_query/20_query_string/validate_query with query_string parameters}
512+
issue: https://github.com/elastic/elasticsearch/issues/137391
513+
- class: org.elasticsearch.xpack.downsample.ILMDownsampleDisruptionIT
514+
method: testILMDownsampleRollingRestart
515+
issue: https://github.com/elastic/elasticsearch/issues/136585
507516

508517
# Examples:
509518
#

0 commit comments

Comments
 (0)