elastic
diff --git a/‎docs/changelog/134320.yaml‎
Lines changed: 5 additions & 0 deletions b/‎docs/changelog/134320.yaml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/changelog/136141.yaml‎
Lines changed: 6 additions & 0 deletions b/‎docs/changelog/136141.yaml‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/reference/elasticsearch/configuration-reference/health-diagnostic-settings.md‎
Lines changed: 4 additions & 0 deletions b/‎docs/reference/elasticsearch/configuration-reference/health-diagnostic-settings.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/reference/elasticsearch/mapping-reference/pattern-text.md‎
Lines changed: 5 additions & 3 deletions b/‎docs/reference/elasticsearch/mapping-reference/pattern-text.md‎
Lines changed: 5 additions & 3 deletions
diff --git a/‎docs/reference/elasticsearch/mapping-reference/semantic-text.md‎
Lines changed: 9 additions & 9 deletions b/‎docs/reference/elasticsearch/mapping-reference/semantic-text.md‎
Lines changed: 9 additions & 9 deletions
diff --git a/‎docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md‎
Lines changed: 54 additions & 1 deletion b/‎docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md‎
Lines changed: 54 additions & 1 deletion
diff --git a/‎docs/reference/elasticsearch/rest-apis/retrievers/rrf-retriever.md‎
Lines changed: 153 additions & 2 deletions b/‎docs/reference/elasticsearch/rest-apis/retrievers/rrf-retriever.md‎
Lines changed: 153 additions & 2 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/functions/description/chunk.md‎
Lines changed: 10 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/functions/description/chunk.md‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/functions/examples/chunk.md‎
Lines changed: 22 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/functions/examples/chunk.md‎
Lines changed: 22 additions & 0 deletions
@@ -0,0 +1,5 @@
+pr: 134320
+summary: Add CHUNK function
+area: ES|QL
+type: enhancement
+issues: []
@@ -0,0 +1,6 @@
+pr: 136141
+summary: Add settings for health indicator `shard_capacity` thresholds
+area: Health
+type: enhancement
+issues:
+ - 116697
@@ -47,4 +47,8 @@ The following are the *expert-level* settings available for configuring an inter
 `health.periodic_logger.poll_interval`
 :   ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting), [time unit value](/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) How often {{es}} logs the health status of the cluster and of each health indicator as observed by the Health API. Defaults to `60s` (60 seconds).
 
+`health.shard_capacity.unhealthy_threshold.yellow` {applies_to}`stack: ga 9.3`
+:   ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting)) The minimum number of additional shards the cluster must still be able to allocate (on data or frozen nodes) for shard capacity health to remain `GREEN`. If fewer are available, health becomes `YELLOW`. Must be greater than `health.shard_capacity.unhealthy_threshold.red`. Defaults to `10`.
 
+`health.shard_capacity.unhealthy_threshold.red` {applies_to}`stack: ga 9.3`
+:   ([Dynamic](docs-content://deploy-manage/stack-settings.md#dynamic-cluster-setting)) The minimum number of additional shards the cluster must still be able to allocate (on data or frozen nodes) below which shard capacity health becomes `RED`. Must be less than `health.shard_capacity.unhealthy_threshold.yellow`. Defaults to `5`.
@@ -46,14 +46,16 @@ In both cases, all queries return a constant score of 1.0.
 
 ## Index sorting for improved compression
 The compression provided by `pattern_text` can be significantly improved if the index is sorted by the `template_id` field.
-For example, a typical approach would be to sort first by `message.template_id`, then by `@timestamp`, as shown in the following example.
+This sorting is not applied by default, but can be enabled for the `message` field of LogsDB indices (assuming it is of type `pattern_text`) by setting the index setting `index.logsdb.default_sort_on_message_template` to `true`.
+This will cause the index to be sorted by `host.name` (if present), then `message.template_id`, and finally by `@timestamp`.
+If the index is not LogsDB or the `pattern_text` field is named something other than `message`, index sorting can still be manually applied as shown in the following example.
 
 ```console
 PUT logs
 {
   "settings": {
     "index": {
-      "sort.field": [ "message.template_id", "@timestamp" ],
+      "sort.field": [ "notice.template_id", "@timestamp" ],
       "sort.order": [ "asc", "desc" ]
     }
   },
@@ -62,7 +64,7 @@ PUT logs
       "@timestamp": {
         "type": "date"
       },
-      "message": {
+      "notice": {
         "type": "pattern_text"
       }
     }
 
@@ -424,20 +424,20 @@ POST test-index/_search
 
 ## Updates and partial updates for `semantic_text` fields [semantic-text-updates]
 
-When updating documents that contain `semantic_text` fields, it’s important to understand how inference is triggered:
+When updating documents that contain `semantic_text` fields, it's important to understand how inference is triggered:
 
-* **Full document updates**
-  When you perform a full document update, **all `semantic_text` fields will re-run inference** even if their values did not change. This ensures that the embeddings are always consistent with the current document state but can increase ingestion costs.
+Full document updates
+:   Full document updates re-run inference on all `semantic_text` fields, even if their values did not change. This ensures that embeddings remain consistent with the current document state but can increase ingestion costs.
 
-* **Partial updates using the Bulk API**
-  Partial updates that **omit `semantic_text` fields** and are submitted through the [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) will **reuse the existing embeddings** stored in the index. In this case, inference is **not triggered** for fields that were not updated, which can significantly reduce processing time and cost.
+Partial updates using the Bulk API
+:   Partial updates submitted through the [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) reuse existing embeddings when you omit `semantic_text` fields. Inference does not run for omitted fields, which can significantly reduce processing time and cost.
 
-* **Partial updates using the Update API**
-  When using the [Update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) with a `doc` object that **omits `semantic_text` fields**, inference **will still run** on all `semantic_text` fields. This means that even if the field values are not changed, embeddings will be re-generated.
+Partial updates using the Update API
+:   Partial updates submitted through the [Update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) re-run inference on all `semantic_text` fields, even when you omit them from the `doc` object. Embeddings are re-generated regardless of whether field values changed.
 
-If you want to avoid unnecessary inference and keep existing embeddings:
+To preserve existing embeddings and avoid unnecessary inference costs:
 
- * Use **partial updates through the Bulk API**.
+ * Use partial updates with the Bulk API.
  * Omit any `semantic_text` fields that did not change from the `doc` object in your request.
 
 ### Scripted updates
 
@@ -113,7 +113,9 @@ First, let’s examine how to combine two different types of queries: a `kNN` qu
 While these queries may produce scores in different ranges, we can use Reciprocal Rank Fusion (`rrf`) to combine the results and generate a merged final result list.
 
 To implement this in the retriever framework, we start with the top-level element: our `rrf` retriever.
-This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever. Our query structure would look like this:
+This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever.
+We can specify weights to adjust the influence of each retriever on the final ranking.
+In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
 
 ```console
 GET /retrievers_example/_search
@@ -197,6 +199,57 @@ This returns the following response based on the final rrf score for each result
 ::::
 
 
+### Using the expanded format with weights 
+```{applies_to}
+stack: ga 9.2
+```
+
+The same query can be written using the expanded format, which allows you to specify custom weights to adjust the influence of each retriever on the final ranking.
+In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
+
+```console
+GET /retrievers_example/_search
+{
+    "retriever": {
+        "rrf": {
+            "retrievers": [
+                {
+                    "retriever": {
+                        "standard": {
+                            "query": {
+                                "query_string": {
+                                    "query": "(information retrieval) OR (artificial intelligence)",
+                                    "default_field": "text"
+                                }
+                            }
+                        }
+                    },
+                    "weight": 2.0
+                },
+                {
+                    "retriever": {
+                        "knn": {
+                            "field": "vector",
+                            "query_vector": [
+                                0.23,
+                                0.67,
+                                0.89
+                            ],
+                            "k": 3,
+                            "num_candidates": 5
+                        }
+                    },
+                    "weight": 1.0
+                }
+            ],
+            "rank_window_size": 10,
+            "rank_constant": 1
+        }
+    },
+    "_source": false
+}
+```
+
 
 ## Example: Hybrid search with linear retriever [retrievers-examples-linear-retriever]
 
 
@@ -6,7 +6,7 @@ applies_to:
 
 # RRF retriever [rrf-retriever]
 
-An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers.
+An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, combining two or more child retrievers.
 Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.
 
 
@@ -32,7 +32,13 @@ Combining `query` and `retrievers` is not supported.
 :   (Optional, array of retriever objects)
 
     A list of child retrievers to specify which sets of returned top documents will have the RRF formula applied to them.
-    Each child retriever carries an equal weight as part of the RRF formula. Two or more child retrievers are required.
+    Each retriever can optionally include a weight to adjust its influence on the final ranking. {applies_to}`stack: ga 9.2`
+    
+    When weights are specified, the final RRF score is calculated as:
+    ```
+    rrf_score = weight_1 × rrf_score_1 + weight_2 × rrf_score_2 + ... + weight_n × rrf_score_n
+    ```
+    where `rrf_score_i` is the RRF score for document from retriever `i`, and `weight_i` is the weight for that retriever.
 
 `rank_constant`
 :   (Optional, integer)
@@ -53,6 +59,82 @@ Combining `query` and `retrievers` is not supported.
 
     Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
 
+Each entry in the `retrievers` array can be specified using the direct format or the wrapped format. {applies_to}`stack: ga 9.2`
+
+**Direct format** (default weight of `1.0`):
+```json
+{
+  "rrf": {
+    "retrievers": [
+      {
+        "standard": {
+          "query": {
+            "multi_match": {
+              "query": "search text",
+              "fields": ["field1", "field2"]
+            }
+          }
+        }
+      },
+      {
+        "knn": {
+          "field": "vector",
+          "query_vector": [1, 2, 3],
+          "k": 10,
+          "num_candidates": 50
+        }
+      }
+    ]
+  }
+}
+```
+
+**Wrapped format with custom weights** {applies_to}`stack: ga 9.2`:
+```json
+{
+  "rrf": {
+    "retrievers": [
+      {
+        "retriever": {
+          "standard": {
+            "query": {
+              "multi_match": {
+                "query": "search text",
+                "fields": ["field1", "field2"]
+              }
+            }
+          }
+        },
+        "weight": 2.0
+      },
+      {
+        "retriever": {
+          "knn": {
+            "field": "vector",
+            "query_vector": [1, 2, 3],
+            "k": 10,
+            "num_candidates": 50
+          }
+        },
+        "weight": 1.0
+      }
+    ]
+  }
+}
+```
+
+In the wrapped format:
+
+`retriever`
+:   (Required, a retriever object)
+
+    Specifies a child retriever. Any valid retriever type can be used (e.g., `standard`, `knn`, `text_similarity_reranker`, etc.).
+
+`weight` {applies_to}`stack: ga 9.2`
+:   (Optional, float)
+
+    The weight that each score of this retriever's top docs will be multiplied in the RRF formula. Higher values increase this retriever's influence on the final ranking. Must be non-negative. Defaults to `1.0`.
+
 ## Example: Hybrid search [rrf-retriever-example-hybrid]
 
 A simple hybrid search example (lexical search + dense vector search) combining a `standard` retriever with a `knn` retriever using RRF:
@@ -182,6 +264,75 @@ GET /restaurants/_search
 5. The rank constant for the RRF retriever.
 6. The rank window size for the RRF retriever.
 
+## Example: Weighted hybrid search [rrf-retriever-example-weighted]
+
+{applies_to}`stack: ga 9.2`
+
+This example demonstrates how to use weights to adjust the influence of different retrievers in the RRF ranking.
+In this case, we're giving the `standard` retriever more importance (weight 2.0) compared to the `knn` retriever (weight 1.0):
+
+```console
+GET /restaurants/_search
+{
+  "retriever": {
+    "rrf": {
+      "retrievers": [
+        {
+          "retriever": { <1>
+            "standard": {
+              "query": {
+                "multi_match": {
+                  "query": "Austria",
+                  "fields": ["city", "region"]
+                }
+              }
+            }
+          },
+          "weight": 2.0 <2>
+        },
+        {
+          "retriever": { <3>
+            "knn": {
+              "field": "vector",
+              "query_vector": [10, 22, 77],
+              "k": 10,
+              "num_candidates": 10
+            }
+          },
+          "weight": 1.0 <4>
+        }
+      ],
+      "rank_constant": 60,
+      "rank_window_size": 50
+    }
+  }
+}
+```
+% TEST[continued]
+
+1. The first retriever in weighted format.
+2. This retriever has a weight of 2.0, giving it twice the influence of the kNN retriever.
+3. The second retriever in weighted format.
+4. This retriever has a weight of 1.0 (default weight).
+
+::::{note}
+You can mix weighted and non-weighted formats in the same query.
+The direct format (without explicit `retriever` wrapper) uses the default weight of `1.0`:
+
+```json
+{
+  "rrf": {
+    "retrievers": [
+      { "standard": { "query": {...} } },
+      { "retriever": { "knn": {...} }, "weight": 2.0 }
+    ]
+  }
+}
+```
+
+In this example, the `standard` retriever uses weight `1.0` (default), while the `knn` retriever uses weight `2.0`.
+::::
+
 ## Example: Hybrid search with sparse vectors [rrf-retriever-example-hybrid-sparse]
 
 A more complex hybrid search example (lexical search + ELSER sparse vector search + dense vector search) using RRF: