Docs: Add weight parameter documentation for Weighted RRF retriever (elastic#136698) (elastic#137380)

mridula-s109 · kderusso · web-flow · commit 3ab5519666ab · 2025-10-30T14:39:50.000+01:00
* Added weighted rrf examples to the doc

* Cleaned up doc

* Modified information

* Fixed discrepancies

* made it more explicit

* Update docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md



* Resolved PR comments

* Resolved comments

* Resolved comments

* Update docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md



* addressed comments'

* changes

* Update rrf-retriever.md

---------

Co-authored-by: Kathleen DeRusso &lt;kathleen.derusso@elastic.co&gt;
diff --git a/docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md b/docs/reference/elasticsearch/rest-apis/retrievers/retrievers-examples.md
@@ -113,7 +113,9 @@ First, let’s examine how to combine two different types of queries: a `kNN` qu
 While these queries may produce scores in different ranges, we can use Reciprocal Rank Fusion (`rrf`) to combine the results and generate a merged final result list.
 
 To implement this in the retriever framework, we start with the top-level element: our `rrf` retriever.
-This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever. Our query structure would look like this:
+This retriever operates on top of two other retrievers: a `knn` retriever and a `standard` retriever.
+We can specify weights to adjust the influence of each retriever on the final ranking.
+In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
 
 ```console
 GET /retrievers_example/_search
@@ -195,6 +197,54 @@ This returns the following response based on the final rrf score for each result
 ::::
 
 
+### Using the expanded format with weights {applies_to}`stack: ga 9.2`
+
+The same query can be written using the expanded format, which allows you to specify custom weights to adjust the influence of each retriever on the final ranking.
+In this example, we're giving the `standard` retriever twice the influence of the `knn` retriever:
+
+```console
+GET /retrievers_example/_search
+{
+    "retriever": {
+        "rrf": {
+            "retrievers": [
+                {
+                    "retriever": {
+                        "standard": {
+                            "query": {
+                                "query_string": {
+                                    "query": "(information retrieval) OR (artificial intelligence)",
+                                    "default_field": "text"
+                                }
+                            }
+                        }
+                    },
+                    "weight": 2.0
+                },
+                {
+                    "retriever": {
+                        "knn": {
+                            "field": "vector",
+                            "query_vector": [
+                                0.23,
+                                0.67,
+                                0.89
+                            ],
+                            "k": 3,
+                            "num_candidates": 5
+                        }
+                    },
+                    "weight": 1.0
+                }
+            ],
+            "rank_window_size": 10,
+            "rank_constant": 1
+        }
+    },
+    "_source": false
+}
+```
+
 
 ## Example: Hybrid search with linear retriever [retrievers-examples-linear-retriever]
 
diff --git a/docs/reference/elasticsearch/rest-apis/retrievers/rrf-retriever.md b/docs/reference/elasticsearch/rest-apis/retrievers/rrf-retriever.md
@@ -6,7 +6,7 @@ applies_to:
 
 # RRF retriever [rrf-retriever]
 
-An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers.
+An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, combining two or more child retrievers.
 Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.
 
 
@@ -32,7 +32,13 @@ Combining `query` and `retrievers` is not supported.
 :   (Optional, array of retriever objects)
 
     A list of child retrievers to specify which sets of returned top documents will have the RRF formula applied to them.
-    Each child retriever carries an equal weight as part of the RRF formula. Two or more child retrievers are required.
+    Each retriever can optionally include a weight to adjust its influence on the final ranking. {applies_to}`stack: ga 9.2`
+    
+    When weights are specified, the final RRF score is calculated as:
+    ```
+    rrf_score = weight_1 × rrf_score_1 + weight_2 × rrf_score_2 + ... + weight_n × rrf_score_n
+    ```
+    where `rrf_score_i` is the RRF score for document from retriever `i`, and `weight_i` is the weight for that retriever.
 
 `rank_constant`
 :   (Optional, integer)
@@ -53,6 +59,82 @@ Combining `query` and `retrievers` is not supported.
 
     Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
 
+Each entry in the `retrievers` array can be specified using the direct format or the wrapped format. {applies_to}`stack: ga 9.2`
+
+**Direct format** (default weight of `1.0`):
+```json
+{
+  "rrf": {
+    "retrievers": [
+      {
+        "standard": {
+          "query": {
+            "multi_match": {
+              "query": "search text",
+              "fields": ["field1", "field2"]
+            }
+          }
+        }
+      },
+      {
+        "knn": {
+          "field": "vector",
+          "query_vector": [1, 2, 3],
+          "k": 10,
+          "num_candidates": 50
+        }
+      }
+    ]
+  }
+}
+```
+
+**Wrapped format with custom weights** {applies_to}`stack: ga 9.2`:
+```json
+{
+  "rrf": {
+    "retrievers": [
+      {
+        "retriever": {
+          "standard": {
+            "query": {
+              "multi_match": {
+                "query": "search text",
+                "fields": ["field1", "field2"]
+              }
+            }
+          }
+        },
+        "weight": 2.0
+      },
+      {
+        "retriever": {
+          "knn": {
+            "field": "vector",
+            "query_vector": [1, 2, 3],
+            "k": 10,
+            "num_candidates": 50
+          }
+        },
+        "weight": 1.0
+      }
+    ]
+  }
+}
+```
+
+In the wrapped format:
+
+`retriever`
+:   (Required, a retriever object)
+
+    Specifies a child retriever. Any valid retriever type can be used (e.g., `standard`, `knn`, `text_similarity_reranker`, etc.).
+
+`weight` {applies_to}`stack: ga 9.2`
+:   (Optional, float)
+
+    The weight that each score of this retriever's top docs will be multiplied in the RRF formula. Higher values increase this retriever's influence on the final ranking. Must be non-negative. Defaults to `1.0`.
+
 ## Example: Hybrid search [rrf-retriever-example-hybrid]
 
 A simple hybrid search example (lexical search + dense vector search) combining a `standard` retriever with a `knn` retriever using RRF:
@@ -99,6 +181,75 @@ GET /restaurants/_search
 5. The rank constant for the RRF retriever.
 6. The rank window size for the RRF retriever.
 
+## Example: Weighted hybrid search [rrf-retriever-example-weighted]
+
+{applies_to}`stack: ga 9.2`
+
+This example demonstrates how to use weights to adjust the influence of different retrievers in the RRF ranking.
+In this case, we're giving the `standard` retriever more importance (weight 2.0) compared to the `knn` retriever (weight 1.0):
+
+```console
+GET /restaurants/_search
+{
+  "retriever": {
+    "rrf": {
+      "retrievers": [
+        {
+          "retriever": { <1>
+            "standard": {
+              "query": {
+                "multi_match": {
+                  "query": "Austria",
+                  "fields": ["city", "region"]
+                }
+              }
+            }
+          },
+          "weight": 2.0 <2>
+        },
+        {
+          "retriever": { <3>
+            "knn": {
+              "field": "vector",
+              "query_vector": [10, 22, 77],
+              "k": 10,
+              "num_candidates": 10
+            }
+          },
+          "weight": 1.0 <4>
+        }
+      ],
+      "rank_constant": 60,
+      "rank_window_size": 50
+    }
+  }
+}
+```
+% TEST[continued]
+
+1. The first retriever in weighted format.
+2. This retriever has a weight of 2.0, giving it twice the influence of the kNN retriever.
+3. The second retriever in weighted format.
+4. This retriever has a weight of 1.0 (default weight).
+
+::::{note}
+You can mix weighted and non-weighted formats in the same query.
+The direct format (without explicit `retriever` wrapper) uses the default weight of `1.0`:
+
+```json
+{
+  "rrf": {
+    "retrievers": [
+      { "standard": { "query": {...} } },
+      { "retriever": { "knn": {...} }, "weight": 2.0 }
+    ]
+  }
+}
+```
+
+In this example, the `standard` retriever uses weight `1.0` (default), while the `knn` retriever uses weight `2.0`.
+::::
+
 ## Example: Hybrid search with sparse vectors [rrf-retriever-example-hybrid-sparse]
 
 A more complex hybrid search example (lexical search + ELSER sparse vector search + dense vector search) using RRF: