elastic · markjhoy · Nov 17, 2025 · Oct 2, 2025 · Oct 3, 2025 · Oct 3, 2025
diff --git a/docs/changelog/135880.yaml b/docs/changelog/135880.yaml
@@ -0,0 +1,5 @@
+pr: 135873
+summary: Adds retriever for result diversification using MMR
+area: Search
+type: enhancement
+issues: [ ]
diff --git a/docs/reference/elasticsearch/rest-apis/retrievers.md b/docs/reference/elasticsearch/rest-apis/retrievers.md
@@ -8,87 +8,138 @@ applies_to:
 
 # Retrievers [retriever]
 
-A retriever is a specification to describe top documents returned from a search. A retriever replaces other elements of the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) that also return top documents such as [`query`](/reference/query-languages/querydsl.md) and [`knn`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-api-knn). A retriever may have child retrievers where a retriever with two or more children is considered a compound retriever. This allows for complex behavior to be depicted in a tree-like structure, called the retriever tree, which clarifies the order of operations that occur during a search.
+A retriever is a specification to describe top documents returned from a search.
+A retriever replaces other elements of
+the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search)
+that also return top documents such
+as [`query`](/reference/query-languages/querydsl.md)
+and [`knn`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-api-knn).
+A retriever may have child retrievers where a retriever with two or more
+children is considered a compound retriever. This allows for complex behavior to
+be depicted in a tree-like structure, called the retriever tree, which clarifies
+the order of operations that occur during a search.
 
 ::::{tip}
-Refer to [*Retrievers*](docs-content://solutions/search/retrievers-overview.md) for a high level overview of the retrievers abstraction. Refer to [Retrievers examples](retrievers/retrievers-examples.md) for additional examples.
+Refer to [*Retrievers*](docs-content://solutions/search/retrievers-overview.md)
+for a high level overview of the retrievers abstraction. Refer
+to [Retrievers examples](retrievers/retrievers-examples.md) for additional
+examples.
 
 ::::
 
 The following retrievers are available:
 
+`diversify`
+:   The [diversify](retrievers/diversify-retriever.md) pares down the result set
+from an inner retriever to help make the final results more diversified from
+each other.
+
 `knn`
-:   The [knn](retrievers/knn-retriever.md) retriever replaces the functionality of a [knn search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-api-knn).
+:   The [knn](retrievers/knn-retriever.md) retriever replaces the functionality
+of
+a [knn search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-api-knn).
 
 `linear`
-:   The [linear](retrievers/linear-retriever.md) retriever linearly combines the scores of other retrievers for the top documents.
+:   The [linear](retrievers/linear-retriever.md) retriever linearly combines the
+scores of other retrievers for the top documents.
 
 `pinned` {applies_to}`stack: GA 9.1`
-:   The [pinned](retrievers/pinned-retriever.md) retriever always places specified documents at the top of the results, with the remaining hits provided by a secondary retriever.
+:   The [pinned](retrievers/pinned-retriever.md) retriever always places
+specified documents at the top of the results, with the remaining hits provided
+by a secondary retriever.
 
 `rescorer`
-:   The [rescorer](retrievers/rescorer-retriever.md) retriever replaces the functionality of the [query rescorer](/reference/elasticsearch/rest-apis/rescore-search-results.md#rescore).
+:   The [rescorer](retrievers/rescorer-retriever.md) retriever replaces the
+functionality of
+the [query rescorer](/reference/elasticsearch/rest-apis/rescore-search-results.md#rescore).
 
 `rrf`
-:   The [rrf](retrievers/rrf-retriever.md) retriever produces top documents from [reciprocal rank fusion (RRF)](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md).
+:   The [rrf](retrievers/rrf-retriever.md) retriever produces top documents
+from [reciprocal rank fusion (RRF)](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md).
 
 `rule`
-:   The [rule](retrievers/rule-retriever.md) retriever applies contextual [Searching with query rules](/reference/elasticsearch/rest-apis/searching-with-query-rules.md#query-rules) to pin or exclude documents for specific queries.
+:   The [rule](retrievers/rule-retriever.md) retriever applies
+contextual [Searching with query rules](/reference/elasticsearch/rest-apis/searching-with-query-rules.md#query-rules)
+to pin or exclude documents for specific queries.
 
 `standard`
-:   The [standard](retrievers/standard-retriever.md) retriever replaces the functionality of a traditional [query](/reference/query-languages/querydsl.md).
+:   The [standard](retrievers/standard-retriever.md) retriever replaces the
+functionality of a traditional [query](/reference/query-languages/querydsl.md).
 
 `text_similarity_reranker`
-:   The [text_similarity_reranker](retrievers/text-similarity-reranker-retriever.md) retriever enhances search results by re-ranking documents based on semantic similarity to a specified inference text, using a machine learning model.
+:
+The [text_similarity_reranker](retrievers/text-similarity-reranker-retriever.md)
+retriever enhances search results by re-ranking documents based on semantic
+similarity to a specified inference text, using a machine learning model.
 
 ## Common usage guidelines [retriever-common-parameters]
 
-
 ### Using `from` and `size` with a retriever tree [retriever-size-pagination]
 
-The [`from`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-from-param) and [`size`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-size-param) parameters are provided globally as part of the general [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search). They are applied to all retrievers in a retriever tree, unless a specific retriever overrides the `size` parameter using a different parameter such as `rank_window_size`. Though, the final search hits are always limited to `size`.
-
+The [`from`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-from-param)
+and [`size`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-size-param)
+parameters are provided globally as part of the
+general [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search).
+They are applied to all retrievers in a retriever tree, unless a specific
+retriever overrides the `size` parameter using a different parameter such
+as `rank_window_size`. Though, the final search hits are always limited
+to `size`.
 
 ### Using aggregations with a retriever tree [retriever-aggregations]
 
-[Aggregations](/reference/aggregations/index.md) are globally specified as part of a search request. The query used for an aggregation is the combination of all leaf retrievers as `should` clauses in a [boolean query](/reference/query-languages/query-dsl/query-dsl-bool-query.md).
-
+[Aggregations](/reference/aggregations/index.md) are globally specified as part
+of a search request. The query used for an aggregation is the combination of all
+leaf retrievers as `should` clauses in
+a [boolean query](/reference/query-languages/query-dsl/query-dsl-bool-query.md).
 
 ### Restrictions on search parameters when specifying a retriever [retriever-restrictions]
 
-When a retriever is specified as part of a search, the following elements are not allowed at the top-level:
+When a retriever is specified as part of a search, the following elements are
+not allowed at the top-level:
 
 * [`query`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#request-body-search-query)
 * [`knn`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-api-knn)
 * [`search_after`](/reference/elasticsearch/rest-apis/paginate-search-results.md#search-after)
 * [`terminate_after`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#request-body-search-terminate-after)
 * [`sort`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-sort-param)
-* [`rescore`](/reference/elasticsearch/rest-apis/rescore-search-results.md#rescore) use a [rescorer retriever](retrievers/rescorer-retriever.md) instead
-
+* [`rescore`](/reference/elasticsearch/rest-apis/rescore-search-results.md#rescore)
+  use a [rescorer retriever](retrievers/rescorer-retriever.md) instead
 
 ## Multi-field query format [multi-field-query-format]
+
 ```yaml {applies_to}
 stack: ga 9.1
 ```
 
-The [`linear`](retrievers/linear-retriever.md) and [`rrf`](retrievers/rrf-retriever.md) retrievers support a multi-field query format that provides a simplified way to define searches across multiple fields without explicitly specifying inner retrievers.
-This format automatically generates appropriate inner retrievers based on the field types and query parameters.
-This is a great way to search an index, knowing little to nothing about its schema, while also handling normalization across lexical and semantic matches.
+The [`linear`](retrievers/linear-retriever.md)
+and [`rrf`](retrievers/rrf-retriever.md) retrievers support a multi-field query
+format that provides a simplified way to define searches across multiple fields
+without explicitly specifying inner retrievers.
+This format automatically generates appropriate inner retrievers based on the
+field types and query parameters.
+This is a great way to search an index, knowing little to nothing about its
+schema, while also handling normalization across lexical and semantic matches.
 
 ### Field grouping [multi-field-field-grouping]
 
 The multi-field query format groups queried fields into two categories:
 
-- **Lexical fields**: fields that support term queries, such as `keyword` and `text` fields.
-- **Semantic fields**: [`semantic_text` fields](/reference/elasticsearch/mapping-reference/semantic-text.md).
+- **Lexical fields**: fields that support term queries, such as `keyword`
+  and `text` fields.
+- **Semantic fields
+  **: [`semantic_text` fields](/reference/elasticsearch/mapping-reference/semantic-text.md).
 
-Each field group is queried separately and the scores/ranks are normalized such that each contributes 50% to the final score/rank.
+Each field group is queried separately and the scores/ranks are normalized such
+that each contributes 50% to the final score/rank.
 This balances the importance of lexical and semantic fields.
-Most indices contain more lexical than semantic fields, and without this grouping the results would often bias towards lexical field matches.
+Most indices contain more lexical than semantic fields, and without this
+grouping the results would often bias towards lexical field matches.
 
 ::::{warning}
-In the `linear` retriever, this grouping relies on using a normalizer other than `none` (i.e., `minmax` or `l2_norm`).
-If you use the `none` normalizer, the scores across field groups will not be normalized and the results may be biased towards lexical field matches.
+In the `linear` retriever, this grouping relies on using a normalizer other
+than `none` (i.e., `minmax` or `l2_norm`).
+If you use the `none` normalizer, the scores across field groups will not be
+normalized and the results may be biased towards lexical field matches.
 ::::
 
 ### Linear retriever field boosting [multi-field-field-boosting]
@@ -199,7 +250,8 @@ GET books/_search
 2. 2x weight
 3. 1x weight (default)
 
-Due to how the [field group scores](#multi-field-field-grouping) are normalized, per-field boosts have no effect on the range of the final score.
+Due to how the [field group scores](#multi-field-field-grouping) are normalized,
+per-field boosts have no effect on the range of the final score.
 Instead, they affect the importance of the field's score within its group.
 
 For example, if the schema looks like:
@@ -227,6 +279,7 @@ PUT /books
   }
 }
 ```
+
 % TEST[continued]
 
 And we run this query:
@@ -248,16 +301,18 @@ GET books/_search
   }
 }
 ```
+
 % TEST[continued]
 
 The score breakdown would be:
 
 * Lexical fields (50% of score):
-  * `title`: 50% of lexical fields group score, 25% of final score
-  * `description`: 50% of lexical fields group score, 25% of final score
+    * `title`: 50% of lexical fields group score, 25% of final score
+    * `description`: 50% of lexical fields group score, 25% of final score
 * Semantic fields (50% of score):
-  * `title_semantic`: 50% of semantic fields group score, 25% of final score
-  * `description_semantic`: 50% of semantic fields group score, 25% of final score
+    * `title_semantic`: 50% of semantic fields group score, 25% of final score
+    * `description_semantic`: 50% of semantic fields group score, 25% of final
+      score
 
 If we apply per-field boosts like so:
 
@@ -278,6 +333,7 @@ GET books/_search
   }
 }
 ```
+
 % TEST[continued]
 
 The score breakdown would change to:
@@ -287,7 +343,8 @@ The score breakdown would change to:
     * `description`: 40% of lexical fields group score, 20% of final score
 * Semantic fields (50% of score):
     * `title_semantic`: 33% of semantic fields group score, 16.5% of final score
-    * `description_semantic`: 66% of semantic fields group score, 33% of final score
+    * `description_semantic`: 66% of semantic fields group score, 33% of final
+      score
 
 ### Wildcard field patterns [multi-field-wildcard-field-patterns]
 
@@ -307,20 +364,24 @@ GET books/_search
   }
 }
 ```
+
 % TEST[continued]
 
 1. Match fields that start with `title`
 2. Match fields that end with `_text`
 
-Note, however, that wildcard field patterns will only resolve to fields that either:
+Note, however, that wildcard field patterns will only resolve to fields that
+either:
 
 - Support term queries, such as `keyword` and `text` fields
 - Are `semantic_text` fields
 
 ### Limitations
 
-- **Single index**: Until 9.2, multi-field queries only work with single index searches.
-- **CCS (Cross Cluster Search)**: Multi-field queries do not support remote cluster searches
+- **Single index**: Until 9.2, multi-field queries only work with single index
+  searches.
+- **CCS (Cross Cluster Search)**: Multi-field queries do not support remote
+  cluster searches
 
 ### Examples
 

diff --git a/docs/reference/elasticsearch/rest-apis/retrievers/diversify-retriever.md b/docs/reference/elasticsearch/rest-apis/retrievers/diversify-retriever.md
@@ -0,0 +1,84 @@
+---
+applies_to:
+  stack: all
+  serverless:
+---
+
+# Diversify retriever [diversify-retriever]
+
+The diversify retriever is able to pare down results from another retriever to
+apply diversification to the top-N results.
+This is particularly useful in cases where you need to have relevant, but
+non-similar results returned from your query. An example of this may be to
+provide more diverse context to a RAG prompt.
+
+Using MMR (Maximum Marginal Relevance) diversification, the retriever discards
+any inner retriever results that are too similar to each other based on
+the `field` parameter and in reference to any `query_vector` that is provided.
+Note that the order of the results from the inner retriever is not changed.
+
+## Parameters [diversify-retriever-parameters]
+
+`type`
+:   (Required, string)
+
+    The type of diversification to use. Currently only `mmr` (maximum marginal relevance) is supported.
+
+`field`
+:   (Required, string)
+
+    The name of the field that will use its values for the diversification process.
+    The field must be a `dense_vector` type.
+
+`num_candidates`
+:   (Required, integer)
+
+    The maximum number of top-N results to return.
+
+`retriever`
+:   (Required, retriever object)
+
+    A single child retriever to specify which sets of returned top documents will have the diversification applied to them.
+    Note that although some of the inner retriever's results may be removed, the rank and order will not change.
+
+`query_vector`
+:   (Optional, array of `float` or `byte`)
+
+    Query vector. Must have the same number of dimensions as the vector field you are searching against.
+    Must be either an array of floats or a hex-encoded byte vector.
+
+`lambda`
+:   (Required if `mmr` is used, float)
+
+    A number between 0.0 and 1.0 specifying how much weight for diversification should be given to the query vector as opposed to the amount of weight given to the field values.
+
+## Example
+
+The following example uses a MMR diversification retriever to diversify and
+return the top three results from the inner standard retriever.
+The lambda is set at 0.7 which favors the weight from the comparisons of the
+vectors in `my_dense_field_vector` over the query vector for determining the
+differencs between the documents.
+
+```console
+GET my_index/_search
+{
+  "retriever": {
+    "diversify": {
+      "type": "mmr",
+      "field": "my_dense_vector_field",
+      "lambda": 0.7,
+      "num_candidates": 3
+      "query_vector": [0.1, 0.2, 0.3],
+      "retriever": {
+        "standard": {
+          "query": {
+            "match": {
+              "title": "elasticsearch"
+            }
+          }
+        }
+      }
+    }
+  }
+}
diff --git a/docs/reference/elasticsearch/toc.yml b/docs/reference/elasticsearch/toc.yml
@@ -114,6 +114,7 @@ toc:
                 - file: rest-apis/retrievers/rule-retriever.md
                 - file: rest-apis/retrievers/standard-retriever.md
                 - file: rest-apis/retrievers/text-similarity-reranker-retriever.md
+                - file: rest-apis/retrievers/diversify-retriever.md
                 - file: rest-apis/retrievers/retrievers-examples.md
             - file: rest-apis/search-multiple-data-streams-indices.md
             - file: rest-apis/search-profile.md

diff --git a/server/src/main/java/org/elasticsearch/search/SearchModule.java b/server/src/main/java/org/elasticsearch/search/SearchModule.java
@@ -208,6 +208,7 @@
 import org.elasticsearch.search.aggregations.pipeline.StatsBucketPipelineAggregationBuilder;
 import org.elasticsearch.search.aggregations.pipeline.SumBucketPipelineAggregationBuilder;
 import org.elasticsearch.search.aggregations.support.ValuesSourceRegistry;
+import org.elasticsearch.search.diversification.ResultDiversificationRetrieverBuilder;
 import org.elasticsearch.search.fetch.FetchPhase;
 import org.elasticsearch.search.fetch.FetchSubPhase;
 import org.elasticsearch.search.fetch.subphase.ExplainPhase;
@@ -1087,6 +1088,9 @@ private void registerRetrieverParsers(List<SearchPlugin> plugins) {
         registerRetriever(new RetrieverSpec<>(StandardRetrieverBuilder.NAME, StandardRetrieverBuilder::fromXContent));
         registerRetriever(new RetrieverSpec<>(KnnRetrieverBuilder.NAME, KnnRetrieverBuilder::fromXContent));
         registerRetriever(new RetrieverSpec<>(RescorerRetrieverBuilder.NAME, RescorerRetrieverBuilder::fromXContent));
+        registerRetriever(
+            new RetrieverSpec<>(ResultDiversificationRetrieverBuilder.NAME, ResultDiversificationRetrieverBuilder::fromXContent)
+        );
 
         registerFromPlugin(plugins, SearchPlugin::getRetrievers, this::registerRetriever);
     }