Skip to content

Commit eb8a871

Browse files
committed
Updating retriever-examples documentation to run validation tests on the provided snippets (#116643)
1 parent 84c04ef commit eb8a871

File tree

2 files changed

+1149
-219
lines changed

2 files changed

+1149
-219
lines changed

docs/reference/search/rrf.asciidoc

Lines changed: 97 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ The `rrf` retriever does not currently support:
105105
* <<rescore, rescore>>
106106

107107
Using unsupported features as part of a search with an `rrf` retriever results in an exception.
108-
+
108+
109109
IMPORTANT: It is best to avoid providing a <<search-api-pit, point in time>> as part of the request, as
110110
RRF creates one internally that is shared by all sub-retrievers to ensure consistent results.
111111

@@ -703,3 +703,99 @@ So for the same params as above, we would now have:
703703

704704
* `from=0, size=2` would return [`1`, `5`] with ranks `[1, 2]`
705705
* `from=2, size=2` would return an empty result set as it would fall outside the available `rank_window_size` results.
706+
707+
==== Aggregations in RRF
708+
709+
The `rrf` retriever supports aggregations from all specified sub-retrievers. Important notes about aggregations:
710+
711+
* They operate on the complete result set from all sub-retrievers
712+
* They are not limited by the `rank_window_size` parameter
713+
* They process the union of all matching documents
714+
715+
For example, consider the following document set:
716+
[source,js]
717+
----
718+
{
719+
"_id": 1, "termA": "foo",
720+
"_id": 2, "termA": "foo", "termB": "bar",
721+
"_id": 3, "termA": "aardvark", "termB": "bar",
722+
"_id": 4, "termA": "foo", "termB": "bar"
723+
}
724+
----
725+
// NOTCONSOLE
726+
727+
Perform a term aggregation on the `termA` field using an `rrf` retriever:
728+
[source,js]
729+
----
730+
{
731+
"retriever": {
732+
"rrf": {
733+
"retrievers": [
734+
{
735+
"standard": {
736+
"query": {
737+
"term": {
738+
"termB": "bar"
739+
}
740+
}
741+
}
742+
},
743+
{
744+
"standard": {
745+
"query": {
746+
"match_all": { }
747+
}
748+
}
749+
}
750+
],
751+
"rank_window_size": 1
752+
}
753+
},
754+
"size": 1,
755+
"aggs": {
756+
"termA_agg": {
757+
"terms": {
758+
"field": "termA"
759+
}
760+
}
761+
}
762+
}
763+
----
764+
// NOTCONSOLE
765+
766+
The aggregation results will include *all* matching documents, regardless of `rank_window_size`.
767+
[source, js]
768+
----
769+
{
770+
"foo": 3,
771+
"aardvark": 1
772+
}
773+
774+
----
775+
// NOTCONSOLE
776+
777+
==== Highlighting in RRF
778+
779+
Using the `rrf` retriever, you can add <<highlighting, highlight snippets>> to show relevant text snippets in your search results. Highlighted snippets are computed based
780+
on the matching text queries defined on the sub-retrievers.
781+
782+
IMPORTANT: Highlighting on vector fields, using either the `knn` retriever or a `knn` query, is not supported.
783+
784+
A more specific example of highlighting in RRF can also be found in the <<retrievers-examples-highlighting-retriever-results, retrievers examples>> page.
785+
786+
==== Inner hits in RRF
787+
788+
The `rrf` retriever supports <<inner-hits,inner hits>> functionality, allowing you to retrieve
789+
related nested or parent/child documents alongside your main search results. Inner hits can be
790+
specified as part of any nested sub-retriever and will be propagated to the top-level parent
791+
retriever. Note that the inner hit computation will take place only at end of `rrf` retriever's
792+
evaluation on the top matching documents, and not as part of the query execution of the nested
793+
sub-retrievers.
794+
795+
[IMPORTANT]
796+
====
797+
When defining multiple `inner_hits` sections across sub-retrievers:
798+
799+
* Each `inner_hits` section must have a unique name
800+
* Names must be unique across all sub-retrievers in the search request
801+
====

0 commit comments

Comments
 (0)