Skip to content

Conversation

pmpailis
Copy link
Contributor

@pmpailis pmpailis commented Sep 9, 2024

In this PR we introduce a different way to evaluate the rrf retriever, by leveraging the query rewrite step at the beginning of query execution.

The main points of this PR are:

  • The introduction of a new CompoundRetrieverBuilder class that defines a class of retrievers that operate on the top results of other retriever(s) and combine their results appropriately. All nested retrievers, after being rewritten themselves, are executed asynchronously through msearch and then the compound retriever is responsible for merging the top results from each sub-retriever through the abstract combineInnerRetrieverResults method.
  • Changes to the RankDocsRetrieverBuilder when extracting to source. The idea now is that we enhance this to also compute topDocs for the innerRetrievers when either profile or explain are present.
  • profile and explain currently require for the inner retrievers to be evaluated twice. This is something that we can build upon and improve maybe in subsequent PRs.
  • Removing the existing limitation to the depth of nested retrievers
  • Updating RankDocsSortField to encode rank and score for each result (now named RankDocsAndScoreSortField)

The high level idea now is that all compound retrievers (including rrf) will try to rewrite themselves into a "simpler" RankDocsRetrieverBuilder by rewriting and executing asynchronously all nested retrievers. This gives us the option to support any arbitrary nesting of retrievers, as well as more fine-grained search params like collapse, highlight, and nested queries.
If any of the subsearches fail, we fail the request and return a 5xx response.

Note: We still support running rrf rank with sub_searches but this uses the old approach which is a bit different as no nesting is supported (explain/profile are also evaluated differently).

@pmpailis
Copy link
Contributor Author

I think some of the yml tests can be combined into a single file. This will make test infra not so rough.

Merged all different yaml tests files in ff05d13 :)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. barring any breaking changes decision around _rank and how rrf behavior has changed (though since its all tech-preview, I personally think this is OK).

int rank = 1;
for (ScoreDoc scoreDoc : rrfRankResult) {
final int findex = index;
final int frank = rank;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me be frank.... 😂

@pmpailis
Copy link
Contributor Author

@elasticmachine update branch

@pmpailis
Copy link
Contributor Author

@elasticmachine update branch

@pmpailis pmpailis added the auto-backport Automatically create backport pull requests when merged label Oct 3, 2024
@pmpailis
Copy link
Contributor Author

pmpailis commented Oct 3, 2024

@benwtrent and @jimczi a huge thank you for reviewing and all support throughout ❤️

@pmpailis pmpailis merged commit dc8c20d into elastic:main Oct 3, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

pmpailis added a commit to pmpailis/elasticsearch that referenced this pull request Oct 3, 2024
matthewabbott pushed a commit to matthewabbott/elasticsearch that referenced this pull request Oct 4, 2024
@pmpailis pmpailis deleted the rework_rrf_to_work_through_rewrite_phase branch May 27, 2025 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >non-issue :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.16.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants