-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Hybrid Retriever POC - DO NOT MERGE #125052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
kderusso
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great POC! As it's a POC I didn't review the code itself, but I left one suggestion to think about when we move to production implementation.
One thing that would be really compelling as an example, is to generate a really complex query using the linear retriever, and then generate the same (nicer, smaller) query using the hybrid retriever.
| public static final String NAME = "hybrid"; | ||
| public static final ParseField FIELDS_FIELD = new ParseField("fields"); | ||
| public static final ParseField QUERY_FIELD = new ParseField("query"); | ||
| public static final ParseField RERANK_FIELD = new ParseField("rerank"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that we incorporated rerank into this POC - I'd also like to see rule incorporated as I think business rules will be a critical part of the hybrid retriever. In that vein I wonder if there's something more generic we can do with the retrievers that we call - Something like this, that could be easily extended as we add any additional future retrievers or want more customization.
POST wiki-index/_search
{
"retriever": {
"hybrid": {
"fields": ["content", "content.semantic"],
"query": "foo",
"rank_modifiers": [
"rule": {
...
},
"rerank": {
"inference_id": "my-reranker-service",
"field": "content"
}]
}
}
}
Adds a
hybridretriever for simple hybrid search across lexical & semantic text fields:Semantic reranking using the
text_similarity_rerankeris integrated:You can use the caret notation to boost matches in certain fields:
And you can use
query_settingsto customize the query run against certain fields: