Similarity search accuracy #3360
-
Hi Team, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @JaisVJ if some of the queries are not full sentences but something like ""Error code #32" then this looks to me more like a keyword search, with "error", "code" and "32" being the keywords. You could check whether the results for this query get better if you use the BM25Retriever instead of the EmbeddingRetriever. If that's the case, you could change your pipeline such that it contains two retrievers. Our tutorial 11 contains an example. Just search for "CustomQueryClassifier" in that tutorial. |
Beta Was this translation helpful? Give feedback.
Hi @JaisVJ if some of the queries are not full sentences but something like ""Error code #32" then this looks to me more like a keyword search, with "error", "code" and "32" being the keywords. You could check whether the results for this query get better if you use the BM25Retriever instead of the EmbeddingRetriever. If that's the case, you could change your pipeline such that it contains two retrievers. Our tutorial 11 contains an example. Just search for "CustomQueryClassifier" in that tutorial.
Another idea would be to store "issue" and "solution" in two different fields of the document when you create it from your dataset. "solution" could be still stored in "content" but "issue" cou…