-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
[X] I checked the documentation and related resources and couldn't find an answer to my question.
https://docs.ragas.io/en/stable/concepts/metrics/critique.html
Your Question
In the documentation (linked above), there is a calculation reference to collecting 3 different verdicts from 3 LLM calls. It then seems that the strictness parameter would then determine how to produce the aggregate final score per the particular aspect ratio.
Looking at the source code, I’m having trouble finding how the multiple verdicts are derived. The prompt engineering only seems to indicate that it’s looking for a single final verdict per the aspect critique. Interestingly, there does seem to be specific code to look the “commonality” of verdicts per the strictness parameter. I’m struggling to connect how the LLM could produce more than one verdict per how the code and prompt engineering are currently written.
(I really like the idea, which is why I’m asking. I’d like to implement that with that strictness parameter working. 😃)
Code Examples
Linked source code in the material above
Additional context
None