Skipping an item in LLM-as-a-judge eval #11994
-
Describe your questionI am using an LLM-as-a-judge evaluator for a particular metric when evaluating a dataset. However, not all items might be valid to be scored by that evaluator. When using SDK evaluators, I know that it is possible to "skip" that item by returning How do I do this for LLM-as-a-judge evaluator? Langfuse Cloud or Self-Hosted?Self-Hosted If Self-Hostedv3.148.0 If Langfuse CloudNo response SDK and integration versionsNo response Pre-Submission Checklist
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
|
Based on the available documentation, there isn't a direct way to skip individual items in LLM-as-a-Judge evaluators similar to returning LLM-as-a-Judge evaluators work differently from SDK evaluators. When you set up an LLM-as-a-Judge evaluator, you configure it to run on specific data by choosing filters (trace name, tags, userId, metadata, etc.) and variable mappings(1). The evaluator then processes all matched items according to these filters. Workarounds you could consider:
The LLM-as-a-Judge feature is designed to run evaluations at scale on filtered data sets, but it doesn't currently support item-level conditional skipping within the evaluator itself(1). 📚 Sources: Have another question? Just tag @inkeep. |
Beta Was this translation helpful? Give feedback.
Hi @sidhi-ts, this functionality is currently not supported. What we recommend in this case is to split up your datasets according to what should be evaluated. You can then configure your evaluators to only run on the datasets that are relevant.
If this would not work for you, feel free to share more details about your use case so we can consider it as a feature request :)