Help with the LLM As A Judge Evaluation #11937
Replies: 1 comment 1 reply
-
|
To evaluate each nested LLM call separately in your trace, you should use observation-level evaluators instead of trace-level evaluators(1). Based on your setup where you have multiple LLM generations (named "test") within a single trace, here's what you need to do: Solution: Use Live Observations EvaluationWhen setting up your LLM-as-a-Judge evaluator, select "Live Observations" as your evaluation target instead of "Live Traces"(1). This allows you to:
Configuration Steps
Important Note for SDK UsersIf you're using the OpenTelemetry-based SDKs (Python v3+ or JS/TS v4+), observation-level evaluators are the recommended approach(1). If you need to filter observations by trace-level attributes like This approach will evaluate both of your "test" generations (the one in 📚 Sources: Have another question? Just tag @inkeep. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Describe your question
I have created a FASTAPI application in which there are nested LLM calls. I want to run evaluator on those nested llm calls. for better understanding of the call structured please refer to the image:
Here nested function calls joke function and then joke function calls the LLM. later the nested function calls poetry function and then in turn it again calls the LLM. I want to run evaluation on the test (name given by me in config run_name langchain).
In each trace object there can be multiple test (LLM Calls), I want for each trace object each LLM within that I should be able to run the evaluator on.
As of now I tried selecting the object as generation then I selected name as test, this works but it is just considering the last LLM call of the trace for the evolution.
Langfuse Cloud or Self-Hosted?
Langfuse Cloud
If Self-Hosted
No response
If Langfuse Cloud
No response
SDK and integration versions
No response
Pre-Submission Checklist
Beta Was this translation helpful? Give feedback.
All reactions