You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using the library with more complex functions, such as PydanticAI agents, is similar — all you need to do is define a
72
+
task function wrapping the function you want to evaluate, with a signature that matches the inputs and outputs of your
73
+
test cases.
74
+
75
+
## Logfire Integration
76
+
77
+
Pydantic Evals uses OpenTelemetry to record traces for each case in your evaluations.
78
+
79
+
You can send these traces to any OpenTelemetry-compatible backend. For the best experience, we recommend [Pydantic Logfire](https://logfire.pydantic.dev/docs), which includes custom views for evals:
<imgsrc="https://ai.pydantic.dev/img/logfire-evals-case.png"alt="Logfire Evals Case View"width="48%">
84
+
</div>
85
+
86
+
You'll see full details about the inputs, outputs, token usage, execution durations, etc. And you'll have access to the full trace for each case — ideal for debugging, writing path-aware evaluators, or running the similar evaluations against production traces.
87
+
88
+
Basic setup:
89
+
90
+
```python {test="skip" lint="skip" format="skip"}
91
+
import logfire
92
+
93
+
logfire.configure(
94
+
send_to_logfire='if-token-present',
95
+
environment='development',
96
+
service_name='evals',
97
+
)
98
+
99
+
...
100
+
101
+
my_dataset.evaluate_sync(my_task)
102
+
```
103
+
104
+
[Read more about the Logfire integration here.](https://ai.pydantic.dev/evals/#logfire-integration)
0 commit comments