You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/evals.md
+54Lines changed: 54 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -688,3 +688,57 @@ async def main():
688
688
2. Save the dataset to a JSON file, this will also write `questions_cases_schema.json` with th JSON schema for `questions_cases.json`. This time the `$schema` key is included in the JSON file to define the schema for IDEs to use while you edit the file, there's no formal spec for this, but it works in vscode and pycharm and is discussed at length in [json-schema-org/json-schema-spec#828](https://github.com/json-schema-org/json-schema-spec/issues/828).
689
689
690
690
_(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main(answer))` to run `main`)_
691
+
692
+
## Integration with Logfire
693
+
694
+
Pydantic Evals is implemented using OpenTelemetry to record traces of the evaluation process. These traces contain all
695
+
the information included in the terminal output as attributes, but also include full tracing from the executions of the
696
+
evaluation task function.
697
+
698
+
You can send these traces to any OpenTelemetry-compatible backend, including [Pydantic Logfire](https://logfire.pydantic.dev/docs).
699
+
700
+
All you need to do is configure Logfire via `logfire.configure`:
701
+
702
+
```python {title="logfire_integration.py"}
703
+
import logfire
704
+
from judge_recipes import recipe_dataset, transform_recipe
705
+
706
+
logfire.configure(
707
+
send_to_logfire='if-token-present', # (1)!
708
+
environment='development', # (2)!
709
+
service_name='evals', # (3)!
710
+
)
711
+
712
+
recipe_dataset.evaluate_sync(transform_recipe)
713
+
```
714
+
715
+
1. The `send_to_logfire` argument controls when traces are sent to Logfire. You can set it to `'if-token-present'` to send data to Logfire only if the `LOGFIRE_TOKEN` environment variable is set. See the [Logfire configuration docs](https://logfire.pydantic.dev/docs/reference/configuration/) for more details.
716
+
2. The `environment` argument sets the environment for the traces. It's a good idea to set this to `'development'` when running tests or evaluations and sending data to a project with production data, to make it easier to filter these traces out while reviewing data from your production environment(s).
717
+
3. The `service_name` argument sets the service name for the traces. This is displayed in the Logfire UI to help you identify the source of the associated spans.
718
+
719
+
Logfire has some special integration with Pydantic Evals traces, including a table view of the evaluation results
720
+
on the evaluation root span (which is generated in each call to [`Dataset.evaluate`][pydantic_evals.Dataset.evaluate]):
0 commit comments