Skip to content

Remove UserWarning: Pydantic serializer warnings when using dataset.evaluate_sync without logfire installed #2515

@avergalogi

Description

@avergalogi

Initial Checks

Description

Hello,

When I run the code below (especially the dataset.evaluate_sync function) from the documentation without having logfire installed, I always get Pydantic warnings.

It seems that when logfire is not installed, pydantic-ai uses a MagicMock object. Pydantic then sees this MagicMock and produces a warning because it expects a real value (like a float).

If I install logfire, the warnings disappear.

It would be great if the library did not show these warnings when logfire isn't installed. I am not using logfire in my project, and the warnings create a lot of noise in my logs.

The warning I see:

UserWarning: Pydantic serializer warnings:
PydanticSerializationUnexpectedValue(Expected float - serialized value may not be as expected [input_value=, input_type=MagicMock])
PydanticSerializationUnexpectedValue(Expected float - serialized value may not be as expected [input_value=, input_type=MagicMock])
return self.serializer.to_python(

Could this be fixed so that no warnings are shown when using the library without logfire?

Thank you!

Example Code

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext, IsInstance

case1 = Case(
    name='simple_case',
    inputs='What is the capital of France?',
    expected_output='Paris',
    metadata={'difficulty': 'easy'},
)


class MyEvaluator(Evaluator[str, str]):
    def evaluate(self, ctx: EvaluatorContext[str, str]) -> float:
        if ctx.output == ctx.expected_output:
            return 1.0
        elif (
            isinstance(ctx.output, str)
            and ctx.expected_output.lower() in ctx.output.lower()
        ):
            return 0.8
        else:
            return 0.0


dataset = Dataset(
    cases=[case1],
    evaluators=[IsInstance(type_name='str'), MyEvaluator()],
)


async def guess_city(question: str) -> str:
    return 'Paris'


report = dataset.evaluate_sync(guess_city)
report.print(include_input=True, include_output=True, include_durations=False)
"""
                              Evaluation Summary: guess_city
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Case ID     ┃ Inputs                         ┃ Outputs ┃ Scores            ┃ Assertions ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ simple_case │ What is the capital of France? │ Paris   │ MyEvaluator: 1.00 │ ✔          │
├─────────────┼────────────────────────────────┼─────────┼───────────────────┼────────────┤
│ Averages    │                                │         │ MyEvaluator: 1.00 │ 100.0% ✔   │
└─────────────┴────────────────────────────────┴─────────┴───────────────────┴────────────┘
"""

Python, Pydantic AI & LLM client version

Python 3.13.3
PydanticAI 0.6.2

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions