Skip to content

Deprecate Usage in favour of RequestUsage and RunUsage to work with genai-prices #2378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
95b543e
new Usage interface to work with genai-prices
samuelcolvin Jul 30, 2025
0b9a37d
Merge branch 'main' into genai-prices-usage
alexmojaki Aug 8, 2025
ee4cf04
Merge branch 'main' of github.com:pydantic/pydantic-ai into genai-pri…
alexmojaki Aug 8, 2025
0b325c3
fix
alexmojaki Aug 8, 2025
fcdf9df
fix
alexmojaki Aug 8, 2025
c34ebdb
fix
alexmojaki Aug 8, 2025
1f273ae
fix
alexmojaki Aug 8, 2025
0e7415c
fix
alexmojaki Aug 8, 2025
7a6d333
fix
alexmojaki Aug 8, 2025
3f38c9a
fix
alexmojaki Aug 8, 2025
ab67fdb
Merge branch 'main' of github.com:pydantic/pydantic-ai into genai-pri…
alexmojaki Aug 8, 2025
5754d89
Simplify diff
alexmojaki Aug 8, 2025
4c78e4f
Simplify diff
alexmojaki Aug 8, 2025
851a582
Simplify diff
alexmojaki Aug 8, 2025
d2cf1ed
Simplify diff
alexmojaki Aug 8, 2025
10e91df
Simplify diff
alexmojaki Aug 8, 2025
e6b5498
Simplify diff
alexmojaki Aug 8, 2025
c54352d
Simplify diff
alexmojaki Aug 8, 2025
4c0c5d8
Remove logfire.configure calls
alexmojaki Aug 8, 2025
15404b6
fix
alexmojaki Aug 8, 2025
516ea49
Disable instrumentation in tests
alexmojaki Aug 8, 2025
2bbeadc
docstrings
alexmojaki Aug 8, 2025
1a1b4ae
shutdown logfire after each test
alexmojaki Aug 8, 2025
79f04d8
shutdown logfire after each test
alexmojaki Aug 8, 2025
0874e88
debugging
alexmojaki Aug 8, 2025
02fe793
debugging
alexmojaki Aug 8, 2025
78416fa
debugging
alexmojaki Aug 8, 2025
76ccb51
debugging
alexmojaki Aug 8, 2025
547c126
revert ci.yml
alexmojaki Aug 11, 2025
75435ce
Merge branch 'main' of github.com:pydantic/pydantic-ai into genai-pri…
alexmojaki Aug 11, 2025
e245215
debugging
alexmojaki Aug 11, 2025
05020aa
fix
alexmojaki Aug 11, 2025
4ce9ec9
Merge branch 'main' of github.com:pydantic/pydantic-ai into genai-pri…
alexmojaki Aug 12, 2025
6dfe676
update genai-prices
alexmojaki Aug 12, 2025
47c61a3
tests
alexmojaki Aug 12, 2025
75ab2c4
tests
alexmojaki Aug 12, 2025
2902e8b
tests
alexmojaki Aug 12, 2025
6d0538d
pragma
alexmojaki Aug 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ jobs:
--extra anthropic
--extra mistral
--extra cohere
pytest tests/test_live.py -v
pytest --durations=0 tests/test_live.py -v
--durations=100
env:
PYDANTIC_AI_LIVE_TEST_DANGEROUS: "CHARGE-ME!"
Expand Down Expand Up @@ -168,7 +168,7 @@ jobs:
- run: mkdir .coverage

- run: uv sync --group dev
- run: uv run ${{ steps.install-command.outputs.install-command }} coverage run -m pytest -n auto --dist=loadgroup
- run: uv run ${{ steps.install-command.outputs.install-command }} coverage run -m pytest --durations=0 -n auto --dist=loadgroup
env:
COVERAGE_FILE: .coverage/.coverage.${{ matrix.python-version }}-${{ matrix.install }}

Expand Down Expand Up @@ -209,7 +209,7 @@ jobs:

- run: unset UV_FROZEN

- run: uv run --all-extras --resolution lowest-direct coverage run -m pytest -n auto --dist=loadgroup
- run: uv run --all-extras --resolution lowest-direct coverage run -m pytest --durations=0 -n auto --dist=loadgroup
env:
COVERAGE_FILE: .coverage/.coverage.${{matrix.python-version}}-lowest-versions

Expand Down Expand Up @@ -289,7 +289,7 @@ jobs:

- run: make lint-js

- run: uv run --package mcp-run-python pytest mcp-run-python -v --durations=100
- run: uv run --package mcp-run-python pytest --durations=0 mcp-run-python -v --durations=100

- run: deno task dev warmup
working-directory: mcp-run-python
Expand Down
25 changes: 7 additions & 18 deletions docs/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -302,9 +302,7 @@ async def main():
CallToolsNode(
model_response=ModelResponse(
parts=[TextPart(content='The capital of France is Paris.')],
usage=Usage(
requests=1, request_tokens=56, response_tokens=7, total_tokens=63
),
usage=RequestUsage(input_tokens=56, output_tokens=7),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
)
Expand Down Expand Up @@ -367,12 +365,7 @@ async def main():
CallToolsNode(
model_response=ModelResponse(
parts=[TextPart(content='The capital of France is Paris.')],
usage=Usage(
requests=1,
request_tokens=56,
response_tokens=7,
total_tokens=63,
),
usage=RequestUsage(input_tokens=56, output_tokens=7),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
)
Expand All @@ -391,7 +384,7 @@ _(This example is complete, it can be run "as is" — you'll need to add `asynci

#### Accessing usage and final output

You can retrieve usage statistics (tokens, requests, etc.) at any time from the [`AgentRun`][pydantic_ai.agent.AgentRun] object via `agent_run.usage()`. This method returns a [`Usage`][pydantic_ai.usage.Usage] object containing the usage data.
You can retrieve usage statistics (tokens, requests, etc.) at any time from the [`AgentRun`][pydantic_ai.agent.AgentRun] object via `agent_run.usage()`. This method returns a [`RunUsage`][pydantic_ai.usage.RunUsage] object containing the usage data.

Once the run finishes, `agent_run.result` becomes a [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] object containing the final output (and related metadata).

Expand Down Expand Up @@ -570,7 +563,7 @@ result_sync = agent.run_sync(
print(result_sync.output)
#> Rome
print(result_sync.usage())
#> Usage(requests=1, request_tokens=62, response_tokens=1, total_tokens=63)
#> RunUsage(requests=1, input_tokens=62, output_tokens=1)

try:
result_sync = agent.run_sync(
Expand All @@ -579,7 +572,7 @@ try:
)
except UsageLimitExceeded as e:
print(e)
#> Exceeded the response_tokens_limit of 10 (response_tokens=32)
#> Exceeded the output_tokens_limit of 10 (output_tokens=32)
```

Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling:
Expand Down Expand Up @@ -1018,9 +1011,7 @@ with capture_run_messages() as messages: # (2)!
tool_call_id='pyd_ai_tool_call_id',
)
],
usage=Usage(
requests=1, request_tokens=62, response_tokens=4, total_tokens=66
),
usage=RequestUsage(input_tokens=62, output_tokens=4),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
Expand All @@ -1042,9 +1033,7 @@ with capture_run_messages() as messages: # (2)!
tool_call_id='pyd_ai_tool_call_id',
)
],
usage=Usage(
requests=1, request_tokens=72, response_tokens=8, total_tokens=80
),
usage=RequestUsage(input_tokens=72, output_tokens=8),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
Expand Down
4 changes: 2 additions & 2 deletions docs/direct.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ model_response = model_request_sync(
print(model_response.parts[0].content)
#> The capital of France is Paris.
print(model_response.usage)
#> Usage(requests=1, request_tokens=56, response_tokens=7, total_tokens=63)
#> RequestUsage(input_tokens=56, output_tokens=7)
```

_(This example is complete, it can be run "as is")_
Expand Down Expand Up @@ -83,7 +83,7 @@ async def main():
tool_call_id='pyd_ai_2e0e396768a14fe482df90a29a78dc7b',
)
],
usage=Usage(requests=1, request_tokens=55, response_tokens=7, total_tokens=62),
usage=RequestUsage(input_tokens=55, output_tokens=7),
model_name='gpt-4.1-nano',
timestamp=datetime.datetime(...),
)
Expand Down
12 changes: 6 additions & 6 deletions docs/message-history.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ print(result.all_messages())
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(requests=1, request_tokens=60, response_tokens=12, total_tokens=72),
usage=RequestUsage(input_tokens=60, output_tokens=12),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
Expand Down Expand Up @@ -126,7 +126,7 @@ async def main():
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(request_tokens=50, response_tokens=12, total_tokens=62),
usage=RequestUsage(input_tokens=50, output_tokens=12),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
Expand Down Expand Up @@ -180,7 +180,7 @@ print(result2.all_messages())
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(requests=1, request_tokens=60, response_tokens=12, total_tokens=72),
usage=RequestUsage(input_tokens=60, output_tokens=12),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
Expand All @@ -198,7 +198,7 @@ print(result2.all_messages())
content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.'
)
],
usage=Usage(requests=1, request_tokens=61, response_tokens=26, total_tokens=87),
usage=RequestUsage(input_tokens=61, output_tokens=26),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
Expand Down Expand Up @@ -299,7 +299,7 @@ print(result2.all_messages())
content='Did you hear about the toothpaste scandal? They called it Colgate.'
)
],
usage=Usage(requests=1, request_tokens=60, response_tokens=12, total_tokens=72),
usage=RequestUsage(input_tokens=60, output_tokens=12),
model_name='gpt-4o',
timestamp=datetime.datetime(...),
),
Expand All @@ -317,7 +317,7 @@ print(result2.all_messages())
content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.'
)
],
usage=Usage(requests=1, request_tokens=61, response_tokens=26, total_tokens=87),
usage=RequestUsage(input_tokens=61, output_tokens=26),
model_name='gemini-1.5-pro',
timestamp=datetime.datetime(...),
),
Expand Down
2 changes: 1 addition & 1 deletion docs/models/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ print(response.all_messages())
model_name='claude-3-5-sonnet-latest',
timestamp=datetime.datetime(...),
kind='response',
vendor_id=None,
provider_request_id=None,
),
]
"""
Expand Down
4 changes: 2 additions & 2 deletions docs/models/openai.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ result = agent.run_sync('Where were the olympics held in 2012?')
print(result.output)
#> city='London' country='United Kingdom'
print(result.usage())
#> Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65)
#> RunUsage(requests=1, input_tokens=57, output_tokens=8)
```

#### Example using a remote server
Expand Down Expand Up @@ -301,7 +301,7 @@ result = agent.run_sync('Where were the olympics held in 2012?')
print(result.output)
#> city='London' country='United Kingdom'
print(result.usage())
#> Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65)
#> RunUsage(requests=1, input_tokens=57, output_tokens=8)
```

1. The name of the model running on the remote server
Expand Down
12 changes: 6 additions & 6 deletions docs/multi-agent-applications.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ result = joke_selection_agent.run_sync(
print(result.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.
print(result.usage())
#> Usage(requests=3, request_tokens=204, response_tokens=24, total_tokens=228)
#> RunUsage(requests=3, input_tokens=204, output_tokens=24)
```

1. The "parent" or controlling agent.
Expand Down Expand Up @@ -144,7 +144,7 @@ async def main():
print(result.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.
print(result.usage()) # (6)!
#> Usage(requests=4, request_tokens=309, response_tokens=32, total_tokens=341)
#> RunUsage(requests=4, input_tokens=309, output_tokens=32)
```

1. Define a dataclass to hold the client and API key dependencies.
Expand Down Expand Up @@ -188,7 +188,7 @@ from rich.prompt import Prompt

from pydantic_ai import Agent, RunContext
from pydantic_ai.messages import ModelMessage
from pydantic_ai.usage import Usage, UsageLimits
from pydantic_ai.usage import RunUsage, UsageLimits


class FlightDetails(BaseModel):
Expand Down Expand Up @@ -221,7 +221,7 @@ async def flight_search(
usage_limits = UsageLimits(request_limit=15) # (3)!


async def find_flight(usage: Usage) -> Union[FlightDetails, None]: # (4)!
async def find_flight(usage: RunUsage) -> Union[FlightDetails, None]: # (4)!
message_history: Union[list[ModelMessage], None] = None
for _ in range(3):
prompt = Prompt.ask(
Expand Down Expand Up @@ -259,7 +259,7 @@ seat_preference_agent = Agent[None, Union[SeatPreference, Failed]]( # (5)!
)


async def find_seat(usage: Usage) -> SeatPreference: # (6)!
async def find_seat(usage: RunUsage) -> SeatPreference: # (6)!
message_history: Union[list[ModelMessage], None] = None
while True:
answer = Prompt.ask('What seat would you like?')
Expand All @@ -278,7 +278,7 @@ async def find_seat(usage: Usage) -> SeatPreference: # (6)!


async def main(): # (7)!
usage: Usage = Usage()
usage: RunUsage = RunUsage()

opt_flight_details = await find_flight(usage)
if opt_flight_details is not None:
Expand Down
4 changes: 2 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"Output" refers to the final value returned from [running an agent](agents.md#running-agents). This can be either plain text, [structured data](#structured-output), or the result of a [function](#output-functions) called with arguments provided by the model.

The output is wrapped in [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] or [`StreamedRunResult`][pydantic_ai.result.StreamedRunResult] so that you can access other data, like [usage][pydantic_ai.usage.Usage] of the run and [message history](message-history.md#accessing-messages-from-results).
The output is wrapped in [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] or [`StreamedRunResult`][pydantic_ai.result.StreamedRunResult] so that you can access other data, like [usage][pydantic_ai.usage.RunUsage] of the run and [message history](message-history.md#accessing-messages-from-results).

Both `AgentRunResult` and `StreamedRunResult` are generic in the data they wrap, so typing information about the data returned by the agent is preserved.

Expand All @@ -24,7 +24,7 @@ result = agent.run_sync('Where were the olympics held in 2012?')
print(result.output)
#> city='London' country='United Kingdom'
print(result.usage())
#> Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65)
#> RunUsage(requests=1, input_tokens=57, output_tokens=8)
```

_(This example is complete, it can be run "as is")_
Expand Down
20 changes: 7 additions & 13 deletions docs/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ from pydantic_ai.messages import (
UserPromptPart,
ModelRequest,
)
from pydantic_ai.usage import Usage
from pydantic_ai.usage import RequestUsage

from fake_database import DatabaseConn
from weather_app import run_weather_forecast, weather_agent
Expand Down Expand Up @@ -141,12 +141,9 @@ async def test_forecast():
tool_call_id=IsStr(),
)
],
usage=Usage(
requests=1,
request_tokens=71,
response_tokens=7,
total_tokens=78,
details=None,
usage=RequestUsage(
input_tokens=71,
output_tokens=7,
),
model_name='test',
timestamp=IsNow(tz=timezone.utc),
Expand All @@ -167,12 +164,9 @@ async def test_forecast():
content='{"weather_forecast":"Sunny with a chance of rain"}',
)
],
usage=Usage(
requests=1,
request_tokens=77,
response_tokens=16,
total_tokens=93,
details=None,
usage=RequestUsage(
input_tokens=77,
output_tokens=16,
),
model_name='test',
timestamp=IsNow(tz=timezone.utc),
Expand Down
8 changes: 3 additions & 5 deletions docs/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ print(dice_result.all_messages())
tool_name='roll_dice', args={}, tool_call_id='pyd_ai_tool_call_id'
)
],
usage=Usage(requests=1, request_tokens=90, response_tokens=2, total_tokens=92),
usage=RequestUsage(input_tokens=90, output_tokens=2),
model_name='gemini-1.5-flash',
timestamp=datetime.datetime(...),
),
Expand All @@ -115,7 +115,7 @@ print(dice_result.all_messages())
tool_name='get_player_name', args={}, tool_call_id='pyd_ai_tool_call_id'
)
],
usage=Usage(requests=1, request_tokens=91, response_tokens=4, total_tokens=95),
usage=RequestUsage(input_tokens=91, output_tokens=4),
model_name='gemini-1.5-flash',
timestamp=datetime.datetime(...),
),
Expand All @@ -135,9 +135,7 @@ print(dice_result.all_messages())
content="Congratulations Anne, you guessed correctly! You're a winner!"
)
],
usage=Usage(
requests=1, request_tokens=92, response_tokens=12, total_tokens=104
),
usage=RequestUsage(input_tokens=92, output_tokens=12),
model_name='gemini-1.5-flash',
timestamp=datetime.datetime(...),
),
Expand Down
6 changes: 3 additions & 3 deletions examples/pydantic_ai_examples/flight_booking.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.messages import ModelMessage
from pydantic_ai.usage import Usage, UsageLimits
from pydantic_ai.usage import RunUsage, UsageLimits

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
Expand Down Expand Up @@ -182,7 +182,7 @@ async def main():
req_date=datetime.date(2025, 1, 10),
)
message_history: list[ModelMessage] | None = None
usage: Usage = Usage()
usage: RunUsage = RunUsage()
# run the agent until a satisfactory flight is found
while True:
result = await search_agent.run(
Expand Down Expand Up @@ -213,7 +213,7 @@ async def main():
)


async def find_seat(usage: Usage) -> SeatPreference:
async def find_seat(usage: RunUsage) -> SeatPreference:
message_history: list[ModelMessage] | None = None
while True:
answer = Prompt.ask('What seat would you like?')
Expand Down
4 changes: 2 additions & 2 deletions pydantic_ai_slim/pydantic_ai/_agent_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ class GraphAgentState:
"""State kept across the execution of the agent graph."""

message_history: list[_messages.ModelMessage]
usage: _usage.Usage
usage: _usage.RunUsage
retries: int
run_step: int

Expand Down Expand Up @@ -342,7 +342,7 @@ async def _make_request(

model_settings, model_request_parameters, message_history, _ = await self._prepare_request(ctx)
model_response = await ctx.deps.model.request(message_history, model_settings, model_request_parameters)
ctx.state.usage.incr(_usage.Usage())
ctx.state.usage.requests += 1

return self._finish_handling(ctx, model_response)

Expand Down
Loading