Skip to content

Commit d146551

Browse files
authored
docs: #1722 LiteLLM usage tracking (#1743)
Fixes #1722 ## Summary - explain how to enable usage reporting for LiteLLM integrations with `ModelSettings(include_usage=True)` - link LiteLLM docs and general usage guide to the new configuration snippet ## Testing - make build-docs ------ https://chatgpt.com/codex/tasks/task_i_68c7f4e8f0748320b28b17d68cfd0bf4
1 parent c583dfc commit d146551

File tree

2 files changed

+35
-0
lines changed

2 files changed

+35
-0
lines changed

docs/models/litellm.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,3 +71,20 @@ if __name__ == "__main__":
7171

7272
asyncio.run(main(model, api_key))
7373
```
74+
75+
## Tracking usage data
76+
77+
If you want LiteLLM responses to populate the Agents SDK usage metrics, pass `ModelSettings(include_usage=True)` when creating your agent.
78+
79+
```python
80+
from agents import Agent, ModelSettings
81+
from agents.extensions.models.litellm_model import LitellmModel
82+
83+
agent = Agent(
84+
name="Assistant",
85+
model=LitellmModel(model="your/model", api_key="..."),
86+
model_settings=ModelSettings(include_usage=True),
87+
)
88+
```
89+
90+
With `include_usage=True`, LiteLLM requests report token and request counts through `result.context_wrapper.usage` just like the built-in OpenAI models.

docs/usage.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,24 @@ print("Total tokens:", usage.total_tokens)
2828

2929
Usage is aggregated across all model calls during the run (including tool calls and handoffs).
3030

31+
### Enabling usage with LiteLLM models
32+
33+
LiteLLM providers do not report usage metrics by default. When you are using [`LitellmModel`](models/litellm.md), pass `ModelSettings(include_usage=True)` to your agent so that LiteLLM responses populate `result.context_wrapper.usage`.
34+
35+
```python
36+
from agents import Agent, ModelSettings, Runner
37+
from agents.extensions.models.litellm_model import LitellmModel
38+
39+
agent = Agent(
40+
name="Assistant",
41+
model=LitellmModel(model="your/model", api_key="..."),
42+
model_settings=ModelSettings(include_usage=True),
43+
)
44+
45+
result = await Runner.run(agent, "What's the weather in Tokyo?")
46+
print(result.context_wrapper.usage.total_tokens)
47+
```
48+
3149
## Accessing usage with sessions
3250

3351
When you use a `Session` (e.g., `SQLiteSession`), each call to `Runner.run(...)` returns usage for that specific run. Sessions maintain conversation history for context, but each run's usage is independent.

0 commit comments

Comments
 (0)