-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Initial Checks
- I'm using the latest version of Pydantic AI
- I've searched for my issue in the issue tracker before opening this issue
Description
When using run_stream, my code is handling 429 errors by doing something like:
try:
agent.run_stream(...)
except ModelHTTPError as e:
if e.status_code == 429:
retry(...)
But I'm still seeing unhandled raw 429 google.genai.errors.ClientError showing up in production.
This is because the library is only catching and wrapping api errors when they occur during the initial request setup in _generate_content:
pydantic-ai/pydantic_ai_slim/pydantic_ai/models/google.py
Lines 507 to 516 in bc7f6a2
| try: | |
| return await func(model=self._model_name, contents=contents, config=config) # type: ignore | |
| except errors.APIError as e: | |
| if (status_code := e.code) >= 400: | |
| raise ModelHTTPError( | |
| status_code=status_code, | |
| model_name=self._model_name, | |
| body=cast(Any, e.details), # pyright: ignore[reportUnknownMemberType] | |
| ) from e | |
| raise ModelAPIError(model_name=self._model_name, message=str(e)) from e |
For example, errors that occur during _process_streamed_response here will not be caught:
pydantic-ai/pydantic_ai_slim/pydantic_ai/models/google.py
Lines 382 to 383 in bc7f6a2
| response = await self._generate_content(messages, True, model_settings, model_request_parameters) | |
| yield await self._process_streamed_response(response, model_request_parameters) # type: ignore |
Retrying after rate limiting is a foundational feature that we should fully support.
For now, the work around is for user code to handle the raw google.genai.errors.ClientError, which goes against the point of the library: to allow model-agnostic user code.
Note
In addition to fixing this for GoogleModel, we should audit the other models to make sure they work properly.
To reproduce
Add this code to GoogleModel._process_streamed_response:
raise ClientError(code=429, response_json={"error": {"message": "Rate limit exceeded"}}, response=None)
Note that the raw ClientError bubbles up to the top.
Python, Pydantic AI & LLM client version
- Python: 3.10
- Pydantic AI: 1.63.0
- LLM provider SDK: Google