Skip to content

Commit 4941468

Browse files
dmontaguDouweM
andauthored
Add tenacity utilities/integration for improved retry handling (#2282)
Co-authored-by: Douwe Maan <[email protected]>
1 parent 5cf372a commit 4941468

File tree

8 files changed

+1174
-4
lines changed

8 files changed

+1174
-4
lines changed

docs/api/retries.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# `pydantic_ai.retries`
2+
3+
::: pydantic_ai.retries

docs/retries.md

Lines changed: 338 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,338 @@
1+
# HTTP Request Retries
2+
3+
Pydantic AI provides retry functionality for HTTP requests made by model providers through custom HTTP transports.
4+
This is particularly useful for handling transient failures like rate limits, network timeouts, or temporary server errors.
5+
6+
## Overview
7+
8+
The retry functionality is built on top of the [tenacity](https://github.com/jd/tenacity) library and integrates
9+
seamlessly with httpx clients. You can configure retry behavior for any provider that accepts a custom HTTP client.
10+
11+
## Installation
12+
13+
To use the retry transports, you need to install `tenacity`, which you can do via the `retries` dependency group:
14+
15+
```bash
16+
pip/uv-add 'pydantic-ai-slim[retries]'
17+
```
18+
19+
## Usage Example
20+
21+
Here's an example of adding retry functionality with smart retry handling:
22+
23+
```python {title="smart_retry_example.py"}
24+
from httpx import AsyncClient, HTTPStatusError
25+
from tenacity import (
26+
AsyncRetrying,
27+
stop_after_attempt,
28+
wait_exponential,
29+
retry_if_exception_type
30+
)
31+
from pydantic_ai import Agent
32+
from pydantic_ai.models.openai import OpenAIModel
33+
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after
34+
from pydantic_ai.providers.openai import OpenAIProvider
35+
36+
def create_retrying_client():
37+
"""Create a client with smart retry handling for multiple error types."""
38+
39+
def should_retry_status(response):
40+
"""Raise exceptions for retryable HTTP status codes."""
41+
if response.status_code in (429, 502, 503, 504):
42+
response.raise_for_status() # This will raise HTTPStatusError
43+
44+
transport = AsyncTenacityTransport(
45+
controller=AsyncRetrying(
46+
# Retry on HTTP errors and connection issues
47+
retry=retry_if_exception_type((HTTPStatusError, ConnectionError)),
48+
# Smart waiting: respects Retry-After headers, falls back to exponential backoff
49+
wait=wait_retry_after(
50+
fallback_strategy=wait_exponential(multiplier=1, max=60),
51+
max_wait=300
52+
),
53+
# Stop after 5 attempts
54+
stop=stop_after_attempt(5),
55+
# Re-raise the last exception if all retries fail
56+
reraise=True
57+
),
58+
validate_response=should_retry_status
59+
)
60+
return AsyncClient(transport=transport)
61+
62+
# Use the retrying client with a model
63+
client = create_retrying_client()
64+
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
65+
agent = Agent(model)
66+
```
67+
68+
## Wait Strategies
69+
70+
### wait_retry_after
71+
72+
The `wait_retry_after` function is a smart wait strategy that automatically respects HTTP `Retry-After` headers:
73+
74+
```python {title="wait_strategy_example.py"}
75+
from pydantic_ai.retries import wait_retry_after
76+
from tenacity import wait_exponential
77+
78+
# Basic usage - respects Retry-After headers, falls back to exponential backoff
79+
wait_strategy_1 = wait_retry_after()
80+
81+
# Custom configuration
82+
wait_strategy_2 = wait_retry_after(
83+
fallback_strategy=wait_exponential(multiplier=2, max=120),
84+
max_wait=600 # Never wait more than 10 minutes
85+
)
86+
```
87+
88+
This wait strategy:
89+
- Automatically parses `Retry-After` headers from HTTP 429 responses
90+
- Supports both seconds format (`"30"`) and HTTP date format (`"Wed, 21 Oct 2015 07:28:00 GMT"`)
91+
- Falls back to your chosen strategy when no header is present
92+
- Respects the `max_wait` limit to prevent excessive delays
93+
94+
## Transport Classes
95+
96+
### AsyncTenacityTransport
97+
98+
For asynchronous HTTP clients (recommended for most use cases):
99+
100+
```python {title="async_transport_example.py"}
101+
from httpx import AsyncClient
102+
from tenacity import AsyncRetrying, stop_after_attempt
103+
from pydantic_ai.retries import AsyncTenacityTransport
104+
105+
# Create the basic components
106+
async_retrying = AsyncRetrying(stop=stop_after_attempt(3), reraise=True)
107+
108+
def validator(response):
109+
"""Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
110+
Without a response validator, only network errors and timeouts will result in a retry.
111+
"""
112+
response.raise_for_status()
113+
114+
# Create the transport
115+
transport = AsyncTenacityTransport(
116+
controller=async_retrying, # AsyncRetrying instance
117+
validate_response=validator # Optional response validator
118+
)
119+
120+
# Create a client using the transport:
121+
client = AsyncClient(transport=transport)
122+
```
123+
124+
### TenacityTransport
125+
126+
For synchronous HTTP clients:
127+
128+
```python {title="sync_transport_example.py"}
129+
from httpx import Client
130+
from tenacity import Retrying, stop_after_attempt
131+
from pydantic_ai.retries import TenacityTransport
132+
133+
# Create the basic components
134+
retrying = Retrying(stop=stop_after_attempt(3), reraise=True)
135+
136+
def validator(response):
137+
"""Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
138+
Without a response validator, only network errors and timeouts will result in a retry.
139+
"""
140+
response.raise_for_status()
141+
142+
# Create the transport
143+
transport = TenacityTransport(
144+
controller=retrying, # Retrying instance
145+
validate_response=validator # Optional response validator
146+
)
147+
148+
# Create a client using the transport
149+
client = Client(transport=transport)
150+
```
151+
152+
## Common Retry Patterns
153+
154+
### Rate Limit Handling with Retry-After Support
155+
156+
```python {title="rate_limit_handling.py"}
157+
from httpx import AsyncClient, HTTPStatusError
158+
from tenacity import AsyncRetrying, stop_after_attempt, retry_if_exception_type, wait_exponential
159+
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after
160+
161+
def create_rate_limit_client():
162+
"""Create a client that respects Retry-After headers from rate limiting responses."""
163+
transport = AsyncTenacityTransport(
164+
controller=AsyncRetrying(
165+
retry=retry_if_exception_type(HTTPStatusError),
166+
wait=wait_retry_after(
167+
fallback_strategy=wait_exponential(multiplier=1, max=60),
168+
max_wait=300 # Don't wait more than 5 minutes
169+
),
170+
stop=stop_after_attempt(10),
171+
reraise=True
172+
),
173+
validate_response=lambda r: r.raise_for_status() # Raises HTTPStatusError for 4xx/5xx
174+
)
175+
return AsyncClient(transport=transport)
176+
177+
# Example usage
178+
client = create_rate_limit_client()
179+
# Client is now ready to use with any HTTP requests and will respect Retry-After headers
180+
```
181+
182+
The `wait_retry_after` function automatically detects `Retry-After` headers in 429 (rate limit) responses and waits for the specified time. If no header is present, it falls back to exponential backoff.
183+
184+
### Network Error Handling
185+
186+
```python {title="network_error_handling.py"}
187+
import httpx
188+
from tenacity import AsyncRetrying, retry_if_exception_type, wait_exponential, stop_after_attempt
189+
from pydantic_ai.retries import AsyncTenacityTransport
190+
191+
def create_network_resilient_client():
192+
"""Create a client that handles network errors with retries."""
193+
transport = AsyncTenacityTransport(
194+
controller=AsyncRetrying(
195+
retry=retry_if_exception_type((
196+
httpx.TimeoutException,
197+
httpx.ConnectError,
198+
httpx.ReadError
199+
)),
200+
wait=wait_exponential(multiplier=1, max=10),
201+
stop=stop_after_attempt(3),
202+
reraise=True
203+
)
204+
)
205+
return httpx.AsyncClient(transport=transport)
206+
207+
# Example usage
208+
client = create_network_resilient_client()
209+
# Client will now retry on timeout, connection, and read errors
210+
```
211+
212+
### Custom Retry Logic
213+
214+
```python {title="custom_retry_logic.py"}
215+
import httpx
216+
from tenacity import AsyncRetrying, wait_exponential, stop_after_attempt
217+
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after
218+
219+
def create_custom_retry_client():
220+
"""Create a client with custom retry logic."""
221+
def custom_retry_condition(exception):
222+
"""Custom logic to determine if we should retry."""
223+
if isinstance(exception, httpx.HTTPStatusError):
224+
# Retry on server errors but not client errors
225+
return 500 <= exception.response.status_code < 600
226+
return isinstance(exception, (httpx.TimeoutException, httpx.ConnectError))
227+
228+
transport = AsyncTenacityTransport(
229+
controller=AsyncRetrying(
230+
retry=custom_retry_condition,
231+
# Use wait_retry_after for smart waiting on rate limits,
232+
# with custom exponential backoff as fallback
233+
wait=wait_retry_after(
234+
fallback_strategy=wait_exponential(multiplier=2, max=30),
235+
max_wait=120
236+
),
237+
stop=stop_after_attempt(5),
238+
reraise=True
239+
),
240+
validate_response=lambda r: r.raise_for_status()
241+
)
242+
return httpx.AsyncClient(transport=transport)
243+
244+
client = create_custom_retry_client()
245+
# Client will retry server errors (5xx) and network errors, but not client errors (4xx)
246+
```
247+
248+
## Using with Different Providers
249+
250+
The retry transports work with any provider that accepts a custom HTTP client:
251+
252+
### OpenAI
253+
254+
```python {title="openai_with_retries.py" requires="smart_retry_example.py"}
255+
from pydantic_ai import Agent
256+
from pydantic_ai.models.openai import OpenAIModel
257+
from pydantic_ai.providers.openai import OpenAIProvider
258+
259+
from smart_retry_example import create_retrying_client
260+
261+
client = create_retrying_client()
262+
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
263+
agent = Agent(model)
264+
```
265+
266+
### Anthropic
267+
268+
```python {title="anthropic_with_retries.py" requires="smart_retry_example.py"}
269+
from pydantic_ai import Agent
270+
from pydantic_ai.models.anthropic import AnthropicModel
271+
from pydantic_ai.providers.anthropic import AnthropicProvider
272+
273+
from smart_retry_example import create_retrying_client
274+
275+
client = create_retrying_client()
276+
model = AnthropicModel('claude-3-5-sonnet-20241022', provider=AnthropicProvider(http_client=client))
277+
agent = Agent(model)
278+
```
279+
280+
### Any OpenAI-Compatible Provider
281+
282+
```python {title="openai_compatible_with_retries.py" requires="smart_retry_example.py"}
283+
from pydantic_ai import Agent
284+
from pydantic_ai.models.openai import OpenAIModel
285+
from pydantic_ai.providers.openai import OpenAIProvider
286+
287+
from smart_retry_example import create_retrying_client
288+
289+
client = create_retrying_client()
290+
model = OpenAIModel(
291+
'your-model-name', # Replace with actual model name
292+
provider=OpenAIProvider(
293+
base_url='https://api.example.com/v1', # Replace with actual API URL
294+
api_key='your-api-key', # Replace with actual API key
295+
http_client=client
296+
)
297+
)
298+
agent = Agent(model)
299+
```
300+
301+
## Best Practices
302+
303+
1. **Start Conservative**: Begin with a small number of retries (3-5) and reasonable wait times.
304+
305+
2. **Use Exponential Backoff**: This helps avoid overwhelming servers during outages.
306+
307+
3. **Set Maximum Wait Times**: Prevent indefinite delays with reasonable maximum wait times.
308+
309+
4. **Handle Rate Limits Properly**: Respect `Retry-After` headers when possible.
310+
311+
5. **Log Retry Attempts**: Add logging to monitor retry behavior in production. (This will be picked up by Logfire automatically if you instrument httpx.)
312+
313+
6. **Consider Circuit Breakers**: For high-traffic applications, consider implementing circuit breaker patterns.
314+
315+
## Error Handling
316+
317+
The retry transports will re-raise the last exception if all retry attempts fail. Make sure to handle these appropriately in your application:
318+
319+
```python {title="error_handling_example.py" requires="smart_retry_example.py"}
320+
from pydantic_ai import Agent
321+
from pydantic_ai.models.openai import OpenAIModel
322+
from pydantic_ai.providers.openai import OpenAIProvider
323+
324+
from smart_retry_example import create_retrying_client
325+
326+
client = create_retrying_client()
327+
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
328+
agent = Agent(model)
329+
```
330+
331+
## Performance Considerations
332+
333+
- Retries add latency to requests, especially with exponential backoff
334+
- Consider the total timeout for your application when configuring retry behavior
335+
- Monitor retry rates to detect systemic issues
336+
- Use async transports for better concurrency when handling multiple requests
337+
338+
For more advanced retry configurations, refer to the [tenacity documentation](https://tenacity.readthedocs.io/).

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ nav:
4343
- thinking.md
4444
- direct.md
4545
- common-tools.md
46+
- retries.md
4647
- MCP:
4748
- mcp/index.md
4849
- mcp/client.md
@@ -101,6 +102,7 @@ nav:
101102
- api/models/mcp-sampling.md
102103
- api/profiles.md
103104
- api/providers.md
105+
- api/retries.md
104106
- api/pydantic_graph/graph.md
105107
- api/pydantic_graph/nodes.md
106108
- api/pydantic_graph/persistence.md

0 commit comments

Comments
 (0)