You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/models/overview.md
+137-2Lines changed: 137 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -153,8 +153,16 @@ attributes showing the queue depth and configured limits. The `name` parameter o
153
153
## Fallback Model
154
154
155
155
You can use [`FallbackModel`][pydantic_ai.models.fallback.FallbackModel] to attempt multiple models
156
-
in sequence until one successfully returns a result. Under the hood, Pydantic AI automatically switches
157
-
from one model to the next if the current model returns a 4xx or 5xx status code.
156
+
in sequence until one succeeds. Pydantic AI can switch to the next model when the current model
157
+
raises an exception (like a 4xx/5xx API error) **or** when the response content indicates a semantic
158
+
failure (like a truncated response or a failed built-in tool call).
159
+
160
+
By default, fallback triggers on [`ModelAPIError`][pydantic_ai.exceptions.ModelAPIError] (4xx/5xx API errors),
161
+
so you don't need to configure anything for the most common use case.
162
+
163
+
This behavior is controlled by the `fallback_on` parameter (see
164
+
[`FallbackModel`][pydantic_ai.models.fallback.FallbackModel]), which accepts exception types,
165
+
exception handlers, and response handlers — all of which can be sync or async.
158
166
159
167
!!! note
160
168
The provider SDKs on which Models are based (like OpenAI, Anthropic, etc.) often have built-in retry logic that can delay the `FallbackModel` from activating.
@@ -305,6 +313,133 @@ passing a custom `fallback_on` argument to the `FallbackModel` constructor.
305
313
!!! note
306
314
Validation errors (from [structured output](../output.md#structured-output) or [tool parameters](../tools.md)) do **not** trigger fallback. These errors use the [retry mechanism](../agent.md#reflection-and-self-correction) instead, which re-prompts the same model to try again. This is intentional: validation errors stem from the non-deterministic nature of LLMs and may succeed on retry, whereas API errors (4xx/5xx) generally indicate issues that won't resolve by retrying the same request.
307
315
316
+
### Response-Based Fallback
317
+
318
+
In addition to exception-based fallback, you can also trigger fallback based on the **content** of a model's response. This is useful when a model returns a successful HTTP response (no exception), but the response content indicates a semantic failure — for example, an unexpected finish reason or a built-in tool reporting failure.
319
+
320
+
!!! note "Non-streaming only"
321
+
Response-based fallback currently only works with non-streaming requests (`agent.run()` and `agent.run_sync()`).
322
+
For streaming requests (`agent.run_stream()`), only exception-based fallback is supported.
323
+
324
+
The `fallback_on` parameter accepts:
325
+
326
+
- A tuple of exception types: `(ModelAPIError, ModelHTTPError)`
327
+
- An exception handler (sync or async): `lambda exc: isinstance(exc, MyError)`
328
+
- A response handler (sync or async): `def check(r: ModelResponse) -> bool`
329
+
- A list mixing all of the above: `[ModelAPIError, exc_handler, response_handler]`
330
+
331
+
Handler type is auto-detected by inspecting type hints on the first parameter. If the first parameter is hinted as [`ModelResponse`][pydantic_ai.messages.ModelResponse], it's a response handler. Otherwise (including untyped handlers and lambdas), it's an exception handler.
332
+
333
+
#### Finish Reason Example
334
+
335
+
A simple use case is checking the model's finish reason — for example, falling back if the response was truncated due to length limits:
336
+
337
+
```python {title="fallback_on_finish_reason.py"}
338
+
from pydantic_ai import Agent
339
+
from pydantic_ai.messages import FinishReason, ModelResponse
340
+
from pydantic_ai.models.fallback import FallbackModel
When you pass a single response handler as `fallback_on` (as above), it **replaces** the default `(ModelAPIError,)` exception fallback entirely. This means API errors (4xx/5xx) will propagate as exceptions instead of triggering fallback to the next model.
364
+
365
+
To keep exception-based fallback alongside a response handler, pass them together as a list — see the [mixed example below](#combining-handlers).
366
+
367
+
!!! note
368
+
Note that Pydantic AI already handles some finish reasons automatically in the [agent loop](../agent.md):
369
+
responses with a `'length'` or `'content_filter'` finish reason raise exceptions (which `FallbackModel`
370
+
catches by default), and empty responses are retried. A response handler is useful for custom
371
+
checks beyond these built-in behaviors.
372
+
373
+
#### Built-in Tool Failure Example
374
+
375
+
A more complex use case is when using built-in tools like web search or URL fetching. For example, Google's [`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool] may return a successful response with a status indicating the URL fetch failed:
376
+
377
+
```python {title="fallback_on_builtin_tool.py"}
378
+
from pydantic_ai import Agent
379
+
from pydantic_ai.messages import ModelResponse
380
+
from pydantic_ai.models.anthropic import AnthropicModel
381
+
from pydantic_ai.models.fallback import FallbackModel
# If Google's web_fetch fails, automatically falls back to Anthropic
413
+
result = agent.run_sync('Summarize https://ai.pydantic.dev')
414
+
print(result.output)
415
+
"""
416
+
Pydantic AI is a Python agent framework for building production-grade LLM applications.
417
+
"""
418
+
```
419
+
420
+
Response handlers receive the [`ModelResponse`][pydantic_ai.messages.ModelResponse] returned by the model and should return `True` to trigger fallback to the next model, or `False` to accept the response.
421
+
422
+
#### Combining Handlers
423
+
424
+
You can combine exception types, exception handlers, and response handlers in a single list:
0 commit comments