Skip to content

Commit 8e0f4ef

Browse files
sarth6claudeDouweM
authored
Add response-based fallback support for FallbackModel (#3786)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Douwe Maan <douwe@pydantic.dev>
1 parent 62bc4dc commit 8e0f4ef

File tree

6 files changed

+872
-48
lines changed

6 files changed

+872
-48
lines changed

docs/models/overview.md

Lines changed: 137 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -153,8 +153,16 @@ attributes showing the queue depth and configured limits. The `name` parameter o
153153
## Fallback Model
154154

155155
You can use [`FallbackModel`][pydantic_ai.models.fallback.FallbackModel] to attempt multiple models
156-
in sequence until one successfully returns a result. Under the hood, Pydantic AI automatically switches
157-
from one model to the next if the current model returns a 4xx or 5xx status code.
156+
in sequence until one succeeds. Pydantic AI can switch to the next model when the current model
157+
raises an exception (like a 4xx/5xx API error) **or** when the response content indicates a semantic
158+
failure (like a truncated response or a failed built-in tool call).
159+
160+
By default, fallback triggers on [`ModelAPIError`][pydantic_ai.exceptions.ModelAPIError] (4xx/5xx API errors),
161+
so you don't need to configure anything for the most common use case.
162+
163+
This behavior is controlled by the `fallback_on` parameter (see
164+
[`FallbackModel`][pydantic_ai.models.fallback.FallbackModel]), which accepts exception types,
165+
exception handlers, and response handlers — all of which can be sync or async.
158166

159167
!!! note
160168
The provider SDKs on which Models are based (like OpenAI, Anthropic, etc.) often have built-in retry logic that can delay the `FallbackModel` from activating.
@@ -305,6 +313,133 @@ passing a custom `fallback_on` argument to the `FallbackModel` constructor.
305313
!!! note
306314
Validation errors (from [structured output](../output.md#structured-output) or [tool parameters](../tools.md)) do **not** trigger fallback. These errors use the [retry mechanism](../agent.md#reflection-and-self-correction) instead, which re-prompts the same model to try again. This is intentional: validation errors stem from the non-deterministic nature of LLMs and may succeed on retry, whereas API errors (4xx/5xx) generally indicate issues that won't resolve by retrying the same request.
307315

316+
### Response-Based Fallback
317+
318+
In addition to exception-based fallback, you can also trigger fallback based on the **content** of a model's response. This is useful when a model returns a successful HTTP response (no exception), but the response content indicates a semantic failure — for example, an unexpected finish reason or a built-in tool reporting failure.
319+
320+
!!! note "Non-streaming only"
321+
Response-based fallback currently only works with non-streaming requests (`agent.run()` and `agent.run_sync()`).
322+
For streaming requests (`agent.run_stream()`), only exception-based fallback is supported.
323+
324+
The `fallback_on` parameter accepts:
325+
326+
- A tuple of exception types: `(ModelAPIError, ModelHTTPError)`
327+
- An exception handler (sync or async): `lambda exc: isinstance(exc, MyError)`
328+
- A response handler (sync or async): `def check(r: ModelResponse) -> bool`
329+
- A list mixing all of the above: `[ModelAPIError, exc_handler, response_handler]`
330+
331+
Handler type is auto-detected by inspecting type hints on the first parameter. If the first parameter is hinted as [`ModelResponse`][pydantic_ai.messages.ModelResponse], it's a response handler. Otherwise (including untyped handlers and lambdas), it's an exception handler.
332+
333+
#### Finish Reason Example
334+
335+
A simple use case is checking the model's finish reason — for example, falling back if the response was truncated due to length limits:
336+
337+
```python {title="fallback_on_finish_reason.py"}
338+
from pydantic_ai import Agent
339+
from pydantic_ai.messages import FinishReason, ModelResponse
340+
from pydantic_ai.models.fallback import FallbackModel
341+
342+
343+
def bad_finish_reason(response: ModelResponse) -> bool:
344+
"""Fallback if the model stopped due to length limit, content filter, or error."""
345+
reason: FinishReason | None = response.finish_reason
346+
# Trigger fallback for problematic finish reasons
347+
return reason in ('length', 'content_filter', 'error')
348+
349+
350+
fallback_model = FallbackModel(
351+
'openai:gpt-5.2',
352+
'anthropic:claude-sonnet-4-5',
353+
fallback_on=bad_finish_reason,
354+
)
355+
356+
agent = Agent(fallback_model)
357+
result = agent.run_sync('What is the capital of France?')
358+
print(result.output)
359+
#> The capital of France is Paris.
360+
```
361+
362+
!!! warning "Solo response handlers replace default exception fallback"
363+
When you pass a single response handler as `fallback_on` (as above), it **replaces** the default `(ModelAPIError,)` exception fallback entirely. This means API errors (4xx/5xx) will propagate as exceptions instead of triggering fallback to the next model.
364+
365+
To keep exception-based fallback alongside a response handler, pass them together as a list — see the [mixed example below](#combining-handlers).
366+
367+
!!! note
368+
Note that Pydantic AI already handles some finish reasons automatically in the [agent loop](../agent.md):
369+
responses with a `'length'` or `'content_filter'` finish reason raise exceptions (which `FallbackModel`
370+
catches by default), and empty responses are retried. A response handler is useful for custom
371+
checks beyond these built-in behaviors.
372+
373+
#### Built-in Tool Failure Example
374+
375+
A more complex use case is when using built-in tools like web search or URL fetching. For example, Google's [`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool] may return a successful response with a status indicating the URL fetch failed:
376+
377+
```python {title="fallback_on_builtin_tool.py"}
378+
from pydantic_ai import Agent
379+
from pydantic_ai.messages import ModelResponse
380+
from pydantic_ai.models.anthropic import AnthropicModel
381+
from pydantic_ai.models.fallback import FallbackModel
382+
from pydantic_ai.models.google import GoogleModel
383+
384+
385+
def web_fetch_failed(response: ModelResponse) -> bool:
386+
"""Check if a web_fetch built-in tool failed to retrieve content."""
387+
for call, result in response.builtin_tool_calls:
388+
if call.tool_name != 'web_fetch':
389+
continue
390+
if not isinstance(result.content, list):
391+
continue
392+
for item in result.content:
393+
if isinstance(item, dict):
394+
status = item.get('url_retrieval_status', '')
395+
if status and status != 'URL_RETRIEVAL_STATUS_SUCCESS':
396+
return True
397+
return False
398+
399+
400+
google_model = GoogleModel('gemini-2.5-flash')
401+
anthropic_model = AnthropicModel('claude-sonnet-4-5')
402+
403+
# Auto-detected as response handler via type hint
404+
fallback_model = FallbackModel(
405+
google_model,
406+
anthropic_model,
407+
fallback_on=web_fetch_failed,
408+
)
409+
410+
agent = Agent(fallback_model)
411+
412+
# If Google's web_fetch fails, automatically falls back to Anthropic
413+
result = agent.run_sync('Summarize https://ai.pydantic.dev')
414+
print(result.output)
415+
"""
416+
Pydantic AI is a Python agent framework for building production-grade LLM applications.
417+
"""
418+
```
419+
420+
Response handlers receive the [`ModelResponse`][pydantic_ai.messages.ModelResponse] returned by the model and should return `True` to trigger fallback to the next model, or `False` to accept the response.
421+
422+
#### Combining Handlers
423+
424+
You can combine exception types, exception handlers, and response handlers in a single list:
425+
426+
```python {title="fallback_on_mixed.py" requires="fallback_on_builtin_tool.py"}
427+
from pydantic_ai.exceptions import ModelAPIError
428+
from pydantic_ai.models.fallback import FallbackModel
429+
430+
from fallback_on_builtin_tool import anthropic_model, google_model, web_fetch_failed
431+
432+
fallback_model = FallbackModel(
433+
google_model,
434+
anthropic_model,
435+
fallback_on=[
436+
ModelAPIError, # Exception type
437+
lambda exc: 'rate limit' in str(exc).lower(), # Exception handler (untyped lambda)
438+
web_fetch_failed, # Response handler (auto-detected via type hint)
439+
],
440+
)
441+
```
442+
308443
### Exception Handling in Middleware and Decorators
309444

310445
When using `FallbackModel`, it's important to understand that [`FallbackExceptionGroup`][pydantic_ai.exceptions.FallbackExceptionGroup]

pydantic_ai_slim/pydantic_ai/_function_schema.py

Lines changed: 5 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
from typing import TYPE_CHECKING, Any, Concatenate, Literal, cast, get_origin
1313

1414
from pydantic import ConfigDict
15-
from pydantic._internal import _decorators, _generate_schema, _typing_extra
15+
from pydantic._internal import _decorators, _generate_schema
1616
from pydantic._internal._config import ConfigWrapper
1717
from pydantic.fields import FieldInfo
1818
from pydantic.json_schema import GenerateJsonSchema
@@ -22,7 +22,7 @@
2222

2323
from ._griffe import doc_descriptions
2424
from ._run_context import RunContext
25-
from ._utils import check_object_json_schema, is_async_callable, is_model_like, run_in_executor
25+
from ._utils import check_object_json_schema, get_first_param_type, is_async_callable, is_model_like, run_in_executor
2626

2727
if TYPE_CHECKING:
2828
from .tools import DocstringFormat, ObjectJsonSchema
@@ -240,28 +240,10 @@ def _takes_ctx(callable_obj: TargetCallable[P, R]) -> TypeIs[WithCtx[P, R]]: #
240240
Returns:
241241
`True` if the callable takes a `RunContext` as first argument, `False` otherwise.
242242
"""
243-
try:
244-
sig = signature(callable_obj)
245-
except ValueError:
246-
return False
247-
try:
248-
first_param_name = next(iter(sig.parameters.keys()))
249-
except StopIteration:
243+
first_param_type = get_first_param_type(callable_obj)
244+
if first_param_type is None:
250245
return False
251-
else:
252-
# See https://github.com/pydantic/pydantic/pull/11451 for a similar implementation in Pydantic
253-
if not isinstance(callable_obj, _decorators._function_like): # pyright: ignore[reportPrivateUsage]
254-
call_func = getattr(type(callable_obj), '__call__', None)
255-
if call_func is not None:
256-
callable_obj = call_func
257-
else:
258-
return False # pragma: no cover
259-
260-
type_hints = _typing_extra.get_function_type_hints(_decorators.unwrap_wrapped_function(callable_obj))
261-
annotation = type_hints.get(first_param_name)
262-
if annotation is None:
263-
return False
264-
return True is not sig.empty and _is_call_ctx(annotation)
246+
return _is_call_ctx(first_param_type)
265247

266248

267249
def _build_schema(

pydantic_ai_slim/pydantic_ai/_utils.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727

2828
from anyio.to_thread import run_sync
2929
from pydantic import BaseModel, TypeAdapter
30+
from pydantic._internal import _decorators, _typing_extra
3031
from pydantic.json_schema import JsonSchemaValue
3132
from typing_extensions import (
3233
ParamSpec,
@@ -425,6 +426,45 @@ def is_async_callable(obj: Any) -> Any:
425426
return inspect.iscoroutinefunction(obj) or (callable(obj) and inspect.iscoroutinefunction(obj.__call__))
426427

427428

429+
def get_first_param_type(callable_obj: Callable[..., Any]) -> Any | None:
430+
"""Get the type annotation of the first parameter of a callable.
431+
432+
Handles regular functions, methods, and callable classes with __call__.
433+
Uses Pydantic internals to properly resolve type hints including forward references.
434+
435+
Args:
436+
callable_obj: The callable to inspect.
437+
438+
Returns:
439+
The type annotation of the first parameter, or None if it cannot be determined.
440+
"""
441+
try:
442+
sig = inspect.signature(callable_obj)
443+
except ValueError:
444+
return None
445+
446+
try:
447+
first_param_name = next(iter(sig.parameters.keys()))
448+
except StopIteration:
449+
return None
450+
451+
# See https://github.com/pydantic/pydantic/pull/11451 for a similar implementation in Pydantic
452+
callable_for_hints = callable_obj
453+
if not isinstance(callable_obj, _decorators._function_like): # pyright: ignore[reportPrivateUsage]
454+
call_func = getattr(type(callable_obj), '__call__', None)
455+
if call_func is not None:
456+
callable_for_hints = call_func
457+
else:
458+
return None # pragma: no cover
459+
460+
try:
461+
type_hints = _typing_extra.get_function_type_hints(_decorators.unwrap_wrapped_function(callable_for_hints))
462+
except (NameError, TypeError, AttributeError):
463+
return None
464+
465+
return type_hints.get(first_param_name)
466+
467+
428468
def _update_mapped_json_schema_refs(s: dict[str, Any], name_mapping: dict[str, str]) -> None:
429469
"""Update $refs in a schema to use the new names from name_mapping."""
430470
if '$ref' in s:

0 commit comments

Comments
 (0)