pydantic
diff --git a/‎docs/models/overview.md‎
Lines changed: 137 additions & 2 deletions b/‎docs/models/overview.md‎
Lines changed: 137 additions & 2 deletions
diff --git a/‎pydantic_ai_slim/pydantic_ai/_function_schema.py‎
Lines changed: 5 additions & 23 deletions b/‎pydantic_ai_slim/pydantic_ai/_function_schema.py‎
Lines changed: 5 additions & 23 deletions
diff --git a/‎pydantic_ai_slim/pydantic_ai/_utils.py‎
Lines changed: 40 additions & 0 deletions b/‎pydantic_ai_slim/pydantic_ai/_utils.py‎
Lines changed: 40 additions & 0 deletions
@@ -153,8 +153,16 @@ attributes showing the queue depth and configured limits. The `name` parameter o
 ## Fallback Model
 
 You can use [`FallbackModel`][pydantic_ai.models.fallback.FallbackModel] to attempt multiple models
-in sequence until one successfully returns a result. Under the hood, Pydantic AI automatically switches
-from one model to the next if the current model returns a 4xx or 5xx status code.
+in sequence until one succeeds. Pydantic AI can switch to the next model when the current model
+raises an exception (like a 4xx/5xx API error) **or** when the response content indicates a semantic
+failure (like a truncated response or a failed built-in tool call).
+
+By default, fallback triggers on [`ModelAPIError`][pydantic_ai.exceptions.ModelAPIError] (4xx/5xx API errors),
+so you don't need to configure anything for the most common use case.
+
+This behavior is controlled by the `fallback_on` parameter (see
+[`FallbackModel`][pydantic_ai.models.fallback.FallbackModel]), which accepts exception types,
+exception handlers, and response handlers — all of which can be sync or async.
 
 !!! note
     The provider SDKs on which Models are based (like OpenAI, Anthropic, etc.) often have built-in retry logic that can delay the `FallbackModel` from activating.
@@ -305,6 +313,133 @@ passing a custom `fallback_on` argument to the `FallbackModel` constructor.
 !!! note
     Validation errors (from [structured output](../output.md#structured-output) or [tool parameters](../tools.md)) do **not** trigger fallback. These errors use the [retry mechanism](../agent.md#reflection-and-self-correction) instead, which re-prompts the same model to try again. This is intentional: validation errors stem from the non-deterministic nature of LLMs and may succeed on retry, whereas API errors (4xx/5xx) generally indicate issues that won't resolve by retrying the same request.
 
+### Response-Based Fallback
+
+In addition to exception-based fallback, you can also trigger fallback based on the **content** of a model's response. This is useful when a model returns a successful HTTP response (no exception), but the response content indicates a semantic failure — for example, an unexpected finish reason or a built-in tool reporting failure.
+
+!!! note "Non-streaming only"
+    Response-based fallback currently only works with non-streaming requests (`agent.run()` and `agent.run_sync()`).
+    For streaming requests (`agent.run_stream()`), only exception-based fallback is supported.
+
+The `fallback_on` parameter accepts:
+
+- A tuple of exception types: `(ModelAPIError, ModelHTTPError)`
+- An exception handler (sync or async): `lambda exc: isinstance(exc, MyError)`
+- A response handler (sync or async): `def check(r: ModelResponse) -> bool`
+- A list mixing all of the above: `[ModelAPIError, exc_handler, response_handler]`
+
+Handler type is auto-detected by inspecting type hints on the first parameter. If the first parameter is hinted as [`ModelResponse`][pydantic_ai.messages.ModelResponse], it's a response handler. Otherwise (including untyped handlers and lambdas), it's an exception handler.
+
+#### Finish Reason Example
+
+A simple use case is checking the model's finish reason — for example, falling back if the response was truncated due to length limits:
+
+```python {title="fallback_on_finish_reason.py"}
+from pydantic_ai import Agent
+from pydantic_ai.messages import FinishReason, ModelResponse
+from pydantic_ai.models.fallback import FallbackModel
+
+
+def bad_finish_reason(response: ModelResponse) -> bool:
+    """Fallback if the model stopped due to length limit, content filter, or error."""
+    reason: FinishReason | None = response.finish_reason
+    # Trigger fallback for problematic finish reasons
+    return reason in ('length', 'content_filter', 'error')
+
+
+fallback_model = FallbackModel(
+    'openai:gpt-5.2',
+    'anthropic:claude-sonnet-4-5',
+    fallback_on=bad_finish_reason,
+)
+
+agent = Agent(fallback_model)
+result = agent.run_sync('What is the capital of France?')
+print(result.output)
+#> The capital of France is Paris.
+```
+
+!!! warning "Solo response handlers replace default exception fallback"
+    When you pass a single response handler as `fallback_on` (as above), it **replaces** the default `(ModelAPIError,)` exception fallback entirely. This means API errors (4xx/5xx) will propagate as exceptions instead of triggering fallback to the next model.
+
+    To keep exception-based fallback alongside a response handler, pass them together as a list — see the [mixed example below](#combining-handlers).
+
+!!! note
+    Note that Pydantic AI already handles some finish reasons automatically in the [agent loop](../agent.md):
+    responses with a `'length'` or `'content_filter'` finish reason raise exceptions (which `FallbackModel`
+    catches by default), and empty responses are retried. A response handler is useful for custom
+    checks beyond these built-in behaviors.
+
+#### Built-in Tool Failure Example
+
+A more complex use case is when using built-in tools like web search or URL fetching. For example, Google's [`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool] may return a successful response with a status indicating the URL fetch failed:
+
+```python {title="fallback_on_builtin_tool.py"}
+from pydantic_ai import Agent
+from pydantic_ai.messages import ModelResponse
+from pydantic_ai.models.anthropic import AnthropicModel
+from pydantic_ai.models.fallback import FallbackModel
+from pydantic_ai.models.google import GoogleModel
+
+
+def web_fetch_failed(response: ModelResponse) -> bool:
+    """Check if a web_fetch built-in tool failed to retrieve content."""
+    for call, result in response.builtin_tool_calls:
+        if call.tool_name != 'web_fetch':
+            continue
+        if not isinstance(result.content, list):
+            continue
+        for item in result.content:
+            if isinstance(item, dict):
+                status = item.get('url_retrieval_status', '')
+                if status and status != 'URL_RETRIEVAL_STATUS_SUCCESS':
+                    return True
+    return False
+
+
+google_model = GoogleModel('gemini-2.5-flash')
+anthropic_model = AnthropicModel('claude-sonnet-4-5')
+
+# Auto-detected as response handler via type hint
+fallback_model = FallbackModel(
+    google_model,
+    anthropic_model,
+    fallback_on=web_fetch_failed,
+)
+
+agent = Agent(fallback_model)
+
+# If Google's web_fetch fails, automatically falls back to Anthropic
+result = agent.run_sync('Summarize https://ai.pydantic.dev')
+print(result.output)
+"""
+Pydantic AI is a Python agent framework for building production-grade LLM applications.
+"""
+```
+
+Response handlers receive the [`ModelResponse`][pydantic_ai.messages.ModelResponse] returned by the model and should return `True` to trigger fallback to the next model, or `False` to accept the response.
+
+#### Combining Handlers
+
+You can combine exception types, exception handlers, and response handlers in a single list:
+
+```python {title="fallback_on_mixed.py" requires="fallback_on_builtin_tool.py"}
+from pydantic_ai.exceptions import ModelAPIError
+from pydantic_ai.models.fallback import FallbackModel
+
+from fallback_on_builtin_tool import anthropic_model, google_model, web_fetch_failed
+
+fallback_model = FallbackModel(
+    google_model,
+    anthropic_model,
+    fallback_on=[
+        ModelAPIError,  # Exception type
+        lambda exc: 'rate limit' in str(exc).lower(),  # Exception handler (untyped lambda)
+        web_fetch_failed,  # Response handler (auto-detected via type hint)
+    ],
+)
+```
+
 ### Exception Handling in Middleware and Decorators
 
 When using `FallbackModel`, it's important to understand that [`FallbackExceptionGroup`][pydantic_ai.exceptions.FallbackExceptionGroup]
 
@@ -12,7 +12,7 @@
 from typing import TYPE_CHECKING, Any, Concatenate, Literal, cast, get_origin
 
 from pydantic import ConfigDict
-from pydantic._internal import _decorators, _generate_schema, _typing_extra
+from pydantic._internal import _decorators, _generate_schema
 from pydantic._internal._config import ConfigWrapper
 from pydantic.fields import FieldInfo
 from pydantic.json_schema import GenerateJsonSchema
@@ -22,7 +22,7 @@
 
 from ._griffe import doc_descriptions
 from ._run_context import RunContext
-from ._utils import check_object_json_schema, is_async_callable, is_model_like, run_in_executor
+from ._utils import check_object_json_schema, get_first_param_type, is_async_callable, is_model_like, run_in_executor
 
 if TYPE_CHECKING:
     from .tools import DocstringFormat, ObjectJsonSchema
@@ -240,28 +240,10 @@ def _takes_ctx(callable_obj: TargetCallable[P, R]) -> TypeIs[WithCtx[P, R]]:  #
     Returns:
         `True` if the callable takes a `RunContext` as first argument, `False` otherwise.
     """
-    try:
-        sig = signature(callable_obj)
-    except ValueError:
-        return False
-    try:
-        first_param_name = next(iter(sig.parameters.keys()))
-    except StopIteration:
+    first_param_type = get_first_param_type(callable_obj)
+    if first_param_type is None:
         return False
-    else:
-        # See https://github.com/pydantic/pydantic/pull/11451 for a similar implementation in Pydantic
-        if not isinstance(callable_obj, _decorators._function_like):  # pyright: ignore[reportPrivateUsage]
-            call_func = getattr(type(callable_obj), '__call__', None)
-            if call_func is not None:
-                callable_obj = call_func
-            else:
-                return False  # pragma: no cover
-
-        type_hints = _typing_extra.get_function_type_hints(_decorators.unwrap_wrapped_function(callable_obj))
-        annotation = type_hints.get(first_param_name)
-        if annotation is None:
-            return False
-        return True is not sig.empty and _is_call_ctx(annotation)
+    return _is_call_ctx(first_param_type)
 
 
 def _build_schema(
 
@@ -27,6 +27,7 @@
 
 from anyio.to_thread import run_sync
 from pydantic import BaseModel, TypeAdapter
+from pydantic._internal import _decorators, _typing_extra
 from pydantic.json_schema import JsonSchemaValue
 from typing_extensions import (
     ParamSpec,
@@ -425,6 +426,45 @@ def is_async_callable(obj: Any) -> Any:
     return inspect.iscoroutinefunction(obj) or (callable(obj) and inspect.iscoroutinefunction(obj.__call__))
 
 
+def get_first_param_type(callable_obj: Callable[..., Any]) -> Any | None:
+    """Get the type annotation of the first parameter of a callable.
+
+    Handles regular functions, methods, and callable classes with __call__.
+    Uses Pydantic internals to properly resolve type hints including forward references.
+
+    Args:
+        callable_obj: The callable to inspect.
+
+    Returns:
+        The type annotation of the first parameter, or None if it cannot be determined.
+    """
+    try:
+        sig = inspect.signature(callable_obj)
+    except ValueError:
+        return None
+
+    try:
+        first_param_name = next(iter(sig.parameters.keys()))
+    except StopIteration:
+        return None
+
+    # See https://github.com/pydantic/pydantic/pull/11451 for a similar implementation in Pydantic
+    callable_for_hints = callable_obj
+    if not isinstance(callable_obj, _decorators._function_like):  # pyright: ignore[reportPrivateUsage]
+        call_func = getattr(type(callable_obj), '__call__', None)
+        if call_func is not None:
+            callable_for_hints = call_func
+        else:
+            return None  # pragma: no cover
+
+    try:
+        type_hints = _typing_extra.get_function_type_hints(_decorators.unwrap_wrapped_function(callable_for_hints))
+    except (NameError, TypeError, AttributeError):
+        return None
+
+    return type_hints.get(first_param_name)
+
+
 def _update_mapped_json_schema_refs(s: dict[str, Any], name_mapping: dict[str, str]) -> None:
     """Update $refs in a schema to use the new names from name_mapping."""
     if '$ref' in s: