Add handoff history mapper

jhills20 · jhills20 · commit 17aa135bb974 · 2025-11-07T16:44:41.000-05:00
diff --git a/docs/handoffs.md b/docs/handoffs.md
@@ -82,7 +82,7 @@ handoff_obj = handoff(
 
 When a handoff occurs, it's as though the new agent takes over the conversation, and gets to see the entire previous conversation history. If you want to change this, you can set an [`input_filter`][agents.handoffs.Handoff.input_filter]. An input filter is a function that receives the existing input via a [`HandoffInputData`][agents.handoffs.HandoffInputData], and must return a new `HandoffInputData`.
 
-By default the runner now wraps the prior transcript inside a developer-role summary message (see [`RunConfig.nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]). The summary appears inside a `<CONVERSATION HISTORY>` block that keeps appending new turns when multiple handoffs happen during the same run. That default only applies when neither the handoff nor the run supplies an explicit `input_filter`, so existing code that already customizes the payload (including the examples in this repository) keeps its current behavior without changes.
+By default the runner now collapses the prior transcript into a single assistant summary message (see [`RunConfig.nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]). The summary appears inside a `<CONVERSATION HISTORY>` block that keeps appending new turns when multiple handoffs happen during the same run. You can provide your own mapping function via [`RunConfig.handoff_history_mapper`][agents.run.RunConfig.handoff_history_mapper] to replace the generated message without writing a full `input_filter`. That default only applies when neither the handoff nor the run supplies an explicit `input_filter`, so existing code that already customizes the payload (including the examples in this repository) keeps its current behavior without changes.
 
 There are some common patterns (for example removing all tool calls from the history), which are implemented for you in [`agents.extensions.handoff_filters`][]
 
diff --git a/docs/running_agents.md b/docs/running_agents.md
@@ -51,13 +51,14 @@ The `run_config` parameter lets you configure some global settings for the agent
 -   [`model_settings`][agents.run.RunConfig.model_settings]: Overrides agent-specific settings. For example, you can set a global `temperature` or `top_p`.
 -   [`input_guardrails`][agents.run.RunConfig.input_guardrails], [`output_guardrails`][agents.run.RunConfig.output_guardrails]: A list of input or output guardrails to include on all runs.
 -   [`handoff_input_filter`][agents.run.RunConfig.handoff_input_filter]: A global input filter to apply to all handoffs, if the handoff doesn't already have one. The input filter allows you to edit the inputs that are sent to the new agent. See the documentation in [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] for more details.
--   [`nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]: When `True` (the default) the runner wraps the prior transcript in a developer-role summary message, placing the content inside a `<CONVERSATION HISTORY>` block while keeping the latest user turn separate before invoking the next agent. The block automatically appends new turns as subsequent handoffs occur. Set this to `False` or provide a custom handoff filter if you prefer to pass through the raw transcript. You can also call [`nest_handoff_history`](agents.extensions.handoff_filters.nest_handoff_history) from your own filters to reuse the default behavior. All [`Runner` methods](agents.run.Runner) automatically create a `RunConfig` when you do not pass one, so the quickstarts and examples pick up this default automatically, and any explicit [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] callbacks continue to override it.
+-   [`nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]: When `True` (the default) the runner collapses the prior transcript into a single assistant message before invoking the next agent. The helper places the content inside a `<CONVERSATION HISTORY>` block that keeps appending new turns as subsequent handoffs occur. Set this to `False` or provide a custom handoff filter if you prefer to pass through the raw transcript. All [`Runner` methods](agents.run.Runner) automatically create a `RunConfig` when you do not pass one, so the quickstarts and examples pick up this default automatically, and any explicit [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] callbacks continue to override it.
+-   [`handoff_history_mapper`][agents.run.RunConfig.handoff_history_mapper]: Optional callable that receives the normalized transcript (history + handoff items) whenever `nest_handoff_history` is `True`. It must return the exact list of input items to forward to the next agent, allowing you to replace the built-in summary without writing a full handoff filter.
 -   [`tracing_disabled`][agents.run.RunConfig.tracing_disabled]: Allows you to disable [tracing](tracing.md) for the entire run.
 -   [`trace_include_sensitive_data`][agents.run.RunConfig.trace_include_sensitive_data]: Configures whether traces will include potentially sensitive data, such as LLM and tool call inputs/outputs.
 -   [`workflow_name`][agents.run.RunConfig.workflow_name], [`trace_id`][agents.run.RunConfig.trace_id], [`group_id`][agents.run.RunConfig.group_id]: Sets the tracing workflow name, trace ID and trace group ID for the run. We recommend at least setting `workflow_name`. The group ID is an optional field that lets you link traces across multiple runs.
 -   [`trace_metadata`][agents.run.RunConfig.trace_metadata]: Metadata to include on all traces.
 
-By default, the SDK now nests prior turns inside a developer summary message whenever an agent hands off to another agent. This reduces repeated assistant messages and keeps the most recent user turn explicit for the receiving agent. If you'd like to return to the legacy behavior, pass `RunConfig(nest_handoff_history=False)` or supply a `handoff_input_filter` that forwards the conversation exactly as you need.
+By default, the SDK now nests prior turns inside a single assistant summary message whenever an agent hands off to another agent. This reduces repeated assistant messages and keeps the full transcript inside a single block that new agents can scan quickly. If you'd like to return to the legacy behavior, pass `RunConfig(nest_handoff_history=False)` or supply a `handoff_input_filter` (or `handoff_history_mapper`) that forwards the conversation exactly as you need.
 
 ## Conversations/chat threads
 
diff --git a/src/agents/_run_impl.py b/src/agents/_run_impl.py
@@ -1042,7 +1042,10 @@ async def execute_handoffs(
                 pre_step_items = list(filtered.pre_handoff_items)
                 new_step_items = list(filtered.new_items)
             elif run_config.nest_handoff_history and handoff_input_data is not None:
-                nested = nest_handoff_history(handoff_input_data)
+                nested = nest_handoff_history(
+                    handoff_input_data,
+                    history_mapper=run_config.handoff_history_mapper,
+                )
                 original_input = (
                     nested.input_history
                     if isinstance(nested.input_history, str)
diff --git a/src/agents/extensions/handoff_filters.py b/src/agents/extensions/handoff_filters.py
@@ -4,7 +4,7 @@
 from copy import deepcopy
 from typing import Any, cast
 
-from ..handoffs import HandoffInputData
+from ..handoffs import HandoffHistoryMapper, HandoffInputData
 from ..items import (
     HandoffCallItem,
     HandoffOutputItem,
@@ -43,8 +43,12 @@ def remove_all_tools(handoff_input_data: HandoffInputData) -> HandoffInputData:
 _CONVERSATION_HISTORY_END = "</CONVERSATION HISTORY>"
 
 
-def nest_handoff_history(handoff_input_data: HandoffInputData) -> HandoffInputData:
-    """Summarizes the previous transcript into a developer message for the next agent."""
+def nest_handoff_history(
+    handoff_input_data: HandoffInputData,
+    *,
+    history_mapper: HandoffHistoryMapper | None = None,
+) -> HandoffInputData:
+    """Summarizes the previous transcript for the next agent."""
 
     normalized_history = _normalize_input_history(handoff_input_data.input_history)
     flattened_history = _flatten_nested_history_messages(normalized_history)
@@ -54,24 +58,29 @@ def nest_handoff_history(handoff_input_data: HandoffInputData) -> HandoffInputDa
     new_items_as_inputs = [_run_item_to_plain_input(item) for item in handoff_input_data.new_items]
     transcript = flattened_history + pre_items_as_inputs + new_items_as_inputs
 
-    developer_message = _build_developer_message(transcript)
-    latest_user = _find_latest_user_turn(transcript)
-    history_items: list[TResponseInputItem] = [developer_message]
-    if latest_user is not None:
-        history_items.append(latest_user)
-
+    mapper = history_mapper or default_handoff_history_mapper
+    history_items = mapper(transcript)
     filtered_pre_items = tuple(
         item
         for item in handoff_input_data.pre_handoff_items
         if _get_run_item_role(item) != "assistant"
     )
 
     return handoff_input_data.clone(
-        input_history=tuple(history_items),
+        input_history=tuple(deepcopy(item) for item in history_items),
         pre_handoff_items=filtered_pre_items,
     )
 
 
+def default_handoff_history_mapper(
+    transcript: list[TResponseInputItem],
+) -> list[TResponseInputItem]:
+    """Returns a single assistant message summarizing the transcript."""
+
+    summary_message = _build_summary_message(transcript)
+    return [summary_message]
+
+
 def _normalize_input_history(
     input_history: str | tuple[TResponseInputItem, ...],
 ) -> list[TResponseInputItem]:
@@ -84,7 +93,7 @@ def _run_item_to_plain_input(run_item: RunItem) -> TResponseInputItem:
     return deepcopy(run_item.to_input_item())
 
 
-def _build_developer_message(transcript: list[TResponseInputItem]) -> TResponseInputItem:
+def _build_summary_message(transcript: list[TResponseInputItem]) -> TResponseInputItem:
     transcript_copy = [deepcopy(item) for item in transcript]
     if transcript_copy:
         summary_lines = [
@@ -96,11 +105,11 @@ def _build_developer_message(transcript: list[TResponseInputItem]) -> TResponseI
 
     content_lines = [_CONVERSATION_HISTORY_START, *summary_lines, _CONVERSATION_HISTORY_END]
     content = "\n".join(content_lines)
-    developer_message: dict[str, Any] = {
-        "role": "developer",
+    assistant_message: dict[str, Any] = {
+        "role": "assistant",
         "content": content,
     }
-    return cast(TResponseInputItem, developer_message)
+    return cast(TResponseInputItem, assistant_message)
 
 
 def _format_transcript_item(item: TResponseInputItem) -> str:
@@ -133,15 +142,6 @@ def _stringify_content(content: Any) -> str:
         return str(content)
 
 
-def _find_latest_user_turn(
-    transcript: list[TResponseInputItem],
-) -> TResponseInputItem | None:
-    for item in reversed(transcript):
-        if item.get("role") == "user":
-            return deepcopy(item)
-    return None
-
-
 def _flatten_nested_history_messages(
     items: list[TResponseInputItem],
 ) -> list[TResponseInputItem]:
@@ -158,8 +158,6 @@ def _flatten_nested_history_messages(
 def _extract_nested_history_transcript(
     item: TResponseInputItem,
 ) -> list[TResponseInputItem] | None:
-    if item.get("role") != "developer":
-        return None
     content = item.get("content")
     if not isinstance(content, str):
         return None
diff --git a/src/agents/handoffs.py b/src/agents/handoffs.py
@@ -69,6 +69,9 @@ def clone(self, **kwargs: Any) -> HandoffInputData:
 HandoffInputFilter: TypeAlias = Callable[[HandoffInputData], MaybeAwaitable[HandoffInputData]]
 """A function that filters the input data passed to the next agent."""
 
+HandoffHistoryMapper: TypeAlias = Callable[[list[TResponseInputItem]], list[TResponseInputItem]]
+"""A function that rewrites the conversation history before the next agent sees it."""
+
 
 @dataclass
 class Handoff(Generic[TContext, TAgent]):
diff --git a/src/agents/run.py b/src/agents/run.py
@@ -44,7 +44,7 @@
     OutputGuardrail,
     OutputGuardrailResult,
 )
-from .handoffs import Handoff, HandoffInputFilter, handoff
+from .handoffs import Handoff, HandoffHistoryMapper, HandoffInputFilter, handoff
 from .items import (
     HandoffCallItem,
     ItemHelpers,
@@ -197,8 +197,16 @@ class RunConfig:
     """
 
     nest_handoff_history: bool = True
-    """Wrap prior run history in a developer message before handing off when no custom input
-    filter is set. Set to False to preserve the raw transcript behavior from previous releases.
+    """Wrap prior run history in a single assistant message before handing off when no custom
+    input filter is set. Set to False to preserve the raw transcript behavior from previous
+    releases.
+    """
+
+    handoff_history_mapper: HandoffHistoryMapper | None = None
+    """Optional function that receives the normalized transcript (history + handoff items) and
+    returns the input history that should be passed to the next agent. When left as `None`, the
+    runner collapses the transcript into a single assistant message. This function only runs when
+    `nest_handoff_history` is True.
     """
 
     input_guardrails: list[InputGuardrail[Any]] | None = None
diff --git a/tests/test_agent_runner.py b/tests/test_agent_runner.py
@@ -172,9 +172,9 @@ async def test_handoffs():
 
     assert result.final_output == "done"
     assert len(result.raw_responses) == 3, "should have three model responses"
-    assert len(result.to_input_list()) == 8, (
-        "should have 8 inputs: dev summary, latest user input, tool call, tool result, message, "
-        "handoff, handoff result, and done message"
+    assert len(result.to_input_list()) == 7, (
+        "should have 7 inputs: summary message, tool call, tool result, message, handoff, "
+        "handoff result, and done message"
     )
     assert result.last_agent == agent_1, "should have handed off to agent_1"
 
@@ -301,15 +301,14 @@ async def test_default_handoff_history_nested_and_filters_respected():
     result = await Runner.run(agent_2, input="user_message")
 
     assert isinstance(result.input, list)
-    developer = _as_message(result.input[0])
-    assert developer["role"] == "developer"
-    developer_content = developer["content"]
-    assert isinstance(developer_content, str)
-    assert "<CONVERSATION HISTORY>" in developer_content
-    assert "triage summary" in developer_content
-    latest_user = _as_message(result.input[1])
-    assert latest_user["role"] == "user"
-    assert latest_user["content"] == "user_message"
+    assert len(result.input) == 1
+    summary = _as_message(result.input[0])
+    assert summary["role"] == "assistant"
+    summary_content = summary["content"]
+    assert isinstance(summary_content, str)
+    assert "<CONVERSATION HISTORY>" in summary_content
+    assert "triage summary" in summary_content
+    assert "user_message" in summary_content
 
     passthrough_model = FakeModel()
     delegate = Agent(name="delegate", model=passthrough_model)
@@ -360,16 +359,14 @@ async def test_default_handoff_history_accumulates_across_multiple_handoffs():
     assert closer_model.first_turn_args is not None
     closer_input = closer_model.first_turn_args["input"]
     assert isinstance(closer_input, list)
-    developer = _as_message(closer_input[0])
-    assert developer["role"] == "developer"
-    developer_content = developer["content"]
-    assert isinstance(developer_content, str)
-    assert developer_content.count("<CONVERSATION HISTORY>") == 1
-    assert "triage summary" in developer_content
-    assert "delegate update" in developer_content
-    latest_user = _as_message(closer_input[1])
-    assert latest_user["role"] == "user"
-    assert latest_user["content"] == "user_question"
+    summary = _as_message(closer_input[0])
+    assert summary["role"] == "assistant"
+    summary_content = summary["content"]
+    assert isinstance(summary_content, str)
+    assert summary_content.count("<CONVERSATION HISTORY>") == 1
+    assert "triage summary" in summary_content
+    assert "delegate update" in summary_content
+    assert "user_question" in summary_content
 
 
 @pytest.mark.asyncio
diff --git a/tests/test_agent_runner_streamed.py b/tests/test_agent_runner_streamed.py
@@ -175,9 +175,9 @@ async def test_handoffs():
 
     assert result.final_output == "done"
     assert len(result.raw_responses) == 3, "should have three model responses"
-    assert len(result.to_input_list()) == 8, (
-        "should have 8 inputs: dev summary, latest user input, tool call, tool result, message, "
-        "handoff, handoff result, and done message"
+    assert len(result.to_input_list()) == 7, (
+        "should have 7 inputs: summary message, tool call, tool result, message, handoff, "
+        "handoff result, and done message"
     )
     assert result.last_agent == agent_1, "should have handed off to agent_1"
 
diff --git a/tests/test_extension_filters.py b/tests/test_extension_filters.py