Skip to content

Commit 98d154c

Browse files
committed
Improve nested handoff conversation history
1 parent 94447af commit 98d154c

File tree

5 files changed

+140
-10
lines changed

5 files changed

+140
-10
lines changed

docs/handoffs.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ handoff_obj = handoff(
8282

8383
When a handoff occurs, it's as though the new agent takes over the conversation, and gets to see the entire previous conversation history. If you want to change this, you can set an [`input_filter`][agents.handoffs.Handoff.input_filter]. An input filter is a function that receives the existing input via a [`HandoffInputData`][agents.handoffs.HandoffInputData], and must return a new `HandoffInputData`.
8484

85-
By default the runner now wraps the prior transcript inside a developer-role summary message (see [`RunConfig.nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]). That default only applies when neither the handoff nor the run supplies an explicit `input_filter`, so existing code that already customizes the payload (including the examples in this repository) keeps its current behavior without changes.
85+
By default the runner now wraps the prior transcript inside a developer-role summary message (see [`RunConfig.nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]). The summary appears inside a `<CONVERSATION HISTORY>` block that keeps appending new turns when multiple handoffs happen during the same run. That default only applies when neither the handoff nor the run supplies an explicit `input_filter`, so existing code that already customizes the payload (including the examples in this repository) keeps its current behavior without changes.
8686

8787
There are some common patterns (for example removing all tool calls from the history), which are implemented for you in [`agents.extensions.handoff_filters`][]
8888

@@ -127,7 +127,7 @@ router = Agent(
127127
)
128128
```
129129

130-
The new [examples/handoffs/log_handoff_history.py](https://github.com/openai/openai-agents-python/tree/main/examples/handoffs/log_handoff_history.py) script contains a complete runnable sample that prints the nested transcript every time a handoff occurs.
130+
The new [examples/handoffs/log_handoff_history.py](https://github.com/openai/openai-agents-python/tree/main/examples/handoffs/log_handoff_history.py) script contains a complete runnable sample that prints the nested transcript every time a handoff occurs so you can see the `<CONVERSATION HISTORY>` block that will be passed to the next agent.
131131

132132
## Recommended prompts
133133

docs/running_agents.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ The `run_config` parameter lets you configure some global settings for the agent
5151
- [`model_settings`][agents.run.RunConfig.model_settings]: Overrides agent-specific settings. For example, you can set a global `temperature` or `top_p`.
5252
- [`input_guardrails`][agents.run.RunConfig.input_guardrails], [`output_guardrails`][agents.run.RunConfig.output_guardrails]: A list of input or output guardrails to include on all runs.
5353
- [`handoff_input_filter`][agents.run.RunConfig.handoff_input_filter]: A global input filter to apply to all handoffs, if the handoff doesn't already have one. The input filter allows you to edit the inputs that are sent to the new agent. See the documentation in [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] for more details.
54-
- [`nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]: When `True` (the default) the runner wraps the prior transcript in a developer-role summary message and keeps the latest user turn separate before invoking the next agent. Set this to `False` or provide a custom handoff filter if you prefer to pass through the raw transcript. You can also call [`nest_handoff_history`](agents.extensions.handoff_filters.nest_handoff_history) from your own filters to reuse the default behavior. All [`Runner` methods](agents.run.Runner) automatically create a `RunConfig` when you do not pass one, so the quickstarts and examples pick up this default automatically, and any explicit [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] callbacks continue to override it.
54+
- [`nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]: When `True` (the default) the runner wraps the prior transcript in a developer-role summary message, placing the content inside a `<CONVERSATION HISTORY>` block while keeping the latest user turn separate before invoking the next agent. The block automatically appends new turns as subsequent handoffs occur. Set this to `False` or provide a custom handoff filter if you prefer to pass through the raw transcript. You can also call [`nest_handoff_history`](agents.extensions.handoff_filters.nest_handoff_history) from your own filters to reuse the default behavior. All [`Runner` methods](agents.run.Runner) automatically create a `RunConfig` when you do not pass one, so the quickstarts and examples pick up this default automatically, and any explicit [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] callbacks continue to override it.
5555
- [`tracing_disabled`][agents.run.RunConfig.tracing_disabled]: Allows you to disable [tracing](tracing.md) for the entire run.
5656
- [`trace_include_sensitive_data`][agents.run.RunConfig.trace_include_sensitive_data]: Configures whether traces will include potentially sensitive data, such as LLM and tool call inputs/outputs.
5757
- [`workflow_name`][agents.run.RunConfig.workflow_name], [`trace_id`][agents.run.RunConfig.trace_id], [`group_id`][agents.run.RunConfig.group_id]: Sets the tracing workflow name, trace ID and trace group ID for the run. We recommend at least setting `workflow_name`. The group ID is an optional field that lets you link traces across multiple runs.

src/agents/extensions/handoff_filters.py

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -39,15 +39,22 @@ def remove_all_tools(handoff_input_data: HandoffInputData) -> HandoffInputData:
3939
)
4040

4141

42+
_CONVERSATION_HISTORY_START = "<CONVERSATION HISTORY>"
43+
_CONVERSATION_HISTORY_END = "</CONVERSATION HISTORY>"
44+
_NEST_HISTORY_METADATA_KEY = "nest_handoff_history"
45+
_NEST_HISTORY_TRANSCRIPT_KEY = "transcript"
46+
47+
4248
def nest_handoff_history(handoff_input_data: HandoffInputData) -> HandoffInputData:
4349
"""Summarizes the previous transcript into a developer message for the next agent."""
4450

4551
normalized_history = _normalize_input_history(handoff_input_data.input_history)
52+
flattened_history = _flatten_nested_history_messages(normalized_history)
4653
pre_items_as_inputs = [
4754
_run_item_to_plain_input(item) for item in handoff_input_data.pre_handoff_items
4855
]
4956
new_items_as_inputs = [_run_item_to_plain_input(item) for item in handoff_input_data.new_items]
50-
transcript = normalized_history + pre_items_as_inputs + new_items_as_inputs
57+
transcript = flattened_history + pre_items_as_inputs + new_items_as_inputs
5158

5259
developer_message = _build_developer_message(transcript)
5360
latest_user = _find_latest_user_turn(transcript)
@@ -80,15 +87,23 @@ def _run_item_to_plain_input(run_item: RunItem) -> TResponseInputItem:
8087

8188

8289
def _build_developer_message(transcript: list[TResponseInputItem]) -> TResponseInputItem:
83-
if transcript:
90+
transcript_copy = [deepcopy(item) for item in transcript]
91+
if transcript_copy:
8492
summary_lines = [
85-
f"{idx + 1}. {_format_transcript_item(item)}" for idx, item in enumerate(transcript)
93+
f"{idx + 1}. {_format_transcript_item(item)}" for idx, item in enumerate(transcript_copy)
8694
]
8795
else:
8896
summary_lines = ["(no previous turns recorded)"]
8997

90-
content = "Previous conversation before this handoff:\n" + "\n".join(summary_lines)
91-
return {"role": "developer", "content": content}
98+
content_lines = [_CONVERSATION_HISTORY_START, *summary_lines, _CONVERSATION_HISTORY_END]
99+
content = "\n".join(content_lines)
100+
return {
101+
"role": "developer",
102+
"content": content,
103+
"metadata": {
104+
_NEST_HISTORY_METADATA_KEY: {_NEST_HISTORY_TRANSCRIPT_KEY: transcript_copy}
105+
},
106+
}
92107

93108

94109
def _format_transcript_item(item: TResponseInputItem) -> str:
@@ -130,6 +145,40 @@ def _find_latest_user_turn(
130145
return None
131146

132147

148+
def _flatten_nested_history_messages(
149+
items: list[TResponseInputItem],
150+
) -> list[TResponseInputItem]:
151+
flattened: list[TResponseInputItem] = []
152+
for item in items:
153+
nested_transcript = _extract_nested_history_transcript(item)
154+
if nested_transcript is not None:
155+
flattened.extend(nested_transcript)
156+
continue
157+
flattened.append(deepcopy(item))
158+
return flattened
159+
160+
161+
def _extract_nested_history_transcript(
162+
item: TResponseInputItem,
163+
) -> list[TResponseInputItem] | None:
164+
if item.get("role") != "developer":
165+
return None
166+
metadata = item.get("metadata")
167+
if not isinstance(metadata, dict):
168+
return None
169+
payload = metadata.get(_NEST_HISTORY_METADATA_KEY)
170+
if not isinstance(payload, dict):
171+
return None
172+
transcript = payload.get(_NEST_HISTORY_TRANSCRIPT_KEY)
173+
if not isinstance(transcript, list):
174+
return None
175+
normalized: list[TResponseInputItem] = []
176+
for entry in transcript:
177+
if isinstance(entry, dict):
178+
normalized.append(deepcopy(entry))
179+
return normalized if normalized else []
180+
181+
133182
def _get_run_item_role(run_item: RunItem) -> str | None:
134183
role_candidate = run_item.to_input_item().get("role")
135184
return role_candidate if isinstance(role_candidate, str) else None

tests/test_agent_runner.py

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -294,7 +294,7 @@ async def test_default_handoff_history_nested_and_filters_respected():
294294

295295
assert isinstance(result.input, list)
296296
assert result.input[0]["role"] == "developer"
297-
assert "Previous conversation" in result.input[0]["content"]
297+
assert "<CONVERSATION HISTORY>" in result.input[0]["content"]
298298
assert "triage summary" in result.input[0]["content"]
299299
assert result.input[1]["role"] == "user"
300300
assert result.input[1]["content"] == "user_message"
@@ -324,6 +324,39 @@ def passthrough_filter(data: HandoffInputData) -> HandoffInputData:
324324
assert filtered_result.input == "user_message"
325325

326326

327+
@pytest.mark.asyncio
328+
async def test_default_handoff_history_accumulates_across_multiple_handoffs():
329+
triage_model = FakeModel()
330+
delegate_model = FakeModel()
331+
closer_model = FakeModel()
332+
333+
closer = Agent(name="closer", model=closer_model)
334+
delegate = Agent(name="delegate", model=delegate_model, handoffs=[closer])
335+
triage = Agent(name="triage", model=triage_model, handoffs=[delegate])
336+
337+
triage_model.add_multiple_turn_outputs(
338+
[[get_text_message("triage summary"), get_handoff_tool_call(delegate)]]
339+
)
340+
delegate_model.add_multiple_turn_outputs(
341+
[[get_text_message("delegate update"), get_handoff_tool_call(closer)]]
342+
)
343+
closer_model.add_multiple_turn_outputs([[get_text_message("resolution")]])
344+
345+
result = await Runner.run(triage, input="user_question")
346+
347+
assert result.final_output == "resolution"
348+
assert closer_model.first_turn_args is not None
349+
closer_input = closer_model.first_turn_args["input"]
350+
assert isinstance(closer_input, list)
351+
assert closer_input[0]["role"] == "developer"
352+
developer_content = closer_input[0]["content"]
353+
assert developer_content.count("<CONVERSATION HISTORY>") == 1
354+
assert "triage summary" in developer_content
355+
assert "delegate update" in developer_content
356+
assert closer_input[1]["role"] == "user"
357+
assert closer_input[1]["content"] == "user_question"
358+
359+
327360
@pytest.mark.asyncio
328361
async def test_async_input_filter_supported():
329362
# DO NOT rename this without updating pyproject.toml

tests/test_extension_filters.py

Lines changed: 49 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -243,7 +243,19 @@ def test_nest_handoff_history_wraps_transcript() -> None:
243243

244244
assert isinstance(nested.input_history, tuple)
245245
assert nested.input_history[0]["role"] == "developer"
246-
assert "Assist reply" in nested.input_history[0]["content"]
246+
developer_content = nested.input_history[0]["content"]
247+
assert "<CONVERSATION HISTORY>" in developer_content
248+
assert "</CONVERSATION HISTORY>" in developer_content
249+
assert "Assist reply" in developer_content
250+
metadata = nested.input_history[0].get("metadata")
251+
assert isinstance(metadata, dict)
252+
history_payload = metadata.get("nest_handoff_history")
253+
assert isinstance(history_payload, dict)
254+
transcript = history_payload.get("transcript")
255+
assert isinstance(transcript, list)
256+
assert len(transcript) == 4
257+
assert transcript[0]["role"] == "user"
258+
assert transcript[1]["role"] == "assistant"
247259
assert nested.input_history[1]["role"] == "user"
248260
assert nested.input_history[1]["content"] == "Hello"
249261
assert len(nested.pre_handoff_items) == 0
@@ -264,3 +276,39 @@ def test_nest_handoff_history_handles_missing_user() -> None:
264276
assert len(nested.input_history) == 1
265277
assert nested.input_history[0]["role"] == "developer"
266278
assert "reasoning" in nested.input_history[0]["content"].lower()
279+
280+
281+
def test_nest_handoff_history_appends_existing_history() -> None:
282+
first = HandoffInputData(
283+
input_history=(_get_user_input_item("Hello"),),
284+
pre_handoff_items=(_get_message_output_run_item("First reply"),),
285+
new_items=(),
286+
run_context=RunContextWrapper(context=()),
287+
)
288+
289+
first_nested = nest_handoff_history(first)
290+
developer_message = first_nested.input_history[0]
291+
292+
follow_up_history = (
293+
developer_message,
294+
_get_user_input_item("Another question"),
295+
)
296+
297+
second = HandoffInputData(
298+
input_history=follow_up_history,
299+
pre_handoff_items=(_get_message_output_run_item("Second reply"),),
300+
new_items=(_get_handoff_output_run_item("transfer"),),
301+
run_context=RunContextWrapper(context=()),
302+
)
303+
304+
second_nested = nest_handoff_history(second)
305+
306+
assert isinstance(second_nested.input_history, tuple)
307+
developer = second_nested.input_history[0]
308+
assert developer["role"] == "developer"
309+
content = developer["content"]
310+
assert content.count("<CONVERSATION HISTORY>") == 1
311+
assert content.count("</CONVERSATION HISTORY>") == 1
312+
assert "First reply" in content
313+
assert "Second reply" in content
314+
assert "Another question" in content

0 commit comments

Comments
 (0)