fix: preserve multimodal content during handoffs with nest_handoff_hi… #2233

saksham-1304 · 2025-12-25T13:36:25Z

Fix: Preserve multimodal content during same-turn handoffs

Summary

This PR fixes an issue where multimodal content (images, files, audio) uploaded by users was being lost when the agent performed a handoff to another agent in the same turn with nest_handoff_history=True (the default behavior).

Problem: When a user uploaded an image and the first agent handed off to a specialist agent, the target agent lost access to the uploaded image because the nest_handoff_history function converted all content to a plain text summary, discarding the multimodal data.

Solution: Modified nest_handoff_history to:

Extract and preserve multimodal content from user messages during the handoff
Include the preserved multimodal content as a separate user message alongside the text summary
Provide a cleaner summary that indicates the presence of multimodal content without dumping raw JSON
Mark preserved messages to prevent duplication across chained handoffs

Changes

Core Fix

src/agents/handoffs/history.py:
- Added _MULTIMODAL_CONTENT_TYPES constant to identify multimodal types
- Added _PRESERVED_MULTIMODAL_MARKER constant to mark preserved messages and prevent duplication
- Modified default_handoff_history_mapper() to extract, preserve, and carry forward multimodal content
- Added _extract_multimodal_content() function to scan user messages for images, files, and audio
- Added _collect_preserved_multimodal_content() function to collect already-preserved content from previous handoffs
- Added _stringify_content_list() function to provide human-readable summaries of multimodal content

Tests

tests/test_extension_filters.py:
- Added 7 comprehensive tests covering:
  - Image content preservation
  - File content preservation
  - Multiple images across messages
  - Audio content preservation
  - Text-only messages (backward compatibility)
  - Multimodal content in non-user messages (negative case)
  - Chained handoffs (no duplication across A → B → C handoffs)

Testing

✅ All 20 existing extension filter tests pass
✅ All 14 handoff tool tests pass
✅ All 17 run step processing tests pass
✅ All 7 new multimodal preservation tests pass (including chained handoffs test)
✅ Code passes ruff format and ruff check
✅ Code passes mypy --ignore-missing-imports --no-warn-unused-ignores

Test Plan:

Run: pytest tests/test_extension_filters.py -v
Run: pytest tests/test_handoff_tool.py tests/test_run_step_processing.py -v
Run: ruff format && ruff check
Run: mypy src/agents/handoffs/history.py tests/test_extension_filters.py --ignore-missing-imports

Impact

Backward Compatible: Yes. Text-only conversations work exactly as before. Multimodal content is now additionally preserved when present.

Example Usage:

# Before: Image lost during handoff
# After: Image preserved alongside conversation summary
welcome_agent.handoffs = [
    handoff(specialist_agent, input_filter=HandoffFilters.welcome_to_specialist),
    # nest_handoff_history=True by default - now preserves multimodal content
]

Related Issue

Resolves the issue reported about multimodal content being lost during same-turn handoffs with nest_handoff_history=True.

Checklist

New tests added for multimodal content preservation
All existing tests pass
Code formatted with ruff format
Code passes linting with ruff check
Code passes type checking with mypy
Changes are backward compatible
Commit messages are concise and in imperative mood
Changes follow SDK guidelines for style and documentation

…story When nest_handoff_history=True (the default), multimodal content (images, files, audio) from user messages was being lost during handoffs because the content was converted to a plain text summary. This fix: - Extracts multimodal content from user messages before summarization - Adds the multimodal content as a separate user message after the summary - Improves the text summary to show '[N image(s) attached]' instead of raw JSON Fixes issue where target agents lose access to uploaded images during same-turn handoffs.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/agents/handoffs/history.py

Add marker system to track preserved multimodal content: - Add _PRESERVED_MULTIMODAL_MARKER constant to mark preserved messages - Skip already-preserved messages during extraction - Add _collect_preserved_multimodal_content() to carry forward existing preserved content across chained handoffs - Add test for chained handoffs scenario This addresses the P1 code review feedback about duplicate conversation turns across chained handoffs.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4223ce12f5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/agents/handoffs/history.py

Copilot

Pull request overview

This PR fixes an issue where multimodal content (images, files, audio) was being lost during agent handoffs when using nest_handoff_history=True (the default behavior). The solution preserves multimodal content from user messages by extracting it and including it as a separate user message alongside the conversation summary.

Changes:

Modified default_handoff_history_mapper() to extract and preserve multimodal content from user messages during handoffs
Added helper functions to extract, collect, and format multimodal content with deduplication support for chained handoffs
Enhanced _stringify_content_list() to provide human-readable summaries indicating presence of images, files, and audio

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
src/agents/handoffs/history.py	Core implementation: added multimodal content preservation logic including extraction, collection, and formatting functions; uses marker to prevent duplication in chained handoffs
tests/test_extension_filters.py	Added 7 comprehensive tests covering image, file, and audio preservation; backward compatibility; and chained handoff deduplication

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/agents/handoffs/history.py

… content validation - Filter out messages with __multimodal_preserved__ marker from summaries to prevent confusing output and duplicate lines in chained handoffs - Add validation for multimodal content types in _collect_preserved_multimodal_content to ensure consistency - Handle output_text type in _stringify_content_list for consistent behavior with assistant messages - All 34 related tests pass with no linting errors

t log --oneline -5 Merge branch 'fix/image-loss-during-handoffs' of https://github.com/saksham-1304/openai-agents-python into fix/image-loss-during-handoffs

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

saksham-1304 · 2026-01-10T23:56:36Z

@seratch Please review the PR

seratch · 2026-01-11T00:20:12Z

Thanks for sending this patch!

I can see this issue caused by nested handoff approach, but I am not convinced this is the best solution. Given the nature of nested handoffs, it does make sense to include different types of items as you suggest, but using a marker name feels like a fragile workaround, even if it happens to work with the server right now.

My understanding is that this is a fundamental limitation of nested handoffs, and we plan to move this logic to an opt-in feature starting from the next minor release. Because of that, I do not plan to pursue this enhancement in the short term.

For reference, here is a thorough review by Codex, and I agree with most of the points raised there as well:

Findings

High: The preserved user message injects a name field ("name": _PRESERVED_MULTIMODAL_MARKER) into a Responses input message. The current OpenAI Responses input schema in this repo’s SDK does not include name for message inputs, so this may be rejected by the API or at least be an undocumented field. This is a compatibility risk that could break handoffs at runtime.
Medium: All multimodal parts are collapsed into a single user message (all_multimodal_content), which loses per-turn grouping between each user prompt and its attachment(s). With multiple images/files across turns, the model gets one message containing all attachments and must infer mapping from the summary lines, which is ambiguous and can degrade reasoning.
Medium: input_audio is treated as a message content part, but the current SDK’s message content list type does not include input_audio. The tests already need type: ignore, so this may be ahead of the supported schema and could fail at runtime if the API rejects it.
Low: There is no explicit test asserting the summary excludes the preserved marker or that the preserved message is marked (the no-duplicate test is indirect). A small targeted test would make the intent clearer and guard against regressions.

Open questions / assumptions

Is name officially allowed on Responses input messages in this SDK/API version? If not, we should avoid it.
Do we want to preserve per-turn attachment grouping (one preserved message per original user turn) so the next agent can reliably map images/files to the right prompt?
Is input_audio intended to be inside message content here, or should audio be handled in a different input shape?

Alternative approaches

Preserve original multimodal user messages as-is (or only strip their text), exclude them from the summary, and return them in order after the summary. This keeps attachment–prompt pairing, avoids a name marker, and still prevents duplication by simply carrying forward those items.
If you must remove text, preserve one multimodal-only message per original user turn to keep boundaries, rather than flattening all attachments into a single message.

Overall, the goal (preventing multimodal loss and avoiding huge JSON summaries) is solid, and the tests cover the main scenarios. The main risk is the name marker compatibility and the loss of per-turn attachment grouping. If those two are addressed, this would be a clean and reasonable solution.

chatgpt-codex-connector bot reviewed Dec 25, 2025

View reviewed changes

src/agents/handoffs/history.py Show resolved Hide resolved

seratch added the feature:core label Dec 27, 2025

saksham-1304 and others added 2 commits December 27, 2025 11:27

fix tests

89e8acb

Merge branch 'main' into fix/image-loss-during-handoffs

4223ce1

chatgpt-codex-connector bot reviewed Jan 10, 2026

View reviewed changes

src/agents/handoffs/history.py Show resolved Hide resolved

Merge branch 'main' into fix/image-loss-during-handoffs

0c67e0e

Copilot AI review requested due to automatic review settings January 10, 2026 07:14

Copilot started reviewing on behalf of saksham-1304 January 10, 2026 07:15 View session

Copilot AI reviewed Jan 10, 2026

View reviewed changes

src/agents/handoffs/history.py Show resolved Hide resolved

src/agents/handoffs/history.py Outdated Show resolved Hide resolved

src/agents/handoffs/history.py Outdated Show resolved Hide resolved

saksham-1304 added 2 commits January 11, 2026 01:43

t status

d96a835

t log --oneline -5 Merge branch 'fix/image-loss-during-handoffs' of https://github.com/saksham-1304/openai-agents-python into fix/image-loss-during-handoffs

saksham-1304 requested a review from Copilot January 10, 2026 20:17

Copilot started reviewing on behalf of saksham-1304 January 10, 2026 20:18 View session

Copilot AI reviewed Jan 10, 2026

View reviewed changes

seratch marked this pull request as draft January 11, 2026 00:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve multimodal content during handoffs with nest_handoff_hi… #2233

fix: preserve multimodal content during handoffs with nest_handoff_hi… #2233

saksham-1304 commented Dec 25, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

saksham-1304 commented Jan 10, 2026

Uh oh!

seratch commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: preserve multimodal content during handoffs with nest_handoff_hi… #2233

Are you sure you want to change the base?

fix: preserve multimodal content during handoffs with nest_handoff_hi… #2233

Conversation

saksham-1304 commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix: Preserve multimodal content during same-turn handoffs

Summary

Changes

Core Fix

Tests

Testing

Impact

Related Issue

Checklist

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

saksham-1304 commented Jan 10, 2026

Uh oh!

seratch commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

saksham-1304 commented Dec 25, 2025 •

edited

Loading