Skip to content

fix: include generated media in run output regardless of store_media setting#6793

Open
Br1an67 wants to merge 14 commits intoagno-agi:mainfrom
Br1an67:fix/media-in-run-output
Open

fix: include generated media in run output regardless of store_media setting#6793
Br1an67 wants to merge 14 commits intoagno-agi:mainfrom
Br1an67:fix/media-in-run-output

Conversation

@Br1an67
Copy link
Contributor

@Br1an67 Br1an67 commented Feb 28, 2026

Summary

Decouples store_media from in-run media availability. Previously, store_media=False prevented media from appearing in both the DB and the returned RunOutput, making it impossible to forward generated images/videos/audio to external services (WhatsApp, Slack) immediately after a run.

Now store_media controls only DB persistence. Generated media is always available on the returned RunOutput for the caller to act on.

Closes #5101


Bugs Fixed

This PR fixes 4 distinct bugs in the media pipeline:

Bug 1: store_media=False strips media from RunOutput

store_media_util() — which copies media from ModelResponse to RunOutput — was gated behind if agent.store_media: in all 10 run paths (4 agent + 6 team). When store_media=False, media was never placed on RunOutput, so callers couldn't forward images to WhatsApp/Slack.

Fix: Remove the gate. store_media_util() always runs. DB scrubbing happens later in cleanup_and_store.

Bug 2: Streaming images gated on store_media (asymmetry)

In handle_model_response_chunk, only images were gated behind if agent.store_media:. Videos, audio, and files were never gated. This meant streaming with store_media=False dropped images but not other media types.

Fix: Remove the gate. All media types are now consistent — always collected into run_response during streaming.

Bug 3: Session cache aliasing leaks media back to DB

After cleanup_and_store scrubbed media and stored the run, session.upsert_run(run=run_response) stored a reference to the same RunOutput object. If media was later restored on that object (e.g., via try/finally), subsequent save_session() calls would re-persist the old run with restored media — leaking it to DB despite store_media=False.

Fix: session.upsert_run(run=copy.copy(run_response)) — shallow copy breaks reference aliasing so the session cache holds an independent scrubbed snapshot.

Bug 4: scrub_media_from_run_output missed top-level output fields

scrub_media_from_run_output() scrubbed input and message media but didn't null the top-level output fields (images, videos, audio, files). This caused team member responses and workflow executor runs to leak media to DB when store_media=False, because _scrub_member_responses and _store_executor_response delegate to this function.

Fix: Add 4 lines to null top-level output fields. Safe because cleanup_and_store uses storage_copy (caller's original is untouched).


Full Data Lifecycle

BEFORE (store_media=False):

  Model generates image
        |
        v
  store_media_util()  --- if store_media: --- SKIPPED
        |                                       |
        v                                       v
  RunOutput.images = []           Media never reaches RunOutput
        |
        v
  cleanup_and_store()
        |
        +-- scrub input/messages --> DB gets scrubbed input
        |
        +-- upsert_run(run_response) --> session holds REFERENCE
        |
        +-- save_session() --> DB gets no media
        |
        v
  Caller gets RunOutput.images = None   <-- can't forward to WhatsApp/Slack


AFTER (store_media=False):

  Model generates image
        |
        v
  store_media_util()  --- ALWAYS runs --> RunOutput.images = [img]
        |
        v
  cleanup_and_store()
        |
        +-- storage_copy = copy.copy(run_response)  <-- breaks aliasing
        |
        +-- scrub_run_output_for_storage(storage_copy)
        |       |
        |       +-- scrub input/message media
        |       +-- null images/videos/audio/files
        |
        +-- upsert_run(storage_copy) --> session holds SCRUBBED COPY
        |
        +-- save_session() --> DB gets no media
        |
        v
  Caller gets RunOutput.images = [img]   <-- forwards to WhatsApp/Slack


TEAM MEMBER RESPONSES (store_media=False on member):

  Member agent generates image
        |
        v
  store_media_util() --> member RunOutput.images = [img]
        |
        v
  _scrub_member_responses()
        |
        +-- scrub_run_output_for_storage(member, member_response)
        |       |
        |       +-- scrub_media_from_run_output(member_response)
        |       |       |
        |       |       +-- null input/message media
        |       |       +-- null images/videos/audio/files  <-- Bug 4 fix
        |       |
        |       +-- (permanent scrub, no restore)
        |
        v
  Team session saved --> member media NOT in DB

Files Changed

File Changes What
agent/_response.py +6 -6 Remove store_media gate on streaming image collection (Bug 2)
agent/_run.py +38 -23 Remove store_media gates in 4 run paths (Bug 1); storage_copy pattern in cleanup_and_store + acleanup_and_store (Bug 3)
team/_run.py +39 -25 Same decoupling across 6 team run paths (Bug 1); storage_copy in _cleanup_and_store + _acleanup_and_store (Bug 3)
utils/agent.py +7 -0 Null top-level output media fields in scrub_media_from_run_output (Bug 4)
tests/unit/agent/test_store_media_run_output.py +192 8 tests: sync/async x non-streaming/streaming x yield_run_output/stream_events
tests/unit/agent/test_store_media_scrub_leak.py +218 10 tests: scrub function behavior, member response scrubbing, session cache isolation
tests/unit/os/routers/test_slack_store_media.py +450 7 tests: Slack integration — non-streaming + streaming with mock and real Agent

Test Coverage (25 tests)

Agent core (8 tests)test_store_media_run_output.py

  • sync/async x non-streaming/streaming x yield_run_output/stream_events
  • Regression: store_media=True still works

Scrub + DB isolation (10 tests)test_store_media_scrub_leak.py

  • scrub_media_from_run_output nulls all 4 media types (images/videos/audio/files)
  • Works on both RunOutput and TeamRunOutput
  • Member response scrubbing via scrub_run_output_for_storage
  • store_media=True preserves media (regression guard)
  • cleanup_and_store restores media for caller (sync + async)
  • Session cache does NOT contain media when store_media=False

Slack integration (7 tests)test_slack_store_media.py

  • Non-streaming: mock agent, real Agent with store_media=False
  • Streaming: completion chunks, content chunks with multi-image
  • Real Agent streaming with store_media=False
  • Regression: store_media=True still uploads

Design Decisions

Why copy.copy() instead of deepcopy?
Shallow copy is sufficient because we only need to break top-level field aliasing (storage_copy.images = None doesn't affect run_response.images). Deep copy would unnecessarily clone large media objects. Input/message media IS shared between the copy and original — this is acceptable because scrub_run_output_for_storage has always scrubbed those on the original in pre-PR code.

Why explicit nulling after scrub_run_output_for_storage?
Defensive — scrub_media_from_run_output now also nulls these fields, but the explicit block makes the intent clear at the cleanup_and_store call site.

Interaction with _scrub_member_responses:
For team member responses, scrub_media_from_run_output is called permanently (no save/restore). The top-level nulling fix (Bug 4) is critical here — without it, member-generated images leak to DB even when the member has store_media=False.


Related Issues

Issue Problem How this PR helps
#5101 (closes) store_media=False strips media from RunOutput — can't forward images to WhatsApp Media now stays on RunOutput for immediate forwarding
#6529 DB bloat from inline base64 media (165MB sessions, 4GB+ tables) Users who set store_media=False to avoid bloat can now still act on media in the current run
#5960 Expired media URLs in history break subsequent LLM calls (404 errors) Reinforces "act now, don't persist" pattern — forward immediately, don't rely on stored URLs
#5838 Session storage growth "extremely exaggerated" in Team mode Users combining store_media=False + store_tool_messages=False now get working media without DB growth

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Improvement
  • Model update
  • Other:

Checklist

  • Code complies with style guidelines
  • Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh)
  • Self-review completed
  • Documentation updated (comments, docstrings)
  • Examples and guides: Relevant cookbook examples have been included or updated (if applicable)
  • Tested in clean environment
  • Tests added/updated (25 new tests)

@Br1an67 Br1an67 requested a review from a team as a code owner February 28, 2026 13:23
Mustafa-Esoofally added a commit that referenced this pull request Mar 2, 2026
Apply the same store_media decoupling to Team that PR #6793 applied to
Agent: remove if-guards on store_media_util() in all 6 team run paths
and add save/restore of media fields around cleanup_and_store so callers
still see generated media when store_media=False.

Add unit tests verifying sync and async agent.run() returns images in
RunOutput even with store_media=False.

Closes #5101
@Mustafa-Esoofally Mustafa-Esoofally self-assigned this Mar 3, 2026
Br1an67 and others added 5 commits March 3, 2026 19:10
Apply the same store_media decoupling to Team that PR agno-agi#6793 applied to
Agent: remove if-guards on store_media_util() in all 6 team run paths
and add save/restore of media fields around cleanup_and_store so callers
still see generated media when store_media=False.

Add unit tests verifying sync and async agent.run() returns images in
RunOutput even with store_media=False.

Closes agno-agi#5101
If persistence fails between scrub and restore, media would be
left scrubbed on the returned RunOutput. Wrapping in try/finally
guarantees the caller always sees media regardless of DB errors.
Tests cover both non-streaming and streaming Slack paths with
store_media=False, including real Agent end-to-end scenarios.
Verifies media reaches upload_response_media_async in both paths.
Cover sync/async streaming with both stream_events=True and
yield_run_output=True to verify images survive cleanup_and_store.
@Mustafa-Esoofally Mustafa-Esoofally force-pushed the fix/media-in-run-output branch from b2d325c to 16cefe1 Compare March 4, 2026 00:20
Mustafa-Esoofally and others added 9 commits March 3, 2026 19:37
When cache_session=True and store_media=False, the finally-block
media restore in cleanup_and_store rehydrated media on the same
RunOutput object stored in session.runs (reference aliasing).
On the next run, save_session would re-persist the old run with
restored media, leaking it to DB despite store_media=False.

Shallow-copy run_response before upsert_run so session.runs holds
an independent snapshot of the scrubbed state.
scrub_media_from_run_output() scrubbed input/messages media but not
top-level output fields (images/videos/audio/files). This caused
member responses and workflow executor runs to leak media to DB
when store_media=False, since _scrub_member_responses and
_store_executor_response delegate to this function.

The fix adds 4 lines to null these fields. This is safe because
cleanup_and_store saves/restores media refs in try/finally, so
callers still receive media — only the DB copy is scrubbed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
scrub_run_output_for_storage() already calls scrub_media_from_run_output()
which nulls images/videos/audio/files — the explicit null blocks after it
were redundant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Generated media not added to run_output

3 participants