Agents file injection during an active run #5128

GabrelBulz · 2026-04-06T20:35:04Z

GabrelBulz
Apr 6, 2026

Hi,

This comes more as a question.
I'm trying to figure out if there is a pattern that provides a solution for the following scenario. (I presume that I'm not the only one facing this issue and I'm curious on how other people solved it).

The idea is quite simple... I want to be able to allow my agent to read a file (using code interpreter - this is an important step) that had been created during the current run.

Let's presume we have the following agent definition. (using openai agents)

`
client = OpenAIChatClient()

agent = Agent(
client=client,
name="PhoneBookAgent",
instructions=(
"You are a phone book assistant. "
"When users ask for contact information, use the get_phone_book function to retrieve it. "
"The function returns ALL contacts at once, so you can answer multiple queries from a single call."
),
tools=[get_phone_book] <- plugin function that allows us to fetch a large dataset from DB or something
)
`

If the DB result is too big then we are not able to return it as a function result because it is > 512kb.

moonbox3
Apr 16, 2026
Maintainer

This is achievable today without rebuilding the agent — the trick is that OpenAI's code interpreter tool takes container.file_ids as a list, and within a single agent.run() the tool-loop passes the same tool dict to the Responses API on each iteration by reference. So your function tool can upload the oversized result to Files and append the new file_id to that list before returning. The next round-trip in the same run will see the file.

import io
import json
from agent_framework import Agent
from agent_framework.openai import OpenAIChatClient
from openai import AsyncOpenAI

openai_files = AsyncOpenAI()
client = OpenAIChatClient()

# Hold a reference to the file_ids list that the tool will append to.
ci_file_ids: list[str] = []
code_interpreter = client.get_code_interpreter_tool(file_ids=ci_file_ids)

INLINE_LIMIT = 400_000  # bytes; keep well under the 512KB tool-result cap


async def get_phone_book() -> str:
    """Return the phone book. Large results are uploaded and exposed via code interpreter."""
    contacts = await fetch_contacts_from_db()
    payload = json.dumps(contacts).encode()

    if len(payload) < INLINE_LIMIT:
        return payload.decode()

    uploaded = await openai_files.files.create(
        file=("phone_book.json", io.BytesIO(payload)),
        purpose="assistants",
    )
    ci_file_ids.append(uploaded.id)  # <-- visible on the next tool-loop iteration
    return (
        f"The phone book was too large to return inline. "
        f"Uploaded as file_id={uploaded.id}. "
        f"Use the code interpreter to load and query that file."
    )


agent = Agent(
    client=client,
    name="PhoneBookAgent",
    instructions=(
        "Use get_phone_book to retrieve contacts. "
        "If it returns a file_id, use the code interpreter to read that file and answer the user's query "
        "instead of calling get_phone_book again."
    ),
    tools=[get_phone_book, code_interpreter],
)

await agent.run("Who works in Engineering and earns over 90k?")

What's happening under the hood (relevant if you want to verify or extend the pattern):

FunctionInvocationLayer.get_response keeps a mutable_options dict across the entire tool loop and re-submits options["tools"] on every call to the Responses API. Non-FunctionTool entries (like the code-interpreter dict from get_code_interpreter_tool) pass through unchanged by reference — see _tools.py:2179 and the tool prep in _chat_client.py:_prepare_tools_for_openai. So a mutation to container["file_ids"] sticks for the remainder of the run.
The key constraint: keep the file_ids list alive where both the tool and the agent can see it (module scope, self, or closure). If you rebuild the agent per request, rebuild the list per request too — don't share it across runs unless you want prior files to persist.

A couple of practical notes:

Cleanup. If these files are single-run, delete them afterward with openai_files.files.delete(file_id). An AgentMiddleware that deletes everything in ci_file_ids after call_next() is a clean place to put this.
Size. The 512KB cap is on the tool result payload, not the uploaded file, so uploading is strictly a way to get past that limit — code interpreter can then read the full dataset from /mnt/data/<file_id> via pandas or plain open.
Prompting. Make the tool's text return explicit ("uploaded as file_id=X; use code interpreter"). The model does the right thing reliably when the hand-off is named.

If you prefer a more decoupled approach, you can do the same thing from function_middleware: intercept the tool result post-call_next(), upload if it's too large, rewrite context.result to a short handoff string, and append to the same ci_file_ids list. Same mechanism, just factored out of the tool body.

0 replies

GabrelBulz · 2026-04-16T12:52:34Z

GabrelBulz
Apr 16, 2026
Author

Let me just start by saying thanks, you made my day. I've been trying to find a solution to this problem for a while.
Previously I was using thread.update() to update the code_interpreter instance, but that works only for gpt-4.1 and earlier models (starting from gpt-5 that was not a suitable solution).

I'll leave here the complete code for whoever else if facing the same issue:

import io
import json
import random
import os

from agent_framework_foundry import FoundryChatClient
from dotenv import load_dotenv
from agent_framework import Agent
from agent_framework.openai import OpenAIChatClient

# Load environment variables
load_dotenv()


client = OpenAIChatClient()

openai_client = client.client

# Hold a reference to the file_ids list that the tool will append to.
ci_file_ids: list[str] = []

code_interpreter = {
    'type': 'code_interpreter',
    'container': {
        'type': 'auto',
        'file_ids': ci_file_ids
    }
}


INLINE_LIMIT = 10


async def fetch_contacts_from_db() -> dict:
    """Simulate fetching contacts from database"""
    first_names = [
        "Alice", "Bob", "Charlie", "Diana", "Edward", "Fiona", "George", "Hannah",
        "Isaac", "Julia", "Kevin", "Laura", "Michael", "Nina", "Oscar", "Patricia",
        "Quinn", "Rachel", "Samuel", "Tina", "Uma", "Victor", "Wendy", "Xavier",
        "Yolanda", "Zachary", "Aria", "Blake", "Chloe", "Derek", "Elena", "Frank",
        "Grace", "Henry", "Iris", "Jack", "Kara", "Liam", "Mia", "Noah",
        "Olivia", "Peter", "Queenie", "Ryan", "Sophia", "Thomas", "Ursula", "Violet",
        "William", "Xena", "Yasmine", "Zoe"
    ]

    last_names = [
        "Smith", "Johnson", "Williams", "Brown", "Jones", "Garcia", "Miller", "Davis",
        "Rodriguez", "Martinez", "Hernandez", "Lopez", "Gonzalez", "Wilson", "Anderson",
        "Thomas", "Taylor", "Moore", "Jackson", "Martin", "Lee", "Thompson", "White",
        "Harris", "Sanchez", "Clark", "Ramirez", "Lewis", "Robinson", "Walker"
    ]

    contacts = {}

    # create entries
    for i in range(55):
        first = random.choice(first_names)
        last = random.choice(last_names)
        full_name = f"{first} {last}"

        # Generate random phone number in format: (XXX) XXX-XXXX
        area_code = random.randint(200, 999)
        prefix = random.randint(200, 999)
        line = random.randint(1000, 9999)
        phone = f"({area_code}) {prefix}-{line}"

        contacts[full_name] = {
            "phone": phone,
            "department": random.choice(["Engineering", "Sales", "Marketing", "HR", "Finance"]),
            "salary": random.randint(60000, 150000)
        }

    # Add a known contact for testing
    contacts["Alice Johnson"] = {
        "phone": "(555) 123-4567",
        "department": "Engineering",
        "salary": 95000
    }

    return contacts


async def get_phone_book() -> str:
    """Return the phone book. Large results are uploaded and exposed via code interpreter."""
    contacts = await fetch_contacts_from_db()
    payload = json.dumps(contacts).encode()

    if len(payload) < INLINE_LIMIT:
        return payload.decode()

    uploaded = await openai_client.files.create(
        file=("phone_book.json", io.BytesIO(payload)),
        purpose="assistants",
    )
    ci_file_ids.append(uploaded.id)
    return (
        f"The phone book was too large to return inline. "
        f"Uploaded as file_id={uploaded.id}. "
        f"Use the code interpreter to load and query that file."
    )


agent = Agent(
    client=client,
    name="PhoneBookAgent",
    instructions=(
        "Use get_phone_book to retrieve contacts. "
        "If it returns a file_id, use the code interpreter to read that file and answer the user's query "
        "instead of calling get_phone_book again."
    ),
    tools=[get_phone_book, code_interpreter],
)






async def main():


    response = await agent.run("What is Alice Johnson's phone number?")

    print("\n" + "=" * 80)
    print("RESPONSE:")
    print("=" * 80)
    print(response.text)


if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agents file injection during an active run #5128

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Agents file injection during an active run #5128

Uh oh!

Uh oh!

GabrelBulz Apr 6, 2026

Replies: 3 comments

Uh oh!

moonbox3 Apr 16, 2026 Maintainer

Uh oh!

Uh oh!

GabrelBulz Apr 16, 2026 Author

GabrelBulz
Apr 6, 2026

moonbox3
Apr 16, 2026
Maintainer

GabrelBulz
Apr 16, 2026
Author