Skip to content

Tool calling is not working with Realtime Agent #2308

@christopheragnus

Description

@christopheragnus

Please read this first

  • Have you read the docs?Agents SDK docs Yes
  • Have you searched for related issues? Others may have faced similar issues. Yes

Describe the bug

I’m integrating Twilio SIP with OpenAI Realtime using openai-agents==0.6.5 (FastAPI server). I accept the SIP call via:

await _openai_client.post(
f"/realtime/calls/{call_id}/accept",
body={“type”: “realtime”, “model”: “gpt-realtime”},
cast_to=dict,
)

Then I attach a RealtimeRunner:

initial_model_settings = {
“voice”: “alloy”,
“modalities”: [“audio”],
“turn_detection”: {“type”: “semantic_vad”, “interrupt_response”: True},
}

_agent = RealtimeAgent(
name=“Customer Support”,
instructions=base_instructions,
tools=tools,
)

runner = RealtimeRunner(
starting_agent=_agent,
model=OpenAIRealtimeSIPModel(),
)

async with await runner.run(
model_config={
“call_id”: call_id,
“initial_model_settings”: initial_model_settings,
}
) as session:
…

I’m trying to configure the audio format for Twilio (PCMU/8k). The Realtime docs show session.update with audio.input.format and output_modalities.

So I tried:

session_update = {
“type”: “realtime”,
“model”: “gpt-realtime”,
“output_modalities”: [“audio”],
“audio”: {
“input”: {“format”: {“type”: “audio/pcmu”, “rate”: 8000}},
“output”: {“format”: {“type”: “audio/pcmu”}, “voice”: “alloy”},
},
}


await session.model.send_event(
RealtimeModelSendRawMessage(
message={“type”: “session.update”, “session”: session_update}
)
)

But I get:

ERROR:openai.agents:Failed to convert raw message: RealtimeModelSendRawMessage(…)
ERROR: RealtimeError(message=“Invalid type for ‘session.audio.input.format’: expected an object, but got null
instead.”, param=‘session.audio.input.format’)

If I remove the explicit session.update, I still get the session.audio.input.format null error. If I set legacy fields like input_audio_format / output_audio_format in initial_model_settings, the call connects but I get no audio on the Twilio leg.

Questions:

What is the correct way to set session.audio.input.format / output_modalities when using openai-agents
RealtimeRunner?
Is RealtimeModelSendRawMessage supposed to support session.update, or do I need a different API/event shape?
Do I need to upgrade openai-agents or use input_audio_format/output_audio_format in some other place for SIP?

Debug information

Python 3.12.3.
openai-agents 0.6.5
openai 2.14.0
FastAPI 0.128.0
Twilio SIP (PCMU 8k)

Repro steps

Ideally provide a minimal python script that can be run to reproduce the bug.

Expected behavior

The call should work and use function tool calling when I say "I would like to make a booking" or look up availabilites when I say "What times are available?"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions