-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
The AsyncRealtime
class cannot be used for real-time transcription because of a conflicting requirement with the API:
- The
AsyncRealtime
class requires users to provide amodel
parameter - When using
intent=transcribe
, the API explicitly forbids themodel
parameter - Even when passing
model=None
, the parameter is still sent as a query parameter, causing the API to reject the request
Error received:
{
"error": {
"message": "You must not provide a model parameter for transcription sessions.",
"type": "invalid_request_error",
"code": "invalid_model",
"event_id": null,
"param": null
},
"event_id": "xxx",
"type": "error"
}
Expected behavior:
The model
parameter should be optional in the AsyncRealtime
class. When model=None
is passed (or not provided), it should not be included as a query parameter in the request. This would allow transcription mode to work properly while maintaining backward compatibility for other use cases.
Current workaround:
Bypass the AsyncRealtime
class and create the connection manually:
realtime_client = AsyncOpenAI(max_retries=0).beta.realtime
conn = AsyncRealtimeConnection(
await connect(
WEBSOCKET_BASE_URL + "intent=transcription",
additional_headers={
"Authorization": "Bearer " + realtime_client._client.api_key,
},
)
)
To Reproduce
import asyncio
import logging
from openai import AsyncOpenAI
from openai.types.beta.realtime import transcription_session_update_param
MODEL = "gpt-4o-transcribe"
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
async def main() -> None:
realtime_client = AsyncOpenAI(max_retries=0).beta.realtime
async with realtime_client.connect(model=MODEL, extra_query={"intent": "transcription"}) as conn:
await conn.transcription_session.update(
session=transcription_session_update_param.Session(
input_audio_format="pcm16",
input_audio_transcription=transcription_session_update_param.SessionInputAudioTranscription(
model="gpt-4o-transcribe"
),
turn_detection=transcription_session_update_param.SessionTurnDetection(
type="server_vad",
threshold=0.9,
prefix_padding_ms=300,
silence_duration_ms=500,
),
input_audio_noise_reduction=transcription_session_update_param.SessionInputAudioNoiseReduction(
type="near_field"
),
)
)
if __name__ == "__main__":
asyncio.run(main())
Code snippets
OS
MacOS
Python version
Python 3.12
Library version
openai[realtime]==1.108.0
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working