You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/realtime-audio-reference.md
+37-37Lines changed: 37 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,15 +54,15 @@ There are nine client events that can be sent from the client to the server:
54
54
55
55
| Event | Description |
56
56
|-------|-------------|
57
-
|[RealtimeClientEventConversationItemCreate](#realtimeclienteventconversationitemcreate)|Send this client event when adding an item to the conversation. |
58
-
|[RealtimeClientEventConversationItemDelete](#realtimeclienteventconversationitemdelete)|Send this client event when you want to remove any item from the conversation history. |
59
-
|[RealtimeClientEventConversationItemTruncate](#realtimeclienteventconversationitemtruncate)|Send this client event when you want to truncate a previous assistant message's audio. |
60
-
|[RealtimeClientEventInputAudioBufferAppend](#realtimeclienteventinputaudiobufferappend)|Send this client event to append audio bytes to the input audio buffer. |
61
-
|[RealtimeClientEventInputAudioBufferClear](#realtimeclienteventinputaudiobufferclear)|Send this client event to clear the audio bytes in the buffer. |
62
-
|[RealtimeClientEventInputAudioBufferCommit](#realtimeclienteventinputaudiobuffercommit)|Send this client event to commit audio bytes to a user message. |
63
-
|[RealtimeClientEventResponseCancel](#realtimeclienteventresponsecancel)|Send this client event to cancel an in-progress response. |
64
-
|[RealtimeClientEventResponseCreate](#realtimeclienteventresponsecreate)|Send this client event to trigger a response generation. |
65
-
|[RealtimeClientEventSessionUpdate](#realtimeclienteventsessionupdate)|Send this client event to update the session's default configuration. |
57
+
|[RealtimeClientEventConversationItemCreate](#realtimeclienteventconversationitemcreate)|The client `conversation.item.create`event is used to add a new item to the conversation's context, including messages, function calls, and function call responses. |
58
+
|[RealtimeClientEventConversationItemDelete](#realtimeclienteventconversationitemdelete)|The client `conversation.item.delete`event is used to remove an item from the conversation history. |
59
+
|[RealtimeClientEventConversationItemTruncate](#realtimeclienteventconversationitemtruncate)|The client `conversation.item.truncate`event is used to truncate a previous assistant message's audio. |
60
+
|[RealtimeClientEventInputAudioBufferAppend](#realtimeclienteventinputaudiobufferappend)|The client `input_audio_buffer.append`event is used to append audio bytes to the input audio buffer. |
61
+
|[RealtimeClientEventInputAudioBufferClear](#realtimeclienteventinputaudiobufferclear)|The client `input_audio_buffer.clear`event is used to clear the audio bytes in the buffer. |
62
+
|[RealtimeClientEventInputAudioBufferCommit](#realtimeclienteventinputaudiobuffercommit)|The client `input_audio_buffer.commit`event is used to commit the user input audio buffer. |
63
+
|[RealtimeClientEventResponseCancel](#realtimeclienteventresponsecancel)|The client `response.cancel`event is used to cancel an in-progress response. |
64
+
|[RealtimeClientEventResponseCreate](#realtimeclienteventresponsecreate)|The client `response.create`event is used to instruct the server to create a response via model inferencing. |
65
+
|[RealtimeClientEventSessionUpdate](#realtimeclienteventsessionupdate)|The client `session.update`event is used to update the session's default configuration. |
66
66
67
67
### RealtimeClientEventConversationItemCreate
68
68
@@ -281,34 +281,34 @@ There are 28 server events that can be received from the server:
281
281
282
282
| Event | Description |
283
283
|-------|-------------|
284
-
|[RealtimeServerEventConversationCreated](#realtimeservereventconversationcreated)|Server event when a conversationis created. Emitted right after session creation. |
285
-
|[RealtimeServerEventConversationItemCreated](#realtimeservereventconversationitemcreated)|Server event when a conversation item is created. |
286
-
|[RealtimeServerEventConversationItemDeleted](#realtimeservereventconversationitemdeleted)|Server event when an item in the conversation is deleted. |
287
-
|[RealtimeServerEventConversationItemInputAudioTranscriptionCompleted](#realtimeservereventconversationiteminputaudiotranscriptioncompleted)|Server event when input audio transcription is enabled and a transcription succeeds. |
288
-
|[RealtimeServerEventConversationItemInputAudioTranscriptionFailed](#realtimeservereventconversationiteminputaudiotranscriptionfailed)|Server event when input audio transcription is configured, and a transcription request for a user message failed. |
289
-
|[RealtimeServerEventConversationItemTruncated](#realtimeservereventconversationitemtruncated)|Server event when the client truncates an earlier assistant audio message item. |
290
-
|[RealtimeServerEventError](#realtimeservereventerror)|Server event when an error occurs. |
291
-
|[RealtimeServerEventInputAudioBufferCleared](#realtimeservereventinputaudiobuffercleared)|Server event when the client clears the input audio buffer. |
292
-
|[RealtimeServerEventInputAudioBufferCommitted](#realtimeservereventinputaudiobuffercommitted)|Server event when an input audio buffer is committed, either by the client or automatically in server VAD mode. |
293
-
|[RealtimeServerEventInputAudioBufferSpeechStarted](#realtimeservereventinputaudiobufferspeechstarted)|Server event in server turn detection mode when speech is detected. |
294
-
|[RealtimeServerEventInputAudioBufferSpeechStopped](#realtimeservereventinputaudiobufferspeechstopped)|Server event in server turn detection mode when speech stops. |
295
-
|[RealtimeServerEventRateLimitsUpdated](#realtimeservereventratelimitsupdated)|Emitted after every "response.done" event to indicate the updated rate limits. |
296
-
|[RealtimeServerEventResponseAudioDelta](#realtimeservereventresponseaudiodelta)|Server event when the model-generated audio is updated. |
297
-
|[RealtimeServerEventResponseAudioDone](#realtimeservereventresponseaudiodone)|Server event when the model-generated audio is done. Also emitted when a response is interrupted, incomplete, or canceled. |
298
-
|[RealtimeServerEventResponseAudioTranscriptDelta](#realtimeservereventresponseaudiotranscriptdelta)|Server event when the model-generated transcription of audio output is updated. |
299
-
|[RealtimeServerEventResponseAudioTranscriptDone](#realtimeservereventresponseaudiotranscriptdone)|Server event when the model-generated transcription of audio output is done streaming. Also emitted when a response is interrupted, incomplete, or canceled. |
300
-
|[RealtimeServerEventResponseContentPartAdded](#realtimeservereventresponsecontentpartadded)|Server event when a new content part is added to an assistant message item during response generation. |
301
-
|[RealtimeServerEventResponseContentPartDone](#realtimeservereventresponsecontentpartdone)|Server event when a content part is done streaming in an assistant message item. Also emitted when a response is interrupted, incomplete, or canceled. |
302
-
|[RealtimeServerEventResponseCreated](#realtimeservereventresponsecreated)|Server event when a new Response is created. The first event of response creation, where the response is in an initial state of "in_progress". |
303
-
|[RealtimeServerEventResponseDone](#realtimeservereventresponsedone)|Server event when a response is done streaming. Always emitted, no matter the final state. |
304
-
|[RealtimeServerEventResponseFunctionCallArgumentsDelta](#realtimeservereventresponsefunctioncallargumentsdelta)|Server event when the model-generated function call arguments are updated. |
305
-
|[RealtimeServerEventResponseFunctionCallArgumentsDone](#realtimeservereventresponsefunctioncallargumentsdone)|Server event when the model-generated function call arguments are done streaming. Also emitted when a response is interrupted, incomplete, or canceled. |
306
-
|[RealtimeServerEventResponseOutputItemAdded](#realtimeservereventresponseoutputitemadded)|Server event when a new output item is added to a response. |
307
-
|[RealtimeServerEventResponseOutputItemDone](#realtimeservereventresponseoutputitemdone)|Server event when an output item is done streaming. Also emitted when a response is interrupted, incomplete, or canceled. |
308
-
|[RealtimeServerEventResponseTextDelta](#realtimeservereventresponsetextdelta)|Server event when the model-generated text is updated. |
309
-
|[RealtimeServerEventResponseTextDone](#realtimeservereventresponsetextdone)|Server event when the model-generated text is done. Also emitted when a response is interrupted, incomplete, or canceled. |
310
-
|[RealtimeServerEventSessionCreated](#realtimeservereventsessioncreated)|Server event when a session is created. |
311
-
|[RealtimeServerEventSessionUpdated](#realtimeservereventsessionupdated)|Server event when a session is updated. |
284
+
|[RealtimeServerEventConversationCreated](#realtimeservereventconversationcreated)|The server `conversation.created` event is returned right after session creation. One conversation is created per session. |
285
+
|[RealtimeServerEventConversationItemCreated](#realtimeservereventconversationitemcreated)|The server `conversation.item.created`event is returned when a conversation item is created. |
286
+
|[RealtimeServerEventConversationItemDeleted](#realtimeservereventconversationitemdeleted)|The server `conversation.item.deleted`event is returned when the client deleted an item in the conversation with a `conversation.item.delete` event. |
287
+
|[RealtimeServerEventConversationItemInputAudioTranscriptionCompleted](#realtimeservereventconversationiteminputaudiotranscriptioncompleted)|The server `conversation.item.input_audio_transcription.completed`event is the result of audio transcription for speech written to the audio buffer. |
288
+
|[RealtimeServerEventConversationItemInputAudioTranscriptionFailed](#realtimeservereventconversationiteminputaudiotranscriptionfailed)|The server `conversation.item.input_audio_transcription.failed`event is returned when input audio transcription is configured, and a transcription request for a user message failed. |
289
+
|[RealtimeServerEventConversationItemTruncated](#realtimeservereventconversationitemtruncated)|The server `conversation.item.truncated`event is returned when the client truncates an earlier assistant audio message item with a `conversation.item.truncate` event. |
290
+
|[RealtimeServerEventError](#realtimeservereventerror)|The server `error`event is returned when an error occurs, which could be a client problem or a server problem. |
291
+
|[RealtimeServerEventInputAudioBufferCleared](#realtimeservereventinputaudiobuffercleared)|The server `input_audio_buffer.cleared`event is returned when the client clears the input audio buffer with a `input_audio_buffer.clear` event. |
292
+
|[RealtimeServerEventInputAudioBufferCommitted](#realtimeservereventinputaudiobuffercommitted)|The server `input_audio_buffer.committed`event is returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. |
293
+
|[RealtimeServerEventInputAudioBufferSpeechStarted](#realtimeservereventinputaudiobufferspeechstarted)|The server `input_audio_buffer.speech_started`event is returned in `server_vad` mode when speech is detected in the audio buffer. |
294
+
|[RealtimeServerEventInputAudioBufferSpeechStopped](#realtimeservereventinputaudiobufferspeechstopped)|The server `input_audio_buffer.speech_stopped`event is returned in `server_vad` mode when the server detects the end of speech in the audio buffer. |
295
+
|[RealtimeServerEventRateLimitsUpdated](#realtimeservereventratelimitsupdated)|The server `rate_limits.updated` event is emitted at the beginning of a response to indicate the updated rate limits. |
296
+
|[RealtimeServerEventResponseAudioDelta](#realtimeservereventresponseaudiodelta)|The server `response.audio.delta`event is returned when the model-generated audio is updated. |
297
+
|[RealtimeServerEventResponseAudioDone](#realtimeservereventresponseaudiodone)|The server `response.audio.done`event is returned when the model-generated audio is done. |
298
+
|[RealtimeServerEventResponseAudioTranscriptDelta](#realtimeservereventresponseaudiotranscriptdelta)|The server `response.audio_transcript.delta`event is returned when the model-generated transcription of audio output is updated. |
299
+
|[RealtimeServerEventResponseAudioTranscriptDone](#realtimeservereventresponseaudiotranscriptdone)|The server `response.audio_transcript.done`event is returned when the model-generated transcription of audio output is done streaming. |
300
+
|[RealtimeServerEventResponseContentPartAdded](#realtimeservereventresponsecontentpartadded)|The server `response.content_part.added`event is returned when a new content part is added to an assistant message item. |
301
+
|[RealtimeServerEventResponseContentPartDone](#realtimeservereventresponsecontentpartdone)|The server `response.content_part.done`event is returned when a content part is done streaming. |
302
+
|[RealtimeServerEventResponseCreated](#realtimeservereventresponsecreated)|The server `response.created`event is returned when a new response is created. This is the first event of response creation, where the response is in an initial state of `in_progress`. |
303
+
|[RealtimeServerEventResponseDone](#realtimeservereventresponsedone)|The server `response.done`event is returned when a response is done streaming. |
304
+
|[RealtimeServerEventResponseFunctionCallArgumentsDelta](#realtimeservereventresponsefunctioncallargumentsdelta)|The server `response.function_call_arguments.delta`event is returned when the model-generated function call arguments are updated. |
305
+
|[RealtimeServerEventResponseFunctionCallArgumentsDone](#realtimeservereventresponsefunctioncallargumentsdone)|The server `response.function_call_arguments.done`event is returned when the model-generated function call arguments are done streaming. |
306
+
|[RealtimeServerEventResponseOutputItemAdded](#realtimeservereventresponseoutputitemadded)|The server `response.output_item.added`event is returned when a new item is created during response generation. |
307
+
|[RealtimeServerEventResponseOutputItemDone](#realtimeservereventresponseoutputitemdone)|The server `response.output_item.done` event is returned when an item is done streaming. |
308
+
|[RealtimeServerEventResponseTextDelta](#realtimeservereventresponsetextdelta)|The server `response.text.delta`event is returned when the model-generated text is updated. |
309
+
|[RealtimeServerEventResponseTextDone](#realtimeservereventresponsetextdone)|The server `response.text.done`event is returned when the model-generated text is done streaming. |
310
+
|[RealtimeServerEventSessionCreated](#realtimeservereventsessioncreated)|The server `session.created`event is the first server event when you establish a new connection to the Realtime API. This event creates and returns a new session with the default session configuration. |
311
+
|[RealtimeServerEventSessionUpdated](#realtimeservereventsessionupdated)|The server `session.updated`event is returned when a session is updated by the client. If there's an error, the server sends an `error` event instead. |
0 commit comments