You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/realtime-audio-reference.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -294,19 +294,19 @@ There are 28 server events that can be received from the server:
294
294
|[RealtimeServerEventInputAudioBufferSpeechStopped](#realtimeservereventinputaudiobufferspeechstopped)| Server event in server turn detection mode when speech stops. |
295
295
|[RealtimeServerEventRateLimitsUpdated](#realtimeservereventratelimitsupdated)| Emitted after every "response.done" event to indicate the updated rate limits. |
296
296
|[RealtimeServerEventResponseAudioDelta](#realtimeservereventresponseaudiodelta)| Server event when the model-generated audio is updated. |
297
-
|[RealtimeServerEventResponseAudioDone](#realtimeservereventresponseaudiodone)| Server event when the model-generated audio is done. Also emitted when a response is interrupted, incomplete, or cancelled. |
297
+
|[RealtimeServerEventResponseAudioDone](#realtimeservereventresponseaudiodone)| Server event when the model-generated audio is done. Also emitted when a response is interrupted, incomplete, or canceled. |
298
298
|[RealtimeServerEventResponseAudioTranscriptDelta](#realtimeservereventresponseaudiotranscriptdelta)| Server event when the model-generated transcription of audio output is updated. |
299
-
|[RealtimeServerEventResponseAudioTranscriptDone](#realtimeservereventresponseaudiotranscriptdone)| Server event when the model-generated transcription of audio output is done streaming. Also emitted when a response is interrupted, incomplete, or cancelled. |
299
+
|[RealtimeServerEventResponseAudioTranscriptDone](#realtimeservereventresponseaudiotranscriptdone)| Server event when the model-generated transcription of audio output is done streaming. Also emitted when a response is interrupted, incomplete, or canceled. |
300
300
|[RealtimeServerEventResponseContentPartAdded](#realtimeservereventresponsecontentpartadded)| Server event when a new content part is added to an assistant message item during response generation. |
301
-
|[RealtimeServerEventResponseContentPartDone](#realtimeservereventresponsecontentpartdone)| Server event when a content part is done streaming in an assistant message item. Also emitted when a response is interrupted, incomplete, or cancelled. |
301
+
|[RealtimeServerEventResponseContentPartDone](#realtimeservereventresponsecontentpartdone)| Server event when a content part is done streaming in an assistant message item. Also emitted when a response is interrupted, incomplete, or canceled. |
302
302
|[RealtimeServerEventResponseCreated](#realtimeservereventresponsecreated)| Server event when a new Response is created. The first event of response creation, where the response is in an initial state of "in_progress". |
303
303
|[RealtimeServerEventResponseDone](#realtimeservereventresponsedone)| Server event when a response is done streaming. Always emitted, no matter the final state. |
304
304
|[RealtimeServerEventResponseFunctionCallArgumentsDelta](#realtimeservereventresponsefunctioncallargumentsdelta)| Server event when the model-generated function call arguments are updated. |
305
-
|[RealtimeServerEventResponseFunctionCallArgumentsDone](#realtimeservereventresponsefunctioncallargumentsdone)| Server event when the model-generated function call arguments are done streaming. Also emitted when a response is interrupted, incomplete, or cancelled. |
305
+
|[RealtimeServerEventResponseFunctionCallArgumentsDone](#realtimeservereventresponsefunctioncallargumentsdone)| Server event when the model-generated function call arguments are done streaming. Also emitted when a response is interrupted, incomplete, or canceled. |
306
306
|[RealtimeServerEventResponseOutputItemAdded](#realtimeservereventresponseoutputitemadded)| Server event when a new output item is added to a response. |
307
-
|[RealtimeServerEventResponseOutputItemDone](#realtimeservereventresponseoutputitemdone)| Server event when an output item is done streaming. Also emitted when a response is interrupted, incomplete, or cancelled. |
307
+
|[RealtimeServerEventResponseOutputItemDone](#realtimeservereventresponseoutputitemdone)| Server event when an output item is done streaming. Also emitted when a response is interrupted, incomplete, or canceled. |
308
308
|[RealtimeServerEventResponseTextDelta](#realtimeservereventresponsetextdelta)| Server event when the model-generated text is updated. |
309
-
|[RealtimeServerEventResponseTextDone](#realtimeservereventresponsetextdone)| Server event when the model-generated text is done. Also emitted when a response is interrupted, incomplete, or cancelled. |
309
+
|[RealtimeServerEventResponseTextDone](#realtimeservereventresponsetextdone)| Server event when the model-generated text is done. Also emitted when a response is interrupted, incomplete, or canceled. |
310
310
|[RealtimeServerEventSessionCreated](#realtimeservereventsessioncreated)| Server event when a session is created. |
311
311
|[RealtimeServerEventSessionUpdated](#realtimeservereventsessionupdated)| Server event when a session is updated. |
312
312
@@ -662,7 +662,7 @@ The server `response.audio.delta` event is returned when the model-generated aud
662
662
663
663
The server `response.audio.done` event is returned when the model-generated audio is done.
664
664
665
-
This event is also returned when a response is interrupted, incomplete, or cancelled.
665
+
This event is also returned when a response is interrupted, incomplete, or canceled.
666
666
667
667
#### Event structure
668
668
@@ -718,7 +718,7 @@ The server `response.audio_transcript.delta` event is returned when the model-ge
718
718
719
719
The server `response.audio_transcript.done` event is returned when the model-generated transcription of audio output is done streaming.
720
720
721
-
This event is also returned when a response is interrupted, incomplete, or cancelled.
721
+
This event is also returned when a response is interrupted, incomplete, or canceled.
722
722
723
723
#### Event structure
724
724
@@ -781,7 +781,7 @@ The server `response.content_part.added` event is returned when a new content pa
781
781
782
782
The server `response.content_part.done` event is returned when a content part is done streaming in an assistant message item.
783
783
784
-
This event is also returned when a response is interrupted, incomplete, or cancelled.
784
+
This event is also returned when a response is interrupted, incomplete, or canceled.
785
785
786
786
#### Event structure
787
787
@@ -882,7 +882,7 @@ The server `response.function_call_arguments.delta` event is returned when the m
882
882
883
883
The server `response.function_call_arguments.done` event is returned when the model-generated function call arguments are done streaming.
884
884
885
-
This event is also returned when a response is interrupted, incomplete, or cancelled.
885
+
This event is also returned when a response is interrupted, incomplete, or canceled.
886
886
887
887
#### Event structure
888
888
@@ -935,7 +935,7 @@ The server `response.output_item.added` event is returned when a new item is cre
935
935
936
936
The server `response.output_item.done` event is returned when an item is done streaming.
937
937
938
-
This event is also returned when a response is interrupted, incomplete, or cancelled.
938
+
This event is also returned when a response is interrupted, incomplete, or canceled.
939
939
940
940
#### Event structure
941
941
@@ -988,7 +988,7 @@ The server `response.text.delta` event is returned when the model-generated text
988
988
989
989
The server `response.text.done` event is returned when the model-generated text is done streaming. The text corresponds to the `text` content part of an assistant message item.
990
990
991
-
This event is also returned when a response is interrupted, incomplete, or cancelled.
991
+
This event is also returned when a response is interrupted, incomplete, or canceled.
992
992
993
993
#### Event structure
994
994
@@ -1258,7 +1258,7 @@ The definition of a function tool as used by the realtime endpoint.
1258
1258
| status |[RealtimeResponseStatus](#realtimeresponsestatus)| The status of the response.<br><br>The default status value is `in_progress`. |
1259
1259
| status_details |[RealtimeResponseStatusDetails](#realtimeresponsestatusdetails)| The details of the response status.<br><br>This property is nullable. |
1260
1260
| output | array of [RealtimeConversationResponseItem](#realtimeconversationresponseitem)| The output items of the response. |
1261
-
| usage | object | Usage statistics for the response. Each Realtime API session maintains a conversation context and append new items to the conversation. Output from previous turns (text and audio tokens) are input for later turns.<br><br>See nested properties next.|
1261
+
| usage | object | Usage statistics for the response. Each Realtime API session maintains a conversation context and appends new items to the conversation. Output from previous turns (text and audio tokens) is input for later turns.<br><br>See nested properties next.|
1262
1262
| + total_tokens | integer | The total number of tokens in the Response including input and output text and audio tokens.<br><br>A property of the `usage` object. |
1263
1263
| + input_tokens | integer | The number of input tokens used in the response, including text and audio tokens.<br><br>A property of the `usage` object. |
1264
1264
| + output_tokens | integer | The number of output tokens sent in the response, including text and audio tokens.<br><br>A property of the `usage` object. |
@@ -1420,7 +1420,7 @@ The response resource.
1420
1420
| type | string | The type of turn detection.<br><br>Allowed values: `server_vad`|
1421
1421
| threshold | number | The activation threshold for the server VAD turn detection. In noisy environments, you might need to increase the threshold to avoid false positives. In quiet environments, you might need to decrease the threshold to avoid false negatives.<br><br>Defaults to `0.5`. You can set the threshold to a value between `0.0` and `1.0`. |
1422
1422
| prefix_padding_ms | string | The duration of speech audio (in milliseconds) to include before the start of detected speech.<br><br>Defaults to `300`. |
1423
-
| silence_duration_ms | string | The duration of silence (in milliseconds) to detect the end of speech. You want to detect the end of speech as soon as possible, but not too soon to avoid cutting off the last part of the speech.<br><br>The model will response more quickly if you set this value to a lower number, but it might cut off the last part of the speech. If you set this value to a higher number, the model will wait longer to detect the end of speech, but it might take longer to respond. |
1423
+
| silence_duration_ms | string | The duration of silence (in milliseconds) to detect the end of speech. You want to detect the end of speech as soon as possible, but not too soon to avoid cutting off the last part of the speech.<br><br>The model will respond more quickly if you set this value to a lower number, but it might cut off the last part of the speech. If you set this value to a higher number, the model will wait longer to detect the end of speech, but it might take longer to respond. |
0 commit comments