-
Notifications
You must be signed in to change notification settings - Fork 497
Improve the types of RealtimeSession configuration #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
seratch
commented
Jun 12, 2025
- inputAudioTranscription
- turnDetection
🦋 Changeset detectedLatest commit: 7d248d3 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| turnDetection: { | ||
| type: 'semantic_vad', | ||
| eagerness: 'medium', | ||
| create_response: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the example but either works!
|
|
||
| export type RealtimeInputAudioTranscriptionConfig = { | ||
| language?: string; | ||
| model?: 'gpt-4o-transcribe' | 'gpt-4o-mini-transcribe' | 'whisper-1' | string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowed this property to pass anything else as we may release new models in the future (plus alpha users may use different ones)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do string & {} for better type autocomplete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkundel-openai thanks, updated!
| return { | ||
| type: c.type, | ||
| create_response: | ||
| 'createResponse' in c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if snake_case then camelCase is preferred, happy to change this order
ed4b4de to
f86d0fc
Compare
|
|
||
| // The Realtime API accepts snake_cased keys, so when using this, this SDK coverts the keys to snake_case ones before passing it to the API | ||
| export type RealtimeTurnDetectionConfigCamelCase = { | ||
| type?: 'semantic_vad'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type can also be server_vad so we should support both.
| prefixPaddingMs?: number; | ||
| silenceDurationMs?: number; | ||
| threshold?: number; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it acceptable for this to also still take other properties inside of these two settings? Thinking how theoretically you could roll your own Realtime Transport Layer right now with other session config. But also fine to guide people to providerData for that and override this entire property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, updated
dkundel-openai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See open comments
f86d0fc to
7d248d3
Compare
| prefixPaddingMs?: number; | ||
| silenceDurationMs?: number; | ||
| threshold?: number; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, updated
| threshold, | ||
| ...rest, | ||
| }; | ||
| // Remove undefined values from the config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I verified the behavior, having undefined values could affect the connection establishment, so I added this logic. but if my observation is wrong or is missing something, please feel free to adjust this part.
| const item = event.response.output[event.response.output.length - 1]; | ||
| const textOutput = getLastTextFromAudioOutputMessage(item) ?? ''; | ||
| const itemId = item.id ?? ''; | ||
| const itemId = item?.id ?? ''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unrelated existing bug i found while doing tests