-
Notifications
You must be signed in to change notification settings - Fork 479
Open
Description
The response for /audio/transcription defines a discriminated union with task being the discriminated property. Of the three variants, only CreateTranscriptionResponseDiarizedJson has this property.
The definitions:
/audio/transcriptions:
post:
operationId: createTranscription
tags:
- Audio
summary: Create transcription
requestBody:
required: true
content:
multipart/form-data:
schema:
$ref: '#/components/schemas/CreateTranscriptionRequest'
responses:
'200':
description: OK
content:
application/json:
schema:
anyOf:
- $ref: '#/components/schemas/CreateTranscriptionResponseJson'
- $ref: '#/components/schemas/CreateTranscriptionResponseDiarizedJson'
x-stainless-skip:
- go
- $ref: '#/components/schemas/CreateTranscriptionResponseVerboseJson'
discriminator:
propertyName: task
text/event-stream:
schema:
$ref: '#/components/schemas/CreateTranscriptionResponseStreamEvent'CreateTranscriptionResponseJson:
type: object
description: Represents a transcription response returned by model, based on the provided input.
properties:
text:
type: string
description: The transcribed text.
logprobs:
type: array
optional: true
description: >
The log probabilities of the tokens in the transcription. Only returned with the models
`gpt-4o-transcribe` and `gpt-4o-mini-transcribe` if `logprobs` is added to the `include` array.
items:
type: object
properties:
token:
type: string
description: The token in the transcription.
logprob:
type: number
description: The log probability of the token.
bytes:
type: array
items:
type: number
description: The bytes of the token.
usage:
type: object
description: Token usage statistics for the request.
anyOf:
- $ref: '#/components/schemas/TranscriptTextUsageTokens'
title: Token Usage
- $ref: '#/components/schemas/TranscriptTextUsageDuration'
title: Duration Usage
discriminator:
propertyName: type
required:
- textCreateTranscriptionResponseDiarizedJson:
type: object
description: >
Represents a diarized transcription response returned by the model, including the combined transcript
and speaker-segment annotations.
properties:
task:
type: string
description: The type of task that was run. Always `transcribe`.
enum:
- transcribe
x-stainless-const: true
duration:
type: number
description: Duration of the input audio in seconds.
text:
type: string
description: The concatenated transcript text for the entire audio input.
segments:
type: array
description: Segments of the transcript annotated with timestamps and speaker labels.
items:
$ref: '#/components/schemas/TranscriptionDiarizedSegment'
usage:
type: object
description: Token or duration usage statistics for the request.
discriminator:
propertyName: type
anyOf:
- $ref: '#/components/schemas/TranscriptTextUsageTokens'
title: Token Usage
- $ref: '#/components/schemas/TranscriptTextUsageDuration'
title: Duration UsageCreateTranscriptionResponseVerboseJson:
type: object
description: Represents a verbose json transcription response returned by model, based on the provided input.
properties:
language:
type: string
description: The language of the input audio.
duration:
type: number
description: The duration of the input audio.
text:
type: string
description: The transcribed text.
words:
type: array
description: Extracted words and their corresponding timestamps.
items:
$ref: '#/components/schemas/TranscriptionWord'
segments:
type: array
description: Segments of the transcribed text and their corresponding details.
items:
$ref: '#/components/schemas/TranscriptionSegment'
usage:
$ref: '#/components/schemas/TranscriptTextUsageDuration'
required:
- language
- duration
- textMetadata
Metadata
Assignees
Labels
No labels