You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/voice-live.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -176,9 +176,12 @@ You're charged separately for the training and model hosting of:
176
176
177
177
Tokens are the units that generative AI models use to process input and generate output.
178
178
179
-
You can estimate the cost with Voice Live API based on audio length as follows:
180
-
- Each second of input audio is approximately 10 tokens.
181
-
- Each second of output audio is approximately 20 tokens.
179
+
You can estimate token usage for different model families with the Voice Live API based on audio length. The following token calculations apply to each model family:
180
+
181
+
| Model family | Input audio (tokens per second) | Output audio (tokens per second) |
182
+
| ----- | ----- | ----- |
183
+
| Azure OpenAI models |~10 tokens |~20 tokens |
184
+
| Phi models |~12.5 tokens |~20 tokens |
182
185
183
186
You're also charged for cached audio and text inputs, including the prompt and the context of the conversations.
0 commit comments