Transcription Errors When Using Nova-3 #1500
-
|
Hi, We have a voice agent service built in Python, where we use Deepgram’s Speech-to-Text WebSocket API for real-time speech transcription. During live sessions, the user’s speech is transcribed via Flux. After the session ends, we make an HTTP request with the session’s recording to get the transcription of the whole session where we utilize the Nova-3 model. Also note that in all our recordings, we have 2 channels: the user and the AI agent. The agent channel includes a synthetic voice generated by a Text-to-Speech model, and the user channel is organic. Our question is related to the second step where we try to get the transcription of the whole recording. In the sample whose request ID is given below, around 1:20, the user says “Okay. I would like that.”, but it is transcribed as “Okay, I won't let you do it.” by Nova-3. Request ID: a6697888-b156-4b55-a2e1-c7bfd17d3b94 We would appreciate your help and insights into why this might be happening and how we can improve such cases. Thanks in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
|
Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently. |
Beta Was this translation helpful? Give feedback.
-
|
Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion. |
Beta Was this translation helpful? Give feedback.
-
|
I have a similar recommendation here as I did with #1499 , specifically around confidence scores. What I see is: This has a relatively low confidence score. It is higher than my initial recommendation of 0.65 but, just as an example, we usually provide scores > 0.90: My suggestion on improving this is to verify what the user said when low confidence transcripts come through. |
Beta Was this translation helpful? Give feedback.
I have a similar recommendation here as I did with #1499 , specifically around confidence scores. What I see is: