Korean transcription missing word spacing in Nova-3 (regression from Nova-2) #1452
Replies: 5 comments 1 reply
-
|
Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently. |
Beta Was this translation helpful? Give feedback.
-
|
Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion. |
Beta Was this translation helpful? Give feedback.
-
|
It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?
|
Beta Was this translation helpful? Give feedback.
-
|
i'm also experiencing the same issue |
Beta Was this translation helpful? Give feedback.
-
|
+1, same issue here. Interestingly, pre-recorded API returns proper spacing, but streaming API returns no spaces at all. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
Nova-3 Korean transcription is returning text without spaces between words, while Nova-2 correctly included spacing. The word-level timestamps in the API response show that words are being correctly segmented internally, but this spacing is not reflected in the final transcript text.
Expected Behavior
Korean transcripts should include spaces between words, as they did in Nova-2.
Current Behavior
Nova-3: Returns Korean text without any spacing
Nova-2: Correctly returned Korean text with proper spacing
Evidence of Internal Word Segmentation
The
wordsarray in the API response shows that Nova-3 is correctly identifying word boundaries:startandendtimestampsThis suggests the spacing logic works internally but isn't being applied to the final transcript string.
API Details
nova-3ko)Request
Please apply the same word spacing logic to Nova-3's Korean transcripts that was used in Nova-2. Since the word segmentation is already working (as evidenced by the timestamps), this should hopefully be a straightforward fix to include spaces in the concatenated transcript text.
Beta Was this translation helpful? Give feedback.
All reactions