-
|
I'm testing the new "nvidia/multitalker-parakeet-streaming-0.6b-v1" on example audio files. Is there a built-in way for the utterances between different speakers to be interleaved from the seglst? With an audio file of two people speaking back-and-forth it'll have two entries in the output .json, while I would like there to be entries in order of utterances split by speaker change. Current solution for me right now is to modify Thanks for any help. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi. If you want to see frequent speaker change in the transcription, check out the following line: Setting a very small value for However - often times there is no "speaker change" because real life conversations have lots of overlapped speech. That's because the multitalker ASR model does not use the concept of "speaker change". |
Beta Was this translation helpful? Give feedback.
Hi.
If you want to see frequent speaker change in the transcription, check out the following line:
NeMo/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py
Line 97 in 66ffb38
Setting a very small value for
sent_break_sec=0.1would break the sentence very often. Then you will see more interweaved transcriptions.However - often times there is no "speaker change" because real life conversations have lots of overlapped speech. That's because the multitalker ASR model does not use the concept of "speaker change".