-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Description
Hi,
I’m using the following command to transcribe from a live microphone:
.\whisper-stream.exe -c 1 -t 8 --step 2000 --length 8000 --keep 0 -mt 64 -ac 0 -bs -1 -vth 0.60 -fth 80.0 -l en -m ggml-large-v3-turbo.bin
The real-time transcription displayed on the CMD screen looks accurate, for example:
[Start speaking]
top of the English Premier League on a landmark day for Bukayo Saka and manager Mikel Arteta.
Saka marked his 200th league appearance with a goal in this 2-0 win over West Ham. The victory
We saw Arsenal finish the day a point clear of reigning champions Liverpool. It was Arteta's 300th game
in charge of the London club. I wanted to celebrate with a win. I've got it.
However, when I use the -f argument to save the transcript to a file, the output becomes repetitive, like this:
top of the English primary league.
top of the English Premier League on a landmark day for Bukayo Saka.
top of the English Premier League on a landmark day for Bukayo Saka and manager Mikel Arteta.
Sokka marked his 200th
Saka marked his 200th league appearance with a goal in this 2-0 win.
Saka marked his 200th league appearance with a goal in this 2-0 win over West Ham. The victory
We saw Arsenal finish the day a point clear
We saw Arsenal finish the day a point clear of reigning champions Liverpool.
We saw Arsenal finish the day a point clear of reigning champions Liverpool. It was Arteta's 300th game.
in charge of the London club.
in charge of the London club. I wanted to celebrate the
in charge of the London club. I wanted to celebrate with a win. I've got it.
Could you please advise why the output file contains duplicate or partial lines while the live CMD output remains clean and accurate?
Also, is it possible to enable or specify a separate VAD model or parameter within Whisper Stream for more accurate speech boundary detection?
Best regards,