[Bug] --suppress_tokens: default value wreaking havoc #1488
Replies: 3 comments 2 replies
-
@misutoneko I did not understand, can you please explain to me in an easier way? |
Beta Was this translation helpful? Give feedback.
-
Well, basically you just add --suppress_tokens "" to your command line. That's the workaround that fixes it for me. Here's an example I've used for testing: If you're not using the command line, I guess the same thing can be accomplished by changing the default value of suppress_tokens from -1 to "" in code. |
Beta Was this translation helpful? Give feedback.
-
I have tried that in Python, and it does not work. Putting suppress_tokens="" in model.transcribe(path, beam_size=4, suppress_tokens="") blows up. Turns out what it needs is a list (to iterate through) in message: suppress_tokens: Optional[List[int]] = [-1]. So -1 is default, but it takes a list integer. So, replacing it with suppress_tokens=[0] does not blow up, but still hallucinates with 'Thank you' every second. I also tried [1] and it goes for a bit and then 'Thank you for watching' and 'Thank you' until Input Overflowed. I will try CLI next for comparison. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
Continuation from #928
I've been wondering about various timing issues and hallucinations, such as the ones listed there.
It looks like these kind of problems can happen whenever the sample starts with a non-speech region.
Btw my samples are very short (most are under 30s) and some samples may not have speech at all.
Anyway, I think I've finally found a workaround, which is to always use this option:
--suppress_tokens ""
If this option isn't present it will fall back to its default value -1,
which doesn't seem to be working as intended.
No idea how to fix it, though :(
Do note that some postprocessing may be needed with this option (as nothing is suppressed).
So the default -1 would be nicer from that pov. Well, if it worked, that is...
Beta Was this translation helpful? Give feedback.
All reactions