Replies: 1 comment 4 replies
-
If you haven't had success transcribing noisy audio with the large-v2 model, using a tool such as You can use a command similar to this to separate vocals and 'everything else': Spleeter is another option (I haven't used that). demucs is headlined as providing "music source separation", I don't know how well it would perform on your particular audio with FX noise so it's worth experimenting. See also |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey there!
As my hearing is not so great, the voices usually get lost amidst of various sound effect (gunshot, street noise, background distortions etc) and I don't understand the context of certain audiodramas, its just a big mess for me sometimes.
That's why I'm asking here and now the community's help, which model and settings should I use for those type of transcribing, when the source is not clean, it includes a lot of extra sound effects and overall can be messy to hear out words even for a mother-tongued English speaking person in the first place.
Has anyone figured out the optimal usage in that case?
I appreciate all the help here and thank you so much in advance!
Beta Was this translation helpful? Give feedback.
All reactions