Replies: 1 comment 2 replies
-
You should preprocess the audio files with noise reduction or suppression before passing to the whisper. There is no such perfect pipeline. Better mechanisms are demucs https://github.com/facebookresearch/demucs for separation of tracks, noise reduce https://github.com/timsainb/noisereduce for noise suppression and https://github.com/haoheliu/voicefixer for speech enhancement. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Dear
I am working on converting the speech of a specific individual into text from audio that contains both background noise and speech from other people. Could you advise on the appropriate pipeline to achieve clear and accurate transcription? Should I perform steps such as noise reduction and speaker isolation, or can advanced models handle the raw audio data effectively?
Best,
Payam
Beta Was this translation helpful? Give feedback.
All reactions