Whisper as an automatic corpus annotation tool in Lhotse #209
pzelasko
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
In the latest release of Lhotse, we added an option to use Whisper to segment and transcribe unlabeled recordings and save the results as a Lhotse
CutSet
manifest. We also support forced alignment with torchaudio's pretrained Wav2Vec2 ASR to get word-level timestamps.Benefits for Whisper users: you will access all the capabilities of Lhotse in terms of data preparation (mixing/truncating multiple examples into one, merging multiple datasets), augmentation (noise mixing, speed perturbation, reverberation, SpecAugment, etc.), and data sampling and dataloading for PyTorch model training.
Benefits for Lhotse users: a familiar interface for using Whisper to annotate your data.
Want to learn more about using Lhotse? See our tutorials here.
Beta Was this translation helpful? Give feedback.
All reactions