Whisper as an automatic corpus annotation tool in Lhotse #209

pzelasko · 2022-09-30T19:17:45Z

pzelasko
Sep 30, 2022

In the latest release of Lhotse, we added an option to use Whisper to segment and transcribe unlabeled recordings and save the results as a Lhotse CutSet manifest. We also support forced alignment with torchaudio's pretrained Wav2Vec2 ASR to get word-level timestamps.

Benefits for Whisper users: you will access all the capabilities of Lhotse in terms of data preparation (mixing/truncating multiple examples into one, merging multiple datasets), augmentation (noise mixing, speed perturbation, reverberation, SpecAugment, etc.), and data sampling and dataloading for PyTorch model training.

Benefits for Lhotse users: a familiar interface for using Whisper to annotate your data.

Want to learn more about using Lhotse? See our tutorials here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whisper as an automatic corpus annotation tool in Lhotse #209

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Whisper as an automatic corpus annotation tool in Lhotse #209

Uh oh!

pzelasko Sep 30, 2022

Replies: 0 comments

pzelasko
Sep 30, 2022