Skip to content

r50206v/COMSE6998-Speech-Recognition

Repository files navigation

  1. need kaldi & montreal forced aligner
  2. should create folders in root directory: org_audio, segmented_audio, train-result
  3. run bash pipeline-train.sh
  4. the model will be zip in model.zip in the root directory
  5. run bash pipeline-test.sh

background music: piano music played by Tim Shevlyakov

rock music: Always there for you

country music: Mama tried

training set: tedlium s5_r3

note lexicon dictionary comes from kaldi/egs/tedlium/s5_r3/data/local/lang_nosp/align_lexicon.txt

note kaldi-scp/*.scp comes from kaldi/egs/tedlium/s5_r3/data/train, kaldi/egs/tedlium/s5_r3/data/test, and kaldi/egs/tedlium/s5_r3/data/dev

note text comes from kaldi/egs/tedlium/s5_r3/data/train/text, kaldi/egs/tedlium/s5_r3/data/dev/text, and kaldi/egs/tedlium/s5_r3/data/test/text

note only partial data (under org_audio, segmented_audio, train-result and text) are uploaded, please use run kaldi/egs/tedlium/s5_r3/run.sh to get the full dataset, and run prepare_convert_to_wav.py to convert wav from sph files

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published