Replies: 3 comments
-
|
which model did you used can you tell me how to do this I wanna do it for Japanese language, because none of the japanese wav2vec2 I found working the english one works best, so it would be helpful if you share how did you used the multilingual one. |
Beta Was this translation helpful? Give feedback.
-
|
You can check https://github.com/MahmoudAshraf97/ctc-forced-aligner |
Beta Was this translation helpful? Give feedback.
-
|
@empz I was trying to follow this, but it's not working. I am also trying to do multilingual transcription. Could you share a gist? Thank you! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I don't know much about ML but I was able to use the following tutorial to do force aligment on multilingual transcription. The only requirement is to romanize the transcript which I did with the
uromanpackage.https://pytorch.org/audio/stable/tutorials/forced_alignment_for_multilingual_data_tutorial.html
According to that tutorial, it uses the Wav2Vec2 model to do this and I successfully aligned multiple languages. There's an extra step involved in mapping the aligned words back to the original word (non-romanized), but that's pretty much it.
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions