Skip to content
Discussion options

You must be logged in to vote

We haven't tried fine-tuning, but it could be a good avenue for research evaluating Whisper models as pretrained representation for unseen language. We have observed some transfer between linguistically adjacent languages, such as Asturian <-> Spanish (Castillian) or Cebuano <-> Filipino (Tagalog). So if your language of interest has an adjacent language that works acceptably in Whisper, you could fine-tune on your dataset using that language token.

Replies: 6 comments 12 replies

Comment options

You must be logged in to vote
1 reply
@IIIBlueberry
Comment options

Answer selected by jongwook
Comment options

You must be logged in to vote
4 replies
@athu16
Comment options

@sanchit-gandhi
Comment options

@sukantan
Comment options

@sanchit-gandhi
Comment options

Comment options

You must be logged in to vote
6 replies
@qunash
Comment options

@aurotripathy
Comment options

@sanchit-gandhi
Comment options

@qunash
Comment options

@AntiDotZA
Comment options

Comment options

You must be logged in to vote
1 reply
@sanchit-gandhi
Comment options

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet