-
Notifications
You must be signed in to change notification settings - Fork 573
Open
Description
Hello, I am finetunning on Czech language dataset. Complication is, I have enought of data total-audio-length-wise, but they are of a worse quality (they are recorded from Czech Parliament and thus worse room acoustic characteristics is present in the audios). I had an idea that I would do first stage finetunning on this Parliament dataset (circa 900 hours of preselected recordings) to teach the model pronunciation and prosody. And then once this is done, I would do second stage finetunning on different, much smaller dataset (50-60 hours) with good audio quality to teach the model quality. Is it possible to do it like this? Can I have a hope for good results?
Thanks for response.
Metadata
Metadata
Assignees
Labels
No labels