training data to TTS vice model #48
Replies: 1 comment
-
.npz isn't a model, it's a file containing info about the voice. (specifically the previous fine (encodec), coarse (encodec) and semantic (wav2vec) tokens) You can create a custom .npz from a short audio clip in the clone section on the bark text to speech tts page. Make sure you have a good clip for cloning, optimally in .wav format (you can use ffmpeg to convert it). Soon, serp-ai will release code for fine-tuning bark as well, i'll implement it as soon as i can. And see if i can implement some other things to make fine-tuning easier. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to make TTS models from audio clips that I have on my PC. I can train a model on the tab and from there I know I can send it to RVC, generate speech from TTS, and then send it to RVC. Then have my voice generated. I was wondering if there is a way to make the data that I made work strat in TTS? or if there I a tool to turn that .pth's to .npz model? If anyone got anything that would be much appreciated!
Beta Was this translation helpful? Give feedback.
All reactions