-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Hello,
Thanks for open-sourcing this work, it's very valuable for the increase in understanding for how denoising diffusion models behave in the domain of audio.
I've started a new training experiment, using the 'base_test' config. I've modified it to use an 6hr dataset, scraped from youtube, but nothing else.
What would be a recommended training time in terms of epochs?
I've so far been training up to 850~ epochs and the model is still only outputting very noisy results; with nothing resembling the dataset at all. The loss so far is around 0.015~
Would it still make sense to keep going with this training or has the experiment collapsed?
Including some best practices and recommendations for training would be great!
Also, what is a good recommendation for dataset size when training an unconditional generation model?
Thanks again!