Training time when training from scratch & recommended dataset size

Hello,

Thanks for open-sourcing this work, it's very valuable for the increase in understanding for how denoising diffusion models behave in the domain of audio.

I've started a new training experiment, using the 'base_test' config. I've modified it to use an 6hr dataset, scraped from youtube, but nothing else.

What would be a recommended training time in terms of epochs?

 I've so far been training up to 850~ epochs and the model is still only outputting very noisy results; with nothing resembling the dataset at all. The loss so far is around 0.015~

Would it still make sense to keep going with this training or has the experiment collapsed? 
Including some best practices and recommendations for training would be great!

Also, what is a good recommendation for dataset size when training an unconditional generation model?

Thanks again!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training time when training from scratch & recommended dataset size #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training time when training from scratch & recommended dataset size #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions