What does the loss graph / time (or iterations) look like for a working training setup? #38
Unanswered
StevenSchrembeck
asked this question in
Q&A
Replies: 2 comments 2 replies
-
Hey, what dataset are you using? And what architectural details (number of heads, depth etc.) and feature extraction details (pre-trained model, k-means clustering model)? I got things to work reasonably well (loss falling to more like ~2 and outputs starting to move towards what you'd expect, more details here) with LibriSpeech, which is quite a small dataset in comparison to the one used in AudioLM for speech. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm training a SemanticTransformer with the pre-made trainer and the loss graph isn't promising. It falls rapidly from 6 to ~5 then remains there, even 1000 iterations later. Might not be nearly enough iterations to know, but I expected it fall further, faster.
If you have a converging SemanticTransformer, what does your loss graph look like? Are you using an out-of-the-box dataset I can also test as a control?
Much appreciated!
Beta Was this translation helpful? Give feedback.
All reactions