Replies: 3 comments 12 replies
-
I think before this commit I was also getting loss around 2 and not decreasing, so I might check that a) you have the latest version and b) you're able to intentionally overfit to a tiny dataset. If you still have loss stuck at 2 even on a tiny dataset, then that seems like a sign of a bug. After the commit, I was able to take audiolm_pytorch, trained it to a very tiny dataset made up of the same data file copied a bunch of times, and was able to intentionally overfit it so the loss was near (or sometimes exactly) 0 and the output sounded pretty similar to the input! Here's the original input and what I was able to generate using the overfitted network: out_305380_mp4_format.mp4original_sample_mp4_format.mp4Since the dataset was intentionally so tiny I didn't really need a ton of steps-- I set 5000 but the loss went to 0 really fast, probably unnecessary. Batch size/ accum grad every was 1 and I trained on a 1 second sample. The sample I used is at 24kHz, since I'm using EnCodec and wanted to avoid any potential issues in case there's any bug in the resampling part (rather than the model itself). |
Beta Was this translation helpful? Give feedback.
-
sorry for jumping in about semantic transformers, if i want to train language with non-english,should i train hubert/wav2vec with that language first? |
Beta Was this translation helpful? Give feedback.
-
On the same subject but different angle - what wav2vec did you use? i'm using Hubert with 500 clustering but in the paper they used w2v-bert with 1000 clustering. anyone trained something with 1000 clusters? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey, anyone that succeeded in training an AudioLM model from this repo or maybe just the SemanticTransformer and is willing to share the training params, how long did you train? some other tips?
I'm using 64 batch size, 2 sec for sample, and the other are default and i'm getting to loss ~2 very quickly but it seems that the model isn't getting better.
Thanks in advanced!
Beta Was this translation helpful? Give feedback.
All reactions