FineTransformer Trainning with Multi-GPU using codec from Encodec #128
Liujingxiu23
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As the traning of soundstream may be very time-consuming since I only have several V100-16G no A100, I want to train the Coasre2Fine Model at the same time to speed up my experiments. I use codec from Encodec of facebook to train a FineTransformer, so I need to change the dataloader to return (coarse_code, fine_code) which are extracted and saved as numpy, and revised some of the code to train the model.
The question is I can not train the model with multi-GPUs while trainning on one GPU is ok.
The error occur just at the prepare stage of accelerate.
The relative code is
(self.transformer,self.optim,self.dl,self.valid_dl) = self.accelerator.prepare(self.transformer,self.optim,self.dl,self.valid_dl)
The key error info is(batchsize=48):
RuntimeError: The size of tensor a (64) must match the size of tensor b (0) at non-singleton dimension 2
And part of my code is:
class C2FDataset:
.....
def getitem(**):return (coarse_codes, fine_codes) ## coasre [48, 6, 32], fine [48,10,32]
Train code is like:
for _ in range(self.grad_accum_every):
coarse, fine = next(self.dl_iter)
data_kwargs = {"coarse_token_ids":coarse, "fine_token_ids":fine}
loss = self.train_wrapper(**data_kwargs)
Do you have some suggest to solve the problem?
Beta Was this translation helpful? Give feedback.
All reactions