FineTransformer Trainning with Multi-GPU using codec from Encodec #128

Liujingxiu23 · 2023-03-17T06:23:34Z

Liujingxiu23
Mar 17, 2023

As the traning of soundstream may be very time-consuming since I only have several V100-16G no A100, I want to train the Coasre2Fine Model at the same time to speed up my experiments. I use codec from Encodec of facebook to train a FineTransformer, so I need to change the dataloader to return (coarse_code, fine_code) which are extracted and saved as numpy, and revised some of the code to train the model.

The question is I can not train the model with multi-GPUs while trainning on one GPU is ok.
The error occur just at the prepare stage of accelerate.
The relative code is
(self.transformer,self.optim,self.dl,self.valid_dl) = self.accelerator.prepare(self.transformer,self.optim,self.dl,self.valid_dl)

The key error info is(batchsize=48):
RuntimeError: The size of tensor a (64) must match the size of tensor b (0) at non-singleton dimension 2

And part of my code is:
class C2FDataset:
.....
def getitem(**):return (coarse_codes, fine_codes) ## coasre [48, 6, 32], fine [48,10,32]

Train code is like:
for _ in range(self.grad_accum_every):
coarse, fine = next(self.dl_iter)
data_kwargs = {"coarse_token_ids":coarse, "fine_token_ids":fine}
loss = self.train_wrapper(**data_kwargs)

Do you have some suggest to solve the problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FineTransformer Trainning with Multi-GPU using codec from Encodec #128

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

FineTransformer Trainning with Multi-GPU using codec from Encodec #128

Uh oh!

Uh oh!

Liujingxiu23 Mar 17, 2023

Replies: 0 comments

Liujingxiu23
Mar 17, 2023