What is the proper "batch_size" value to pass to self.log in DDP mode? #13522

dlnp2 · 2022-07-03T17:51:29Z

dlnp2
Jul 3, 2022

Hello there, I came across a warning:

UserWarning: Trying to infer the batch_size from an ambiguous collection. The batch size we found is 1. To avoid any miscalculations, use self.log(..., batch_size=batch_size).

But I'm not sure what value to pass to batch_size in self.log. I searched for similar discussions, however, could not clearly understand what to do in my situation.

The code is running in DDP mode with pl.Trainer(..., gpus=8, accumulate_grad_batches=16, replace_sampler_ddp=True), and the batch size specified in my DataModule is 2, which is in turn passed to each DataLoaders. So, in my understanding, the total batch size over all processes is 2 * 8 = 16 and the effective batch size is 2 * 8 * 16 = 256.

Should I pass 2, 16, 256, or another value?

Another question is, the results seem to be exactly the same for both cases when batch_size in self.log is manually specified to 2 and inferred by Lightning (figure below: cyan for the former, pink for the latter case). Is this a correct behavior?

The loss is MSE, Lighting version is 1.6.4.

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What is the proper "batch_size" value to pass to self.log in DDP mode? #13522

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

What is the proper "batch_size" value to pass to self.log in DDP mode? #13522

Uh oh!

dlnp2 Jul 3, 2022

Replies: 0 comments

dlnp2
Jul 3, 2022