Batch size 1 gradient accumulation #8807

mfarazi1991 · 2024-01-22T02:01:11Z

mfarazi1991
Jan 22, 2024

Hello,

I have an issue with variable-size graphs/meshes. So I need to include some matrices with my PyG data loader but obviously, I cannot as the matrices have different sizes like N by N or M by M. I try to use batch size one with gradient accumulation but the models I use overfit very fast (learning rate is handled) and it should not. Any idea for the root cause? for images and pytorch the gradient accumulation works rather fine but it seems for graphs not maybe?

rusty1s · 2024-01-23T19:17:26Z

rusty1s
Jan 23, 2024
Maintainer

Is the question more related to how to use mini-batching here? I think you have two ways for that:

Pad the dense matrices to equal shape
Represent them as nested tensors (fully supported as part of our DataLoader)

Besides that, it is hard to give a good answer here why gradient accumulation wouldn't work here. I don't think this is necessarily an issue with graph-based data.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batch size 1 gradient accumulation #8807

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Batch size 1 gradient accumulation #8807

Uh oh!

mfarazi1991 Jan 22, 2024

Replies: 1 comment

Uh oh!

rusty1s Jan 23, 2024 Maintainer

mfarazi1991
Jan 22, 2024

rusty1s
Jan 23, 2024
Maintainer