You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've discovered something new about how batches are formed when the number of images in a bucket doesn't divide exactly by the batch size. It had been my assumption that PyTorch or sd-scripts would evenly distribute the images into each batch. e.g.:
Flux doesn't seem to respond anywhere near as well to batch size 1 compared to higher batch sizes, so I think the [4, 4, 4, 4] arrangement might offer higher quality training than [5, 5, 5, 1]. I'll probably write a custom batch sampler function to try to evenly distribute the images into batches to see if I can get further quality gains.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I've discovered something new about how batches are formed when the number of images in a bucket doesn't divide exactly by the batch size. It had been my assumption that PyTorch or sd-scripts would evenly distribute the images into each batch. e.g.:
but now I put that assumption to the test, batches actually are full-sized until the final few images, so reality currently seems to be:
Flux doesn't seem to respond anywhere near as well to batch size 1 compared to higher batch sizes, so I think the [4, 4, 4, 4] arrangement might offer higher quality training than [5, 5, 5, 1]. I'll probably write a custom batch sampler function to try to evenly distribute the images into batches to see if I can get further quality gains.
Beta Was this translation helpful? Give feedback.
All reactions