Aspect Ratio Bucketing

Understanding Aspect Ratio (AR) Bucketing: A Simple Guide

What you have:

Training Resolution: 1024 pixels
Batch size: 4
Pixel Limit (aka Budget): 1,048,576 pixels
(Because 1024×1024=1,048,576)

Handling AR’s properly = better image generations. OneTrainer uses Aspect Ratio Bucketing. Here's how it works.

Creating Aspect Ratio Buckets

Defines buckets relative to training resolution using all_possible_input_aspects

Reading Each Image

The program looks at every image in your dataset and notes down the width and height

Finding the Best Bucket for Each Image

For every image, the program figures out which bucket is the closest fit

Adjusting the Image to Fit the Bucket

Scaling: If the image is still too big, it shrinks (scales down) the image to fit within the pixel budget.
Cropping:
- If scaling alone cannot make it fit, then it can crop one dimension evenly (width or height). The crop amount is functionally limited by the amount of buckets and the training resolution (we derive the resolution of each bucket from the training res). If the crop jitter augmentation is enabled it will randomly distribute the cropping required in one or more dimensions

In summary we try to make the smallest possible adjustments to the image.

all_possible_input_aspects in (width, height)

(4.0, 1.0),
(3.5, 1.0),
(3.0, 1.0),
(2.5, 1.0),
(2.0, 1.0),
(1.75, 1.0), Common Widescreen (16/9)
(1.5, 1.0),
(1.25, 1.0),
(1.0, 1.0), Square
(1.0, 1.25),
(1.0, 1.5), Common Portrait
(1.0, 1.75),
(1.0, 2.0),
(1.0, 2.5),
(1.0, 3.0),
(1.0, 3.5),
(1.0, 4.0)

Example ‘Dataset’ of 4 Images

16:9
- 1280 x 720
- 1366 x 768
- 1600 x 900
- 1920 x 1080

Lets use the 1920 x 1080 as an example

16:9 - 1920 x 1080 (2.073M pixels) Example

Determine AR - divide image width by height = 1.7
Looking at the available buckets, our closet match 1.75:1, however its not an exact match
The image is also over our pixel budget so we proportionally scale down the image to 1365 × 768 (1.048M pixels)
Now to make it fit the 1.75:1 bucket we must reduce its width, so we crop 21 pixels from the width

Conclusion

An aspect ratio bucket is an aspect ratio adjusted to the pixel budget.
Images are scaled / cropped to match the closest possible bucket.
During training, a batch can only be filled with images on the same bucket. That explains a potential image drop when using a batch size greater than 1 and images on different ratio.

Overview

Home

Overview

Learning

Training

Getting Started

The Program - Tab Explanation

General

Model

Data

Concepts

Validation Datasets

Prior Prediction Datasets

AR Buckets

Training

Optimizers

Advanced Optimizers

Orthogonal Optimizers

Custom Scheduler

Sampling

Backup and Saving

Tools

Additional Embeddings

Cloud

Embedding

Lora

More info

Infos, Guides and Lessons Learnt

Misc Info

Model Support

Guides

One Trainer March 2024 Guide

Manually setup OneTrainer in Runpod

Other Tools - Helpful Links

Lessons Learnt

Frequently Asked Questions

Lessons Learnt and Tutorials

For Developers

Dev Corner

Developing Locally, Training Remotely on Runpod

Quick Start for Developers

CLI Training

Docker Image

Embedding Training

Project Structure

RAM Offloading

Uh oh!

Aspect Ratio Bucketing

Understanding Aspect Ratio (AR) Bucketing: A Simple Guide

What you have:

Creating Aspect Ratio Buckets

Reading Each Image

Finding the Best Bucket for Each Image

Adjusting the Image to Fit the Bucket

all_possible_input_aspects in (width, height)

Example ‘Dataset’ of 4 Images

16:9 - 1920 x 1080 (2.073M pixels) Example

Conclusion

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Overview

Training

More info

For Developers

Clone this wiki locally