BETA-Tuned Timestep Distribution #1225

Koratahiu · 2025-12-26T06:05:27Z

This PR implements the timestep distribution proposed in the paper:
Beta-Tuned Timestep Diffusion Model

This method aims to align timestep sampling with the diffusion model's forward pass, resulting in faster convergence and improved training performance. The paper observes that the data distribution changes most significantly during the initial timesteps, rendering standard uniform sampling sub-optimal.

Usage

Select BETA timestep distribution.
Set Noising bias to 1 (corresponds to Beta in the paper; recommended: 1).
Set Noising weight to < 1 (corresponds to Alpha in the paper; recommended: 0.8).

Note: This is compatible with existing loss weighting strategies (e.g., Min-SNR, Debiased, etc.).

dxqb · 2025-12-27T09:59:07Z

Does this apply to all models? Only diffusion models are Beta-sampled during inference. Flow matching models are sampled with linear sigmas and often with timestep-shifting ("Flux-shift").
This would mean that using a beta timestep distribution during training is equivalent to using (dynamic) timestep shifting during training for flow matching models, which we already have.

is that correct? did #1124 also only apply to diffusion, not to flow matching?

Koratahiu · 2025-12-27T17:59:48Z

Does this apply to all models? Only diffusion models are Beta-sampled during inference.

It’s a tunable distribution, but it’s specifically intended for diffusion models (SD, SDXL, etc.).
For flow-matching, we need to identify where the data distribution changes most significantly.

Flow matching models are sampled with linear sigmas and often with timestep-shifting ("Flux-shift"). This would mean that using a beta timestep distribution during training is equivalent to using (dynamic) timestep shifting during training for flow matching models, which we already have.

Here's examples:

(08, 1) The paper's J-shaped:

(2, 2) This is very similar to Chroma timestep distribution.

(1, 1.2) the reverse

is that correct? did #1124 also only apply to diffusion, not to flow matching?

The issue is that #1124 lacks a theoretical basis (it’s more of a heuristic method) but it functions similarly. Also, while it supports flow matching by accepting sigmas, requiring both betas and sigmas added too much code.

O-J1 · 2026-01-02T12:43:28Z

Do we have any results of our own showing this actually works on SD1.5 and SDXL and not on these specific datasets? The paper only covers training at 32x32, 128x128 and 256x256 which are not resolutions either model can do?

Koratahiu · 2026-01-02T14:50:58Z

Do we have any results of our own showing this actually works on SD1.5 and SDXL? The paper only covers training at 32x32, 128x128 and 256x256 which are not resolutions either model can do?

It is a known observation in diffusion papers that the later timesteps are relatively easy for the model compared to others (since most of the image is still noise).
While the initial timesteps have near-infinite possibilities and are relatively hard (e.g., the issue mentioned in #1230).
I implemented the method from this paper as it was straightforward to do; it should provide similar benefits to those seen in #1124.

O-J1 · 2026-01-02T15:06:36Z

So we havent tried it for any training, at all?

Koratahiu · 2026-01-02T15:55:21Z

You mean testing? Yes, I tested it in my recent runs (SDXL - 1024) and they went very well.
I haven't done any direct comparisons yet, though I did run some tests using #1124, which was more stable and faster to train (in terms of validation loss) compared to uniform sampling.

Koratahiu added 2 commits December 26, 2025 07:58

initial

3016a4a

use torch.rand for alpha==1

e910566

Koratahiu mentioned this pull request Dec 26, 2025

SpeeD Timestep Sampling #1124

Closed

2 tasks

Koratahiu mentioned this pull request Dec 27, 2025

E-TSDM: Early Timestep-shared Diffusion Model #1230

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BETA-Tuned Timestep Distribution #1225

BETA-Tuned Timestep Distribution #1225

Uh oh!

Koratahiu commented Dec 26, 2025 •

edited

Loading

Uh oh!

dxqb commented Dec 27, 2025

Uh oh!

Koratahiu commented Dec 27, 2025

Uh oh!

O-J1 commented Jan 2, 2026 •

edited

Loading

Uh oh!

Koratahiu commented Jan 2, 2026 •

edited

Loading

Uh oh!

O-J1 commented Jan 2, 2026

Uh oh!

Koratahiu commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

BETA-Tuned Timestep Distribution #1225

Are you sure you want to change the base?

BETA-Tuned Timestep Distribution #1225

Uh oh!

Conversation

Koratahiu commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Usage

Uh oh!

dxqb commented Dec 27, 2025

Uh oh!

Koratahiu commented Dec 27, 2025

Uh oh!

O-J1 commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Koratahiu commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

O-J1 commented Jan 2, 2026

Uh oh!

Koratahiu commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Koratahiu commented Dec 26, 2025 •

edited

Loading

O-J1 commented Jan 2, 2026 •

edited

Loading

Koratahiu commented Jan 2, 2026 •

edited

Loading