Skip to content

Fix incorrect batch temporal IDs for cond_model_input in dreambooth flux2 img2img#13846

Open
Jefsky wants to merge 1 commit into
huggingface:mainfrom
Jefsky:fix/dreambooth-batch-cond-ids
Open

Fix incorrect batch temporal IDs for cond_model_input in dreambooth flux2 img2img#13846
Jefsky wants to merge 1 commit into
huggingface:mainfrom
Jefsky:fix/dreambooth-batch-cond-ids

Conversation

@Jefsky
Copy link
Copy Markdown

@Jefsky Jefsky commented May 30, 2026

Problem

In train_dreambooth_lora_flux2_img2img.py, Flux2Pipeline._prepare_image_ids is designed for multiple reference images within a single sample, assigning different temporal embeddings (T=10, T=20, T=30...) to distinguish them.

However, cond_model_input has shape (B, C, H, W) where each batch element is an independent training sample with a single conditional image. The old code split the batch into individual items:

cond_model_input_list = [cond_model_input[i].unsqueeze(0) for i in range(cond_model_input.shape[0])]
cond_model_input_ids = Flux2Pipeline._prepare_image_ids(cond_model_input_list)

This produced incorrect cross-sample temporal offsets:

  • sample 0 → T=10, sample 1 → T=20, sample 2 → T=30, ...

Fix

Generate temporal IDs for a single sample and expand across the batch dimension:

cond_model_input_ids = Flux2Pipeline._prepare_image_ids([cond_model_input[0:1]])
cond_model_input_ids = cond_model_input_ids.expand(cond_model_input.shape[0], -1, -1)

All batch elements now use the same temporal ID (T=10), which is the correct behavior since each sample only has one conditional image.

Related PRs

Fixes #13811

…lux2 img2img

The _prepare_image_ids method assigns different temporal embeddings
(T=10, T=20, T=30...) to distinguish multiple reference images within
a single sample. However, in the training script, cond_model_input has
shape (B, C, H, W) where each batch element is an independent training
sample with only one conditional image.

The previous implementation split the batch into individual samples,
producing incorrect cross-sample temporal offsets (sample 0 -> T=10,
sample 1 -> T=20, etc.).

Fix: generate temporal IDs for one sample and expand across the batch
dimension, so all samples use the same temporal ID (T=10).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix incorrect batch handling in _prepare_image_ids usage in train_dreambooth_lora_flux2_img2img.py

1 participant