[Flux LoRA] fix issues in flux lora scripts #11111

linoytsaban · 2025-03-18T19:39:27Z

fix remaining pending issues from #10313, #9476 in Flux LoRA training scripts

verify optimizer is updating properly (transformer only ☑️, test encoder w/ clip ☑️, pivotal w/ clip, pivotal w/ clip & t5, ti) wip
accelerate error when running on multiple gpus
replace scheduler
log_validation with mixed precision
save intermediate embeddings when checkpointing enabled

code snippets and output examples:

for running log_validation with mixed precision-

import os
os.environ['MODEL_NAME'] = "black-forest-labs/FLUX.1-dev"
os.environ['DATASET_NAME'] = "dog"
os.environ['OUTPUT_DIR'] = "flux-test-1"

!accelerate launch train_dreambooth_lora_flux.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$DATASET_NAME \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="bf16" \
  --instance_prompt="a photo of sks dog" \
  --resolution=1024 \
  --train_batch_size=1 \
  --guidance_scale=1 \
  --gradient_accumulation_steps=1 \
  --optimizer="prodigy" \
  --learning_rate=1. \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --checkpointing_steps=250 \
  --validation_prompt="a photo of sks dog in a bucket"\
  --validation_epochs=25 \
  --seed="0" \
  --push_to_hub

validation output at step 380:

HuggingFaceDocBuilderDev · 2025-03-18T19:48:45Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…ra_advanced

luchaoqi · 2025-03-19T01:33:00Z

Hi @linoytsaban , thanks for this prompt fix!

I believe the accelerator would produce error with line here with textual inversion specifically following the blog here:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/playpen-nas-ssd/luchao/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced_linoy.py", line 2576, in <module>
[rank0]:     main(args)
[rank0]:   File "/playpen-nas-ssd/luchao/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced_linoy.py", line 2273, in main
[rank0]:     prompt_embeds, pooled_prompt_embeds, text_ids = encode_prompt(
[rank0]:                                                     ^^^^^^^^^^^^^^
[rank0]:   File "/playpen-nas-ssd/luchao/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced_linoy.py", line 1446, in encode_prompt
[rank0]:     dtype = text_encoders[0].dtype
[rank0]:             ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/playpen-nas-ssd/luchao/software/miniconda3/envs/diffuser/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1928, in __getattr__
[rank0]:     raise AttributeError(
[rank0]: AttributeError: 'DistributedDataParallel' object has no attribute 'dtype'. Did you mean: 'type'?

Also Is it possible to verify if textual inversion works in sks dog case on your end as well? e.g. pure CLIP textual inversion as mentioned here

  --train_text_encoder_ti \
  --train_text_encoder_ti_frac=1 \
  --train_transformer_frac=0

linoytsaban · 2025-03-19T06:46:50Z

hey @luchaoqi! yes I'm currently testing multiple configurations - will definitely test with pivotal tuning with clip and pure textual inversion with clip.

re: error with accelerator when running with multiple processes - adding it to the todo list :)

sayakpaul

Initial comments.

examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py

examples/dreambooth/train_dreambooth_lora_flux.py

…n, fix accelerator.accumulate call in advanced script

linoytsaban · 2025-03-19T08:21:42Z

@sayakpaul I noticed in the scripts some times we use: accelerator.unwrap_model, and sometimes we use unwrap_model

    def unwrap_model(model):
        model = accelerator.unwrap_model(model)
        model = model._orig_mod if is_compiled_module(model) else model
        return model

do you recall why it's not consistently one way or the other?

sayakpaul · 2025-03-19T09:29:21Z

Using the unwrap_model() function works. The ones that doesn't should be updated to have something similar. We added unwrap_model() to have more consistency for cases with torch.compile()

….unwrap_model with unwrap model

…dvanced script

linoytsaban · 2025-03-19T10:52:57Z

Hey @luchaoqi! could you please check if the accelerator now works fine with distributed training? I think it should be resolved now

luchaoqi · 2025-03-19T14:25:06Z

Hi @linoytsaban, yes distributed training works as expected.

Pure textual inversion pops up new problems:

03/19/2025 10:20:03 - INFO - __main__ - Running validation...
 Generating 4 images with prompt: a photo of <s0><s1> person at 50 years old.
Traceback (most recent call last):
  File "/playpen-nas-ssd/luchao/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced_linoy.py", line 2408, in <module>
    main(args)
  File "/playpen-nas-ssd/luchao/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced_linoy.py", line 2055, in main
    text_encoder_one.train()
    ^^^^^^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'text_encoder_one' where it is not associated with a value
[rank0]: Traceback (most recent call last):
[rank0]:   File "/playpen-nas-ssd/luchao/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced_linoy.py", line 2408, in <module>
[rank0]:     main(args)
[rank0]:   File "/playpen-nas-ssd/luchao/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced_linoy.py", line 2055, in main
[rank0]:     text_encoder_one.train()
[rank0]:     ^^^^^^^^^^^^^^^^
[rank0]: UnboundLocalError: cannot access local variable 'text_encoder_one' where it is not associated with a value

linoytsaban · 2025-04-04T12:54:03Z

@luchaoqi if you want to give it a try the current version should be fixed

luchaoqi · 2025-04-04T20:20:21Z

@linoytsaban thanks! Would definitely try it out asap once I get some time.
Feel free to merge it if other reviewers agree, cheers!

sayakpaul

Thanks! Left some comments. Let me know if they make sense.

sayakpaul · 2025-04-08T10:10:06Z

examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py

    # run inference
    generator = torch.Generator(device=accelerator.device).manual_seed(args.seed) if args.seed is not None else None
-    autocast_ctx = nullcontext()
+    autocast_ctx = torch.autocast(accelerator.device.type)


I think this is only needed for the intermediate validation. Do we need to check for that?

Yeah I think you're right, tested it now and seems to work as expected, changed it now

examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py

examples/dreambooth/train_dreambooth_flux.py

examples/dreambooth/train_dreambooth_lora_flux.py

…x_advanced.py Co-authored-by: Sayak Paul <[email protected]>

sayakpaul

Thanks a ton for handling this!

sayakpaul · 2025-04-08T14:08:04Z

@bot /style

github-actions · 2025-04-08T14:08:58Z

Style fixes have been applied. View the workflow run here.

linoytsaban · 2025-04-08T14:40:20Z

Failing test is unrelated

luchaoqi · 2025-04-21T06:14:13Z

@linoytsaban thanks for the fix earlier!
I got the chance to test the new script out, but I'm wondering how to get the results similar to the blog here, especially fig1 w/o pivotal tuning. I tested with both pure textual inversion (CLIP) and CLIP + pivotal tuning with the code below:

CLIP only:

accelerate launch --config_file "xxx.yaml" train_dreambooth_lora_flux_advanced.py \
  --pretrained_model_name_or_path="black-forest-labs/FLUX.1-dev" \
  --instance_data_dir="path to dataset containing images" \
  --output_dir="xxx" \
  --instance_prompt="a photo of TOK person" \
  --mixed_precision="bf16" \
  --resolution=512 \
  --train_batch_size=1 \
  --repeats=1 \
  --report_to="wandb"\
  --gradient_accumulation_steps=1 \
  --guidance_scale=1 \
  --learning_rate=1.0 \
  --text_encoder_lr=1.0 \
  --optimizer="prodigy" \
  --train_text_encoder_ti \
  --train_text_encoder_ti_frac=1 \
  --train_transformer_frac=0 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=1000 \
  --checkpointing_steps=200 \
  --seed="0" \
  --validation_epochs=100 \
  --validation_prompt="a photo of TOK person at 50 years old"

CLIP + pivotal tuning:

accelerate launch --config_file "xxx.yaml" train_dreambooth_lora_flux_advanced.py \
  --pretrained_model_name_or_path="black-forest-labs/FLUX.1-dev" \
  --instance_data_dir="path to dataset containing images" \
  --output_dir="xxx" \
  --instance_prompt="a photo of TOK person" \
  --mixed_precision="bf16" \
  --resolution=512 \
  --train_batch_size=1 \
  --repeats=1 \
  --report_to="wandb"\
  --lora_layers="attn.to_k,attn.to_q,attn.to_v,attn.to_out.0" \
  --rank=16 \
  --gradient_accumulation_steps=1 \
  --guidance_scale=1 \
  --learning_rate=1.0 \
  --text_encoder_lr=1.0 \
  --optimizer="prodigy" \
  --train_text_encoder_ti \
  --train_text_encoder_ti_frac=0.25 \
  --train_transformer_frac=1 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=1000 \
  --checkpointing_steps=200 \
  --seed="0" \
  --validation_epochs=100 \
  --validation_prompt="a photo of TOK person at 50 years old"

and the inference code I run following code here:

import torch
from diffusers import AutoPipelineForText2Image
from safetensors.torch import load_file

pipe = AutoPipelineForText2Image.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16).to('cuda')

text_encoders = [pipe.text_encoder, pipe.text_encoder_2]
tokenizers = [pipe.tokenizer, pipe.tokenizer_2]

embedding_path = "xxx_emb.safetensors"

state_dict = load_file(embedding_path)
# load embeddings of text_encoder 1 (CLIP ViT-L/14)
pipe.load_textual_inversion(state_dict["clip_l"], token=["<s0>", "<s1>"], text_encoder=pipe.text_encoder, tokenizer=pipe.tokenizer)
# load embeddings of text_encoder 2 (T5 XXL) - ignore this line if you didn't enable `--enable_t5_ti`
pipe.load_textual_inversion(state_dict["t5"], token=["<s0>", "<s1>"], text_encoder=pipe.text_encoder_2, tokenizer=pipe.tokenizer_2)


instance_token = "<s0><s1>"
prompt = f"a photo of {instance_token} person at 50 years old"

image = pipe(prompt=prompt, 
             num_inference_steps=25, 
             ).images[0]

Seems the model is not learning the concept from training images at all, producing non-personalized results. Anything I might be missing here?

luchaoqi · 2025-05-01T03:48:18Z

updates: I found token_embedding is barely changing during optimization when following the recommended hyperparameters in here and here. I changed the optimizer to adamW and token_embedding starts to change but still fails to learn the concept (if doing pure textual inversion in CLIP solely).

Caselles · 2025-06-13T09:25:31Z

What is the current state of this script ? I'm looking for Flux Lora training options and I would love to use diffusers but it seems that there's a lot of disappointing results from what I'm reading (also in #10313). Is the script functional and works well or it needs some work?

sayakpaul · 2025-06-13T09:30:06Z

@linoytsaban

linoytsaban · 2025-06-16T14:24:06Z

Hey @luchaoqi! for pure-textual inversion, we added it as an experimental feature that can potentially be used for concepts that are very closed to the model distribution / for people who are keen to experiment with very lightweight embeddings or look to try textual inversion with Flux. Naturally, since T5 carries most of the weight with Flux, training CLIP embeddings alone (i.e. pure textual inversion with CLIP) will likely be insufficient for most concepts. With pivotal tuning - the transformer is trained as well, but its also essential in that case to modify the inference code to load the lora weights into the transformer - from your example I'm not sure if you only included the inference you did for the pure textual inversion or this is also what you used for the pivotal tuning checkpoint - if so, could you try again and load the transformer weights too?

linoytsaban · 2025-06-16T14:26:07Z

Hey @Caselles! It should indeed be functional, we've addressed the bugs reported in the issue you mentioned and successfully trained with this script on various concepts. Please let us know if you encounter any issues :)

linoytsaban added 2 commits March 18, 2025 21:05

remove custom scheduler

298709e

update requirements.txt

364f478

log_validation with mixed precision

90e9517

linoytsaban changed the title ~~[Flux LoRA] fix issues in advanced script~~ [Flux LoRA] fix issues in flux lora scripts Mar 18, 2025

linoytsaban and others added 4 commits March 18, 2025 23:09

Merge branch 'main' into flux_lora_advanced

d90d7f0

add intermediate embeddings saving when checkpointing is enabled

bdd6cae

remove comment

c8e165b

Merge remote-tracking branch 'origin/flux_lora_advanced' into flux_lo…

d434db3

…ra_advanced

linoytsaban requested a review from sayakpaul March 18, 2025 21:40

linoytsaban added bug Something isn't working training labels Mar 18, 2025

fix validation

710fcae

Merge branch 'main' into flux_lora_advanced

2d8ca60

sayakpaul reviewed Mar 19, 2025

View reviewed changes

add unwrap_model for accelerator, torch.no_grad context for validatio…

0565932

…n, fix accelerator.accumulate call in advanced script

revert unwrap_model change temp

ba4dece

linoytsaban added 3 commits March 19, 2025 12:38

add .module to address distributed training bug + replace accelerator…

c155f22

….unwrap_model with unwrap model

changes to align advanced script with canonical script

9c4368d

make changes for distributed training + unify unwrap_model calls in a…

7492e92

…dvanced script

linoytsaban and others added 3 commits March 19, 2025 12:57

add module.dtype fix to dreambooth script

0729c66

unify unwrap_model calls in dreambooth script

cc1d2ad

Merge branch 'main' into flux_lora_advanced

07c2974

Merge branch 'main' into flux_lora_advanced

603b57c

linoytsaban and others added 5 commits April 2, 2025 08:28

Merge branch 'main' into flux_lora_advanced

b211eea

mixed precision

9b2917f

Merge branch 'main' into flux_lora_advanced

22046d1

Merge branch 'main' into flux_lora_advanced

f1af7e2

Merge branch 'main' into flux_lora_advanced

8dc7005

Merge branch 'main' into flux_lora_advanced

d8ef75f

linoytsaban requested a review from sayakpaul April 8, 2025 08:31

Merge branch 'main' into flux_lora_advanced

bfd8d45

sayakpaul reviewed Apr 8, 2025

View reviewed changes

linoytsaban and others added 4 commits April 8, 2025 14:45

Update examples/advanced_diffusion_training/train_dreambooth_lora_flu…

5d249a7

…x_advanced.py Co-authored-by: Sayak Paul <[email protected]>

Merge branch 'main' into flux_lora_advanced

a4b1e7f

smol style change

57ee3cf

change autocast

8b991a5

linoytsaban requested a review from sayakpaul April 8, 2025 12:26

sayakpaul approved these changes Apr 8, 2025

View reviewed changes

Apply style fixes

bfd1df6

Merge branch 'main' into flux_lora_advanced

c978ca3

linoytsaban merged commit 71f34fc into huggingface:main Apr 8, 2025
8 of 9 checks passed

linoytsaban deleted the flux_lora_advanced branch April 9, 2025 06:41

Uh oh!

[Flux LoRA] fix issues in flux lora scripts #11111

[Flux LoRA] fix issues in flux lora scripts #11111

Uh oh!

Conversation

linoytsaban commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 18, 2025

Uh oh!

luchaoqi commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linoytsaban commented Mar 19, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

linoytsaban commented Mar 19, 2025

Uh oh!

sayakpaul commented Mar 19, 2025

Uh oh!

linoytsaban commented Mar 19, 2025

Uh oh!

luchaoqi commented Mar 19, 2025

Uh oh!

linoytsaban commented Apr 4, 2025

Uh oh!

luchaoqi commented Apr 4, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

linoytsaban Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Apr 8, 2025

Uh oh!

github-actions bot commented Apr 8, 2025

Uh oh!

linoytsaban commented Apr 8, 2025

Uh oh!

Uh oh!

luchaoqi commented Apr 21, 2025

Uh oh!

luchaoqi commented May 1, 2025

Uh oh!

Caselles commented Jun 13, 2025

Uh oh!

sayakpaul commented Jun 13, 2025

Uh oh!

linoytsaban commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linoytsaban commented Jun 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

linoytsaban commented Mar 18, 2025 •

edited

Loading

luchaoqi commented Mar 19, 2025 •

edited

Loading

linoytsaban Apr 8, 2025 •

edited

Loading

linoytsaban commented Jun 16, 2025 •

edited

Loading