- 
                Notifications
    
You must be signed in to change notification settings  - Fork 6.5k
 
Description
Describe the bug
when I run the script train_dreambooth_lora_flux.py. It raise  ValueError: unexpected save model: <class 'deepspeed.runtime.engine.DeepSpeedEngine'>. something bug in save_model_hook?
Reproduction
accelerate launch train_dreambooth_lora_flux_custom.py 
--pretrained_model_name_or_path=$MODEL_NAME  
--instance_data_dir=$INSTANCE_DIR 
--output_dir=$OUTPUT_DIR 
--mixed_precision="bf16" 
--instance_prompt="bedroom, YF_CN style" 
--resolution=1024 
--train_batch_size=1 
--guidance_scale=1 
--gradient_accumulation_steps=4 
--optimizer="prodigy" 
--learning_rate=1. 
--report_to="tensorboard" 
--lr_scheduler="constant" 
--lr_warmup_steps=0 
--num_train_epochs=30 
--validation_prompt="bedroom, YF_CN style" 
--validation_epochs=80 
--checkpointing_steps=500 
--seed="0" 
--gradient_checkpointing 
--use_8bit_adam 
--rank=4
Logs
No response
System Info
torch==2.3.1
accelerate==0.34.2
deepspeed==0.15.1+8ac42ed7
diffusers==0.31.0.dev0
default_config.yaml as follow:
compute_environment: LOCAL_MACHINE
debug: true
deepspeed_config:
gradient_accumulation_steps: 1
gradient_clipping: 1.0
offload_optimizer_device: none
offload_param_device: none
zero3_init_flag: false
zero_stage: 2
distributed_type: DEEPSPEED
downcast_bf16: 'no'
enable_cpu_affinity: false
machine_rank: 0
main_training_function: main
mixed_precision: fp16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: fals