-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Description
Describe the bug
env: MODEL_NAME=runwayml/stable-diffusion-v1-5
env: INSTANCE_DIR=/content/drive/MyDrive/Newfolder
env: HF_ENDPOINT=https://hf-mirror.com/
2024-08-18 08:46:08.308678: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-18 08:46:08.328601: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-18 08:46:08.334721: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-18 08:46:09.559880: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
08/18/2024 08:46:10 - INFO - main - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: no
{'dynamic_thresholding_ratio', 'variance_type', 'clip_sample_range', 'sample_max_value', 'thresholding', 'timestep_spacing', 'prediction_type', 'rescale_betas_zero_snr'} was not found in config. Values will be initialized to default values.
{'scaling_factor', 'latents_mean', 'use_quant_conv', 'latents_std', 'shift_factor', 'force_upcast', 'mid_block_add_attention', 'use_post_quant_conv'} was not found in config. Values will be initialized to default values.
{'transformer_layers_per_block', 'mid_block_type', 'addition_time_embed_dim', 'encoder_hid_dim_type', 'time_cond_proj_dim', 'dual_cross_attention', 'projection_class_embeddings_input_dim', 'num_attention_heads', 'reverse_transformer_layers_per_block', 'time_embedding_act_fn', 'mid_block_only_cross_attention', 'addition_embed_type', 'use_linear_projection', 'num_class_embeds', 'encoder_hid_dim', 'only_cross_attention', 'resnet_time_scale_shift', 'time_embedding_dim', 'cross_attention_norm', 'time_embedding_type', 'addition_embed_type_num_heads', 'conv_out_kernel', 'attention_type', 'dropout', 'class_embeddings_concat', 'timestep_post_act', 'class_embed_type', 'conv_in_kernel', 'upcast_attention', 'resnet_skip_time_act', 'resnet_out_scale_factor'} was not found in config. Values will be initialized to default values.
Resolving data files: 100% 18/18 [00:00<00:00, 150094.38it/s]
Generating train split: 9 examples [00:00, 377.47 examples/s]
Traceback (most recent call last):
File "/content/train_text_to_image_lora.py", line 979, in
main()
File "/content/train_text_to_image_lora.py", line 625, in main
raise ValueError(
ValueError: --image_column' value 'image' needs to be one of: text
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 48, in main
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1097, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 703, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_text_to_image_lora.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--dataset_name=/content/drive/MyDrive/Newfolder', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--checkpointing_steps=100', '--learning_rate=1e-4', '--report_to=wandb', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=500', '--validation_prompt=forward trajectory', '--validation_epochs=50', '--seed=0', '--push_to_hub']' returned non-zero exit status 1.
i also had a dependency issue and think this error is related to that.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf-cu12 24.4.1 requires pyarrow<15.0.0a0,>=14.0.1, but you have pyarrow 17.0.0 which is incompatible.
ibis-framework 8.0.0 requires pyarrow<16,>=2, but you have pyarrow 17.0.0 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 2.21.0 requires pyarrow>=15.0.0, but you have pyarrow 14.0.1 which is incompatible.
pyarrow has conflicting versions for cudf-cu12 24.4.1 ibis-framework 8.0.0 datasets 2.21.0
Reproduction
!pip install git+https://github.com/huggingface/diffusers
#!pip install accelerate
!pip install -r https://raw.githubusercontent.com/huggingface/diffusers/main/examples/text_to_image/requirements.txt
!pip install pyarrow==14.0.1
!accelerate config default
%env MODEL_NAME=runwayml/stable-diffusion-v1-5
%env INSTANCE_DIR=/content/drive/MyDrive/Newfolder
%env HF_ENDPOINT=https://hf-mirror.com
!accelerate launch train_text_to_image_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--dataset_name=$INSTANCE_DIR
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--checkpointing_steps=100
--learning_rate=1e-4
--report_to="wandb"
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="forward trajectory"
--validation_epochs=50
--seed="0"
--push_to_hub
Logs
No response
System Info
- π€ Diffusers version: 0.31.0.dev0
- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Running on Google Colab?: Yes
- Python version: 3.10.12
- PyTorch version (GPU?): 2.3.1+cu121 (True)
- Flax version (CPU?/GPU?/TPU?): 0.8.4 (gpu)
- Jax version: 0.4.26
- JaxLib version: 0.4.26
- Huggingface_hub version: 0.23.5
- Transformers version: 4.42.4
- Accelerate version: 0.32.1
- PEFT version: 0.7.0
- Bitsandbytes version: not installed
- Safetensors version: 0.4.4
- xFormers version: not installed
- Accelerator: Tesla T4, 15360 MiB
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
to add my dataset folder has image.png and image.txt
