Error(s) in loading state_dict for UNet2DConditionModel #552
Unanswered
CliffordJordan
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I keep getting this error with Realistic Vision 2.0 and Paragon 1.0. However, when I use Realistic Vision 1.3, there is no such and all works well.
Here is the error...
prepare tokenizer
Use DreamBooth method.
prepare images.
found directory C:\ai\kohya_ss\Lora Training Data\deisegataD\image\200_deisegatad contains 29 image files
5800 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 4
resolution: (768, 768)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "C:\ai\kohya_ss\Lora Training Data\deisegataD\image\200_deisegatad"
image_count: 29
num_repeats: 200
shuffle_caption: False
keep_tokens: 0
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: deisegatad
caption_extension: .caption
[Dataset 0]
loading image sizes.
100%|████████████████████████████████████████████████████████████████████████████████| 29/29 [00:00<00:00, 4141.75it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (512, 512), count: 5800
mean ar error (without repeats): 0.0
prepare accelerator
C:\ai\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py:249: FutureWarning:
logging_dir
is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Useproject_dir
instead.warnings.warn(
Using accelerator 0.15.0 or above.
load args= Namespace(v2=True, v_parameterization=False, pretrained_model_name_or_path='C:/ai/stable-diffusion-webui/models/Stable-diffusion/Paragon_1.0_Beta.safetensors', tokenizer_cache_dir=None, train_data_dir='C:/ai/kohya_ss/Lora Training Data/deisegataD/image', shuffle_caption=False, caption_extension='.caption', caption_extention=None, keep_tokens=0, color_aug=False, flip_aug=False, face_crop_aug_range=None, random_crop=False, debug_dataset=False, resolution=(768, 768), cache_latents=True, vae_batch_size=1, cache_latents_to_disk=False, enable_bucket=True, min_bucket_reso=256, max_bucket_reso=1024, bucket_reso_steps=64, bucket_no_upscale=True, token_warmup_min=1, token_warmup_step=0, caption_dropout_rate=0.0, caption_dropout_every_n_epochs=0, caption_tag_dropout_rate=0.0, reg_data_dir=None, in_json=None, dataset_repeats=1, output_dir='C:/ai/kohya_ss/Lora Training Data/deisegataD/model', output_name='deisegata_d2', huggingface_repo_id=None, huggingface_repo_type=None, huggingface_path_in_repo=None, huggingface_token=None, huggingface_repo_visibility=None, save_state_to_huggingface=False, resume_from_huggingface=False, async_upload=False, save_precision='fp16', save_every_n_epochs=1, save_every_n_steps=None, save_n_epoch_ratio=None, save_last_n_epochs=None, save_last_n_epochs_state=None, save_last_n_steps=None, save_last_n_steps_state=None, save_state=False, resume=None, train_batch_size=4, max_token_length=None, mem_eff_attn=True, xformers=True, vae=None, max_train_steps=1450, max_train_epochs=None, max_data_loader_n_workers=0, persistent_data_loader_workers=False, seed=828459009, gradient_checkpointing=True, gradient_accumulation_steps=1, mixed_precision='fp16', full_fp16=False, clip_skip=None, logging_dir='C:/ai/kohya_ss/Lora Training Data/deisegataD/log', log_with=None, log_prefix=None, log_tracker_name=None, wandb_api_key=None, noise_offset=None, multires_noise_iterations=None, multires_noise_discount=0.3, adaptive_noise_scale=None, lowram=False, sample_every_n_steps=100, sample_every_n_epochs=None, sample_prompts='C:/ai/kohya_ss/Lora Training Data/deisegataD/model\sample\prompt.txt', sample_sampler='euler_a', config_file=None, output_config=False, prior_loss_weight=1.0, optimizer_type='AdamW8bit', use_8bit_adam=False, use_lion_optimizer=False, learning_rate=0.0001, max_grad_norm=1.0, optimizer_args=None, lr_scheduler_type='', lr_scheduler_args=None, lr_scheduler='cosine', lr_warmup_steps=145, lr_scheduler_num_cycles=1, lr_scheduler_power=1, dataset_config=None, min_snr_gamma=None, weighted_captions=False, no_metadata=False, save_model_as='safetensors', unet_lr=0.0001, text_encoder_lr=5e-05, network_weights=None, network_module='networks.lora', network_dim=8, network_alpha=1.0, network_args=None, network_train_unet_only=False, network_train_text_encoder_only=False, training_comment=None, dim_from_weights=False)
weight_dtype= torch.float16
accelerator= <accelerate.accelerator.Accelerator object at 0x0000021AEC950040>
loading model for process 0/1
load StableDiffusion checkpoint: C:/ai/stable-diffusion-webui/models/Stable-diffusion/Paragon_1.0_Beta.safetensors
Traceback (most recent call last):
File "C:\ai\kohya_ss\train_network.py", line 790, in
train(args)
File "C:\ai\kohya_ss\train_network.py", line 149, in train
text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype, accelerator)
File "C:\ai\kohya_ss\library\train_util.py", line 3023, in load_target_model
text_encoder, vae, unet, load_stable_diffusion_format = _load_target_model(
File "C:\ai\kohya_ss\library\train_util.py", line 2989, in _load_target_model
text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, name_or_path, device)
File "C:\ai\kohya_ss\library\model_util.py", line 863, in load_models_from_stable_diffusion_checkpoint
info = unet.load_state_dict(converted_unet_checkpoint)
File "C:\ai\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1604, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel:
size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]).
size mismatch for up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]).
size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]).
Traceback (most recent call last):
File "C:\Users\cliff.RICHARDFEYNMAN\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\cliff.RICHARDFEYNMAN\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\ai\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in
File "C:\ai\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "C:\ai\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 923, in launch_command
simple_launcher(args)
File "C:\ai\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 579, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\ai\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--v2', '--enable_bucket', '--pretrained_model_name_or_path=C:/ai/stable-diffusion-webui/models/Stable-diffusion/Paragon_1.0_Beta.safetensors', '--train_data_dir=C:/ai/kohya_ss/Lora Training Data/deisegataD/image', '--resolution=768,768', '--output_dir=C:/ai/kohya_ss/Lora Training Data/deisegataD/model', '--logging_dir=C:/ai/kohya_ss/Lora Training Data/deisegataD/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001', '--network_dim=8', '--output_name=deisegata_d2', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=145', '--train_batch_size=4', '--max_train_steps=1450', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale', '--sample_sampler=euler_a', '--sample_prompts=C:/ai/kohya_ss/Lora Training Data/deisegataD/model\sample\prompt.txt', '--sample_every_n_steps=100']' returned non-zero exit status 1.
Beta Was this translation helpful? Give feedback.
All reactions