Skip to content

学習させてもsafetensorデータが生成されない #2233

@tanakahumihiko545

Description

@tanakahumihiko545

どうか皆様のお力をお借りできればと思い、質問させていただきました。どうかお願い致します。
loraの作成時に数KBのTOML・JSONファイルが作られるのみでsafetensorファイルが作成されません。またtrain stepsも0%まま一切進まない原因が分かりません。以下の内容に何かエラーが起きているのでしょうか。学習用モデルもSDXLでしか画像の読み込みがされませんでした。学習用の画像は一度リサイズしました。
グラボはrtx5050 16gbを使用しております。是非ともよろしくお願いいたします。

00:36:26-311331 INFO Training has ended.
00:37:38-060680 INFO Start training LoRA Standard ...
00:37:38-060680 INFO Validating lr scheduler arguments...
00:37:38-061680 INFO Validating optimizer arguments...
00:37:38-061680 INFO Validating
M:/StabilityMatrix-win-x64/Data/Packages/kohya_ss/outp
uts existence and writability... SUCCESS
00:37:38-062680 INFO Validating
M:/StabilityMatrix-win-x64/Data/Models/StableDiffusion
/sd_xl_base_1.0_0.9vae.safetensors existence...
SUCCESS
00:37:38-063681 INFO Validating
M:/StabilityMatrix-win-x64/Data/Packages/train/nisimur
a existence... SUCCESS
00:37:38-064680 INFO Folder 10_nisimura: 10 repeats found
00:37:38-065681 INFO Folder 10_nisimura: 78 images found
00:37:38-066680 INFO Folder 10_nisimura: 78 * 10 = 780 steps
00:37:38-066680 INFO Regularization factor: 1
00:37:38-067680 INFO Train batch size: 1
00:37:38-067680 INFO Gradient accumulation steps: 1
00:37:38-068680 INFO Epoch: 2
00:37:38-068680 INFO Max train steps: 1000
00:37:38-069680 INFO stop_text_encoder_training = 0
00:37:38-069680 INFO lr_warmup_steps = 0.1
00:37:38-070680 INFO Effective Learning Rate Configuration (based on GUI
settings):
00:37:38-071680 INFO - Main LR (for optimizer & fallback): 1.00e-04
00:37:38-071680 INFO - Text Encoder (Primary/CLIP) Effective LR: 1.00e-04
(Fallback to Main LR)
00:37:38-072679 INFO - Text Encoder (T5XXL, if applicable) Effective LR:
1.00e-04 (Fallback to Main LR)
00:37:38-073680 INFO - U-Net Effective LR: 1.00e-04 (Specific Value)
00:37:38-073680 INFO Note: These LRs reflect the GUI's direct settings.
Advanced options in sd-scripts (e.g., block LRs,
LoRA+) can further modify rates for specific layers.
00:37:38-074680 INFO Saving training config to
M:/StabilityMatrix-win-x64/Data/Packages/kohya_ss/outp
uts\nishimura_v1_20251106-003738.json...
00:37:38-075680 INFO Executing command:
M:\StabilityMatrix-win-x64\Data\Packages\kohya_ss\venv
\Scripts\accelerate.EXE launch --dynamo_backend no
--dynamo_mode default --mixed_precision bf16
--num_processes 1 --num_machines 1
--num_cpu_threads_per_process 2
M:/StabilityMatrix-win-x64/Data/Packages/kohya_ss/sd-s
cripts/sdxl_train_network.py --config_file
M:/StabilityMatrix-win-x64/Data/Packages/kohya_ss/outp
uts/config_lora-20251106-003738.toml
2025-11-06 00:37:46 INFO Loading settings from train_util.py:4651
M:/StabilityMatrix-win-x64/Data
/Packages/kohya_ss/outputs/conf
ig_lora-20251106-003738.toml...
M:\StabilityMatrix-win-x64\Data\Packages\kohya_ss\venv\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces was not set. It will be set to True by default. This behavior will be depracted in transformers v4.45, and will be then set to False by default. For more details check this issue: huggingface/transformers#31884
warnings.warn(
2025-11-06 00:37:46 INFO Using DreamBooth method. train_network.py:517
INFO prepare images. train_util.py:2072
INFO get image size from name of train_util.py:1965
cache files
100%|██████████| 78/78 [00:00<?, ?it/s]
INFO set image size from cache train_util.py:1995
files: 0/78
INFO found directory train_util.py:2019
M:\StabilityMatrix-win-x64\Data
\Packages\train\nisimura\10_nis
imura contains 78 image files
read caption: 100%|██████████| 78/78 [00:00<00:00, 19507.23it/s]
INFO 780 train images with repeats. train_util.py:2116
INFO 0 reg images with repeats. train_util.py:2120
WARNING no regularization images / train_util.py:2125
正則化画像が見つかりませんでし

INFO [Dataset 0] config_util.py:580
batch_size: 1
resolution: (512, 512)
resize_interpolation: None
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True

                           [Subset 0 of Dataset 0]                         
                             image_dir:                                    
                         "M:\StabilityMatrix-win-x64\Dat                   
                         a\Packages\train\nisimura\10_ni                   
                         simura"                                           
                             image_count: 78                               
                             num_repeats: 10                               
                             shuffle_caption: False                        
                             keep_tokens: 0                                
                             caption_dropout_rate: 0.0                     
                             caption_dropout_every_n_epo                   
                         chs: 0                                            
                             caption_tag_dropout_rate:                     
                         0.0                                               
                             caption_prefix: None                          
                             caption_suffix: None                          
                             color_aug: False                              
                             flip_aug: False                               
                             face_crop_aug_range: None                     
                             random_crop: False                            
                             token_warmup_min: 1,                          
                             token_warmup_step: 0,                         
                             alpha_mask: False                             
                             resize_interpolation: None                    
                             custom_attributes: {}                         
                             is_reg: False                                 
                             class_tokens: nisimura                        
                             caption_extension: .txt                       
                                                                           
                                                                           
                INFO     [Prepare dataset 0]             config_util.py:592
                INFO     loading image sizes.             train_util.py:987

100%|██████████| 78/78 [00:00<00:00, 25983.30it/s]
INFO make buckets train_util.py:1010
WARNING min_bucket_reso and train_util.py:1027
max_bucket_reso are ignored if
bucket_no_upscale is set,
because bucket reso is defined
by image size automatically /
bucket_no_upscaleが指定された場
合は、bucketの解像度は画像サイ
ズから自動計算されるため、min_b
ucket_resoとmax_bucket_resoは無
視されます
INFO number of images (including train_util.py:1056
repeats) /
各bucketの画像枚数(繰り返し回
数を含む)
INFO bucket 0: resolution (512, train_util.py:1061
512), count: 780
INFO mean ar error (without train_util.py:1069
repeats):
2.5064541193573428e-05
WARNING clip_skip will be sdxl_train_util.py:349
unexpected /
SDXL学習ではclip_skipは動作
しません
INFO preparing accelerator train_network.py:580
accelerator device: cuda
INFO loading model for process sdxl_train_util.py:32
0/1
INFO load StableDiffusion sdxl_train_util.py:73
checkpoint:
M:/StabilityMatrix-win-x64/D
ata/Models/StableDiffusion/s
d_xl_base_1.0_0.9vae.safeten
sors
INFO building U-Net sdxl_model_util.py:198
2025-11-06 00:37:47 INFO loading U-Net from sdxl_model_util.py:202
checkpoint
2025-11-06 00:37:48 INFO U-Net:
INFO building text encoders sdxl_model_util.py:211
INFO loading text encoders from sdxl_model_util.py:264
checkpoint
INFO text encoder 1:
INFO text encoder 2:
INFO building VAE sdxl_model_util.py:285
2025-11-06 00:37:49 INFO loading VAE from checkpoint sdxl_model_util.py:290
INFO VAE:
INFO Enable xformers for U-Net train_util.py:3349
import network module: networks.lora
INFO [Dataset 0] train_util.py:2613
INFO caching latents with caching train_util.py:1115
strategy.
INFO caching latents... train_util.py:1164
100%|██████████| 78/78 [00:22<00:00, 3.46it/s]
2025-11-06 00:38:12 INFO create LoRA network. base dim (rank): lora.py:935
64, alpha: 32
INFO neuron dropout: p=None, rank dropout: lora.py:936
p=None, module dropout: p=None
INFO create LoRA for Text Encoder 1: lora.py:1027
INFO create LoRA for Text Encoder 2: lora.py:1027
INFO create LoRA for Text Encoder: 264 lora.py:1035
modules.
2025-11-06 00:38:13 INFO create LoRA for U-Net: 722 modules. lora.py:1043
INFO enable LoRA for U-Net: 722 modules lora.py:1089
prepare optimizer, data loader etc.
INFO use 8-bit AdamW optimizer | {} train_util.py:4804
running training / 学習開始
num train images * repeats / 学習画像の数×繰り返し回数: 780
num validation images * repeats / 学習画像の数×繰り返し回数: 0
num reg images / 正則化画像の数: 0
num batches per epoch / 1epochのバッチ数: 780
num epochs / epoch数: 2
batch size per device / バッチサイズ: 1
gradient accumulation steps / 勾配を合計するステップ数 = 1
total optimization steps / 学習ステップ数: 1000
2025-11-06 00:38:19 INFO unet dtype: torch.bfloat16, train_network.py:1323
device: cuda:0
INFO text_encoder [0] dtype: train_network.py:1329
torch.bfloat16, device:
cuda:0
INFO text_encoder [1] dtype: train_network.py:1329
torch.bfloat16, device:
cuda:0
steps: 0%| | 0/1000 [00:00<?, ?it/s]
epoch 1/2

2025-11-06 00:38:20 INFO epoch is incremented. train_util.py:779
current_epoch: 0, epoch: 1
CUDA error (C:/a/xformers/xformers/third_party/flash-attention/hopper\flash_fwd_launch_template.h:188): invalid argument
Traceback (most recent call last):
File "M:\StabilityMatrix-win-x64\Data\Assets\Python\cpython-3.10.18-windows-x86_64-none\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "M:\StabilityMatrix-win-x64\Data\Assets\Python\cpython-3.10.18-windows-x86_64-none\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "M:\StabilityMatrix-win-x64\Data\Packages\kohya_ss\venv\Scripts\accelerate.EXE_main
.py", line 10, in
sys.exit(main())
File "M:\StabilityMatrix-win-x64\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 50, in main
args.func(args)
File "M:\StabilityMatrix-win-x64\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1235, in launch_command
simple_launcher(args)
File "M:\StabilityMatrix-win-x64\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 823, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['M:\StabilityMatrix-win-x64\Data\Packages\kohya_ss\venv\Scripts\python.exe', 'M:/StabilityMatrix-win-x64/Data/Packages/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', 'M:/StabilityMatrix-win-x64/Data/Packages/kohya_ss/outputs/config_lora-20251106-003738.toml']' returned non-zero exit status 1.
00:38:22-446854 INFO Training has ended.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions