-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
hello
Newly installed training doesn't work
installed by: uv, ./gui-uv.sh
my system: Linux: CachyOS
GPU:NVIDIA
`❯ ./gui-uv.sh
2025-11-15 17:33:28.928086: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1763224408.938454 12465 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1763224408.941816 12465 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1763224408.950823 12465 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1763224408.950838 12465 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1763224408.950840 12465 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1763224408.950842 12465 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
2025-11-15 17:33:28.953455: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
17:33:32-223151 WARNING Skipping requirements verification.
17:33:32-225060 INFO headless: False
17:33:32-225861 INFO Using shell=True when running external commands...
Running on local URL: http://127.0.0.1:7860/
To create a public link, set share=True in launch().
17:34:58-001817 INFO Copy /home/tobias/Schreibtisch/Training_images/ to
/home/tobias/Schreibtisch/Destination_training/img/200_h53d man...
17:34:58-003747 INFO Regularization images directory is missing... not copying regularisation images...
17:34:58-004437 INFO Done creating kohya_ss training folder structure at
/home/tobias/Schreibtisch/Destination_training/...
17:35:12-824582 INFO Start training Dreambooth...
17:35:12-825262 INFO Validating lr scheduler arguments...
17:35:12-825823 INFO Validating optimizer arguments...
17:35:12-826262 INFO Validating /home/tobias/Schreibtisch/Destination_training/log existence and writability...
SUCCESS
17:35:12-826749 INFO Validating /home/tobias/Schreibtisch/Destination_training/model existence and writability...
SUCCESS
17:35:12-827290 INFO Validating /home/tobias/AI/ComfyUI/models/checkpoints/lustifySDXLNSFW_endgame.safetensors
existence... SUCCESS
17:35:12-827766 INFO Validating /home/tobias/Schreibtisch/Destination_training/img existence... SUCCESS
17:35:12-828235 INFO Folder 200_h53d man: 200 repeats found
17:35:12-828681 INFO Folder 200_h53d man: 2 images found
17:35:12-829091 INFO Folder 200_h53d man: 2 * 200 = 400 steps
17:35:12-829520 INFO Regularization factor: 1
17:35:12-829912 INFO Total steps: 400
17:35:12-830302 INFO Train batch size: 6
17:35:12-830693 INFO Gradient accumulation steps: 1
17:35:12-831087 INFO Epoch: 1
17:35:12-831457 INFO Max train steps: 1600
17:35:12-831844 INFO lr_warmup_steps = 0.1
17:35:12-832520 INFO Saving training config to
/home/tobias/Schreibtisch/Destination_training/model/TEST_232_20251115-173512.json...
17:35:12-833205 INFO Executing command: /home/tobias/AI/kohya_ss/.venv/bin/accelerate launch --dynamo_backend no
--dynamo_mode default --mixed_precision fp16 --num_processes 1 --num_machines 1
--num_cpu_threads_per_process 2 /home/tobias/AI/kohya_ss/sd-scripts/sdxl_train.py --config_file
/home/tobias/Schreibtisch/Destination_training/model/config_dreambooth-20251115-173512.toml
ipex flag is deprecated, will be removed in Accelerate v1.10. From 2.7.0, PyTorch has all needed optimizations for Intel CPU and XPU.
/home/tobias/AI/kohya_ss/sd-scripts/library/deepspeed_utils.py:131: SyntaxWarning: "is not" with a literal. Did you mean "!="?
wrap_model_forward_with_torch_autocast = args.mixed_precision is not "no"
Traceback (most recent call last):
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/utils/import_utils.py", line 920, in _get_module
return importlib.import_module("." + module_name, self.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tobias/.local/share/uv/python/cpython-3.11.13-linux-x86_64-gnu/lib/python3.11/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1204, in _gcd_import
File "", line 1176, in _find_and_load
File "", line 1126, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1204, in _gcd_import
File "", line 1176, in _find_and_load
File "", line 1147, in _find_and_load_unlocked
File "", line 690, in _load_unlocked
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/models/autoencoders/init.py", line 1, in
from .autoencoder_asym_kl import AsymmetricAutoencoderKL
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/models/autoencoders/autoencoder_asym_kl.py", line 22, in
from ..modeling_utils import ModelMixin
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/models/modeling_utils.py", line 35, in
from ..quantizers import DiffusersAutoQuantizer, DiffusersQuantizer
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/quantizers/init.py", line 15, in
from .auto import DiffusersAutoQuantizer
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/quantizers/auto.py", line 22, in
from .bitsandbytes import BnB4BitDiffusersQuantizer, BnB8BitDiffusersQuantizer
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/quantizers/bitsandbytes/init.py", line 2, in
from .utils import dequantize_and_replace, dequantize_bnb_weight, replace_with_bnb_linear
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/quantizers/bitsandbytes/utils.py", line 32, in
import bitsandbytes as bnb
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/bitsandbytes/init.py", line 19, in
from .nn import modules
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/bitsandbytes/nn/init.py", line 21, in
from .triton_based_modules import (
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/bitsandbytes/nn/triton_based_modules.py", line 6, in
from bitsandbytes.triton.dequantize_rowwise import dequantize_rowwise
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/bitsandbytes/triton/dequantize_rowwise.py", line 18, in
@triton.autotune(
^^^^^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/runtime/autotuner.py", line 378, in decorator
return Autotuner(fn, fn.arg_names, configs, key, reset_to_zero, restore_value, pre_hook=pre_hook,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/runtime/autotuner.py", line 130, in init
self.do_bench = driver.active.get_benchmarker()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 23, in getattr
self._initialize_obj()
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 2[0](https://github.com/bmaltais/kohya_ss/issues/3458), in _initialize_obj
self._obj = self._init_fn()
^^^^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 9, in _create_driver
return actives0
^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 535, in init
self.utils = CudaUtils() # TODO: make static
^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 89, in init
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 71, in compile_module_from_src
mod = importlib.util.module_from_spec(spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ImportError: /home/tobias/.triton/cache/QLAEYTJR4KV5WSBGJKRUAKVP475DE47NW7P4XMI2RFXBOIE5TZ4Q/cuda_utils.so: undefined symbol: cuModuleGetFunction
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobias/AI/kohya_ss/sd-scripts/sdxl_train.py", line 20, in
from library import deepspeed_utils, sdxl_model_util, strategy_base, strategy_sd, strategy_sdxl
File "/home/tobias/AI/kohya_ss/sd-scripts/library/sdxl_model_util.py", line 8, in
from diffusers import AutoencoderKL, EulerDiscreteScheduler, UNet2DConditionModel
File "", line 1229, in _handle_fromlist
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/utils/import_utils.py", line 911, in getattr
value = getattr(module, name)
^^^^^^^^^^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/utils/import_utils.py", line 910, in getattr
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/diffusers/utils/import_utils.py", line 922, in _get_module
raise RuntimeError(
RuntimeError: Failed to import diffusers.models.autoencoders.autoencoder_kl because of the following error (look up to see its traceback):
/home/tobias/.triton/cache/QLAEYTJR4KV5WSBGJKRUAKVP475DE47NW7P4XMI2RFXBOIE5TZ4Q/cuda_utils.so: undefined symbol: cuModuleGetFunction
Traceback (most recent call last):
File "/home/tobias/AI/kohya_ss/.venv/bin/accelerate", line 10, in
sys.exit(main())
^^^^^^
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main
args.func(args)
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 1199, in launch_command
simple_launcher(args)
File "/home/tobias/AI/kohya_ss/.venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 785, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/tobias/AI/kohya_ss/.venv/bin/python', '/home/tobias/AI/kohya_ss/sd-scripts/sdxl_train.py', '--config_file', '/home/tobias/Schreibtisch/Destination_training/model/config_dreambooth-20251115-173512.toml']' returned non-zero exit status 1.
17:35:18-079399 INFO Training has ended. `