You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Has anyone encountered the virtual try on function in the sd-webii easyPhoto plugin when uploading clothing templates using cloth upload, which may result in an error message?
#15273
The following values were not passed to accelerate launch and had defaults used instead:
--num_processes was set to a value of 2
More than one GPU was found, enabling multi-GPU training.
If this was unintended please pass in --num_processes=1.
--num_machines was set to a value of 1
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
[2024-03-14 15:23:37,083] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
[W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-L2OV9PU]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。).
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
2024-03-14 15:23:54,211 - modelscope - INFO - PyTorch version 2.1.2+cu121 Found.
2024-03-14 15:23:54,219 - modelscope - INFO - TensorFlow version 2.16.1 Found.
2024-03-14 15:23:54,219 - modelscope - INFO - Loading ast index from C:\Users\lhcx.cache\modelscope\ast_indexer
2024-03-14 15:23:54,376 - modelscope - INFO - Loading done! Current index file version is 1.9.3, with md5 b9af109f4bc3e4b3d70e7f2b23c9b24e and a total number of 943 components indexed
2024-03-14 15:23:54,998 - modelscope - INFO - PyTorch version 2.1.2+cu121 Found.
2024-03-14 15:23:55,004 - modelscope - INFO - TensorFlow version 2.16.1 Found.
2024-03-14 15:23:55,010 - modelscope - INFO - Loading ast index from C:\Users\lhcx.cache\modelscope\ast_indexer
2024-03-14 15:23:55,210 - modelscope - INFO - Loading done! Current index file version is 1.9.3, with md5 b9af109f4bc3e4b3d70e7f2b23c9b24e and a total number of 943 components indexed
[W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-L2OV9PU]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。).
Traceback (most recent call last):
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1390, in
main()
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\utils\gpu_info.py", line 195, in wrapper
result = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 711, in main
accelerator = Accelerator(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init
self.state = AcceleratorState(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 720, in init
PartialState(cpu, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 192, in init
torch.distributed.init_process_group(backend=self.backend, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\c10d_logger.py", line 74, in wrapper
func_return = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1148, in init_process_group
default_pg, _ = _new_process_group_helper(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1268, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in
[W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-L2OV9PU]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。).
Traceback (most recent call last):
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1390, in
main()
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\utils\gpu_info.py", line 195, in wrapper
result = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 711, in main
accelerator = Accelerator(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init
self.state = AcceleratorState(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 720, in init
PartialState(cpu, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 192, in init
torch.distributed.init_process_group(backend=self.backend, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\c10d_logger.py", line 74, in wrapper
func_return = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1148, in init_process_group
default_pg, _ = _new_process_group_helper(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1268, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in
[2024-03-14 15:23:59,246] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 10016) of binary: E:\stable-diffusion-webui-1.8\venv\Scripts\python.exe
Traceback (most recent call last):
File "C:\Users\lhcx\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\lhcx\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 989, in
main()
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 985, in main
launch_command(args)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 970, in launch_command multi_gpu_launcher(args)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 646, in multi_gpu_launcher
distrib_run.run(args)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\run.py", line 797, in run
elastic_launch(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\launcher\api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\launcher\api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py FAILED
Failures:
[1]:
time : 2024-03-14_15:23:59
host : DESKTOP-L2OV9PU
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 13156)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure):
[0]:
time : 2024-03-14_15:23:59
host : DESKTOP-L2OV9PU
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 10016)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
2024-03-14 15:24:00,374 - EasyPhoto - Error executing the command: Command '['E:\stable-diffusion-webui-1.8\venv\Scripts\python.exe', '-m', 'accelerate.commands.launch', '--mixed_precision=fp16', '--main_process_port=3456', 'E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py', '--pretrained_model_name_or_path=extensions\sd-webui-EasyPhoto\models\stable-diffusion-v1-5', '--pretrained_model_ckpt=models\Stable-diffusion\Chilloutmix-Ni-pruned-fp16-fix.safetensors', '--train_data_dir=outputs\easyphoto-cloth-id-infos\test_black_200_200\processed_images', '--caption_column=text', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--dataloader_num_workers=0', '--max_train_steps=200', '--checkpointing_steps=200', '--learning_rate=0.0001', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--train_text_encoder', '--seed=42', '--rank=128', '--network_alpha=64', '--output_dir=E:\stable-diffusion-webui-1.8\outputs/easyphoto-cloth-id-infos\test_black_200_200\user_weights', '--logging_dir=E:\stable-diffusion-webui-1.8\outputs/easyphoto-cloth-id-infos\test_black_200_200\user_weights', '--enable_xformers_memory_efficient_attention', '--mixed_precision=fp16', '--cach
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
The following values were not passed to accelerate launch and had defaults used instead:
--num_processes was set to a value of 2
More than one GPU was found, enabling multi-GPU training.
If this was unintended please pass in --num_processes=1.
--num_machines was set to a value of 1
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
[2024-03-14 15:23:37,083] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
[W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-L2OV9PU]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。).
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
2024-03-14 15:23:54,211 - modelscope - INFO - PyTorch version 2.1.2+cu121 Found.
2024-03-14 15:23:54,219 - modelscope - INFO - TensorFlow version 2.16.1 Found.
2024-03-14 15:23:54,219 - modelscope - INFO - Loading ast index from C:\Users\lhcx.cache\modelscope\ast_indexer
2024-03-14 15:23:54,376 - modelscope - INFO - Loading done! Current index file version is 1.9.3, with md5 b9af109f4bc3e4b3d70e7f2b23c9b24e and a total number of 943 components indexed
2024-03-14 15:23:54,998 - modelscope - INFO - PyTorch version 2.1.2+cu121 Found.
2024-03-14 15:23:55,004 - modelscope - INFO - TensorFlow version 2.16.1 Found.
2024-03-14 15:23:55,010 - modelscope - INFO - Loading ast index from C:\Users\lhcx.cache\modelscope\ast_indexer
2024-03-14 15:23:55,210 - modelscope - INFO - Loading done! Current index file version is 1.9.3, with md5 b9af109f4bc3e4b3d70e7f2b23c9b24e and a total number of 943 components indexed
[W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-L2OV9PU]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。).
Traceback (most recent call last):
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1390, in
main()
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\utils\gpu_info.py", line 195, in wrapper
result = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 711, in main
accelerator = Accelerator(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init
self.state = AcceleratorState(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 720, in init
PartialState(cpu, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 192, in init
torch.distributed.init_process_group(backend=self.backend, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\c10d_logger.py", line 74, in wrapper
func_return = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1148, in init_process_group
default_pg, _ = _new_process_group_helper(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1268, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in
[W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-L2OV9PU]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。).
Traceback (most recent call last):
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1390, in
main()
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\utils\gpu_info.py", line 195, in wrapper
result = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 711, in main
accelerator = Accelerator(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init
self.state = AcceleratorState(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 720, in init
PartialState(cpu, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\state.py", line 192, in init
torch.distributed.init_process_group(backend=self.backend, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\c10d_logger.py", line 74, in wrapper
func_return = func(*args, **kwargs)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1148, in init_process_group
default_pg, _ = _new_process_group_helper(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1268, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in
[2024-03-14 15:23:59,246] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 10016) of binary: E:\stable-diffusion-webui-1.8\venv\Scripts\python.exe
Traceback (most recent call last):
File "C:\Users\lhcx\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\lhcx\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 989, in
main()
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 985, in main
launch_command(args)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 970, in launch_command multi_gpu_launcher(args)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\accelerate\commands\launch.py", line 646, in multi_gpu_launcher
distrib_run.run(args)
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\run.py", line 797, in run
elastic_launch(
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\launcher\api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "E:\stable-diffusion-webui-1.8\venv\lib\site-packages\torch\distributed\launcher\api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py FAILED
Failures:
[1]:
time : 2024-03-14_15:23:59
host : DESKTOP-L2OV9PU
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 13156)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure):
[0]:
time : 2024-03-14_15:23:59
host : DESKTOP-L2OV9PU
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 10016)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
2024-03-14 15:24:00,374 - EasyPhoto - Error executing the command: Command '['E:\stable-diffusion-webui-1.8\venv\Scripts\python.exe', '-m', 'accelerate.commands.launch', '--mixed_precision=fp16', '--main_process_port=3456', 'E:\stable-diffusion-webui-1.8\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py', '--pretrained_model_name_or_path=extensions\sd-webui-EasyPhoto\models\stable-diffusion-v1-5', '--pretrained_model_ckpt=models\Stable-diffusion\Chilloutmix-Ni-pruned-fp16-fix.safetensors', '--train_data_dir=outputs\easyphoto-cloth-id-infos\test_black_200_200\processed_images', '--caption_column=text', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--dataloader_num_workers=0', '--max_train_steps=200', '--checkpointing_steps=200', '--learning_rate=0.0001', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--train_text_encoder', '--seed=42', '--rank=128', '--network_alpha=64', '--output_dir=E:\stable-diffusion-webui-1.8\outputs/easyphoto-cloth-id-infos\test_black_200_200\user_weights', '--logging_dir=E:\stable-diffusion-webui-1.8\outputs/easyphoto-cloth-id-infos\test_black_200_200\user_weights', '--enable_xformers_memory_efficient_attention', '--mixed_precision=fp16', '--cach
Beta Was this translation helpful? Give feedback.
All reactions