想微调qwen3-vl-8b-instruct,但遇到 Processor was not found,请问现在llama-factory还不支持这个模型吗 #10113
Replies: 1 comment
-
|
image标签的问题 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
`
W0119 12:17:29.398000 3822638 site-packages/torch/distributed/run.py:803]
W0119 12:17:29.398000 3822638 site-packages/torch/distributed/run.py:803] *****************************************
W0119 12:17:29.398000 3822638 site-packages/torch/distributed/run.py:803] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W0119 12:17:29.398000 3822638 site-packages/torch/distributed/run.py:803] *****************************************
[W119 12:17:36.057487562 ProcessGroupNCCL.cpp:924] Warning: TORCH_NCCL_AVOID_RECORD_STREAMS is the default now, this environment variable is thus deprecated. (function operator())
[W119 12:17:36.130012532 ProcessGroupNCCL.cpp:924] Warning: TORCH_NCCL_AVOID_RECORD_STREAMS is the default now, this environment variable is thus deprecated. (function operator())
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:36,716 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:36,717 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:36,717 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:36,717 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:36,717 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:36,717 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:36,717 >> loading file chat_template.jinja
[INFO|tokenization_utils_base.py:2337] 2026-01-19 12:17:37,081 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|image_processing_base.py:374] 2026-01-19 12:17:37,081 >> loading configuration file /data/model/Qwen3-VL-8B/preprocessor_config.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:37,082 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:37,082 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:37,082 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:37,082 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:37,082 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:37,082 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2066] 2026-01-19 12:17:37,083 >> loading file chat_template.jinja
[INFO|tokenization_utils_base.py:2337] 2026-01-19 12:17:37,459 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Converting format of dataset (num_proc=16): 0%| | 0/75 [00:00<?, ? examples/s]
Converting format of dataset (num_proc=16): 7%|▋ | 5/75 [00:00<00:09, 7.24 examples/s]
Converting format of dataset (num_proc=16): 100%|██████████| 75/75 [00:00<00:00, 124.93 examples/s]
Converting format of dataset (num_proc=16): 100%|██████████| 75/75 [00:00<00:00, 80.12 examples/s]
/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4876: UserWarning: barrier(): using the device under current context. You can specify
device_idininit_process_groupto mute this warning.warnings.warn( # warn only once
[rank0]:[W119 12:17:39.512279645 ProcessGroupNCCL.cpp:5072] Guessing device ID based on global rank. This can cause a hang if rank to GPU mapping is heterogeneous. You can specify device_id in init_process_group()
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:00<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:03<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:03<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:03<?, ? examples/s]
[rank0]: multiprocess.pool.RemoteTraceback:
[rank0]: """
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/multiprocess/pool.py", line 125, in worker
[rank0]: result = (True, func(*args, **kwds))
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/utils/py_utils.py", line 586, in _write_generator_to_queue
[rank0]: for i, result in enumerate(func(**kwargs)):
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3674, in _map_single
[rank0]: for i, batch in iter_outputs(shard_iterable):
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3624, in iter_outputs
[rank0]: yield i, apply_function(example, i, offset=offset)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3547, in apply_function
[rank0]: processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/data/processor/supervised.py", line 99, in preprocess_dataset
[rank0]: input_ids, labels = self._encode_data_example(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/data/processor/supervised.py", line 43, in _encode_data_example
[rank0]: messages = self.template.mm_plugin.process_messages(prompt + response, images, videos, audios, self.processor)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/data/mm_plugin.py", line 1666, in process_messages
[rank0]: self._validate_input(processor, images, videos, audios)
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/data/mm_plugin.py", line 181, in _validate_input
[rank0]: raise ValueError("Processor was not found, please check and update your model file.")
[rank0]: ValueError: Processor was not found, please check and update your model file.
[rank0]: """
[rank0]: The above exception was the direct cause of the following exception:
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/launcher.py", line 185, in
[rank0]: run_exp()
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/train/tuner.py", line 126, in run_exp
[rank0]: _training_function(config={"args": args, "callbacks": callbacks})
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/train/tuner.py", line 88, in _training_function
[rank0]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft
[rank0]: dataset_module = get_dataset(template, model_args, data_args, training_args, stage="sft", **tokenizer_module)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/data/loader.py", line 318, in get_dataset
[rank0]: train_dict["train"] = _get_preprocessed_dataset(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/LlamaFactory/src/llamafactory/data/loader.py", line 255, in _get_preprocessed_dataset
[rank0]: dataset = dataset.map(
[rank0]: ^^^^^^^^^^^^
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 560, in wrapper
[rank0]: out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3309, in map
[rank0]: for rank, done, content in iflatmap_unordered(
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/utils/py_utils.py", line 626, in iflatmap_unordered
[rank0]: [async_result.get(timeout=0.05) for async_result in async_results]
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/datasets/utils/py_utils.py", line 626, in
[rank0]: [async_result.get(timeout=0.05) for async_result in async_results]
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/multiprocess/pool.py", line 774, in get
[rank0]: raise self._value
[rank0]: ValueError: Processor was not found, please check and update your model file.
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:00<?, ? examples/s][rank0]:[W119 12:17:44.453044828 ProcessGroupNCCL.cpp:1524] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:01<?, ? examples/s][rank0]:[W119 12:17:45.615043825 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]
Running tokenizer on dataset (num_proc=16): 0%| | 0/75 [00:02<?, ? examples/s]W0119 12:17:46.404000 3822638 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 3822728 closing signal SIGTERM
E0119 12:17:46.921000 3822638 site-packages/torch/distributed/elastic/multiprocessing/api.py:882] failed (exitcode: 1) local_rank: 0 (pid: 3822727) of binary: /home/user/anaconda3/envs/llama-factory/bin/python3.11
Traceback (most recent call last):
File "/home/user/anaconda3/envs/llama-factory/bin/torchrun", line 7, in
sys.exit(main())
^^^^^^
File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 357, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/torch/distributed/run.py", line 936, in main
run(args)
File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/torch/distributed/run.py", line 927, in run
elastic_launch(
File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 156, in call
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 293, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
/home/user/LlamaFactory/src/llamafactory/launcher.py FAILED
Failures:
<NO_OTHER_FAILURES>
Root Cause (first observed failure):
[0]:
time : 2026-01-19_12:17:46
host : gpu
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 3822727)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[W119 12:17:47.954984534 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
Traceback (most recent call last):
File "/home/user/anaconda3/envs/llama-factory/bin/llamafactory-cli", line 7, in
sys.exit(main())
^^^^^^
File "/home/user/LlamaFactory/src/llamafactory/cli.py", line 24, in main
launcher.launch()
File "/home/user/LlamaFactory/src/llamafactory/launcher.py", line 115, in launch
process = subprocess.run(
^^^^^^^^^^^^^^^
File "/home/user/anaconda3/envs/llama-factory/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['torchrun', '--nnodes', '1', '--node_rank', '0', '--nproc_per_node', '2', '--master_addr', '127.0.0.1', '--master_port', '37243', '/home/user/LlamaFactory/src/llamafactory/launcher.py', 'saves/Qwen3-VL-8B-Instruct/lora/train_2026-01-19-11-46-59/training_args.yaml']' returned non-zero exit status 1.
[W119 12:17:48.995677198 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
`
Beta Was this translation helpful? Give feedback.
All reactions