Update requirements.txt #1245

Farewell-CK · 2025-09-07T02:38:23Z

The latest version of the Paddleformers library is 0.2.1. If the default download is the latest, the following error will occur.

/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
W0907 09:45:14.021631  3142 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 12.8, Runtime API Version: 12.6
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
LAUNCH INFO 2025-09-07 09:45:17,460 -----------  Configuration  ----------------------
LAUNCH INFO 2025-09-07 09:45:17,460 auto_cluster_config: 0
LAUNCH INFO 2025-09-07 09:45:17,460 auto_parallel_config: None
LAUNCH INFO 2025-09-07 09:45:17,460 auto_tuner_json: None
LAUNCH INFO 2025-09-07 09:45:17,460 devices: 0
LAUNCH INFO 2025-09-07 09:45:17,460 elastic_level: -1
LAUNCH INFO 2025-09-07 09:45:17,460 elastic_timeout: 30
LAUNCH INFO 2025-09-07 09:45:17,460 enable_gpu_log: True
LAUNCH INFO 2025-09-07 09:45:17,460 gloo_port: 6767
LAUNCH INFO 2025-09-07 09:45:17,460 host: None
LAUNCH INFO 2025-09-07 09:45:17,460 ips: None
LAUNCH INFO 2025-09-07 09:45:17,460 job_id: default
LAUNCH INFO 2025-09-07 09:45:17,460 legacy: False
LAUNCH INFO 2025-09-07 09:45:17,460 log_dir: erniekit_dist_log
LAUNCH INFO 2025-09-07 09:45:17,460 log_level: INFO
LAUNCH INFO 2025-09-07 09:45:17,460 log_overwrite: False
LAUNCH INFO 2025-09-07 09:45:17,460 master: 127.0.0.1:8080
LAUNCH INFO 2025-09-07 09:45:17,460 max_restart: 3
LAUNCH INFO 2025-09-07 09:45:17,460 nnodes: 1
LAUNCH INFO 2025-09-07 09:45:17,460 nproc_per_node: None
LAUNCH INFO 2025-09-07 09:45:17,460 rank: -1
LAUNCH INFO 2025-09-07 09:45:17,460 run_mode: collective
LAUNCH INFO 2025-09-07 09:45:17,460 server_num: None
LAUNCH INFO 2025-09-07 09:45:17,460 servers: 
LAUNCH INFO 2025-09-07 09:45:17,460 sort_ip: False
LAUNCH INFO 2025-09-07 09:45:17,460 start_port: 6070
LAUNCH INFO 2025-09-07 09:45:17,460 trainer_num: None
LAUNCH INFO 2025-09-07 09:45:17,460 trainers: 
LAUNCH INFO 2025-09-07 09:45:17,461 training_script: /home/aistudio/ERNIE/erniekit/launcher.py
LAUNCH INFO 2025-09-07 09:45:17,461 training_script_args: ['train', '/home/aistudio/configs/ERNIE-4.5-0.3B/sft/run_sft_lora_32k.yaml']
LAUNCH INFO 2025-09-07 09:45:17,461 with_gloo: 1
LAUNCH INFO 2025-09-07 09:45:17,461 --------------------------------------------------
LAUNCH INFO 2025-09-07 09:45:17,461 Job: default, mode collective, replicas 1[1:1], elastic False
LAUNCH INFO 2025-09-07 09:45:17,462 Run Pod: wsqqoa, replicas 1, status ready
LAUNCH INFO 2025-09-07 09:45:17,474 Watching Pod: wsqqoa, replicas 1, status running
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
W0907 09:45:21.091935  3501 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 12.8, Runtime API Version: 12.6
[2025-09-07 09:45:21,425] [    INFO] - user has defined resume_from_checkpoint: None
[2025-09-07 09:45:21,426] [    INFO] - The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
[2025-09-07 09:45:21,426] [    INFO] - reset finetuning arguments global_batch_size to 8
[2025-09-07 09:45:21,426] [ WARNING] - eval_batch_size set to 1
[2025-09-07 09:45:21,457] [    INFO] - Tensor_parallel_degree = 1. Set sequence_parallel to False.
[2025-09-07 09:45:21,458] [   DEBUG] - ============================================================
[2025-09-07 09:45:21,458] [   DEBUG] -      Model Configuration Arguments      
[2025-09-07 09:45:21,458] [   DEBUG] - paddle commit id              : 0cd8342db88f96ef57398e888b050b5077ae6e73
[2025-09-07 09:45:21,458] [   DEBUG] - paddleformers commit id       : c8e46d9f40ac234baac500b0621873f2da7b9d47
[2025-09-07 09:45:21,458] [   DEBUG] - add_tail_layers               : False
[2025-09-07 09:45:21,458] [   DEBUG] - bos_token_id                  : 0
[2025-09-07 09:45:21,458] [   DEBUG] - continue_training             : True
[2025-09-07 09:45:21,458] [   DEBUG] - download_hub                  : None
[2025-09-07 09:45:21,458] [   DEBUG] - eos_token_id                  : 1
[2025-09-07 09:45:21,459] [   DEBUG] - fine_tuning                   : LoRA
[2025-09-07 09:45:21,459] [   DEBUG] - fuse_gate_detach_matmul       : True
[2025-09-07 09:45:21,459] [   DEBUG] - fuse_linear                   : False
[2025-09-07 09:45:21,459] [   DEBUG] - fuse_rms_norm                 : True
[2025-09-07 09:45:21,459] [   DEBUG] - fuse_rope                     : True
[2025-09-07 09:45:21,459] [   DEBUG] - fuse_softmax_mask             : False
[2025-09-07 09:45:21,459] [   DEBUG] - fuse_swiglu                   : True
[2025-09-07 09:45:21,459] [   DEBUG] - lora                          : True
[2025-09-07 09:45:21,459] [   DEBUG] - lora_alpha                    : -1
[2025-09-07 09:45:21,459] [   DEBUG] - lora_path                     : None
[2025-09-07 09:45:21,459] [   DEBUG] - lora_plus_scale               : 1.0
[2025-09-07 09:45:21,459] [   DEBUG] - lora_rank                     : 32
[2025-09-07 09:45:21,459] [   DEBUG] - loss_subbatch_seqlen          : 32768
[2025-09-07 09:45:21,459] [   DEBUG] - max_position_embeddings       : 4096
[2025-09-07 09:45:21,459] [   DEBUG] - model_name_or_path            : /home/aistudio/model/ERNIE-4.5-0.3B-Paddle
[2025-09-07 09:45:21,459] [   DEBUG] - moe_aux_loss_lambda           : 1e-05
[2025-09-07 09:45:21,459] [   DEBUG] - moe_gate                      : top2_fused
[2025-09-07 09:45:21,459] [   DEBUG] - moe_group                     : dummy
[2025-09-07 09:45:21,459] [   DEBUG] - moe_group_experts             : False
[2025-09-07 09:45:21,459] [   DEBUG] - moe_multimodal_dispatch_use_allgather: v2-alltoall-unpad
[2025-09-07 09:45:21,459] [   DEBUG] - moe_orthogonal_loss_lambda    : 0.0
[2025-09-07 09:45:21,459] [   DEBUG] - moe_use_aux_free              : None
[2025-09-07 09:45:21,460] [   DEBUG] - moe_use_hard_gate             : False
[2025-09-07 09:45:21,460] [   DEBUG] - moe_with_send_router_loss     : False
[2025-09-07 09:45:21,460] [   DEBUG] - moe_z_loss_lambda             : 0.0
[2025-09-07 09:45:21,460] [   DEBUG] - no_recompute_layers           : None
[2025-09-07 09:45:21,460] [   DEBUG] - num_nextn_predict_layers      : 0
[2025-09-07 09:45:21,460] [   DEBUG] - offload_recompute_inputs      : False
[2025-09-07 09:45:21,460] [   DEBUG] - pp_seg_method                 : layer:Ernie4_5_DecoderLayer|ErnieDecoderLayer|EmptyLayer
[2025-09-07 09:45:21,460] [   DEBUG] - recompute_granularity         : full
[2025-09-07 09:45:21,460] [   DEBUG] - recompute_use_reentrant       : True
[2025-09-07 09:45:21,460] [   DEBUG] - rope_3d                       : True
[2025-09-07 09:45:21,460] [   DEBUG] - rslora                        : False
[2025-09-07 09:45:21,460] [   DEBUG] - rslora_plus                   : False
[2025-09-07 09:45:21,460] [   DEBUG] - stage                         : SFT
[2025-09-07 09:45:21,460] [   DEBUG] - tensor_parallel_output        : True
[2025-09-07 09:45:21,460] [   DEBUG] - use_attn_mask_start_row_indices: True
[2025-09-07 09:45:21,460] [   DEBUG] - use_flash_attention           : True
[2025-09-07 09:45:21,460] [   DEBUG] - use_flash_attn_with_mask      : True
[2025-09-07 09:45:21,460] [   DEBUG] - use_fused_head_and_loss_fn    : False
[2025-09-07 09:45:21,460] [   DEBUG] - use_mem_eff_attn              : True
[2025-09-07 09:45:21,460] [   DEBUG] - use_recompute_loss_fn         : True
[2025-09-07 09:45:21,460] [   DEBUG] - use_recompute_moe             : False
[2025-09-07 09:45:21,460] [   DEBUG] - use_sparse_flash_attn         : True
[2025-09-07 09:45:21,460] [   DEBUG] - use_sparse_head_and_loss_fn   : True
[2025-09-07 09:45:21,461] [   DEBUG] - virtual_pp_degree             : 1
[2025-09-07 09:45:21,461] [   DEBUG] - vision_config                 : VisionArguments(attn_implementation='eager', attn_sep=True, depth=32, embed_dim=1280, hidden_act='quick_gelu', hidden_size=1280, in_channels=3, in_chans=3, mlp_ratio=4, model_type='DFNRope_vision_transformer', num_heads=16, patch_size=14, spatial_merge_size=2, spatial_patch_size=14, tensor_parallel_degree=4, use_recompute=True, vit_num_recompute_layers=10000)
[2025-09-07 09:45:21,461] [   DEBUG] - 
[2025-09-07 09:45:21,461] [   DEBUG] - ============================================================
[2025-09-07 09:45:21,461] [   DEBUG] -       Data Configuration Arguments      
[2025-09-07 09:45:21,461] [   DEBUG] - paddle commit id              : 0cd8342db88f96ef57398e888b050b5077ae6e73
[2025-09-07 09:45:21,461] [   DEBUG] - paddleformers commit id       : c8e46d9f40ac234baac500b0621873f2da7b9d47
[2025-09-07 09:45:21,461] [   DEBUG] - buffer_size                   : 500
[2025-09-07 09:45:21,461] [   DEBUG] - dataset_name                  : KnowledgeBasedSFTReader
[2025-09-07 09:45:21,461] [   DEBUG] - dataset_type                  : iterable
[2025-09-07 09:45:21,461] [   DEBUG] - eval_dataset_path             : /home/aistudio/datasets/val.jsonl
[2025-09-07 09:45:21,461] [   DEBUG] - eval_dataset_prob             : 1.0
[2025-09-07 09:45:21,461] [   DEBUG] - eval_dataset_type             : erniekit
[2025-09-07 09:45:21,461] [   DEBUG] - greedy_intokens               : True
[2025-09-07 09:45:21,461] [   DEBUG] - in_tokens_batching            : True
[2025-09-07 09:45:21,461] [   DEBUG] - mask_out_eos_token            : True
[2025-09-07 09:45:21,461] [   DEBUG] - max_prompt_len                : 2048
[2025-09-07 09:45:21,461] [   DEBUG] - max_seq_len                   : 32768
[2025-09-07 09:45:21,461] [   DEBUG] - num_comparisons               : 6
[2025-09-07 09:45:21,461] [   DEBUG] - num_samples_each_epoch        : 6000000
[2025-09-07 09:45:21,461] [   DEBUG] - offline_dataset_path          : None
[2025-09-07 09:45:21,462] [   DEBUG] - random_shuffle                : True
[2025-09-07 09:45:21,462] [   DEBUG] - text_dataset_path             : None
[2025-09-07 09:45:21,462] [   DEBUG] - text_dataset_prob             : None
[2025-09-07 09:45:21,462] [   DEBUG] - train_dataset_path            : /home/aistudio/datasets/train.jsonl
[2025-09-07 09:45:21,462] [   DEBUG] - train_dataset_prob            : 1.0
[2025-09-07 09:45:21,462] [   DEBUG] - train_dataset_type            : erniekit
[2025-09-07 09:45:21,462] [   DEBUG] - use_cls                       : True
[2025-09-07 09:45:21,462] [   DEBUG] - 
[2025-09-07 09:45:21,462] [    INFO] - The global seed is set to 23, local seed is set to 24 and random seed is set to 23.
[2025-09-07 09:45:21,462] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: True
[2025-09-07 09:45:21,462] [    INFO] - Start to load model ...
[2025-09-07 09:45:21,463] [    INFO] - Using download source: huggingface
[2025-09-07 09:45:21,463] [    INFO] - Loading configuration file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/config.json
[2025-09-07 09:45:21,463] [ WARNING] - You are using a model of type ernie4_5 to instantiate a model of type ernie4_5_moe. This is not supported for all configurations of models and can yield errors.
[2025-09-07 09:45:21,464] [    INFO] - Loading weights file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/model.safetensors
[2025-09-07 09:45:21,870] [    INFO] - Loaded weights file from disk, setting weights to model.
Traceback (most recent call last):
  File "/home/aistudio/ERNIE/erniekit/launcher.py", line 46, in <module>
    launch()
  File "/home/aistudio/ERNIE/erniekit/launcher.py", line 34, in launch
    run_tuner()
  File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 76, in run_tuner
    _training_function(config={"args": args})
  File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 55, in _training_function
    run_sft(model_args, data_args, generating_args, finetuning_args)
  File "/home/aistudio/ERNIE/erniekit/train/sft/workflow.py", line 362, in run_sft
    model = model_class.from_pretrained(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/model_utils.py", line 2665, in from_pretrained
    model = cls(config, *init_args, **model_kwargs)
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/utils.py", line 290, in __impl__
    init_func(self, *args, **kwargs)
TypeError: Ernie4_5_MoeForCausalLM.__init__() got an unexpected keyword argument 'convert_from_hf'
LAUNCH INFO 2025-09-07 09:45:22,480 Pod failed
LAUNCH ERROR 2025-09-07 09:45:22,480 Container failed !!!
Container rank 0 status failed cmd ['/opt/conda/envs/python35-paddle120-env/bin/python', '-u', '/home/aistudio/ERNIE/erniekit/launcher.py', 'train', '/home/aistudio/configs/ERNIE-4.5-0.3B/sft/run_sft_lora_32k.yaml'] code 1 log erniekit_dist_log/workerlog.0
LAUNCH INFO 2025-09-07 09:45:22,480 ------------------------- ERROR LOG DETAIL -------------------------
m[2025-09-07 09:45:21,461] [   DEBUG] - num_samples_each_epoch        : 6000000
[2025-09-07 09:45:21,461] [   DEBUG] - offline_dataset_path          : None
[2025-09-07 09:45:21,462] [   DEBUG] - random_shuffle                : True
[2025-09-07 09:45:21,462] [   DEBUG] - text_dataset_path             : None
[2025-09-07 09:45:21,462] [   DEBUG] - text_dataset_prob             : None
[2025-09-07 09:45:21,462] [   DEBUG] - train_dataset_path            : /home/aistudio/datasets/train.jsonl
[2025-09-07 09:45:21,462] [   DEBUG] - train_dataset_prob            : 1.0
[2025-09-07 09:45:21,462] [   DEBUG] - train_dataset_type            : erniekit
[2025-09-07 09:45:21,462] [   DEBUG] - use_cls                       : True
[2025-09-07 09:45:21,462] [   DEBUG] - 
[2025-09-07 09:45:21,462] [    INFO] - The global seed is set to 23, local seed is set to 24 and random seed is set to 23.
[2025-09-07 09:45:21,462] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: True
[2025-09-07 09:45:21,462] [    INFO] - Start to load model ...
[2025-09-07 09:45:21,463] [    INFO] - Using download source: huggingface
[2025-09-07 09:45:21,463] [    INFO] - Loading configuration file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/config.json
[2025-09-07 09:45:21,463] [ WARNING] - You are using a model of type ernie4_5 to instantiate a model of type ernie4_5_moe. This is not supported for all configurations of models and can yield errors.
[2025-09-07 09:45:21,464] [    INFO] - Loading weights file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/model.safetensors
[2025-09-07 09:45:21,870] [    INFO] - Loaded weights file from disk, setting weights to model.
Traceback (most recent call last):
  File "/home/aistudio/ERNIE/erniekit/launcher.py", line 46, in <module>
    launch()
  File "/home/aistudio/ERNIE/erniekit/launcher.py", line 34, in launch
    run_tuner()
  File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 76, in run_tuner
    _training_function(config={"args": args})
  File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 55, in _training_function
    run_sft(model_args, data_args, generating_args, finetuning_args)
  File "/home/aistudio/ERNIE/erniekit/train/sft/workflow.py", line 362, in run_sft
    model = model_class.from_pretrained(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/model_utils.py", line 2665, in from_pretrained
    model = cls(config, *init_args, **model_kwargs)
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/utils.py", line 290, in __impl__
    init_func(self, *args, **kwargs)
TypeError: Ernie4_5_MoeForCausalLM.__init__() got an unexpected keyword argument 'convert_from_hf'
LAUNCH INFO 2025-09-07 09:45:22,480 Exit code 1

The latest version of the Paddleformers library is 0.2.1. If the default download is the latest, the following error will occur. ``` /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml warnings.warn( W0907 09:45:14.021631 3142 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 12.8, Runtime API Version: 12.6 /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) LAUNCH INFO 2025-09-07 09:45:17,460 ----------- Configuration ---------------------- LAUNCH INFO 2025-09-07 09:45:17,460 auto_cluster_config: 0 LAUNCH INFO 2025-09-07 09:45:17,460 auto_parallel_config: None LAUNCH INFO 2025-09-07 09:45:17,460 auto_tuner_json: None LAUNCH INFO 2025-09-07 09:45:17,460 devices: 0 LAUNCH INFO 2025-09-07 09:45:17,460 elastic_level: -1 LAUNCH INFO 2025-09-07 09:45:17,460 elastic_timeout: 30 LAUNCH INFO 2025-09-07 09:45:17,460 enable_gpu_log: True LAUNCH INFO 2025-09-07 09:45:17,460 gloo_port: 6767 LAUNCH INFO 2025-09-07 09:45:17,460 host: None LAUNCH INFO 2025-09-07 09:45:17,460 ips: None LAUNCH INFO 2025-09-07 09:45:17,460 job_id: default LAUNCH INFO 2025-09-07 09:45:17,460 legacy: False LAUNCH INFO 2025-09-07 09:45:17,460 log_dir: erniekit_dist_log LAUNCH INFO 2025-09-07 09:45:17,460 log_level: INFO LAUNCH INFO 2025-09-07 09:45:17,460 log_overwrite: False LAUNCH INFO 2025-09-07 09:45:17,460 master: 127.0.0.1:8080 LAUNCH INFO 2025-09-07 09:45:17,460 max_restart: 3 LAUNCH INFO 2025-09-07 09:45:17,460 nnodes: 1 LAUNCH INFO 2025-09-07 09:45:17,460 nproc_per_node: None LAUNCH INFO 2025-09-07 09:45:17,460 rank: -1 LAUNCH INFO 2025-09-07 09:45:17,460 run_mode: collective LAUNCH INFO 2025-09-07 09:45:17,460 server_num: None LAUNCH INFO 2025-09-07 09:45:17,460 servers: LAUNCH INFO 2025-09-07 09:45:17,460 sort_ip: False LAUNCH INFO 2025-09-07 09:45:17,460 start_port: 6070 LAUNCH INFO 2025-09-07 09:45:17,460 trainer_num: None LAUNCH INFO 2025-09-07 09:45:17,460 trainers: LAUNCH INFO 2025-09-07 09:45:17,461 training_script: /home/aistudio/ERNIE/erniekit/launcher.py LAUNCH INFO 2025-09-07 09:45:17,461 training_script_args: ['train', '/home/aistudio/configs/ERNIE-4.5-0.3B/sft/run_sft_lora_32k.yaml'] LAUNCH INFO 2025-09-07 09:45:17,461 with_gloo: 1 LAUNCH INFO 2025-09-07 09:45:17,461 -------------------------------------------------- LAUNCH INFO 2025-09-07 09:45:17,461 Job: default, mode collective, replicas 1[1:1], elastic False LAUNCH INFO 2025-09-07 09:45:17,462 Run Pod: wsqqoa, replicas 1, status ready LAUNCH INFO 2025-09-07 09:45:17,474 Watching Pod: wsqqoa, replicas 1, status running /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. /opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml warnings.warn( W0907 09:45:21.091935 3501 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 12.8, Runtime API Version: 12.6 [2025-09-07 09:45:21,425] [ INFO] - user has defined resume_from_checkpoint: None [2025-09-07 09:45:21,426] [ INFO] - The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-). [2025-09-07 09:45:21,426] [ INFO] - reset finetuning arguments global_batch_size to 8 [2025-09-07 09:45:21,426] [ WARNING] - eval_batch_size set to 1 [2025-09-07 09:45:21,457] [ INFO] - Tensor_parallel_degree = 1. Set sequence_parallel to False. [2025-09-07 09:45:21,458] [ DEBUG] - ============================================================ [2025-09-07 09:45:21,458] [ DEBUG] - Model Configuration Arguments [2025-09-07 09:45:21,458] [ DEBUG] - paddle commit id : 0cd8342db88f96ef57398e888b050b5077ae6e73 [2025-09-07 09:45:21,458] [ DEBUG] - paddleformers commit id : c8e46d9f40ac234baac500b0621873f2da7b9d47 [2025-09-07 09:45:21,458] [ DEBUG] - add_tail_layers : False [2025-09-07 09:45:21,458] [ DEBUG] - bos_token_id : 0 [2025-09-07 09:45:21,458] [ DEBUG] - continue_training : True [2025-09-07 09:45:21,458] [ DEBUG] - download_hub : None [2025-09-07 09:45:21,458] [ DEBUG] - eos_token_id : 1 [2025-09-07 09:45:21,459] [ DEBUG] - fine_tuning : LoRA [2025-09-07 09:45:21,459] [ DEBUG] - fuse_gate_detach_matmul : True [2025-09-07 09:45:21,459] [ DEBUG] - fuse_linear : False [2025-09-07 09:45:21,459] [ DEBUG] - fuse_rms_norm : True [2025-09-07 09:45:21,459] [ DEBUG] - fuse_rope : True [2025-09-07 09:45:21,459] [ DEBUG] - fuse_softmax_mask : False [2025-09-07 09:45:21,459] [ DEBUG] - fuse_swiglu : True [2025-09-07 09:45:21,459] [ DEBUG] - lora : True [2025-09-07 09:45:21,459] [ DEBUG] - lora_alpha : -1 [2025-09-07 09:45:21,459] [ DEBUG] - lora_path : None [2025-09-07 09:45:21,459] [ DEBUG] - lora_plus_scale : 1.0 [2025-09-07 09:45:21,459] [ DEBUG] - lora_rank : 32 [2025-09-07 09:45:21,459] [ DEBUG] - loss_subbatch_seqlen : 32768 [2025-09-07 09:45:21,459] [ DEBUG] - max_position_embeddings : 4096 [2025-09-07 09:45:21,459] [ DEBUG] - model_name_or_path : /home/aistudio/model/ERNIE-4.5-0.3B-Paddle [2025-09-07 09:45:21,459] [ DEBUG] - moe_aux_loss_lambda : 1e-05 [2025-09-07 09:45:21,459] [ DEBUG] - moe_gate : top2_fused [2025-09-07 09:45:21,459] [ DEBUG] - moe_group : dummy [2025-09-07 09:45:21,459] [ DEBUG] - moe_group_experts : False [2025-09-07 09:45:21,459] [ DEBUG] - moe_multimodal_dispatch_use_allgather: v2-alltoall-unpad [2025-09-07 09:45:21,459] [ DEBUG] - moe_orthogonal_loss_lambda : 0.0 [2025-09-07 09:45:21,459] [ DEBUG] - moe_use_aux_free : None [2025-09-07 09:45:21,460] [ DEBUG] - moe_use_hard_gate : False [2025-09-07 09:45:21,460] [ DEBUG] - moe_with_send_router_loss : False [2025-09-07 09:45:21,460] [ DEBUG] - moe_z_loss_lambda : 0.0 [2025-09-07 09:45:21,460] [ DEBUG] - no_recompute_layers : None [2025-09-07 09:45:21,460] [ DEBUG] - num_nextn_predict_layers : 0 [2025-09-07 09:45:21,460] [ DEBUG] - offload_recompute_inputs : False [2025-09-07 09:45:21,460] [ DEBUG] - pp_seg_method : layer:Ernie4_5_DecoderLayer|ErnieDecoderLayer|EmptyLayer [2025-09-07 09:45:21,460] [ DEBUG] - recompute_granularity : full [2025-09-07 09:45:21,460] [ DEBUG] - recompute_use_reentrant : True [2025-09-07 09:45:21,460] [ DEBUG] - rope_3d : True [2025-09-07 09:45:21,460] [ DEBUG] - rslora : False [2025-09-07 09:45:21,460] [ DEBUG] - rslora_plus : False [2025-09-07 09:45:21,460] [ DEBUG] - stage : SFT [2025-09-07 09:45:21,460] [ DEBUG] - tensor_parallel_output : True [2025-09-07 09:45:21,460] [ DEBUG] - use_attn_mask_start_row_indices: True [2025-09-07 09:45:21,460] [ DEBUG] - use_flash_attention : True [2025-09-07 09:45:21,460] [ DEBUG] - use_flash_attn_with_mask : True [2025-09-07 09:45:21,460] [ DEBUG] - use_fused_head_and_loss_fn : False [2025-09-07 09:45:21,460] [ DEBUG] - use_mem_eff_attn : True [2025-09-07 09:45:21,460] [ DEBUG] - use_recompute_loss_fn : True [2025-09-07 09:45:21,460] [ DEBUG] - use_recompute_moe : False [2025-09-07 09:45:21,460] [ DEBUG] - use_sparse_flash_attn : True [2025-09-07 09:45:21,460] [ DEBUG] - use_sparse_head_and_loss_fn : True [2025-09-07 09:45:21,461] [ DEBUG] - virtual_pp_degree : 1 [2025-09-07 09:45:21,461] [ DEBUG] - vision_config : VisionArguments(attn_implementation='eager', attn_sep=True, depth=32, embed_dim=1280, hidden_act='quick_gelu', hidden_size=1280, in_channels=3, in_chans=3, mlp_ratio=4, model_type='DFNRope_vision_transformer', num_heads=16, patch_size=14, spatial_merge_size=2, spatial_patch_size=14, tensor_parallel_degree=4, use_recompute=True, vit_num_recompute_layers=10000) [2025-09-07 09:45:21,461] [ DEBUG] - [2025-09-07 09:45:21,461] [ DEBUG] - ============================================================ [2025-09-07 09:45:21,461] [ DEBUG] - Data Configuration Arguments [2025-09-07 09:45:21,461] [ DEBUG] - paddle commit id : 0cd8342db88f96ef57398e888b050b5077ae6e73 [2025-09-07 09:45:21,461] [ DEBUG] - paddleformers commit id : c8e46d9f40ac234baac500b0621873f2da7b9d47 [2025-09-07 09:45:21,461] [ DEBUG] - buffer_size : 500 [2025-09-07 09:45:21,461] [ DEBUG] - dataset_name : KnowledgeBasedSFTReader [2025-09-07 09:45:21,461] [ DEBUG] - dataset_type : iterable [2025-09-07 09:45:21,461] [ DEBUG] - eval_dataset_path : /home/aistudio/datasets/val.jsonl [2025-09-07 09:45:21,461] [ DEBUG] - eval_dataset_prob : 1.0 [2025-09-07 09:45:21,461] [ DEBUG] - eval_dataset_type : erniekit [2025-09-07 09:45:21,461] [ DEBUG] - greedy_intokens : True [2025-09-07 09:45:21,461] [ DEBUG] - in_tokens_batching : True [2025-09-07 09:45:21,461] [ DEBUG] - mask_out_eos_token : True [2025-09-07 09:45:21,461] [ DEBUG] - max_prompt_len : 2048 [2025-09-07 09:45:21,461] [ DEBUG] - max_seq_len : 32768 [2025-09-07 09:45:21,461] [ DEBUG] - num_comparisons : 6 [2025-09-07 09:45:21,461] [ DEBUG] - num_samples_each_epoch : 6000000 [2025-09-07 09:45:21,461] [ DEBUG] - offline_dataset_path : None [2025-09-07 09:45:21,462] [ DEBUG] - random_shuffle : True [2025-09-07 09:45:21,462] [ DEBUG] - text_dataset_path : None [2025-09-07 09:45:21,462] [ DEBUG] - text_dataset_prob : None [2025-09-07 09:45:21,462] [ DEBUG] - train_dataset_path : /home/aistudio/datasets/train.jsonl [2025-09-07 09:45:21,462] [ DEBUG] - train_dataset_prob : 1.0 [2025-09-07 09:45:21,462] [ DEBUG] - train_dataset_type : erniekit [2025-09-07 09:45:21,462] [ DEBUG] - use_cls : True [2025-09-07 09:45:21,462] [ DEBUG] - [2025-09-07 09:45:21,462] [ INFO] - The global seed is set to 23, local seed is set to 24 and random seed is set to 23. [2025-09-07 09:45:21,462] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: True [2025-09-07 09:45:21,462] [ INFO] - Start to load model ... [2025-09-07 09:45:21,463] [ INFO] - Using download source: huggingface [2025-09-07 09:45:21,463] [ INFO] - Loading configuration file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/config.json [2025-09-07 09:45:21,463] [ WARNING] - You are using a model of type ernie4_5 to instantiate a model of type ernie4_5_moe. This is not supported for all configurations of models and can yield errors. [2025-09-07 09:45:21,464] [ INFO] - Loading weights file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/model.safetensors [2025-09-07 09:45:21,870] [ INFO] - Loaded weights file from disk, setting weights to model. Traceback (most recent call last): File "/home/aistudio/ERNIE/erniekit/launcher.py", line 46, in <module> launch() File "/home/aistudio/ERNIE/erniekit/launcher.py", line 34, in launch run_tuner() File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 76, in run_tuner _training_function(config={"args": args}) File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 55, in _training_function run_sft(model_args, data_args, generating_args, finetuning_args) File "/home/aistudio/ERNIE/erniekit/train/sft/workflow.py", line 362, in run_sft model = model_class.from_pretrained( File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/model_utils.py", line 2665, in from_pretrained model = cls(config, *init_args, **model_kwargs) File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/utils.py", line 290, in __impl__ init_func(self, *args, **kwargs) TypeError: Ernie4_5_MoeForCausalLM.__init__() got an unexpected keyword argument 'convert_from_hf' LAUNCH INFO 2025-09-07 09:45:22,480 Pod failed LAUNCH ERROR 2025-09-07 09:45:22,480 Container failed !!! Container rank 0 status failed cmd ['/opt/conda/envs/python35-paddle120-env/bin/python', '-u', '/home/aistudio/ERNIE/erniekit/launcher.py', 'train', '/home/aistudio/configs/ERNIE-4.5-0.3B/sft/run_sft_lora_32k.yaml'] code 1 log erniekit_dist_log/workerlog.0 LAUNCH INFO 2025-09-07 09:45:22,480 ------------------------- ERROR LOG DETAIL ------------------------- m[2025-09-07 09:45:21,461] [ DEBUG] - num_samples_each_epoch : 6000000 [2025-09-07 09:45:21,461] [ DEBUG] - offline_dataset_path : None [2025-09-07 09:45:21,462] [ DEBUG] - random_shuffle : True [2025-09-07 09:45:21,462] [ DEBUG] - text_dataset_path : None [2025-09-07 09:45:21,462] [ DEBUG] - text_dataset_prob : None [2025-09-07 09:45:21,462] [ DEBUG] - train_dataset_path : /home/aistudio/datasets/train.jsonl [2025-09-07 09:45:21,462] [ DEBUG] - train_dataset_prob : 1.0 [2025-09-07 09:45:21,462] [ DEBUG] - train_dataset_type : erniekit [2025-09-07 09:45:21,462] [ DEBUG] - use_cls : True [2025-09-07 09:45:21,462] [ DEBUG] - [2025-09-07 09:45:21,462] [ INFO] - The global seed is set to 23, local seed is set to 24 and random seed is set to 23. [2025-09-07 09:45:21,462] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: True [2025-09-07 09:45:21,462] [ INFO] - Start to load model ... [2025-09-07 09:45:21,463] [ INFO] - Using download source: huggingface [2025-09-07 09:45:21,463] [ INFO] - Loading configuration file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/config.json [2025-09-07 09:45:21,463] [ WARNING] - You are using a model of type ernie4_5 to instantiate a model of type ernie4_5_moe. This is not supported for all configurations of models and can yield errors. [2025-09-07 09:45:21,464] [ INFO] - Loading weights file /home/aistudio/model/ERNIE-4.5-0.3B-Paddle/model.safetensors [2025-09-07 09:45:21,870] [ INFO] - Loaded weights file from disk, setting weights to model. Traceback (most recent call last): File "/home/aistudio/ERNIE/erniekit/launcher.py", line 46, in <module> launch() File "/home/aistudio/ERNIE/erniekit/launcher.py", line 34, in launch run_tuner() File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 76, in run_tuner _training_function(config={"args": args}) File "/home/aistudio/ERNIE/erniekit/train/tuner.py", line 55, in _training_function run_sft(model_args, data_args, generating_args, finetuning_args) File "/home/aistudio/ERNIE/erniekit/train/sft/workflow.py", line 362, in run_sft model = model_class.from_pretrained( File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/model_utils.py", line 2665, in from_pretrained model = cls(config, *init_args, **model_kwargs) File "/home/aistudio/external-libraries/lib/python3.10/site-packages/paddleformers/transformers/utils.py", line 290, in __impl__ init_func(self, *args, **kwargs) TypeError: Ernie4_5_MoeForCausalLM.__init__() got an unexpected keyword argument 'convert_from_hf' LAUNCH INFO 2025-09-07 09:45:22,480 Exit code 1 ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update requirements.txt #1245

Update requirements.txt #1245

Uh oh!

Farewell-CK commented Sep 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Update requirements.txt #1245

Are you sure you want to change the base?

Update requirements.txt #1245

Uh oh!

Conversation

Farewell-CK commented Sep 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant