使用lora微调方法时显示RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #839

ghost · 2024-02-16T03:57:31Z

ghost
Feb 16, 2024

我想用对话格式数据在阿里云PAI平台上微调chatglm3-6b模型，我按照官方步骤一步步进行，前面很顺利，直到我输入python finetune_hf.py data/fix /mnt/workspace/chatglm3-6b [configs/lora.yaml，程序在检查完checkpoint并准备微调时报错。希望能有大佬帮帮我，谢谢！附上报错日志。
Num examples = 1
Num Epochs = 1,000
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 1
Gradient Accumulation steps = 1
Total optimization steps = 1,000
Number of trainable parameters = 1,949,696
0%| | 0/1000 [00:00<?, ?it/s]/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/torch/utils/checkpoint.py:90: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn(
Traceback (most recent call last):

File "/mnt/workspace/ChatGLM3/finetune_demo/finetune_hf.py", line 525, in
app()

File "/mnt/workspace/ChatGLM3/finetune_demo/finetune_hf.py", line 517, in main
trainer.train()

File "/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/transformers/trainer.py", line 1539, in train
return inner_training_loop(

File "/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/transformers/trainer.py", line 1869, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)

File "/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/transformers/trainer.py", line 2781, in training_step
self.accelerator.backward(loss)

File "/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/accelerate/accelerator.py", line 1966, in backward
loss.backward(**kwargs)

File "/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/torch/_tensor.py", line 522, in backward
torch.autograd.backward(

File "/opt/conda/envs/ChatGLM3-6b-finetunning/lib/python3.10/site-packages/torch/autograd/init.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

0%| | 0/1000 [00:02<?, ?it/s]

baifachuan · 2024-02-16T11:06:57Z

baifachuan
Feb 16, 2024

我遇到一样的问题，请问下这个问题是怎么解决的呢？

0 replies

longmans · 2024-02-19T07:33:08Z

longmans
Feb 19, 2024

Add this for enable input grads.

@baifachuan

1 reply

netwolf712 Feb 19, 2024

亲测可用，感谢！

在
model.gradient_checkpointing_enable()
下面增加
model.enable_input_require_grads()

davidluciolu · 2025-02-25T07:59:33Z

davidluciolu
Feb 25, 2025

您好，请问这是什么原理呢？

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

使用lora微调方法时显示RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #839

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

使用lora微调方法时显示RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #839

Uh oh!

Uh oh!

ghost Feb 16, 2024

Replies: 3 comments · 1 reply

Uh oh!

baifachuan Feb 16, 2024

Uh oh!

Uh oh!

longmans Feb 19, 2024

Uh oh!

netwolf712 Feb 19, 2024

Uh oh!

davidluciolu Feb 25, 2025

ghost
Feb 16, 2024

Replies: 3 comments 1 reply

baifachuan
Feb 16, 2024

longmans
Feb 19, 2024

davidluciolu
Feb 25, 2025