fix: liger fails to run loss with new param #124

anhuong · 2025-02-11T21:48:56Z

In newer versions of transformers past 4.46, num_items_in_batch is passed to the loss function as seen here which lead to error TypeError: lce_forward() got an unexpected keyword argument 'num_items_in_batch' (full error below). To fix this I pass in the additional parameter to lce_forward which allowed tuning to run through successfully.

ERROR:sft_trainer.py:Traceback (most recent call last):
  File "/home/tuning/.local/lib/python3.12/site-packages/tuning/sft_trainer.py", line 675, in main
    trainer, additional_train_info = train(
                                     ^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/tuning/sft_trainer.py", line 419, in train
    trainer.train(resume_from_checkpoint)
  File "/home/tuning/.local/lib/python3.12/site-packages/transformers/trainer.py", line 2171, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/transformers/trainer.py", line 2531, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/transformers/trainer.py", line 3675, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/transformers/trainer.py", line 3731, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 863, in forward
    output = self._fsdp_wrapped_module(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/peft/peft_model.py", line 1644, in forward
    return self.base_model(
           ^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tuning/.local/lib/python3.12/site-packages/peft/tuners/tuners_utils.py", line 197, in forward
    return self.model.forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: lce_forward() got an unexpected keyword argument 'num_items_in_batch'

Successful run of tuning:

***************** Module Forwards Patching *************
INFO:framework.py:***************** Module Forwards Patching *************
Rule: llama-fused-lce Module:                           Class: LlamaForCausalLM Num:  1
INFO:framework.py:Rule: llama-fused-lce Module:                           Class: LlamaForCausalLM Num:  1
Rule: llama-rms       Module: input_layernorm           Class: LlamaRMSNorm    Num: 32
INFO:framework.py:Rule: llama-rms       Module: input_layernorm           Class: LlamaRMSNorm    Num: 32
Rule: llama-rms       Module: model                     Class: LlamaRMSNorm    Num:  1
INFO:framework.py:Rule: llama-rms       Module: model                     Class: LlamaRMSNorm    Num:  1
Rule: llama-rms       Module: post_attention_layernorm  Class: LlamaRMSNorm    Num: 32
INFO:framework.py:Rule: llama-rms       Module: post_attention_layernorm  Class: LlamaRMSNorm    Num: 32
Rule: llama-rope      Module:                           Class: LlamaForCausalLM Num:  1
INFO:framework.py:Rule: llama-rope      Module:                           Class: LlamaForCausalLM Num:  1

Signed-off-by: Anh Uong <[email protected]>

willmj

Nice, LGTM! Thanks Anh

fabianlim

LGTM

fix: liger fail to run loss with new param

fd728a5

Signed-off-by: Anh Uong <[email protected]>

anhuong requested a review from fabianlim as a code owner February 11, 2025 21:48

anhuong changed the title ~~fix: liger fail to run loss with new param~~ fix: liger fails to run loss with new param Feb 11, 2025

willmj approved these changes Feb 11, 2025

View reviewed changes

fabianlim approved these changes Feb 11, 2025

View reviewed changes

fabianlim merged commit f6116e6 into foundation-model-stack:main Feb 11, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: liger fails to run loss with new param #124

fix: liger fails to run loss with new param #124

Uh oh!

anhuong commented Feb 11, 2025

Uh oh!

willmj left a comment •

edited

Loading

Uh oh!

fabianlim left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: liger fails to run loss with new param #124

fix: liger fails to run loss with new param #124

Uh oh!

Conversation

anhuong commented Feb 11, 2025

Uh oh!

willmj left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fabianlim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

willmj left a comment •

edited

Loading