Skip to content

Upcast FSDP2 parameters only if requires_grad#3848

Merged
SunMarc merged 3 commits intohuggingface:mainfrom
AlignmentResearch:oskar/avoid-fp32-upcast
Nov 26, 2025
Merged

Upcast FSDP2 parameters only if requires_grad#3848
SunMarc merged 3 commits intohuggingface:mainfrom
AlignmentResearch:oskar/avoid-fp32-upcast

Conversation

@ojh31
Copy link
Contributor

@ojh31 ojh31 commented Nov 25, 2025

What does this PR do?

Fixes an unnecessary upcast of frozen parameters, particularly in the case of LoRA finetuning

Fixes #3844

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Fully-Sharded Data Parallism: @SunMarc @zach-huggingface

@ojh31 ojh31 force-pushed the oskar/avoid-fp32-upcast branch from 365a47b to 3f8941e Compare November 25, 2025 02:59
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this ! Just a nit

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SunMarc SunMarc merged commit d1c96ba into huggingface:main Nov 26, 2025
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FSDP2 with lora take more memory than FSDP

3 participants