-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Support grad_clip_norm_()
for FSDP
#20784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
amorehead
wants to merge
32
commits into
Lightning-AI:master
Choose a base branch
from
amorehead:fsdp-grad-clip-by-norm
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 19 commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
2fc2fb7
Update fsdp.py
amorehead aa6e482
Merge branch 'Lightning-AI:master' into fsdp-grad-clip-by-norm
amorehead c36f40c
Support gradient norm clipping for FSDP
amorehead 8fad423
Update CHANGELOG.md
amorehead 04fbaf1
Fix args for certain precisions
amorehead bce69ca
Standardize precision args
amorehead 0df38f5
Guard for typing
amorehead a42b974
Fix argument typing
amorehead ed2fe05
Wrap AMP test module in FSDP
amorehead 2f62a0a
Simplify guard
amorehead 7f7987e
Remove FSDP traces in AMP precision unit test
amorehead 0b9b2a3
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead f98ce47
Merge branch 'master' into fsdp-grad-clip-by-norm
Borda 5814091
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead 75d6d9f
Merge branch 'master' into fsdp-grad-clip-by-norm
Borda 395c7fd
Merge branch 'master' into fsdp-grad-clip-by-norm
Borda 6f04f9c
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead dee2225
Apply suggestions from code review
Borda 169e20c
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead 3d80102
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead de84676
Update module.py
amorehead 188ca22
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 7c829b6
Update amp.py
amorehead 161241e
Update deepspeed.py
amorehead eea0a94
Update fsdp.py
amorehead 181a355
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 759c4a2
Update precision.py
amorehead 9bc3991
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 63e9d3a
Update test_amp.py
amorehead 46fb1b5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 9d9cd47
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead 6b631c3
Merge branch 'master' into fsdp-grad-clip-by-norm
justusschock File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be a breaking change, it has to go to the end of arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to how other codebases like Fabric would call
clip_gradients
? As far as I can see with this PR's unit tests, all references in the Lightning codebase are not broken by this change. And if you are, for clarification, wouldmodule
have to be made amodule: Optional[Module] = None
as the last argument in all of the modified functions below?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am saying if user is using positional arguments this will break for him
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. I've just made the new
module
argument fully optional by listing it as the last optional argumentmodule: Optional[Module] = None
. Let me know if you can see anything else that needs to be addressed.