-
Couldn't load subscription status.
- Fork 6.5k
[LoRA] refactor lora loading at the model-level #11719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refactoring this to reduce code duplication. I only have nits, overall LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM.
src/diffusers/utils/peft_utils.py
Outdated
|
|
||
|
|
||
| @contextlib.contextmanager | ||
| def _lora_loading_context(_pipeline): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense as a context manager, but I don't see it being applied anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenjaminBossan and I decided to remove its use because we preferred more explicit code to handle offloading.
What does this PR do?
Currently, there's a significant overlap between these two methods:
diffusers/src/diffusers/loaders/peft.py
Line 121 in 8adc600
diffusers/src/diffusers/loaders/lora_base.py
Line 322 in 8adc600
This makes it challenging to incorporate new changes as the developer has to remember to propagate the changes in both locations. Hence, this PR factors out the common things from both of the places and turns them into utility functions. IMO, it also helps to make both the loading functions a bit cleaner to read.