support for customizing LoRA multipliers through the sdapi#1982
support for customizing LoRA multipliers through the sdapi#1982wbruna wants to merge 9 commits intoLostRuins:concedo_experimentalfrom
Conversation
|
Does it have any implications on memory use or runtime file loading? |
|
For For |
|
Personally I have seen this request a few times. There is demand for it. If its a bit slower during a switch that is better than not having it at all. Just make sure nothing changes if its not used. |
b0735b5 to
1ddd1a8
Compare
|
Got a first somewhat-working version. I've included code for the As suspected,
What do you think? |
|
By the way, it's also possible to support the |
8d4bc54 to
f013f51
Compare
|
Should be ready enough for reviewing. As described before:
|
|
Cleaned up the code, and reorganized the commits. Tested with Klein 9b and SDXL. Probably needs some polishing on the launcher and config side, once we decide the zero-multiplier approach is OK. I'll leave this aside a bit, to focus on master-509-4cdfff5 🙂 |
|
the default behavior right now (before this PR), is when one multiplier is provided (which is the current status quo of the launcher), all loras are initialized at the same strength, which is what should be default i think. E.g. Then the API override should augment it to a new value temporarily for that request (only adjustable for those loras loaded at mult 0). Also I think |
Intentionally omitted, since it could be considered sensitive information. Usually, we'd have a root directory for all the LoRA files, then we could show subpaths under it. But all LoRAs now are specified by full path, so we can't know which part could be shown. (@LostRuins , a |
Alright, I'll adjust it later (and fix the bug @Riztard mentioned).
|
|
Rebased on top of #2006 to get a fix for zero-multiplier LoRAs getting stuck, and to be able to test both PRs at the same time; but I'll keep the branches separate. Also restored the behavior when a single multiplier is specified. Now:
|
ca2cced to
54cf43a
Compare
Also fix typo in the function name.
The `sdloramult` flag now accepts a list of multipliers, one for each LoRA. If all multipliers are non-zero, LoRAs load as before, with no extra VRAM usage or performance impact. If any LoRA has a multiplier of 0, we switch to `at_runtime` mode, and these LoRAs will be available to multiplier changes via the `lora` sdapi field and show up in the `sdapi/v1/loras` endpoint. All LoRAs are still preloaded on startup, and cached to avoid file reloads. A single multiplier (1.0 by default) is applied to all LoRAs, to keep it compatible with the previous behavior.



This is still just an idea!Since we just got support for multiple LoRAs, we could include LoRA customization on the API side, by:
/sdapi/v1/loraslorafileld at/sdapi/v1/txt2imgand/sdapi/v1/img2imgI recently implemented support on my Python client script for the mainline sd-server implementation, so I have a reasonable idea about how complicated that would be. I'm also aware that the sd.cpp C API would have to be adapted to allow changing LoRA weights without reloading the models.
Do you think this would be worth implementing?