Replies: 1 comment 4 replies
-
All great ideas (and great optimization blog)! Sadly, Forge's leading programmer seems to have abandoned the project. How about giving some quantification of projected speed gains? Like, publishing a table with generation runtimes of a couple of test images side by side, Forge's times against Automatic's with these extensions activated. That just might help to turn back lllyasviel's attention. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I guess some of you may have read this blog on SD optimisation https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl It lists out quite a lot of things that can speed up inference by quite a bit, but many of these are not in Forge yet.
I'm listing a few interesting ones out below.
1. Disabling CFG
You know disabling CFG completely saves a huge amount of time, though it's not exactly useful. But do you know that you can disable CFG after ~75% of the sampling steps are done without losing quality?
I don't see an extension or a built-in way to do this in Forge / A1111, but I'm guessing this shouldn't be too hard to implement.
2. DeepCache
A huge increase in performance with small drop in quality. There's an extension for A1111 but it doesn't work with ControlNet nor Forge https://github.com/aria1th/sd-webui-deepcache-standalone And also the upstream DeepCache package has not been updated since Dec 23.
3. OneDiff
A much more flexible & fast
torch.compile
alternative that works with Loras etc. Good performance gains with no loss in quality. There's an extension for A1111 but it doesn't work with ControlNet nor Forge https://github.com/siliconflow/onediffBeta Was this translation helpful? Give feedback.
All reactions