Replies: 1 comment 1 reply
-
That's currently not supported by the stable-diffusion.cpp backend, unfortunately. If you have enough system RAM, "Model CPU Offload" could be an alternative: it keeps the models loaded in RAM, moving to VRAM only the active models on-demand; so peak VRAM usage gets as low as the largest model. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
Thank you so much for all the work you do and may this New Year be much better than the last one :)
I was wondering if it would be possible to allow picking particular GPU for Clip in Image Generation instead of just offloading to cpu.
I have dedicated computer just for Image generation with two GPUs, if it would be possible I could use bigger quants for both Diffusion and Clip and gain a little bit of speed. Could be a dropdown like with text generation CPU, GPU1, GPU2, GPU3, GPU4 - no All of course ;) (Maybe one day ;) )
P.S.
Not related but I am a huge fan of auto fit in new Kobold, I have 4 GPUs for Text Generation and they have different sizes of vRAM and it was a pain to manually find optimal fit using just the Tensor split option and eventually regex CPU offload, especially with big models that barely fit. Now it is so easy and so fast. Thank you so much for adding that option. <3
Beta Was this translation helpful? Give feedback.
All reactions