Replies: 1 comment 1 reply
-
@ClashSAN Are you asking for only that model? Trying the optimizations mentioned in that long issues page I have gained huge speeds. With my 4090, what was previously taking me minutes is now generating in seconds. I did a new build around a 4090,... i9 13000, 64gb ddr5. I was only getting 1 - 2 it/s mostly around 1.05. Now I'm hovering around 17.90. For example: (Run on a custom model trained from SD 2.1 model using Dreambooth and also using a embedding) to generate a 50 step DDIM at 960x960 with Batch size 8 was taking 6-7 minutes. Now it's doing it in just under a minute. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Dear 3090/4090 users:
According to @C43H66N12O12S2 here, 1 month ago he is getting 28 it/s on a 4090.
I think he is busy but I would really like to bring attention to the speed optimizations which he's discussed in a long issue page.
If you have a 4090, please try to replicate, the commit hash is probably 66d038f
I'm not sure if he is getting big gains from increasing the batch size.
Anyways, if we use a sd finetuned model of 256x256, with a DPM++ 2M sampler at 6-8 steps, how many images would we be able to make?
See if
--medvram
in addition to--xformers
will increase the speed, since it should raise your maximum batch size.There are also some extra windows tips found at the bottom here: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations Just don't try this on a very bloated system, or your results may not be good.
Beta Was this translation helpful? Give feedback.
All reactions