-
3060ti 8 gb vram + 32 gb ram, generation time of 768x768 with flux1-dev-bnb-nf4-v2 model and 20 steps takes about 40 seconds, and with 30 steps - 1 minute 10 seconds. If I try to load the full flux1-dev model on 24 GB + ae.vae + clip_l + t5xxl_fp16 then I get a very, very long generation time, from 10 minutes (About 4-5 minutes of background preparation and another 4-5 minutes of generation. And this is at a fairly low generation resolution of 768x768) and sometimes it may not start at all, switching to eternal loading or get error for the vram (or black screen). What could be the reason? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Flux1D-NF4 = 12GB size with clips and vae baked in. Flux1D-Full+AE+ClipL+T5 = ~34GB total size If I to guess, a lack of RAM so you're having to use your storage drive's PageFile which is quite a bit slower. I would recommend trying out the one of the lower quants like Q4.1, along with the GGUF Q8 of the T5 encoder. |
Beta Was this translation helpful? Give feedback.
-
GGUF Q8+T5 FP16,and lower your gpu weights to 3600mb,its quick |
Beta Was this translation helpful? Give feedback.
Flux1D-NF4 = 12GB size with clips and vae baked in.
Flux1D-Full+AE+ClipL+T5 = ~34GB total size
If I to guess, a lack of RAM so you're having to use your storage drive's PageFile which is quite a bit slower.
I would recommend trying out the one of the lower quants like Q4.1, along with the GGUF Q8 of the T5 encoder.
CivitAI GGUF Q4.1
HuggingFace GGUF T5_v1.1 Encoder