-
Notifications
You must be signed in to change notification settings - Fork 431
add support for Qwen Image Pruning #874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Quality seems a little worse than the Lightning model, with ~30% less peak VRAM usage, and similar speed gains. |
@leejet , looks like a123e25 from the qwen_image_edit branch is enough to support the '13b' pruned model. Thanks! The '12b' variant still doesn't work, though, maybe because it has non-contiguous layers. I guess they're keeping the non-pruned layers with the same number as they have on the original model. |
@wbruna I think it might be a problem with the GGUF quant, not the model. Look at the GGUF https://huggingface.co/wsbagnsv1/Qwen-Image-Pruning-GGUF/tree/main?show_file_info=Qwen-Image-Pruning-12b-Q4_0.gguf Transformer block 18, ![]() ![]() I've reported it to the GGUF quant guy but he doesn't seem to get what i mean https://huggingface.co/wsbagnsv1/Qwen-Image-Pruning-GGUF/discussions/1 |
What do you mean "Not working anymore", is it still generating an image? It seems to work for me |
b9d7b2b
to
9bc2e3c
Compare
This PR still works; I was referring to my previous comment, which mentioned that a123e25 (from the |
The support for dynamic number of Qwen image transformer blocks is available in the qwen_image_edit branch. |
I tested it with the same model I use to test this PR (the 13b, they renamed it afterwards):
|
Please use the latest code from this PR #877. |
Oh, I see: I forgot that rev was on the @LostRuins , btw: if you're already syncing with that one, feel free to drop my changes, they shouldn't be necessary. |
I noticed that this PR was automatically closed after the qwen_image branch was deleted. |
Thanks. I was keeping this open just to further investigate about the 12b variant, but I could just open another one instead. |
For #851 . Allow the model loading logic to tolerate missing layers, which is enough to run the 12B Pruning variant:
https://huggingface.co/OPPOer/Qwen-Image-Pruning
Tested with the Q4_K_M quant from https://huggingface.co/wsbagnsv1/Qwen-Image-Pruning-GGUF :