- 
                Notifications
    
You must be signed in to change notification settings  - Fork 574
 
add support for Qwen Image Pruning #1779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for Qwen Image Pruning #1779
Conversation
fe14626    to
    9c4ec67      
    Compare
  
    | 
           I'm gonna hold off on merging this one until it's merged upstream. I'm a little bit skeptical of the layer overwriting logic and I don't know enough about how the pruned models work or if it triggers any false positives in regular models.  | 
    
| 
           Well, the only thing this does is finding out what is the last numbered layer on the file, and setting the number of layers to that value - basically dodging a "broken model file because it only has 40 layers" error. Nothing too technical: I tested changing the 60 to 40 first, it worked; then I made that loop to find the last numbered layer from the model tensors themselves.  | 
    
9c4ec67    to
    24cebc9      
    Compare
  
    | 
           what if it actually is a broken model though? Wouldn't it be safer to simply whitelist for qwen image pruned? At a quick glance, it seems like we can match against a model that has type = VERSION_QWEN_IMAGE with tensor_count = 1285 and exactly 41 instead of 60 transformer block layers. wouldn't that be safer?  | 
    
| 
           I got the model, let me mess around with it and try  | 
    
          
 It would most likely just fail in a different way. It could be mistaken for "good" if the tensor I'm checking is the only one missing in one of the last layers; but that's unlikely, since it's one of the first on each layer. 
 40, actually (0 to 39). And the type is already validated at this point. But checking an amount of tensors instead of tensor names would be more prone to false positives. What could be more robust would be checking for any correct tensor names in any of the layers: then, the model would fail to load for any incomplete layer. I don't want to fix the number because the point of pruning is removing layers from the model: they could release another variant with a few less, or a few more, and we'd need to whitelist those too.  | 
    
24cebc9    to
    4fbfee6      
    Compare
  
    | 
           Meh, the '12b' variant doesn't have contiguous layers 😕 That would need bigger code changes. But I saw some people complaining they got errors trying to load it on other backends, so maybe the model is really kind of broken. The older '13b' works fine, though; I've been using it for debugging the last few days.  | 
    
| 
           Anyway i'll merge this first... though I think I will add a check to match exactly 40 or 41 layers before it overwrites the layer count. Later if the 12B model gets fixed it will have 41 layers.  | 
    
| 
           bc09f34 works fine with the 13b  | 
    

From leejet/stable-diffusion.cpp#874 .