- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.4k
          lora : improve compat with mergekit-extract-lora
          #11131
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Merged
      
        
      
            ngxson
  merged 8 commits into
  ggml-org:master
from
ngxson:xsn/mergekit_extract_lora_compat
  
      
      
   
  Jan 8, 2025 
      
    
                
     Merged
            
            
  
    lora : improve compat with mergekit-extract-lora
  
  #11131
              
                    ngxson
  merged 8 commits into
  ggml-org:master
from
ngxson:xsn/mergekit_extract_lora_compat
  
      
      
   
  Jan 8, 2025 
              
            Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    
            
                  compilade
  
            
            reviewed
            
                
                  Jan 8, 2025 
                
            
            
          
          
| I still can't figure out why pyright CI failed. I made no changes to the reported files. Do you have any idea @compilade ? Edit: never mind, there is a problem with upstream safetensors package | 
            
                  ggerganov
  
            
            approved these changes
            
                
                  Jan 8, 2025 
                
            
            
          
          
            
                  compilade
  
            
            approved these changes
            
                
                  Jan 8, 2025 
                
            
            
          
          
    
  tinglou 
      pushed a commit
        to tinglou/llama.cpp
      that referenced
      this pull request
    
      Feb 13, 2025 
    
    
      
  
    
      
    
  
* (wip) support mergekit-extracted lora * support mergekit-extract-lora * use lora->get_scale * correct comment * correct norm name & condition * add some hints
    
  arthw 
      pushed a commit
        to arthw/llama.cpp
      that referenced
      this pull request
    
      Feb 26, 2025 
    
    
      
  
    
      
    
  
* (wip) support mergekit-extracted lora * support mergekit-extract-lora * use lora->get_scale * correct comment * correct norm name & condition * add some hints
    
  mglambda 
      pushed a commit
        to mglambda/llama.cpp
      that referenced
      this pull request
    
      Mar 8, 2025 
    
    
      
  
    
      
    
  
* (wip) support mergekit-extracted lora * support mergekit-extract-lora * use lora->get_scale * correct comment * correct norm name & condition * add some hints
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Motivation
A while ago, I released GGUF-my-LoRA which aims to provide a better playground for users to make even more lora adapters.
However, I soon realized that most users (who have GPU power) still prefer to fine tune the model, instead of making a lora adapter. For example, mradermacher have a huge collection of fine tuned models. Some reasons for which SFT is preferred are:
That made me thinking, can we use
mergekit-extract-loraconvert fine tuned model to lora adapter then use it in llama.cpp?An adapter weights just a fraction of the whole model. Even with a small quality degradation, that's still a bargain!
Idea
mergekit-extract-loraproduces a LoRA adapter by doing matrix decomposition. In the end, it leaves us with an adapter including bothnormvectors andtoken_embdthat we current don't support.Implementation
I made changes to
convert_lora_to_gguf.pyto keep these tensors in the output GGUF.On the
llama.cppside, I added support fortoken_embd.NOTE:
normis present in GGUF, but is not used for now. Adding this should be trivial, but because I will have to modify all thebuild_*functions, which takes me a lot of time, so I decide not to do it now. Also, even without that, most adapters that I tested still works fine.Demo
To make an adapter, install mergekit and run
mergekit-extract-lora, for example:(Note: you can skip this step, download the one of the pre-converted adapters that I made here: https://huggingface.co/collections/ngxson/extracted-lora-mergekit-677d5c3eea0b6a7661201846)
Then, convert it to GGUF
Now use it: