- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.4k
          llama : add llama_vocab, functions -> methods, naming
          #11110
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
287e8c2    to
    4d27597      
    Compare
  
    | I'm stumped by this error in the CI: https://github.com/ggerganov/llama.cpp/actions/runs/12654124026/job/35261257507?pr=11110#step:8:355 Not sure what is causing it. @slaren Do you have any guess what could be the issue? | 
| Looks like a compiler bug, but I cannot reproduce it locally. | 
| I think it's safe to assume that this is a compiler bug. It only happens with clang on Windows, and only on release builds. Changing the generator from "ninja multi-config" to "ninja" fixes it for the arm build, but not with hip or sycl. The destructors that it complains about seem to be the maps in  | 
| Thanks! | 
b7f2c02    to
    be9a25f      
    Compare
  
    403dee8    to
    aefcffa      
    Compare
  
    c89e808    to
    bfe781a      
    Compare
  
    1d1f264    to
    a857dc5      
    Compare
  
    ggml-ci Co-authored-by: Diego Devesa <[email protected]>
| This should be ready to merge. These are 5 PRs refactoring  This set of PRs, together with #11167 and #11174 all introduce API changes, so it would make sense to merge them together to avoid multiple breaking changes. After this is merged, I will attempt a similar refactoring for  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I think it is very good to settle on C++ OOP style rather than the current mix of C and C++. Some suggestions for further refactoring:
- Move the code from llama_model_load_from_fileto thellama_modelconstructor
- Move llama_vocab::loadto the constructor
- structwith private members is not very common, might be better to use- class
- I don't see the point of declaring empty destructors, they can be removed or explicitly set to default. Very few classes need a destructor.
- At some point we should abstract eveything needed to model an architecture to a single class (such that each architecture is a subclass of this class)
- After that, llm_typeshould probably be removed entirely, and each architecture should have its own enum if needed, with a function to return the type as a string (which by default could be"<arch> <params>")
* llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci]
d8ded9f    to
    6540935      
    Compare
  
    llama_vocab, functions -> methods, naming
      | LLAMA_LOG_WARN( | ||
| "%s: Added a BOS token to the prompt as specified by the model but the prompt " | ||
| "also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. " | ||
| "Are you sure this is what you want?\n", __FUNCTION__); | ||
| } | ||
| if (vocab.tokenizer_add_eos && output.size() >= 2 && *(output.end()-2) == vocab.special_eos_id) { | ||
| if (vocab.get_add_bos() && output.size() >= 2 && *(output.end()-2) == vocab.token_eos()) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vocab.get_add_bos()
probably a typo @ggerganov ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok - it means "get the add_bos flag". The "get" is necessary to disambiguate from the action of adding a BOS vocab.add_bos().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov  what I mean is that you are checking for eos, but using vocab.get_add_bos(). Should it not be vocab.get_add_eos() instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes. Thanks for spotting - fixing.
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <[email protected]> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <[email protected]>
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <[email protected]> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <[email protected]>
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <[email protected]> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <[email protected]>
This PR refactors
struct llama_modelandstruct llama_vocabrelated functionality. Moved the tensor data loading tosrc/llama-model.cpp. Thesrc/llama.cppnow contains primarily graph build logic.struct llama_vocabis now public in thellamaAPI and the respective calls use it instead ofstruct llama_model. Improved naming consistency in the public API.Sub-PRs
struct llama_vocabto the API #11156API changes
Multiple naming changes in the
llama_contextandllama_modelAPI. Old names have been deprecated.Add
struct llama_vocabto the APIllama_n_vocab()now acceptsstruct llama_vocabinstead ofstruct llama_modelllama_sampler_init_dry()now acceptsstruct llama_vocabinstead ofstruct llama_modelThe tokenization API now accepts
struct llama_vocabinstead ofstruct llama_modelThe sampler API now accepts
struct llama_vocabinstead ofstruct llama_modelUpdate API names for improved consistency:
Adapter API
Multiple naming changes in this API. Skipped the deprecation phase.
struct llama_control_vector->struct llama_adapter_cvecstruct llama_lora_adapter->struct llama_adapter_lorallama_lora_adapter_[verb](ctx, ...)->llama_[verb]_adapter_lora(ctx, ...)Migration instructions
Adapting user code to the changes is fairly straightforward:
llama_model_get_vocab(model)where the old API requiredllama_modeland the new API requiresllama_vocab