-
Notifications
You must be signed in to change notification settings - Fork 13.4k
llama : add llama_vocab, functions -> methods, naming
#11110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
287e8c2 to
4d27597
Compare
|
I'm stumped by this error in the CI: https://github.com/ggerganov/llama.cpp/actions/runs/12654124026/job/35261257507?pr=11110#step:8:355 Not sure what is causing it. @slaren Do you have any guess what could be the issue? |
|
Looks like a compiler bug, but I cannot reproduce it locally. |
|
I think it's safe to assume that this is a compiler bug. It only happens with clang on Windows, and only on release builds. Changing the generator from "ninja multi-config" to "ninja" fixes it for the arm build, but not with hip or sycl. The destructors that it complains about seem to be the maps in |
|
Thanks! |
b7f2c02 to
be9a25f
Compare
403dee8 to
aefcffa
Compare
c89e808 to
bfe781a
Compare
1d1f264 to
a857dc5
Compare
ggml-ci Co-authored-by: Diego Devesa <[email protected]>
|
This should be ready to merge. These are 5 PRs refactoring This set of PRs, together with #11167 and #11174 all introduce API changes, so it would make sense to merge them together to avoid multiple breaking changes. After this is merged, I will attempt a similar refactoring for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I think it is very good to settle on C++ OOP style rather than the current mix of C and C++. Some suggestions for further refactoring:
- Move the code from
llama_model_load_from_fileto thellama_modelconstructor - Move
llama_vocab::loadto the constructor structwith private members is not very common, might be better to useclass- I don't see the point of declaring empty destructors, they can be removed or explicitly set to
default. Very few classes need a destructor. - At some point we should abstract eveything needed to model an architecture to a single class (such that each architecture is a subclass of this class)
- After that,
llm_typeshould probably be removed entirely, and each architecture should have its own enum if needed, with a function to return the type as a string (which by default could be"<arch> <params>")
* llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci]
d8ded9f to
6540935
Compare
llama_vocab, functions -> methods, naming
| LLAMA_LOG_WARN( | ||
| "%s: Added a BOS token to the prompt as specified by the model but the prompt " | ||
| "also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. " | ||
| "Are you sure this is what you want?\n", __FUNCTION__); | ||
| } | ||
| if (vocab.tokenizer_add_eos && output.size() >= 2 && *(output.end()-2) == vocab.special_eos_id) { | ||
| if (vocab.get_add_bos() && output.size() >= 2 && *(output.end()-2) == vocab.token_eos()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vocab.get_add_bos()
probably a typo @ggerganov ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok - it means "get the add_bos flag". The "get" is necessary to disambiguate from the action of adding a BOS vocab.add_bos().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov what I mean is that you are checking for eos, but using vocab.get_add_bos(). Should it not be vocab.get_add_eos() instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes. Thanks for spotting - fixing.
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <[email protected]> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <[email protected]>
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <[email protected]> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <[email protected]>
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <[email protected]> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <[email protected]>
This PR refactors
struct llama_modelandstruct llama_vocabrelated functionality. Moved the tensor data loading tosrc/llama-model.cpp. Thesrc/llama.cppnow contains primarily graph build logic.struct llama_vocabis now public in thellamaAPI and the respective calls use it instead ofstruct llama_model. Improved naming consistency in the public API.Sub-PRs
struct llama_vocabto the API #11156API changes
Multiple naming changes in the
llama_contextandllama_modelAPI. Old names have been deprecated.Add
struct llama_vocabto the APIllama_n_vocab()now acceptsstruct llama_vocabinstead ofstruct llama_modelllama_sampler_init_dry()now acceptsstruct llama_vocabinstead ofstruct llama_modelThe tokenization API now accepts
struct llama_vocabinstead ofstruct llama_modelThe sampler API now accepts
struct llama_vocabinstead ofstruct llama_modelUpdate API names for improved consistency:
Adapter API
Multiple naming changes in this API. Skipped the deprecation phase.
struct llama_control_vector->struct llama_adapter_cvecstruct llama_lora_adapter->struct llama_adapter_lorallama_lora_adapter_[verb](ctx, ...)->llama_[verb]_adapter_lora(ctx, ...)Migration instructions
Adapting user code to the changes is fairly straightforward:
llama_model_get_vocab(model)where the old API requiredllama_modeland the new API requiresllama_vocab