-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Add support for Ling v2 #16028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add support for Ling v2 #16028
Conversation
Thank you for the effort, but I already have a working version, will submit PR soon. Unfortunately I can tell that this PR is non-working. |
Here is my test results. It runs perfectly. command: llama-cli -m ./Ling-mini-2.0-Q4_K_M.gguf --temp 0.7 llama-cli logs:
Q & A (Ling-mini 2.0)
Speed test results (136.17 tokens per second w/ Q4_K_M quantize on Apple M4 Pro):
|
Sorry, but no, there are numerous issues with this implementation, I'll name just a few:
|
... You do realize you're talking to someone who works at/with Inclusion about their own model arch, right? No need for such needless passive aggression anyways when everyone is trying to help :'( |
Yes, and it is not passive aggression, simply stating facts. |
I understand. I've now fixed issue 2 about "It needlessly splits and permutes Q/K/V". |
You should not split them either, it's beneficial to have QKV fused. |
When will this PR be merged ? I want to deploy GGUF-format Ling models on my macOS. :) |
|| t.first == "_<EOT>" | ||
|| t.first == "<|end_of_text|>" | ||
|| t.first == "<end_of_utterance>" // smoldocling | ||
|| t.first == "<|role_end|>" // Ling v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering why this was added? It's set as eos_token
and special
in the tokenizer, so this should not be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, it's just because the llama-cli log told me that special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's fine though, at that point it is added as EOG
, so not an issue. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Thank you.
Add support for Ling v2
Related issues: #15968
Github:https://github.com/inclusionAI/Ling-V2
Huggingface:https://huggingface.co/collections/inclusionAI/ling-v2-68bf1dd2fc34c306c1fa6f86
Modelscope:https://modelscope.cn/collections/Ling-V2-01d8988fbf864d