-
Notifications
You must be signed in to change notification settings - Fork 13.7k
model : add support for apple/DiffuCoder-7B-cpGRPO #14502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| # Read merges.txt | ||
| merges_file = self.dir_model / "merges.txt" | ||
| merges = [] | ||
| with open(merges_file, 'r', encoding='utf-8') as f: | ||
| for i, line in enumerate(f): | ||
| line = line.strip() | ||
| if i == 0 and line.startswith("#version:"): | ||
| continue | ||
| if not line: | ||
| continue | ||
| merges.append(line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? SpecialVocab should be able to handle this.
| layer.ffn_up_exps = create_tensor(tn(LLM_TENSOR_FFN_UP_EXPS, "weight", i), { n_embd, n_ff_exp, n_expert}, 0); | ||
| } | ||
| } break; | ||
| case LLM_ARCH_DREAM: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks (unsurprisingly) identical to Qwen2, I don't think it deserves the duplication.
|
Also really puzzled by the fact that it's just a finetuned Qwen2.5 model while at the same time being based on Dream which claims to be a diffusion model?! |
According to PDF they started from Qwen but it's not just a finetune |
|
Yeah I was up too late and skimmed through the DiffuCoder paper too fast. The HuggingFace model tree, and section 3 of the paper made it seem like a normal Qwen2.5 finetune. I'll close this request for now. |
This pull request adds initial support for Apple's new DiffuCoder model.
I'll be uploading the F16 gguf to https://huggingface.co/gabriellarson/DiffuCoder-7B-cpGRPO-GGUF