Skip to content

Conversation

am17an
Copy link
Collaborator

@am17an am17an commented Sep 15, 2025

Add support for https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct, MoE diffusion models similar to OLMoE (except the QK norm). Added two ggufs - bf16 and q8_0

Example command: ./llama-diffusion-cli -m llada-moe-7B-instruct-BF16.gguf -p "Write code to train MNIST in pytroch" -ngl 99 --diffusion-block-length 32 --diffusion-steps 256 -ub 256 --diffusion-algorithm 4 -fa 0 --temp 0 -sys "You are a helpful AI assistant"

@am17an am17an requested a review from CISC September 15, 2025 08:50
@github-actions github-actions bot added examples python python script changes labels Sep 15, 2025
@am17an am17an merged commit 6d75883 into ggml-org:master Sep 16, 2025
51 of 52 checks passed
@am17an am17an deleted the llada_moe branch September 16, 2025 02:39
@CISC
Copy link
Collaborator

CISC commented Oct 18, 2025

@am17an There's a preview of LLaDa2Moe available, and it uses the same expert group selection as in BailingMoeV2, so I made it generally available in 6dd223b in case you want to take a stab at it later. :)

@am17an
Copy link
Collaborator Author

am17an commented Oct 19, 2025

@CISC Thanks! I saw that, from what I see it looks their sampling doesn't change so it should be straightforward to add this model. Though I will wait for them to release the full version first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants