Add LLaDA-7b-MoE diffusion model #16003

am17an · 2025-09-15T08:50:24Z

Add support for https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct, MoE diffusion models similar to OLMoE (except the QK norm). Added two ggufs - bf16 and q8_0

Example command: ./llama-diffusion-cli -m llada-moe-7B-instruct-BF16.gguf -p "Write code to train MNIST in pytroch" -ngl 99 --diffusion-block-length 32 --diffusion-steps 256 -ub 256 --diffusion-algorithm 4 -fa 0 --temp 0 -sys "You are a helpful AI assistant"

convert_hf_to_gguf.py

gguf-py/gguf/constants.py

src/llama-model.cpp

src/llama-vocab.cpp

CISC · 2025-10-18T14:35:05Z

@am17an There's a preview of LLaDa2Moe available, and it uses the same expert group selection as in BailingMoeV2, so I made it generally available in 6dd223b in case you want to take a stab at it later. :)

am17an · 2025-10-19T13:32:24Z

@CISC Thanks! I saw that, from what I see it looks their sampling doesn't change so it should be straightforward to add this model. Though I will wait for them to release the full version first.

Add LLaDA-7b-MoE diffusion model

2c2d1e6

am17an requested a review from CISC September 15, 2025 08:50

github-actions bot added examples python python script changes labels Sep 15, 2025

Add convert_hf_to_gguf_update change

347b769

CISC approved these changes Sep 15, 2025

View reviewed changes

am17an force-pushed the llada_moe branch from f109414 to 2f3ab34 Compare September 15, 2025 13:58

Address review comments

bcf81ed

am17an force-pushed the llada_moe branch from 2f3ab34 to bcf81ed Compare September 15, 2025 14:13

am17an merged commit 6d75883 into ggml-org:master Sep 16, 2025
51 of 52 checks passed

am17an deleted the llada_moe branch September 16, 2025 02:39

tomasmcm mentioned this pull request Sep 24, 2025

Support LLaDA-MoE-7B-A1B lmstudio-ai/lmstudio-bug-tracker#1025

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add LLaDA-7b-MoE diffusion model #16003

Add LLaDA-7b-MoE diffusion model #16003

Uh oh!

am17an commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Oct 18, 2025

Uh oh!

am17an commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add LLaDA-7b-MoE diffusion model #16003

Add LLaDA-7b-MoE diffusion model #16003

Uh oh!

Conversation

am17an commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Oct 18, 2025

Uh oh!

am17an commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants