Skip to content

Commit 6c8c73e

Browse files
authored
fix(validation): add validation for lora target linear with quantize experts (#3461)
* fix: add validation for lora target linear with quantize experts * chore: fix lint * chore: comment * fix: missing link on readme
1 parent a260d33 commit 6c8c73e

File tree

4 files changed

+21
-1
lines changed

4 files changed

+21
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
## 🎉 Latest Updates
3131

3232
- 2026/03:
33-
- New model support has been added in Axolotl for Qwen3.5, Qwen3.5 MoE, [GLM-4.7-Flash](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/glm47-flash), [GLM-4.6V](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/glm46v), and [GLM-4.5-Air](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/glm45).
33+
- New model support has been added in Axolotl for [Qwen3.5, Qwen3.5 MoE](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/qwen3.5), [GLM-4.7-Flash](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/glm47-flash), [GLM-4.6V](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/glm46v), and [GLM-4.5-Air](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/glm45).
3434
- [MoE expert quantization](https://docs.axolotl.ai/docs/expert_quantization.html) support (via `quantize_moe_experts: true`) greatly reduces VRAM when training MoE models (FSDP2 compat).
3535
- 2026/02:
3636
- [ScatterMoE LoRA](https://github.com/axolotl-ai-cloud/axolotl/pull/3410) support. LoRA fine-tuning directly on MoE expert weights using custom Triton kernels.

docs/expert_quantization.qmd

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ lora_target_parameters:
4545

4646
## Limitations
4747

48+
- `lora_target_linear` is not compatible with `quantize_moe_experts`. See [Expert LoRA targeting](#expert-lora-targeting) instead.
4849
- `cpu_ram_efficient_loading` hangs / takes long time with FSDP2 + QLoRA.
4950
- Total model parameter count may display incorrectly (trainable param count is correct).
5051
- FSDP LoRA (8-bit) may have a large initial VRAM spike at the first 1-2 steps, which then drops. QLoRA does not exhibit this.

src/axolotl/utils/schemas/config.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1302,6 +1302,11 @@ def check_multigpu_lora_kernels(cls, data):
13021302
@classmethod
13031303
def check_quantize_moe_experts(cls, data):
13041304
if data.get("quantize_moe_experts"):
1305+
if data.get("lora_target_linear"):
1306+
raise ValueError(
1307+
"lora_target_linear is not compatible with quantize_moe_experts. "
1308+
"Use lora_target_parameters to target expert weights instead."
1309+
)
13051310
if data.get("adapter") not in ("lora", "qlora"):
13061311
raise ValueError("quantize_moe_experts requires adapter: lora or qlora")
13071312
if not (data.get("load_in_4bit") or data.get("load_in_8bit")):

tests/utils/schemas/validation/test_moe_quant.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,20 @@ def test_false_skips_validation(self, min_base_cfg, gpu_caps, env_caps):
7979
result = validate_config(cfg, capabilities=gpu_caps, env_capabilities=env_caps)
8080
assert result["quantize_moe_experts"] is False
8181

82+
def test_rejects_lora_target_linear(self, min_base_cfg, gpu_caps, env_caps):
83+
"""quantize_moe_experts with lora_target_linear should fail."""
84+
cfg = (
85+
DictDefault(
86+
quantize_moe_experts=True,
87+
adapter="qlora",
88+
load_in_4bit=True,
89+
lora_target_linear=True,
90+
)
91+
| min_base_cfg
92+
)
93+
with pytest.raises(ValueError, match="lora_target_linear is not compatible"):
94+
validate_config(cfg, capabilities=gpu_caps, env_capabilities=env_caps)
95+
8296
def test_default_is_false(self, min_base_cfg, gpu_caps, env_caps):
8397
"""quantize_moe_experts should default to false."""
8498
cfg = DictDefault({}) | min_base_cfg

0 commit comments

Comments
 (0)