[fix] Fix visual layer ignore pattern for Qwen2.5-VL models (#1766)

Zhao-Dongyu · web-flow · commit d497b5aa5f5c · 2025-08-21T10:59:27.000-04:00
Qwen2.5-VL uses "model.visual.*" layer names while Qwen2-VL uses
"visual.*". Updated ignore patterns to handle both naming conventions
correctly.

## SUMMARY:

Updated visual layer ignore patterns to support both Qwen2-VL
(`visual.*`) and Qwen2.5-VL (`model.visual.*`) naming conventions,
ensuring proper exclusion of visual layers from quantization in both
model versions.

### Problem
Different Qwen VL model versions use different naming conventions for
visual layers:

- **Qwen2-VL**: Uses `visual.*` pattern (e.g.,
`visual.blocks.0.attn.qkv`)
- **Qwen2.5-VL**: Uses `model.visual.*` pattern (e.g.,
`model.visual.blocks.0.attn.qkv`)

The current ignore pattern `"re:visual.*"` only works for Qwen2-VL but
fails for Qwen2.5-VL models, causing visual layers to be incorrectly
included in quantization.

### Solution
Updated ignore patterns to handle both naming conventions:
- Keep `"re:visual.*"` for Qwen2-VL compatibility
- Add `"re:model.visual.*"` for Qwen2.5-VL compatibility

## TEST PLAN:

Verified that `"re:model.visual.*"` pattern matches Qwen2.5-VL layer
names
diff --git a/examples/quantization_w8a8_fp8/qwen_2_5_vl_example.py b/examples/quantization_w8a8_fp8/qwen_2_5_vl_example.py
@@ -17,7 +17,7 @@
 recipe = QuantizationModifier(
     targets="Linear",
     scheme="FP8_DYNAMIC",
-    ignore=["re:.*lm_head", "re:visual.*"],
+    ignore=["lm_head", "re:visual.*", "re:model.visual.*"],
 )
 
 # Apply quantization and save to disk in compressed-tensors format.

Original file line number	Diff line number	Diff line change
`@@ -17,7 +17,7 @@`
`17`	`17`	`recipe = QuantizationModifier(`
`18`	`18`	`targets="Linear",`
`19`	`19`	`scheme="FP8_DYNAMIC",`
`20`		`- ignore=["re:.lm_head", "re:visual."],`
	`20`	`+ ignore=["lm_head", "re:visual.", "re:model.visual."],`
`21`	`21`	`)`
`22`	`22`
`23`	`23`	`# Apply quantization and save to disk in compressed-tensors format.`