[VLM] Fix mllama targets (#1402)

kylesayrs · dsikka · web-flow · commit d783c261e66a · 2025-04-30T14:03:58.000Z
## Purpose ## * When #1389 landed, modules being skipped by ignore were no longer being skipped. However, this requires that the sequential targets list be correct. Mllama defaults to targeting vision layers, and hence the vision tower was being traced, leading to errors. ```python3 _no_split_modules = [ "MllamaVisionEncoderLayer", "MllamaCrossAttentionDecoderLayer", "MllamaSelfAttentionDecoderLayer", ] ``` ## Changes ## * Only target text decoder layers, not vision decoder layers ## Testing ## * #1335 passes Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
diff --git a/examples/multimodal_vision/mllama_example.py b/examples/multimodal_vision/mllama_example.py
@@ -32,6 +32,7 @@ def data_collator(batch):
     GPTQModifier(
         targets="Linear",
         scheme="W4A16",
+        sequential_targets=["MllamaSelfAttentionDecoderLayer"],
         ignore=["re:.*lm_head", "re:multi_modal_projector.*", "re:vision_model.*"],
     ),
 ]

Original file line number	Diff line number	Diff line change
`@@ -32,6 +32,7 @@ def data_collator(batch):`
`32`	`32`	`GPTQModifier(`
`33`	`33`	`targets="Linear",`
`34`	`34`	`scheme="W4A16",`
	`35`	`+ sequential_targets=["MllamaSelfAttentionDecoderLayer"],`
`35`	`36`	`ignore=["re:.lm_head", "re:multi_modal_projector.", "re:vision_model.*"],`
`36`	`37`	`),`
`37`	`38`	`]`