Skip to content

Commit d1c5d19

Browse files
committed
minor
Signed-off-by: weimingc <[email protected]>
1 parent 13042fa commit d1c5d19

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

modelopt/torch/export/quant_utils.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -929,7 +929,6 @@ def all_items_same(item_list):
929929
# Mathematical equivalence:
930930
# Before: o_proj_out = [attn @ (v_proj_in @ v_proj.W^T)^T * scale] @ o_proj.W^T
931931
# After: o_proj_out = [attn @ (v_proj_in @ (v_proj.W * scale)^T)^T] @ o_proj.W^T
932-
# note: for GQA models, TODO:
933932
(["LlamaAttention", "Qwen3Attention", "Qwen3MoeAttention"], ("v_proj", "o_proj")),
934933
# MLP: Fuse down_proj's pre_quant_scale into up_proj's output dimension
935934
# Mathematical equivalence:

0 commit comments

Comments
 (0)