Skip to content

Commit 1f4ca3a

Browse files
committed
disable modeling_utils rewrite
1 parent fd455e1 commit 1f4ca3a

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

onnx_diagnostic/torch_export_patches/patches/patch_transformers.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1901,7 +1901,10 @@ def get_placeholder_mask(
19011901
try:
19021902
import transformers.modeling_utils
19031903

1904-
patch_modeling_utils = True
1904+
# TODO(titaiwang): This is not ready yet.
1905+
# Using multi-turn conversation to export, we don't need to rewrite the attention
1906+
# as sequence_length is not restricted to 1.
1907+
patch_modeling_utils = False
19051908

19061909
from transformers.integrations.sdpa_attention import use_gqa_in_sdpa, repeat_kv
19071910

0 commit comments

Comments
 (0)