Skip to content

Commit 5307989

Browse files
Update default attn_implementation to None for Biencoder
Signed-off-by: Oliver Holworthy <1216955+oliverholworthy@users.noreply.github.com>
1 parent ab1fa49 commit 5307989

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

nemo_automodel/_transformers/auto_model.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -734,7 +734,7 @@ def from_pretrained(
734734
share_encoder: bool = True,
735735
pooling: str = "avg",
736736
l2_normalize: bool = True,
737-
attn_implementation: str = "flash_attention_2",
737+
attn_implementation: Optional[str] = None,
738738
use_liger_kernel: bool = True,
739739
use_sdpa_patching: bool = True,
740740
sdpa_method: Optional[List[SDPBackend]] = None,
@@ -762,7 +762,7 @@ def from_pretrained(
762762
l2_normalize: Whether to L2 normalize embeddings.
763763
attn_implementation: Attention implementation to use (e.g.,
764764
``"flash_attention_2"``, ``"sdpa"``, ``"eager"``).
765-
Defaults to ``"flash_attention_2"``.
765+
Defaults to ``None`` (uses the model/transformers default, typically sdpa).
766766
use_liger_kernel: Whether to apply Liger kernel optimizations.
767767
use_sdpa_patching: Whether to apply SDPA patching.
768768
sdpa_method: SDPA backend methods to use.

0 commit comments

Comments
 (0)