Skip to content

Commit 2cfea09

Browse files
Change NeMoAutoModelBiencoder to use DEFAULT_ATTN_IMPLEMENTATION
Signed-off-by: Oliver Holworthy <1216955+oliverholworthy@users.noreply.github.com>
1 parent 5307989 commit 2cfea09

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

nemo_automodel/_transformers/auto_model.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -734,7 +734,7 @@ def from_pretrained(
734734
share_encoder: bool = True,
735735
pooling: str = "avg",
736736
l2_normalize: bool = True,
737-
attn_implementation: Optional[str] = None,
737+
attn_implementation: str = DEFAULT_ATTN_IMPLEMENTATION,
738738
use_liger_kernel: bool = True,
739739
use_sdpa_patching: bool = True,
740740
sdpa_method: Optional[List[SDPBackend]] = None,
@@ -762,7 +762,8 @@ def from_pretrained(
762762
l2_normalize: Whether to L2 normalize embeddings.
763763
attn_implementation: Attention implementation to use (e.g.,
764764
``"flash_attention_2"``, ``"sdpa"``, ``"eager"``).
765-
Defaults to ``None`` (uses the model/transformers default, typically sdpa).
765+
Defaults to ``DEFAULT_ATTN_IMPLEMENTATION``
766+
(``"flash_attention_2"`` when flash-attn is installed, otherwise ``"sdpa"``).
766767
use_liger_kernel: Whether to apply Liger kernel optimizations.
767768
use_sdpa_patching: Whether to apply SDPA patching.
768769
sdpa_method: SDPA backend methods to use.

0 commit comments

Comments
 (0)