Skip to content

Commit cc45396

Browse files
authored
fix: enable cuda_use_flash_attention2 for PictureDescriptionVlmModel (#1496)
fix: enable use_cuda_flash_attention2 for PictureDescriptionVlmModel Signed-off-by: Zach Cox <zach.s.cox@gmail.com>
1 parent 976e92e commit cc45396

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

docling/models/picture_description_vlm_model.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,10 @@ def __init__(
5757
artifacts_path,
5858
torch_dtype=torch.bfloat16,
5959
_attn_implementation=(
60-
"flash_attention_2" if self.device.startswith("cuda") else "eager"
60+
"flash_attention_2"
61+
if self.device.startswith("cuda")
62+
and accelerator_options.cuda_use_flash_attention2
63+
else "eager"
6164
),
6265
).to(self.device)
6366

0 commit comments

Comments
 (0)