Skip to content

ComfyNodePRs/PR-comfyui-flex-attention-7610b4df

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Flex Attention for ComfyUI

A ComfyUI node that patches model attention to use PyTorch Flex Attention (torch.nn.attention.flex_attention) with torch.compile. No extra dependencies — works on any GPU supported by torch.compile.

Requirements

  • PyTorch 2.5+ (flex_attention is built-in)
  • Any CUDA GPU supported by torch.compile

Usage

Place the Flex Attention node between your model/LoRA loader and the KSampler in your workflow.

Unlike Flash Attention 4 (which requires Blackwell GPUs), Flex Attention works on any modern NVIDIA GPU. It uses PyTorch's native torch.compile to fuse the attention kernel — no graph breaks required.

How it works

  • Compiles torch.nn.attention.flex_attention via torch.compile
  • First call triggers compilation (expect a one-time warmup delay)
  • Subsequent calls use the compiled kernel
  • Falls back to PyTorch SDPA when attention masks are present

About

Flex Attention (torch.compile) node for ComfyUI — no extra dependencies, works on any GPU

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%