Flex Attention for ComfyUI

A ComfyUI node that patches model attention to use PyTorch Flex Attention (torch.nn.attention.flex_attention) with torch.compile. No extra dependencies — works on any GPU supported by torch.compile.

Requirements

PyTorch 2.5+ (flex_attention is built-in)
Any CUDA GPU supported by torch.compile

Usage

Place the Flex Attention node between your model/LoRA loader and the KSampler in your workflow.

Unlike Flash Attention 4 (which requires Blackwell GPUs), Flex Attention works on any modern NVIDIA GPU. It uses PyTorch's native torch.compile to fuse the attention kernel — no graph breaks required.

How it works

Compiles torch.nn.attention.flex_attention via torch.compile
First call triggers compilation (expect a one-time warmup delay)
Subsequent calls use the compiled kernel
Falls back to PyTorch SDPA when attention masks are present

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flex Attention for ComfyUI

Requirements

Usage

How it works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Flex Attention for ComfyUI

Requirements

Usage

How it works

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages