Skip to content

Commit 7acd7da

Browse files
committed
Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block
1 parent c67f582 commit 7acd7da

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

src/diffusers/models/transformers/transformer_flux2.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -321,7 +321,8 @@ def __init__(
321321
self.norm = nn.LayerNorm(dim, elementwise_affine=False, eps=eps)
322322

323323
# Note that the MLP in/out linear layers are fused with the attention QKV/out projections, respectively; this
324-
# is often called a "parallel" transformer block
324+
# is often called a "parallel" transformer block. See the [ViT-22B paper](https://arxiv.org/abs/2302.05442)
325+
# for a visual depiction of this type of transformer block.
325326
self.attn = Flux2Attention(
326327
query_dim=dim,
327328
dim_head=attention_head_dim,

0 commit comments

Comments
 (0)