Flash Attention without Triangular Mask #594

pearlli98 · 2022-07-20T21:57:29Z

pearlli98
Jul 20, 2022

I am interested in implementing a version of flash attention in triton without masking.

I infer from the pytest in the tutorial that the flash attention example generate outputs that are masked with a triangular matrix. I also infer from the code that it would be this specific line inside the kernel that apply the triangular mask.

However, when i remove both of these operations, the outputs from triton and pytorch are not the same. Does anyone know how I can change the code to make this happen? Thank you!

ptillet · 2022-07-21T00:07:10Z

ptillet
Jul 21, 2022
Maintainer

You also need to change the bounds of the loop to make sure each query iterates through all the keys

0 replies

pearlli98 · 2022-07-21T04:31:22Z

pearlli98
Jul 21, 2022
Author

@ptillet Thank you for your response! I changed the for loop bound to for start_n in range(0, N_CTX, BLOCK_N): and right now the results seem to assign for the most part. I have some follow-up questions on this:

Does the change correspond to what you meant by "iterating through all the keys"?
I do see some numerical differences on the forward pass output for up to 0.0015. Do you think this range is expected?

Thank you so much :)

1 reply

ptillet Jul 21, 2022
Maintainer

Yes, and yes :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flash Attention without Triangular Mask #594

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Flash Attention without Triangular Mask #594

Uh oh!

pearlli98 Jul 20, 2022

Replies: 2 comments · 1 reply

Uh oh!

ptillet Jul 21, 2022 Maintainer

Uh oh!

Uh oh!

pearlli98 Jul 21, 2022 Author

Uh oh!

ptillet Jul 21, 2022 Maintainer

pearlli98
Jul 20, 2022

Replies: 2 comments 1 reply

ptillet
Jul 21, 2022
Maintainer

pearlli98
Jul 21, 2022
Author

ptillet Jul 21, 2022
Maintainer