Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
attention.cpp	attention.cpp
attention_v1.cu	attention_v1.cu
attention_v2.cu	attention_v2.cu
attention_v3.cu	attention_v3.cu
attention_v4.cu	attention_v4.cu
attention_v5.cu	attention_v5.cu
common.h	common.h
main.py	main.py

Name

Last commit message

Last commit date

Attention

Resources:

For bs=1, num_heads=8, len_query=4096, len_kv = 8192. 5090 @ 400W, compile with CUDA 12.9

Provide feedback