Early-stage C scaffold for experimenting with attention implementations and efficiency trade-offs (KV-cache, IO, runtime). The project will grow into a small simulator/benchmark suite comparing MHA, GQA/MQA, MLA, and FlashAttention-style tiling in C with stable-softmax correctness tests.
cmake -S . -B build
cmake --build buildThis currently builds stub binaries (attn_bench, attn_tests) as placeholders for the upcoming implementations.