Commit ddee88d
[Neuron][Kernel] NKI-based flash-attention kernel with paged KV cache (vllm-project#11277)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
Co-authored-by: Jiangfei Duan <jfduan@outlook.com>1 parent 823ab79 commit ddee88d
File tree
3 files changed
+1126
-1
lines changed- .buildkite
- tests/neuron
- vllm/attention/ops
3 files changed
+1126
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
| 57 | + | |
0 commit comments