Skip to content

Commit fb62f01

Browse files
committed
cann: update the docs CANN.md
1 parent 47f2c64 commit fb62f01

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

docs/backend/CANN.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,15 @@ cmake --build build --config release
258258
### **GitHub contribution**:
259259
Please add the **[CANN]** prefix/tag in issues/PRs titles to help the CANN-team check/address them without delay.
260260

261+
## Updates
261262
### Basic Flash Attention Support
262263
The basic FA kernel with aclnnops has been added in aclnn_ops.cpp.
263264
Currently, the FA only supports the cases with FP16 KV tensors and NO logit softcap.
264-
Since the aclnn interface for flash attention cannot support the logit softcap, we will only update the
265+
Since the aclnn interface for flash attention cannot support the logit softcap, we will only update the quantized version in the future.
266+
267+
Authors from Peking University: Bizhao Shi ([email protected]), Yuxin Yang ([email protected]), Ruiyang Ma ([email protected]), and Guojie Luo ([email protected]).
268+
269+
Thanks Tuo Dai and Shanni Li from Huawei Technologies Co., Ltd.
270+
271+
## TODO
272+
- Support more models and d

0 commit comments

Comments
 (0)