Skip to content

Conversation

tianyuzhou668
Copy link
Contributor

@tianyuzhou668 tianyuzhou668 commented Sep 28, 2025

修复了softmax kernel中bf16的精度问题;
新增了index_elementwise_put_kernel的适配;
新增了显存不足时oom的warning;
支持了flash-attention同时传入is_casual和mask的情况,并针对mask中的最小值精度问题进行了修复;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant