Skip to content

Commit 504fb89

Browse files
author
Iwan Kawrakow
committed
Revert "Fix race in the CUDA DeepSeek FA kernel (#406)"
This reverts commit 36e6e88. I should have tested. We get NaNs.
1 parent 36e6e88 commit 504fb89

File tree

1 file changed

+0
-2
lines changed

1 file changed

+0
-2
lines changed

ggml/src/ggml-cuda/fattn-new-mma.cu

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -898,8 +898,6 @@ static __device__ __forceinline__ void flash_attn_ext_f16_process_tile(
898898
KQ_crs += __shfl_xor_sync(0xFFFFFFFF, KQ_crs, offset, WARP_SIZE);
899899
}
900900

901-
__syncthreads();
902-
903901
// Write back combined meta data:
904902
#pragma unroll
905903
for (int imeta = 0; imeta < nmeta; ++imeta) {

0 commit comments

Comments
 (0)