Skip to content

Commit d06d087

Browse files
committed
feat: Add assertion and comment about relationship between simd size and num simd groups
Branch: GraniteFourPerf Signed-off-by: Gabe Goodhart <[email protected]>
1 parent 641276a commit d06d087

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

ggml/src/ggml-metal/ggml-metal.m

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3022,6 +3022,13 @@ static bool ggml_metal_encode_node(
30223022
const int64_t shmem_size = d_state / 32;
30233023
GGML_ASSERT(shmem_size * 32 == d_state);
30243024

3025+
// The final simd_sum won't work if the number of simd groups is
3026+
// larger than the size of a single simd group. If this case is
3027+
// hit at some point, the logic in the second simd_sum could be
3028+
// expanded to handle this with one more sequential simd_sum to
3029+
// collapse simd group sums another time.
3030+
GGML_ASSERT(shmem_size <= 32);
3031+
30253032
// One thread pre element in d_state
30263033
GGML_ASSERT(d_state <= (int64_t)pipeline.maxTotalThreadsPerThreadgroup);
30273034

0 commit comments

Comments
 (0)