Skip to content

Commit c1caadf

Browse files
authored
Merge pull request #35 from fslaborg/perf/fix-fold2-horizontal-reduction-bug-20251012-153313-63bc646eb8728c6a-2e2b5a2eff045062
Daily Perf Improver - Fix fold2 horizontal SIMD reduction bug
2 parents 00d6de1 + 24c6383 commit c1caadf

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

src/FsMath/SpanPrimitives.fs

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -641,9 +641,10 @@ type SpanINumberPrimitives =
641641
let vy = Numerics.Vector<'T>(y.Slice(yi, simdWidth))
642642
accVec <- fv accVec vx vy
643643

644-
let mutable acc = init
645-
for i = 0 to Numerics.Vector<'T>.Count - 1 do
646-
acc <- acc + accVec.[i]
644+
// Horizontal reduction: combine all SIMD lanes
645+
// For fold2 with operation f(acc, x, y), the accVec contains results from multiple (x,y) pairs
646+
// We need to reduce these using just addition since they're independent accumulated results
647+
let mutable acc = Numerics.Vector.Sum(accVec)
647648

648649
for i = ceiling to length - 1 do
649650
acc <- f acc x.[xOffset + i] y.[yOffset + i]

0 commit comments

Comments
 (0)