123123 fast_findmin(dij, n)
124124
125125Find the minimum value and its index in the first `n` elements of the `dij`
126- array. The use of `@turbo` macro gives a significiant performance boost.
126+ array.
127127
128128# Arguments
129129- `dij`: An array of values.
@@ -133,14 +133,35 @@ array. The use of `@turbo` macro gives a significiant performance boost.
133133- `dij_min`: The minimum value in the first `n` elements of the `dij` array.
134134- `best`: The index of the minimum value in the `dij` array.
135135"""
136- fast_findmin (dij, n) = begin
137- # findmin(@inbounds @view dij[1:n])
138- best = 1
139- @inbounds dij_min = dij[1 ]
140- @turbo for here in 2 : n
141- newmin = dij[here] < dij_min
142- best = newmin ? here : best
143- dij_min = newmin ? dij[here] : dij_min
136+ function fast_findmin (x, n)
137+ laneIndices = SIMD. Vec {8, Int64} ((1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ))
138+ minvals = SIMD. Vec {8, Float64} (Inf )
139+ min_indices = SIMD. Vec {8, Int64} (0 )
140+
141+ n_batches, remainder = divrem (n, 8 )
142+ lane = VecRange {8} (0 )
143+ i = 1
144+ @inbounds @fastmath for _ in 1 : n_batches
145+ predicate = x[lane + i] < minvals
146+ minvals = vifelse (predicate, x[lane + i], minvals)
147+ min_indices = vifelse (predicate, laneIndices, min_indices)
148+
149+ i += 8
150+ laneIndices += 8
144151 end
145- dij_min, best
152+
153+ min_value = SIMD. minimum (minvals)
154+ min_index = min_value == minvals[1 ] ? min_indices[1 ] : min_value == minvals[2 ] ? min_indices[2 ] :
155+ min_value == minvals[3 ] ? min_indices[3 ] : min_value == minvals[4 ] ? min_indices[4 ] :
156+ min_value == minvals[5 ] ? min_indices[5 ] : min_value == minvals[6 ] ? min_indices[6 ] :
157+ min_value == minvals[7 ] ? min_indices[7 ] : min_indices[8 ]
158+
159+ @inbounds @fastmath for _ in 1 : remainder
160+ xi = x[i]
161+ pred = x[i] < min_value
162+ min_value = ifelse (pred, xi, min_value)
163+ min_index = ifelse (pred, i, min_index)
164+ i += 1
165+ end
166+ return min_value, min_index
146167end
0 commit comments