Skip to content

Commit 2aab316

Browse files
Merge pull request #244 from forresti/fix-comment
fix comment
2 parents 922915a + f054c21 commit 2aab316

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/core/NEON/kernels/NEDirectConvolutionLayerKernel.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1082,7 +1082,7 @@ class convolver_3x3
10821082
the third thread [16,24] and the fourth thread [25,31].
10831083
10841084
The algorithm outer loop iterates over Z, P, Y, X where P is the depth/3rd dimension of each kernel. This order is not arbitrary, the main benefit of this
1085-
is that we setup the neon registers containing the kernerl's values only once and then compute each XY using the preloaded registers as opposed as doing this for every XY value.
1085+
is that we setup the neon registers containing the kernel's values only once and then compute each XY using the preloaded registers as opposed as doing this for every XY value.
10861086
10871087
The algorithm does not require allocating any additional memory amd computes the results directly in-place in two stages:
10881088
1) Convolve plane 0 with kernel 0 and initialize the corresponding output plane with these values.

0 commit comments

Comments
 (0)