Fix typo (#463)

svretina · web-flow · commit 21b86bba4729 · 2023-01-26T15:44:37.000-05:00
temsor -&gt; tensor
diff --git a/docs/src/examples/multithreading.md b/docs/src/examples/multithreading.md
@@ -228,7 +228,7 @@ Meaning we could fit three 886x886 matrices in our L2 cache by splitting them up
 
 Aside from the fact that LoopVectorization did much better than OpenBLAS--Julia's default library--over this size range, LoopVectorization's major advantage that it should perform similarly well for a wide variety of comparable operations and not just GEMM (GEneral Matrix-Matrix multiplication) specifically. GEMM has long been a motivating benchmark, as it's one of the best optimized routines available to compare against and get a sense of how well you're doing vs hand-tuned limits optimized in assembly.
 
-Because it is so well optimized, a standard trick for implementing more general optimized routines is to convert them into GEMM calls. For example, this is commonly done for temsor operations (see, e.g., [TensorOperations.jl](https://github.com/Jutho/TensorOperations.jl)) as well as for convolutions, e.g. in [NNlib](https://github.com/FluxML/NNlib.jl/blob/ca82fb23928c7ee7d08afb722718cf93be13f81c/src/impl/conv_im2col.jl#L25)'s `conv_im2col!`, their default optimized convolution function.
+Because it is so well optimized, a standard trick for implementing more general optimized routines is to convert them into GEMM calls. For example, this is commonly done for tensor operations (see, e.g., [TensorOperations.jl](https://github.com/Jutho/TensorOperations.jl)) as well as for convolutions, e.g. in [NNlib](https://github.com/FluxML/NNlib.jl/blob/ca82fb23928c7ee7d08afb722718cf93be13f81c/src/impl/conv_im2col.jl#L25)'s `conv_im2col!`, their default optimized convolution function.
 
 Lets take a look at convolutions as our next example. We create a batch of a hundred 256x256 images with 3 input channels, and convolve them with a 5x5 kernel producing 6 output channels.
 ```julia