You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/cross-platform/sme2/4-outer-product.md
+14-2Lines changed: 14 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,8 +43,11 @@ order. This means that loading row-data from memory is efficient as the memory
43
43
system operates efficiently with contiguous data. An example of this is where caches are loaded row by row, and data prefetching is simple - just load the data from ``current address + sizeof(data)``. This is not the case for loading column-data from memory though, as it requires more work from the memory system.
44
44
45
45
In order to further improve the effectiveness of the matrix multiplication, it
46
-
is therefore desirable to change the layout in memory of the left-hand side matrix, which is called ``matLeft`` in the code examples in this Learning Path, which essentially performs a matrix
47
-
transposition so that instead of loading column-data from memory, one loads row-data.
46
+
is therefore desirable to change the layout in memory of the left-hand side
47
+
matrix, which is called ``matLeft`` in the code examples in this Learning Path.
48
+
The improved layout would ensure that elements from the same column are located
49
+
next to each other in memory. This is essentially a matrix transposition,
50
+
which changes ``matLeft`` from row-major order to column-major order.
48
51
49
52
{{% notice Important %}}
50
53
It is important to note here that this reorganizes the layout of the matrix in
0 commit comments