Skip to content

Commit acaf479

Browse files
[Matrix] Minor cleanups and fixes, add a paragraph on optimizations.
1 parent 140e08a commit acaf479

File tree

2 files changed

+79
-35
lines changed

2 files changed

+79
-35
lines changed

content/learning-paths/cross-platform/matrix/3-code-1.md

Lines changed: 52 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -23,12 +23,13 @@ In the Matrix processing library, you implement two types of checks:
2323

2424
The idea here is to make the program fail in a noticeable way. Of course, in a real world application, the error should be caught and dealt with by the application, if it can. Error handling, and especially recovering from errors, can be a complex topic.
2525

26-
At the top of file `include/Matrix/Matrix.h`, include `<cassert>` to get the C-style assertions declarations for checks in `Debug` mode only:
26+
At the top of file `include/Matrix/Matrix.h`, include `<cassert>` to get the C-style assertions declarations for checks in `Debug` mode only, as well as `<cstddef>` which provides standard C declaration like `size_t`:
2727

2828
```CPP
2929
#pragma once
3030

3131
#include <cassert>
32+
#include <cstddef>
3233

3334
namespace MatComp {
3435
```
@@ -44,7 +45,7 @@ const Version &getVersion();
4445
/// and the EXIT_FAILURE error code. It will also print the file name (\p
4546
/// fileName) and line number (\p lineNumber) that caused that application to
4647
/// exit.
47-
[[noreturn]] void die(const char *fileName, std::size_t lineNumber,
48+
[[noreturn]] void die(const char *fileName, size_t lineNumber,
4849
const char *reason);
4950
5051
```
@@ -109,10 +110,11 @@ The Matrix data structure has the following private data members:
109110
Modern C++ offers constructs in the language to deal safely with memory; you will use `std::unique_ptr` which guaranties that the Matrix class will be safe from a whole range of memory management errors.
110111

111112
Add the following includes at the top of `include/Matrix/Matrix.h`, right under
112-
the '<cassert>' include:
113+
the `<cstddef>` include:
113114

114115
```CPP
115116
#include <cassert>
117+
#include <cstddef>
116118
#include <cstring>
117119
#include <initializer_list>
118120
#include <iostream>
@@ -229,12 +231,6 @@ TEST(Matrix, defaultConstruct) {
229231
TEST(Matrix, booleanConversion) {
230232
EXPECT_FALSE(Matrix<int8_t>());
231233
EXPECT_FALSE(Matrix<double>());
232-
233-
EXPECT_TRUE(Matrix<int8_t>(1, 1));
234-
EXPECT_TRUE(Matrix<double>(1, 1));
235-
236-
EXPECT_TRUE(Matrix<int8_t>(1, 1, 1));
237-
EXPECT_TRUE(Matrix<double>(1, 1, 2.0));
238234
}
239235
```
240236

@@ -320,9 +316,9 @@ The tests should still pass, check for yourself.
320316
The next step is to be able to construct valid matrices, so add this constructor to the public section of class `Matrix` in `include/Matrix/Matrix.h`:
321317

322318
```CPP
323-
/// Construct a \p numRows x \p numColumns uninitialized Matrix
324-
Matrix(size_t numRows, size_t numColumns)
325-
: numRows(numRows), numColumns(numColumns), data() {
319+
/// Construct a \p numRows x \p numCols uninitialized Matrix
320+
Matrix(size_t numRows, size_t numCols)
321+
: numRows(numRows), numColumns(numCmns), data() {
326322
allocate(getNumElements());
327323
}
328324
```
@@ -348,6 +344,17 @@ TEST(Matrix, uninitializedConstruct) {
348344
```
349345

350346
This constructs a valid `Matrix` if it contains elements), and the `uninitializedConstruct` test checks that two valid matrices of different types and dimensions can be constructed.
347+
You should also update the `booleanConversion` test in this file to check for boolean conversion for valid matrices so it now looks like:
348+
349+
```CPP
350+
TEST(Matrix, booleanConversion) {
351+
EXPECT_FALSE(Matrix<int8_t>());
352+
EXPECT_FALSE(Matrix<double>());
353+
354+
EXPECT_TRUE(Matrix<int8_t>(1, 1));
355+
EXPECT_TRUE(Matrix<double>(1, 1));
356+
}
357+
```
351358
352359
Compile and test again, all should pass:
353360
@@ -374,6 +381,35 @@ ninja check
374381
[ PASSED ] 4 tests.
375382
```
376383

384+
Another constructor that is missing is one that will create and initialize matrices to a known value. Let's add it to `Matrix` in `include/Matrix/Matrix.h`:
385+
386+
```CPP
387+
/// Construct a \p numRows x \p numCols Matrix with all elements
388+
/// initialized to value \p val.
389+
Matrix(size_t numRows, size_t numCols, Ty val) : Matrix(numRows, numCols) {
390+
allocate(getNumElements());
391+
for (size_t i = 0; i < getNumElements(); i++)
392+
data[i] = val;
393+
}
394+
```
395+
396+
Add boolean conversion tests for this new constructor by modifying `booleanConversion` in `tests/Matrix.cpp` so it looks like:
397+
398+
```CPP
399+
TEST(Matrix, booleanConversion) {
400+
EXPECT_FALSE(Matrix<int8_t>());
401+
EXPECT_FALSE(Matrix<double>());
402+
403+
EXPECT_TRUE(Matrix<int8_t>(1, 1));
404+
EXPECT_TRUE(Matrix<double>(1, 1));
405+
406+
EXPECT_TRUE(Matrix<int8_t>(1, 1, 1));
407+
EXPECT_TRUE(Matrix<double>(1, 1, 2.0));
408+
}
409+
```
410+
411+
You should be getting the pattern now: each new feature or method comes with tests.
412+
377413
The `Matrix` class is missing two important methods:
378414
- A *getter*, to read the matrix element at (row, col).
379415
- A *setter*, to modify the matrix element at (row, col).
@@ -397,20 +433,6 @@ Add them now in the public section of `Matrix` in `include/Matrix/Matrix.h`:
397433
}
398434
```
399435
400-
Another constructor that is missing is one that will create and initialize matrices to a known value. Let's add it to `Matrix` in `include/Matrix/Matrix.h`:
401-
402-
```CPP
403-
/// Construct a \p numRows x \p numColumns Matrix with all elements
404-
/// initialized to value \p val.
405-
Matrix(size_t numRows, size_t numCols, Ty val) : Matrix(numRows, numCols) {
406-
allocate(getNumElements());
407-
for (size_t i = 0; i < getNumElements(); i++)
408-
data[i] = val;
409-
}
410-
```
411-
412-
You should be getting the pattern now.
413-
414436
Add tests for those 3 methods in `tests/Matrix.cpp`:
415437
416438
```CPP
@@ -503,7 +525,7 @@ The C++ `std::initializer_list` enables users to provide a list of literal
503525
values (in row major order) to use to initialize the matrix with:
504526

505527
```CPP
506-
/// Construct a \p numRows x \p numColumns Matrix with elements
528+
/// Construct a \p numRows x \p numCols Matrix with elements
507529
/// initialized from the values from \p il in row-major order.
508530
Matrix(size_t numRows, size_t numCols, std::initializer_list<Ty> il)
509531
: Matrix(numRows, numCols) {
@@ -862,7 +884,7 @@ ninja check
862884
[----------] 16 tests from Matrix (0 ms total)
863885
864886
[----------] Global test environment tear-down
865-
[==========] 16 tests from 3 test suites ran. (0 ms total)
887+
[==========] 16 tests from 1 test suite ran. (0 ms total)
866888
[ PASSED ] 16 tests.
867889
```
868890

@@ -941,6 +963,7 @@ Add these to the public section of `Matrix` in `include/Matrix/Matrix.h`:
941963
return false;
942964
return true;
943965
}
966+
944967
/// Returns true iff matrices do not compare equal.
945968
bool operator!=(const Matrix &rhs) const { return !(*this == rhs); }
946969
```
@@ -1069,4 +1092,4 @@ The compiler also catch a large number of type or misuse errors. With this core
10691092

10701093
You can refer to this chapter source code in
10711094
`code-examples/learning-paths/cross-platform/matrix/chapter-3` in the archive that
1072-
you have downloaded earlier.
1095+
you have downloaded earlier.

content/learning-paths/cross-platform/matrix/4-code-2.md

Lines changed: 27 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ makes them suitable for using in bigger algorithms and is a common pattern used
9292
9393
One point worth mentioning is related to the `Abs` class: depending on the type
9494
used at instantiation, the compiler selects an optimized implementation for
95-
unsigned types, and there is no need to compute the absolute value of an always
95+
unsigned types, as there is no need to compute the absolute value of an always
9696
positive value. This optimization is transparent to users.
9797
9898
Those operators are marked as `constexpr` so that the compiler can optimize the
@@ -203,7 +203,7 @@ type traits (from `<numeric_limit>`) such as `max` to get the maximum value
203203
representable for a given type.
204204

205205
As those tests have been added to a new source file, it needs to be known to the
206-
build system, so add it now to the matrix-test target in `CMakeLists.txt`:
206+
build system, so add it now to the `matrix-test` target in `CMakeLists.txt`:
207207

208208
```TXT
209209
add_executable(matrix-test tests/main.cpp
@@ -292,7 +292,7 @@ First, create a `applyEltWiseUnaryOp` helper routine in the public section of
292292
operation as follows:
293293

294294
```CPP
295-
/// Apply element wise unary scalar operator \p uOp to each element.
295+
/// Apply element wise unary scalar operator \p op to each element.
296296
template <template <typename> class uOp>
297297
Matrix &applyEltWiseUnaryOp(const uOp<Ty> &op) {
298298
static_assert(std::is_base_of<unaryOperation, uOp<Ty>>::value,
@@ -505,7 +505,7 @@ ninja check
505505
[----------] 4 tests from unaryOperator (0 ms total)
506506

507507
[----------] Global test environment tear-down
508-
[==========] 27 tests from 3 test suites ran. (0 ms total)
508+
[==========] 27 tests from 2 test suites ran. (0 ms total)
509509
[ PASSED ] 27 tests.
510510
```
511511

@@ -692,7 +692,7 @@ Add this `applyEltWiseBinaryOp` helper routine to the public section of `Matrix`
692692
in `include/Matrix/Matrix.h`:
693693

694694
```CPP
695-
/// Apply element wise binary scalar operator \p bOp to each element.
695+
/// Apply element wise binary scalar operator \p op to each element.
696696
template <template <typename> class bOp>
697697
Matrix &applyEltWiseBinaryOp(const bOp<Ty> &op, const Matrix &rhs) {
698698
static_assert(std::is_base_of<binaryOperation, bOp<Ty>>::value,
@@ -1116,6 +1116,27 @@ content.
11161116
- Resize: to be able to dynamically change a matrix dimensions.
11171117
- Extract: to be able to extract part of a matrix.
11181118

1119+
### Optimization
1120+
1121+
The code written so far is relatively high level and allows the compiler to
1122+
perform a large number of optimizations, from propagating constants to
1123+
unrolling loops to name but a few most basic ones.
1124+
1125+
The `applyEltWiseUnaryOp` and `applyEltWiseBinaryOp` helper routines
1126+
from the Matrix library process one element at a time. The compiler
1127+
may make use of Arm specific SIMD (Single Instruction, Multiple Data)
1128+
instructions to process several elements at a time with one instruction. This
1129+
is an optimization named vectorization, that can either be done automatically
1130+
by the compiler (this is named *autovectorization*) or it can be done manually by the developper with the use of *intrinsics functions*.
1131+
You can learn more about the compiler's autovectorization capabilities with the
1132+
[Learn about Autovectorization](/learning-paths/cross-platform/loop-reflowing/)
1133+
learning path and about other vectorization tricks with the
1134+
[Optimize SIMD code with vectorization-friendly data layout](/learning-paths/cross-platform/vectorization-friendly-data-layout/)
1135+
learning path.
1136+
1137+
You can also learn how to
1138+
[Accelerate Matrix Multiplication Performance with SME2](/learning-paths/cross-platform/multiplying-matrices-with-sme2/).
1139+
11191140
## What have you achieved so far?
11201141

11211142
At this stage, the code structure looks like:
@@ -1153,4 +1174,4 @@ You can continue to add more functions, and more tests.
11531174

11541175
You can refer to this chapter source code in
11551176
`code-examples/learning-paths/cross-platform/matrix/chapter-4` in the archive that
1156-
you have downloaded earlier.
1177+
you have downloaded earlier.

0 commit comments

Comments
 (0)