Skip to content

Commit 3fbd0dc

Browse files
committed
touch up advanced guide
1 parent ca05d09 commit 3fbd0dc

File tree

1 file changed

+35
-16
lines changed

1 file changed

+35
-16
lines changed

doc/advanced-user-guide.md

Lines changed: 35 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ When **aligator** is installed, the CMake configuration file (`aligatorConfig.cm
1010
Users can write an extension module in C++ for performance reasons when providing e.g. custom constraints, cost functions, dynamics, and so on.
1111

1212
The CMake function is called as follows:
13-
```cmake
13+
```{.cmake}
1414
aligator_create_python_extension(<name> [WITH_SOABI] <sources...>)
1515
```
1616

@@ -50,51 +50,70 @@ If you want to look at Eigen types such as vectors and matrices, you should look
5050

5151
### Hybrid debugging with Visual Studio Code
5252

53-
**TODO** Finish documenting this
53+
[TODO]
54+
55+
## Using **aligator**'s parallelization features
56+
57+
The `SolverProxDDP` solver is able to leverage multicore CPU architectures.
5458

55-
## Using parallel aligator & performance optimization
5659
### Inside your code
57-
Before calling the solver make sure to enable parallelization :
58-
``` python
59-
# valid in C++ or python
60+
61+
Before calling the solver make sure to enable parallelization as follows:
62+
63+
In Python:
64+
65+
```python
6066
solver.rollout_type = aligator.ROLLOUT_LINEAR
6167
solver.linear_solver_choice = aligator.LQ_SOLVER_PARALLEL
62-
solver.setNumThreads(<number of threads>)
68+
solver.setNumThreads(num_threads)
6369
```
6470

65-
### Bash setup for CPU core optimization
66-
Aligator uses OpenMP for parallelization which is setup using environment variables in your bash. The settings are local to your bash.
71+
And in C++:
72+
```cpp
73+
std::size_t num_threads = 4ul; // for example
74+
solver.rollout_type = aligator::RolloutType::LINEAR;
75+
solver.linear_solver_choice = aligator::LQSolverChoice::PARALLEL;
76+
solver.setNumThreads(num_threads);
77+
```
78+
79+
### Shell setup for CPU core optimization
80+
**Aligator** uses OpenMP for parallelization which is setup using environment variables in your shell. The settings are local to your shell.
6781

6882
#### Visualization
6983
Printing OpenMP parameters at launch:
7084
```bash
7185
export OMP_DISPLAY_ENV=VERBOSE
7286
```
73-
Prints when a thread is launched and with which affinity (CPU thread(s) on where it will try to run):
87+
Print when a thread is launched and with which affinity (CPU thread(s) on where it will try to run):
7488
```bash
7589
export OMP_DISPLAY_AFFINITY=TRUE
7690
```
7791

78-
#### Core & thread assignation
79-
OpenMP operates with "**places**" that defines a CPU thread or core reserved for a thread. **Places** can be a CPU thread or an entire CPU core (composed of one or multiples threads).
92+
#### Core and thread assignment
93+
OpenMP operates with **places** which define a CPU thread or core reserved for a thread. **Places** can be a CPU thread or an entire CPU core (which can have one thread, or multiple with hyperthreading).
8094

81-
##### Assigning places with CPU threads :
95+
##### Assigning places with CPU threads:
8296
```bash
8397
export OMP_PLACES ="threads(n)" # Threads will run on the first nth CPU threads, with one thread per CPU thread.
8498
```
8599
or
86100
```bash
87101
export OMP_PLACES="{0},{1},{2}" # Threads will run on CPU threads 0, 1 ,2
88102
```
89-
##### Assigning places with CPU cores :
90-
threads will run on the first nth CPU cores, with one thread per core, even if the core has multiple threads
103+
##### Assigning places with CPU cores:
104+
105+
Threads will run on the first nth CPU cores, with one thread per core, even if the core has multiple threads
91106
```bash
92107
export OMP_PLACES="cores(n)"
93108
```
94109

95110
For more info on places see [here](https://www.ibm.com/docs/en/xl-fortran-linux/16.1.0?topic=openmp-omp-places).
96111

97-
##### Using only performance cores
112+
##### Using only performance cores (Intel performance hybrid architectures)
113+
114+
Some modern CPUs have a mix of performance (P) and efficiency (E) cores. The E-cores are often slower, hence we should
115+
have OpenMP schedule threads on P-cores only.
116+
98117
Get your CPU model with
99118
```bash
100119
lscpu | grep -i "Model Name"

0 commit comments

Comments
 (0)