Skip to content

Commit 03d5356

Browse files
authored
Merge pull request #377 from LouiseMsn/OpenMP-README
Adding aligator parallel + openMP documentation
2 parents 17f2f58 + 3fbd0dc commit 03d5356

File tree

4 files changed

+219
-88
lines changed

4 files changed

+219
-88
lines changed

README.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -96,13 +96,16 @@ cmake --build . -jNCPUS
9696

9797
Users can refer to [examples](https://github.com/Simple-Robotics/aligator/tree/main/examples) in either language to see how to build a trajectory optimization problem, create a solver instance (with parameters), and solve their problem.
9898

99-
For how to use **aligator** in CMake, including creation of a Python extension module in C++, please refer to the [developer's guide](doc/developers-guide.md).
99+
For how to use **aligator** in CMake, including creation of a Python extension module in C++, please refer to the [advanced user's guide](doc/advanced-user-guide.md).
100+
101+
### Aligator parallel & CPU optimizations
102+
Please see the [advanced user's guide](doc/advanced-user-guide.md#using-parallel-aligator--performance-optimization)
100103

101104
## Benchmarking
102105

103106
The repo [aligator-bench](https://github.com/Simple-Robotics/aligator-bench) provides a comparison of aligator against other solvers.
104107

105-
For developer info on benchmarking, see [doc/developers-guide.md](doc/developers-guide.md).
108+
For developer info on benchmarking, see [doc/advanced-user-guide.md](doc/advanced-user-guide.md).
106109

107110
## Citing aligator
108111

@@ -137,17 +140,18 @@ Please also consider citing the reference paper for the ProxDDP algorithm:
137140

138141
* [Antoine Bambade](https://bambade.github.io/) (Inria): mathematics and algorithms developer
139142
* [Justin Carpentier](https://jcarpent.github.io/) (Inria): project instructor
140-
* [Wilson Jallet](https://manifoldfr.github.io/) (Inria): main developer and manager of the project
141-
* [Sarah Kazdadi](https://github.com/sarah-ek/): linear algebra czar
143+
* [Wilson Jallet](https://manifoldfr.github.io/) (Inria): project lead and principal developer
144+
* [Sarah Kazdadi](https://github.com/sarah-ek/) (Inria): linear algebra czar
142145
* [Quentin Le Lidec](https://quentinll.github.io/) (Inria): feature developer
143146
* [Joris Vaillant](https://github.com/jorisv) (Inria): core developer
144147
* [Nicolas Mansard](https://gepettoweb.laas.fr/index.php/Members/NicolasMansard) (LAAS-CNRS): project coordinator
145148
* [Guilhem Saurel](https://github.com/nim65s) (LAAS-CNRS): core maintainer
146149
* [Fabian Schramm](https://github.com/fabinsch) (Inria): core developer
147-
* [Ludovic De Matteïs](https://github.com/LudovicDeMatteis) (LAAS-CNRS/Inria): feature developer
150+
* [Ludovic De Matteïs](https://github.com/LudovicDeMatteis) (LAAS-CNRS): feature developer
148151
* [Ewen Dantec](https://edantec.github.io/) (Inria): feature developer
149152
* [Antoine Bussy](https://github.com/antoine-bussy) (Aldebaran)
150-
* [Valentin Tordjman--Levavasseur](https://github.com/Tordjx) (Inria): feature developper
153+
* [Valentin Tordjman--Levavasseur](https://github.com/Tordjx) (Inria): feature developer
154+
* [Louise Manson](https://github.com/LouiseMsn) (Inria): documentation
151155

152156
## Acknowledgments
153157

doc/advanced-user-guide.md

Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
# Developer and advanced user guide
2+
3+
When creating the CMake build, make sure to add the `-DCMAKE_EXPORT_COMPILE_COMMANDS=1` flag. See its documentation [here](https://cmake.org/cmake/help/latest/variable/CMAKE_EXPORT_COMPILE_COMMANDS.html).
4+
5+
A template project for using **aligator** with CMake and C++ can be found in the [aligator-cmake-example-project](https://github.com/Simple-Robotics/aligator-cmake-example-project) repository.
6+
7+
## Creating a Python extension module
8+
9+
When **aligator** is installed, the CMake configuration file (`aligatorConfig.cmake`) provides a CMake function to help users easily create a [Python extension module](https://docs.python.org/3/extending/extending.html).
10+
Users can write an extension module in C++ for performance reasons when providing e.g. custom constraints, cost functions, dynamics, and so on.
11+
12+
The CMake function is called as follows:
13+
```{.cmake}
14+
aligator_create_python_extension(<name> [WITH_SOABI] <sources...>)
15+
```
16+
17+
This will create a CMake `MODULE` target named `<name>` on which the user can set properties and add an `install` directive.
18+
19+
An usage example can be found in [this repo](https://github.com/Simple-Robotics/aligator-cmake-example-project).
20+
21+
## Debugging
22+
23+
### Debugging a C++ executable
24+
25+
This project builds some C++ examples and tests. Debugging them is fairly straightforward using GDB:
26+
27+
```bash
28+
gdb path/to/executable
29+
```
30+
31+
with the appropriate command line arguments. Examples will appear in the binaries of `build/examples`. Make sure to look at GDB's documentation.
32+
33+
If you want to catch `std::exception` instances thrown, enter the following command once in GDB:
34+
35+
```gdb
36+
(gdb) catch throw std::exception
37+
```
38+
39+
### Debugging a Python example or test
40+
41+
In order for debug symbols to be loaded and important variables not being optimized out, you will want to compile in `DEBUG` mode.
42+
43+
Then, you can run the module under `gdb` using
44+
45+
```bash
46+
gdb --args python example/file.py
47+
```
48+
49+
If you want to look at Eigen types such as vectors and matrices, you should look into the [`eigengdb`](https://github.com/dmillard/eigengdb) plugin for GDB.
50+
51+
### Hybrid debugging with Visual Studio Code
52+
53+
[TODO]
54+
55+
## Using **aligator**'s parallelization features
56+
57+
The `SolverProxDDP` solver is able to leverage multicore CPU architectures.
58+
59+
### Inside your code
60+
61+
Before calling the solver make sure to enable parallelization as follows:
62+
63+
In Python:
64+
65+
```python
66+
solver.rollout_type = aligator.ROLLOUT_LINEAR
67+
solver.linear_solver_choice = aligator.LQ_SOLVER_PARALLEL
68+
solver.setNumThreads(num_threads)
69+
```
70+
71+
And in C++:
72+
```cpp
73+
std::size_t num_threads = 4ul; // for example
74+
solver.rollout_type = aligator::RolloutType::LINEAR;
75+
solver.linear_solver_choice = aligator::LQSolverChoice::PARALLEL;
76+
solver.setNumThreads(num_threads);
77+
```
78+
79+
### Shell setup for CPU core optimization
80+
**Aligator** uses OpenMP for parallelization which is setup using environment variables in your shell. The settings are local to your shell.
81+
82+
#### Visualization
83+
Printing OpenMP parameters at launch:
84+
```bash
85+
export OMP_DISPLAY_ENV=VERBOSE
86+
```
87+
Print when a thread is launched and with which affinity (CPU thread(s) on where it will try to run):
88+
```bash
89+
export OMP_DISPLAY_AFFINITY=TRUE
90+
```
91+
92+
#### Core and thread assignment
93+
OpenMP operates with **places** which define a CPU thread or core reserved for a thread. **Places** can be a CPU thread or an entire CPU core (which can have one thread, or multiple with hyperthreading).
94+
95+
##### Assigning places with CPU threads:
96+
```bash
97+
export OMP_PLACES ="threads(n)" # Threads will run on the first nth CPU threads, with one thread per CPU thread.
98+
```
99+
or
100+
```bash
101+
export OMP_PLACES="{0},{1},{2}" # Threads will run on CPU threads 0, 1 ,2
102+
```
103+
##### Assigning places with CPU cores:
104+
105+
Threads will run on the first nth CPU cores, with one thread per core, even if the core has multiple threads
106+
```bash
107+
export OMP_PLACES="cores(n)"
108+
```
109+
110+
For more info on places see [here](https://www.ibm.com/docs/en/xl-fortran-linux/16.1.0?topic=openmp-omp-places).
111+
112+
##### Using only performance cores (Intel performance hybrid architectures)
113+
114+
Some modern CPUs have a mix of performance (P) and efficiency (E) cores. The E-cores are often slower, hence we should
115+
have OpenMP schedule threads on P-cores only.
116+
117+
Get your CPU model with
118+
```bash
119+
lscpu | grep -i "Model Name"
120+
```
121+
Get CPU core info with:
122+
```bash
123+
lscpu -e
124+
125+
# with an i7-13800H
126+
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ MHZ
127+
0 0 0 0 0:0:0:0 yes 5000.0000 400.0000 400.000
128+
1 0 0 0 0:0:0:0 yes 5000.0000 400.0000 400.000
129+
2 0 0 1 4:4:1:0 yes 5000.0000 400.0000 400.000
130+
3 0 0 1 4:4:1:0 yes 5000.0000 400.0000 400.000
131+
4 0 0 2 8:8:2:0 yes 5200.0000 400.0000 400.000
132+
5 0 0 2 8:8:2:0 yes 5200.0000 400.0000 5176.303
133+
6 0 0 3 12:12:3:0 yes 5200.0000 400.0000 1482.743
134+
7 0 0 3 12:12:3:0 yes 5200.0000 400.0000 400.000
135+
8 0 0 4 16:16:4:0 yes 5000.0000 400.0000 3485.561
136+
9 0 0 4 16:16:4:0 yes 5000.0000 400.0000 721.684
137+
10 0 0 5 20:20:5:0 yes 5000.0000 400.0000 1641.311
138+
11 0 0 5 20:20:5:0 yes 5000.0000 400.0000 400.000
139+
12 0 0 6 24:24:6:0 yes 4000.0000 400.0000 400.000
140+
13 0 0 7 25:25:6:0 yes 4000.0000 400.0000 2949.734
141+
14 0 0 8 26:26:6:0 yes 4000.0000 400.0000 2554.695
142+
15 0 0 9 27:27:6:0 yes 4000.0000 400.0000 3588.623
143+
16 0 0 10 28:28:7:0 yes 4000.0000 400.0000 400.000
144+
17 0 0 11 29:29:7:0 yes 4000.0000 400.0000 400.000
145+
18 0 0 12 30:30:7:0 yes 4000.0000 400.0000 400.000
146+
19 0 0 13 31:31:7:0 yes 4000.0000 400.0000 3610.068
147+
```
148+
A little digging on the internet tells us that this CPU has 6 performance cores and 8 efficiency cores for a total of 20 threads. We see higher frequencies in core 0 to 5: these are the performance cores. To use only performance cores on this CPU you would set:
149+
```bash
150+
export OMP_PLACES="cores(6)"
151+
# or
152+
export OMP_PLACES="threads(12)"
153+
```
154+
> [!IMPORTANT]
155+
> Put your PC in performance mode (usually found in the power settings).
156+
157+
## Profiling
158+
159+
We use [google benchmark](https://github.com/google/benchmark/tree/v1.5.0) to define C++ benchmarks
160+
which are able to aggregate data from runs, and [Flame Graphs](https://github.com/brendangregg/FlameGraph) to produce a breakdown of the various function calls and their importance as a proportion of the call stack.
161+
162+
If you have the Rust toolchain and `cargo` installed, we suggest you install [cargo-flamegraph](https://github.com/flamegraph-rs/flamegraph). Then, you can create a flame graph with the following command:
163+
164+
```bash
165+
flamegraph -o my_flamegraph.svg -- ./build/examples/example-croc-talos-arm
166+
```
167+
168+
169+
Here's Crocoddyl's flame graph:
170+
![croc-talos-arm](images/flamegraph-croc.svg)
171+
Here's for `aligator::SolverFDDP`:
172+
![prox-talos-arm](images/flamegraph-prox.svg)

doc/developers-guide.md

Lines changed: 0 additions & 70 deletions
This file was deleted.

doc/header.html

Lines changed: 37 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,65 +1,90 @@
1-
<!-- HTML header for doxygen 1.9.2-->
1+
<!-- HTML header for doxygen 1.13.2-->
22
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3-
<html xmlns="http://www.w3.org/1999/xhtml">
3+
<html xmlns="http://www.w3.org/1999/xhtml" lang="$langISO">
44
<head>
55
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
6-
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
6+
<meta http-equiv="X-UA-Compatible" content="IE=11"/>
77
<meta name="generator" content="Doxygen $doxygenversion"/>
88
<meta name="viewport" content="width=device-width, initial-scale=1"/>
99
<!--BEGIN PROJECT_NAME--><title>$projectname: $title</title><!--END PROJECT_NAME-->
1010
<!--BEGIN !PROJECT_NAME--><title>$title</title><!--END !PROJECT_NAME-->
11+
<!--BEGIN PROJECT_ICON-->
12+
<link rel="icon" href="$relpath^$projecticon" type="image/x-icon" />
13+
<!--END PROJECT_ICON-->
1114
<link href="$relpath^tabs.css" rel="stylesheet" type="text/css"/>
15+
<!--BEGIN DISABLE_INDEX-->
16+
<!--BEGIN FULL_SIDEBAR-->
17+
<script type="text/javascript">var page_layout=1;</script>
18+
<!--END FULL_SIDEBAR-->
19+
<!--END DISABLE_INDEX-->
1220
<script type="text/javascript" src="$relpath^jquery.js"></script>
1321
<script type="text/javascript" src="$relpath^dynsections.js"></script>
22+
<!--BEGIN DOX AWESOME-->
1423
<script type="text/javascript" src="$relpath^doxygen-awesome-darkmode-toggle.js"></script>
15-
<script type="text/javascript" src="$relpath^doxygen-awesome-fragment-copy-button.js"></script>
24+
<!-- <script type="text/javascript" src="$relpath^doxygen-awesome-fragment-copy-button.js"></script> -->
1625
<script type="text/javascript" src="$relpath^doxygen-awesome-interactive-toc.js"></script>
1726
<script type="text/javascript" src="$relpath^doxygen-awesome-tabs.js"></script>
1827
<script type="text/javascript">
1928
DoxygenAwesomeDarkModeToggle.init()
20-
DoxygenAwesomeFragmentCopyButton.init()
29+
// DoxygenAwesomeFragmentCopyButton.init()
2130
DoxygenAwesomeInteractiveToc.init()
2231
DoxygenAwesomeTabs.init()
2332
</script>
33+
<!--END DOX AWESOME-->
34+
<!--BEGIN COPY_CLIPBOARD-->
35+
<script type="text/javascript" src="$relpath^clipboard.js"></script>
36+
<!--END COPY_CLIPBOARD-->
2437
$treeview
2538
$search
2639
$mathjax
40+
$darkmode
2741
<link href="$relpath^$stylesheet" rel="stylesheet" type="text/css" />
2842
$extrastylesheet
2943
</head>
3044
<body>
45+
<!--BEGIN DISABLE_INDEX-->
46+
<!--BEGIN FULL_SIDEBAR-->
47+
<div id="side-nav" class="ui-resizable side-nav-resizable"><!-- do not remove this div, it is closed by doxygen! -->
48+
<!--END FULL_SIDEBAR-->
49+
<!--END DISABLE_INDEX-->
3150

3251
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
3352

3453
<!--BEGIN TITLEAREA-->
3554
<div id="titlearea">
3655
<table cellspacing="0" cellpadding="0">
3756
<tbody>
38-
<tr style="height: 56px;">
57+
<tr id="projectrow">
3958
<!--BEGIN PROJECT_LOGO-->
40-
<td id="projectlogo"><img alt="Logo" src="$relpath^$projectlogo"/></td>
59+
<td id="projectlogo"><img alt="Logo" src="$relpath^$projectlogo"$logosize/></td>
4160
<!--END PROJECT_LOGO-->
4261
<!--BEGIN PROJECT_NAME-->
43-
<td id="projectalign" style="padding-left: 0.5em;">
44-
<div id="projectname">$projectname
45-
<!--BEGIN PROJECT_NUMBER-->&#160;<span id="projectnumber">$projectnumber</span><!--END PROJECT_NUMBER-->
62+
<td id="projectalign">
63+
<div id="projectname">$projectname<!--BEGIN PROJECT_NUMBER--><span id="projectnumber">&#160;$projectnumber</span><!--END PROJECT_NUMBER-->
4664
</div>
4765
<!--BEGIN PROJECT_BRIEF--><div id="projectbrief">$projectbrief</div><!--END PROJECT_BRIEF-->
4866
</td>
4967
<!--END PROJECT_NAME-->
5068
<!--BEGIN !PROJECT_NAME-->
5169
<!--BEGIN PROJECT_BRIEF-->
52-
<td style="padding-left: 0.5em;">
70+
<td>
5371
<div id="projectbrief">$projectbrief</div>
5472
</td>
5573
<!--END PROJECT_BRIEF-->
5674
<!--END !PROJECT_NAME-->
5775
<!--BEGIN DISABLE_INDEX-->
5876
<!--BEGIN SEARCHENGINE-->
59-
<td>$searchbox</td>
77+
<!--BEGIN !FULL_SIDEBAR-->
78+
<td>$searchbox</td>
79+
<!--END !FULL_SIDEBAR-->
6080
<!--END SEARCHENGINE-->
6181
<!--END DISABLE_INDEX-->
6282
</tr>
83+
<!--BEGIN SEARCHENGINE-->
84+
<!--BEGIN FULL_SIDEBAR-->
85+
<tr><td colspan="2">$searchbox</td></tr>
86+
<!--END FULL_SIDEBAR-->
87+
<!--END SEARCHENGINE-->
6388
</tbody>
6489
</table>
6590
</div>

0 commit comments

Comments
 (0)