You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add section about multithreaded linear algebra to performance tips (#50124)
* Add section about multithreaded linear algebra to performance tips
* Mention linear algebra backends
---------
Co-authored-by: Viral B. Shah <[email protected]>
Copy file name to clipboardExpand all lines: doc/src/manual/performance-tips.md
+29Lines changed: 29 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1631,3 +1631,32 @@ will not require this degree of programmer annotation to attain performance.
1631
1631
In the mean time, some user-contributed packages like
1632
1632
[FastClosures](https://github.com/c42f/FastClosures.jl) automate the
1633
1633
insertion of `let` statements as in `abmult3`.
1634
+
1635
+
## [Multithreading and linear algebra](@id man-multithreading-linear-algebra)
1636
+
1637
+
This section applies to multithreaded Julia code which, in each thread, performs linear algebra operations.
1638
+
Indeed, these linear algebra operations involve BLAS / LAPACK calls, which are themselves multithreaded.
1639
+
In this case, one must ensure that cores aren't oversubscribed due to the two different types of multithreading.
1640
+
1641
+
Julia compiles and uses its own copy of OpenBLAS for linear algebra, whose number of threads is controlled by the environment variable `OPENBLAS_NUM_THREADS`.
1642
+
It can either be set as a command line option when launching Julia, or modified during the Julia session with `BLAS.set_num_threads(N)` (the submodule `BLAS` is exported by `using LinearAlgebra`).
1643
+
Its current value can be accessed with `BLAS.get_num_threads()`.
1644
+
1645
+
When the user does not specify anything, Julia tries to choose a reasonable value for the number of OpenBLAS threads (e.g. based on the platform, the Julia version, etc.).
1646
+
However, it is generally recommended to check and set the value manually.
1647
+
The OpenBLAS behavior is as follows:
1648
+
1649
+
* If `OPENBLAS_NUM_THREADS=1`, OpenBLAS uses the calling Julia thread(s), i.e. it "lives in" the Julia thread that runs the computation.
1650
+
* If `OPENBLAS_NUM_THREADS=N>1`, OpenBLAS creates and manages its own pool of threads (`N` in total). There is just one OpenBLAS thread pool shared among all Julia threads.
1651
+
1652
+
When you start Julia in multithreaded mode with `JULIA_NUM_THREADS=X`, it is generally recommended to set `OPENBLAS_NUM_THREADS=1`.
1653
+
Given the behavior described above, increasing the number of BLAS threads to `N>1` can very easily lead to worse performance, in particular when `N<<X`.
1654
+
However this is just a rule of thumb, and the best way to set each number of threads is to experiment on your specific application.
1655
+
1656
+
## [Alternative linear algebra backends](@id man-backends-linear-algebra)
1657
+
1658
+
As an alternative to OpenBLAS, there exist several other backends that can help with linear algebra performance.
1659
+
Prominent examples include [MKL.jl](https://github.com/JuliaLinearAlgebra/MKL.jl) and [AppleAccelerate.jl](https://github.com/JuliaMath/AppleAccelerate.jl).
1660
+
1661
+
These are external packages, so we will not discuss them in detail here.
1662
+
Please refer to their respective documentations (especially because they have different behaviors than OpenBLAS with respect to multithreading).
0 commit comments