Skip to content

Commit 373da67

Browse files
paru user guide
1 parent ba2bd76 commit 373da67

File tree

3 files changed

+8
-7
lines changed

3 files changed

+8
-7
lines changed

ParU/Doc/ChangeLog

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
TODO, 2025: version 1.1.0
1+
Nov 1, 2025: version 1.1.0
22

33
* performance: better detection and control of threading in the BLAS
44

ParU/Doc/paru_user_guide.pdf

239 Bytes
Binary file not shown.

ParU/Doc/paru_user_guide.tex

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1835,7 +1835,7 @@ \subsection{Using ParU in MATLAB}
18351835
\end{itemize}
18361836

18371837
%-------------------------------------------------------------------------------
1838-
\section{Requirements and Availability}
1838+
\section{Requirements, Availability, and Performance Considerations}
18391839
\label{summary}
18401840
%-------------------------------------------------------------------------------
18411841

@@ -1851,17 +1851,18 @@ \section{Requirements and Availability}
18511851
nested parallelism, by making parallel calls to the BLAS, each of which can
18521852
themselves be parallel. This is supported by the Intel MKL BLAS library with
18531853
an Intel-specific method (\verb'mkl_set_num_threads_local') and by an OpenBLAS
1854-
method (\verb'openblas_set_num_threads_local', in v0.3.27 or late) to tell the
1854+
method (\verb'openblas_set_num_threads_local', in v0.3.27 or later) to tell the
18551855
BLAS how many threads to use. As a result, for best performance, the Intel MKL
1856-
BLAS or OpenBLAS (v0.3.27 or later) is required. However, do NOT use the
1856+
BLAS or OpenBLAS (v0.3.27 or later) is required.
1857+
If a different BLAS library is used, it should be a single-threaded library,
1858+
since ParU will call it from many of its own threads, simultaneously.
1859+
1860+
Do NOT use the
18571861
\verb'OMP_PLACES' environment variable if using the Intel MKL BLAS. Leave it
18581862
unset. Otherwise, using it with ParU will lead to only a single thread
18591863
employed inside the BLAS, even when ParU expects it to be multithreaded,
18601864
and performance will suffer as a result.
18611865

1862-
If a different BLAS library is used, it should be a single-threaded library,
1863-
since ParU will call it from many of its own threads, simultaneously.
1864-
18651866
ParU also relies heavily on OpenMP tasking to factorize multiple frontal
18661867
matrices at the same time, where each frontal matrix can also be factorized by
18671868
multiple threads. If tasking is not available, each frontal matrix is

0 commit comments

Comments
 (0)