Skip to content

Commit 226399b

Browse files
author
ripley
committed
update BLAS/LAPACK documentation
git-svn-id: https://svn.r-project.org/R/trunk@87892 00db46b3-68df-0310-9c12-caf00c1e9a41
1 parent 817e711 commit 226399b

File tree

2 files changed

+102
-74
lines changed

2 files changed

+102
-74
lines changed

doc/NEWS.Rd

Lines changed: 36 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -205,16 +205,40 @@
205205

206206
\item New datasets \code{penguins} and \code{penguins_raw} thanks to
207207
\I{Ella Kaye}, \I{Heather Turner}, and \I{Kristen Gorman}.
208+
}
209+
}
208210

211+
\subsection{BLAS and LAPACK}{
212+
\itemize{
209213
\item The bundled BLAS and LAPACK sources have been updated to
210-
those shipped with LAPACK 3.12.1.
211-
212-
This is mainly bug fixes, but includes a handful of new ancillary
213-
routines, including two new BLAS routines \code{dgemmtr} and
214-
\code{zgemmtr} which are now used by LAPACK routines. So an
215-
external BLAS to be used with the internal LAPACK (unusual) needs
216-
to provide those routines, and if an external LAPACK is 3.12.1 or
217-
later, the BLAS used must contain the 2025 additions.
214+
those shipped as part of january 2025's LAPACK 3.12.1.
215+
216+
\item It is intended that this will be the last update to BLAS and
217+
LAPACK in the \R sources. Those building \R from source are
218+
encouraged to use external BLAS and LAPACK and this will be required in
219+
future.
220+
221+
\item This update was mainly bug fixes but contained a barely
222+
documented major change. The set of BLAS routines has been
223+
unchanged since 1988, so throughout \R's history. This update
224+
introduced two new BLAS routines \code{dgemmtr} and
225+
\code{zgemmtr} which are now used by LAPACK routines. This means
226+
that BLAS implementations are no longer interchangeable.
227+
228+
\item To work aorund this, \R can be configured with option
229+
\option{--with-2025blas} which arranges for the 2025 additions to
230+
be compiled into \code{libRlapack} (the internal LAPACK, not built
231+
if an external LAPACK is used).
232+
233+
This option allows the continuation of swapping the BLAS in use by
234+
symlinking \file{lib/libRblas.*}. It has the disadvantage of
235+
using the reference BLAS version of the 2025 routines whereas an
236+
enhanced BLAS might have an optimized version (\I{OpenBLAS} does as
237+
from version 0.3.29).
238+
239+
\item Windows builds use the internal LAPACK and by default the
240+
internal BLAS: notes on how to swap the latter in \file{Rblas.dll}
241+
are in file \file{src/extra/blas/Makefile.win}.
218242
}
219243
}
220244

@@ -237,7 +261,7 @@
237261
Intel\sspace{}2024.2 (and 2022.2 should).
238262

239263
Current binary distributions on macOS use Apple
240-
\command{clang}\sspace{}14 and so do not use C23.
264+
\command{clang}\sspace{}14 and so do not use C23.
241265

242266
\item The minimum \command{autoconf} requirement for a maintainer
243267
build has been increased to \command{autoconf}\sspace{}2.72.
@@ -254,11 +278,6 @@
254278
\item There is now support for installing the debug symbols for
255279
recommended packages on macOS: see \samp{REC_INSTALL_OPT} in file
256280
\file{config.site}.
257-
258-
\item There is a new \command{configure} option
259-
\option{--with-2025blas} which will compile the 2025 BLAS
260-
additions in the internal LAPACK to allow an external BLAS which
261-
lacks them to be used.
262281
}
263282
}
264283

@@ -656,9 +675,10 @@
656675
\item Many arguments which should be length-1 logical are checked
657676
more thoroughly. The most commonly seen errors are in
658677
\code{unlink(, recursive)}, \code{tempdir()} and the \code{na.rm}
659-
argument of \code{max()}, \code{min()}, \code{sum()}, \dots.
678+
arguments of \code{max()}, \code{min()}, \code{sum()}, \dots.
660679

661-
\code{grep()}, \code{strsplit()} and similar took non-\code{TRUE}
680+
\code{grep()}, \code{strsplit()} and similar took
681+
non-\code{TRUE}/\code{FALSE}
662682
values of their logical arguments as \code{FALSE}, but these were
663683
almost always mistakes and are now reported as \code{NA}.
664684
}

doc/manual/R-admin.texi

Lines changed: 66 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -3581,24 +3581,24 @@ these variables.
35813581
The linear algebra routines in @R{} make use of @acronym{BLAS} (Basic
35823582
Linear Algebra Subprograms, @uref{https://netlib.org/blas/faq.html})
35833583
routines, and most make use of routines from @acronym{LAPACK} (@I{Linear
3584-
Algebra PACKage}, @uref{https://netlib.org/lapack/}). The @R{}
3585-
sources contain reference (Fortran) implementations of these, but they
3586-
can be replaced by external libraries, usually those tuned for speed on
3587-
specific CPUs. These libraries normally contain all of the BLAS
3588-
routines and some tuned LAPACK routines and perhaps the rest of LAPACK
3589-
from the reference implementation. Because of the way linking works,
3590-
using an external BLAS library may necessitate using the version of
3591-
LAPACK it contains.
3584+
Algebra PACKage}, @uref{https://netlib.org/lapack/}). The @R{} sources
3585+
contain reference (Fortran) implementations of these, but they can be
3586+
replaced by external libraries, usually those tuned for speed on
3587+
specific CPUs. These libraries normally contain all of the pre-2025
3588+
BLAS routines and some tuned LAPACK routines and perhaps the rest of
3589+
LAPACK from the reference implementation. Because of the way linking
3590+
works, using an external BLAS library may necessitate using the version
3591+
of LAPACK it contains.
35923592

35933593
Note that the alternative implementations will not give identical
3594-
numeric results. Some differences may be benign (such the signs of @abbr{SVD}s
3595-
and eigenvectors), but the optimized routines can be less accurate and
3596-
(particularly for LAPACK) can be from older versions with fewer
3597-
corrections. However, @R{} relies on
3594+
numeric results. Some differences may be benign (such the signs of
3595+
@abbr{SVD}s and eigenvectors), but the optimized routines can be less
3596+
accurate and (particularly for LAPACK) can be from older versions with
3597+
fewer corrections. Moreover, @R{} relies on
35983598
@acronym{ISO}/@acronym{IEC}@tie{}60559 compliance. This can be broken
35993599
if for example the code assumes that terms with a zero factor are always
36003600
zero and do not need to be computed---whereas @code{x*0} can be
3601-
@code{NaN}. The internal BLAS has been extensively patched to avoid
3601+
@code{NaN}. The internal BLAS used to be extensively patched to avoid
36023602
this whereas @I{MKL}'s documentation has warned
36033603
@quotation
36043604
LAPACK routines assume that input matrices do not contain IEEE 754
@@ -3627,6 +3627,10 @@ results or segfaults.
36273627
There is a tendency for re-distributors of @R{} to use `enhanced' linear
36283628
algebra libraries without explaining their downsides.
36293629

3630+
@R{} is moving away from bundling the reference (`netlib')
3631+
implementations of LAPACK and BLAS (which is nowadays distributed as
3632+
part of LAPACK): it has been announced that version 3.12.1 included in
3633+
@R{}@tie{}4.5.0 is expected to be the last update.
36303634

36313635
@node BLAS
36323636
@subsection BLAS
@@ -3671,12 +3675,12 @@ checked. Also, the BLAS can be switched after configure is run, either
36713675
as a symbolic link or by the mechanisms mentioned below, and this can
36723676
defeat the completeness check.
36733677

3674-
Some enhanced @acronym{BLAS}es are compiler-system-specific
3675-
(@code{Accelerate} on macOS, @code{sunperf} on Solaris@footnote{Using
3676-
the Oracle Developer Studio @command{cc} and @command{f95} compilers},
3677-
@code{libessl} on IBM). The correct incantation for these is often
3678-
found @emph{via} @option{--with-blas} with no value on the appropriate
3679-
platforms.
3678+
@c Some enhanced @acronym{BLAS}es are compiler-system-specific
3679+
@c (@code{Accelerate} on macOS, @code{sunperf} on Solaris@footnote{Using
3680+
@c the Oracle Developer Studio @command{cc} and @command{f95} compilers},
3681+
@c @code{libessl} on IBM). The correct incantation for these is often
3682+
@c found @emph{via} @option{--with-blas} with no value on the appropriate
3683+
@c platforms.
36803684

36813685
Note that under Unix (but not under Windows) if @R{} is compiled against
36823686
a non-default @acronym{BLAS} and @option{--enable-BLAS-shlib} is
@@ -3690,16 +3694,17 @@ in use: Build @R{} with @option{--with-blas} to select the OS version of
36903694
the reference BLAS, and then use @command{update-alternatives} to switch
36913695
between the available BLAS libraries. See
36923696
@uref{https://wiki.debian.org/DebianScience/LinearAlgebraLibraries}.
3693-
3694-
Fedora 33 and later offer `@I{FlexiBLAS}', a similar mechanism for switching
3695-
the BLAS in use
3696-
(@uref{https://www.mpi-magdeburg.mpg.de/projects/flexiblas}). However,
3697-
rather than overriding @code{libblas}, this requires configuring @R{}
3698-
with option @option{--with-blas=flexiblas}. `Backend' wrappers are
3699-
available for the reference BLAS, ATLAS and serial, threaded and @abbr{OpenMP}
3700-
builds of @I{OpenBLAS} and @I{BLIS}, and perhaps others@footnote{for example,
3701-
Intel @I{MKL} not packaged by Fedora.}. This can be controlled from a
3702-
running @R{} session by package @CRANpkg{flexiblas}.
3697+
(ATLAS, MKL and OpenBLAS alternatives are currently available.)
3698+
3699+
Fedora 33 and later offer `@I{FlexiBLAS}', a similar mechanism for
3700+
switching the BLAS in use
3701+
(@uref{https://www.mpi-magdeburg.mpg.de/projects/flexiblas}). Rather
3702+
than overriding @code{libblas}, this requires configuring @R{} with
3703+
option @option{--with-blas=flexiblas}. `Backend' wrappers are available
3704+
for the reference BLAS, ATLAS and serial, threaded and @abbr{OpenMP}
3705+
builds of @I{OpenBLAS} and @I{BLIS}, and perhaps others@footnote{for
3706+
example, Intel @I{MKL} not packaged by Fedora.}. This can be controlled
3707+
from a running @R{} session by package @CRANpkg{flexiblas}.
37033708

37043709
BLAS implementations which use parallel computations can be
37053710
non-deterministic: this is known for ATLAS.
@@ -3709,7 +3714,11 @@ non-deterministic: this is known for ATLAS.
37093714
@subsubsection ATLAS
37103715

37113716
ATLAS (@uref{https://math-atlas.sourceforge.net/}) is a ``tuned''
3712-
@acronym{BLAS} that runs on a wide range of Unix-alike platforms.
3717+
@acronym{BLAS} that runs on a wide range of Unix-alike Intel platforms and can
3718+
be used on Windows. At the time of writing it had last been updated in
3719+
2018 (in an unreleased `developer (unstable)' branch).
3720+
@c https://sourceforge.net/p/math-atlas/mailman/math-atlas-announce/?viewmonth=201810
3721+
37133722
Unfortunately it is built by default as a static library that on some
37143723
platforms may not be able to be used with shared objects such as are
37153724
used in @R{} packages. Be careful when using pre-built versions of
@@ -3735,11 +3744,11 @@ or, as on @cputype{x86_64} Fedora where a path needs to be specified,
37353744
@end example
37363745
@noindent
37373746
Distributed ATLAS libraries cannot be tuned to your machine and so are a
3738-
compromise: for example Fedora tunes@footnote{The only way to see
3739-
exactly which CPUs the distributed libraries have been tuned for is to
3740-
read the @file{atlas.spec} file.} @cputype{x86_64} @abbr{RPM}s for CPUs with
3741-
SSE3 extensions, and separate @abbr{RPM}s may be available for specific CPU
3742-
families.
3747+
compromise: for example, when checked Fedora tuned@footnote{The only way
3748+
to see exactly which CPUs the distributed libraries have been tuned for
3749+
is to read the @file{atlas.spec} file.} @cputype{x86_64} @abbr{RPM}s for
3750+
CPUs with SSE3 extensions, and separate @abbr{RPM}s may be available for
3751+
specific CPU families.
37433752

37443753
Note that building @R{} on Linux against distributed shared libraries
37453754
may need @samp{-devel} or @samp{-dev} packages installed.
@@ -3766,16 +3775,14 @@ virtual core per physical CPU. (For the Fedora libraries the
37663775
compile-time flag specifies 4 threads.)
37673776
@c https://math-atlas.sourceforge.net/atlas_install/node21.html
37683777

3769-
ATLAS appears no longer to be under development: at the time of writing
3770-
the latest release was from 2016,
3771-
37723778
@node OpenBLAS and BLIS
37733779
@subsubsection @I{OpenBLAS} and @I{BLIS}
37743780

3775-
@I{Dr Kazushige Goto} wrote a tuned @acronym{BLAS} for several processors
3776-
and OSes, which was frozen in 2010. @I{OpenBLAS}
3777-
(@uref{https://www.openblas.net/}) is a descendant project with support
3778-
for some later CPUs.
3781+
@I{Dr Kazushige Goto} wrote a tuned @acronym{BLAS} for several
3782+
processors and OSes, which was frozen in 2010. @I{OpenBLAS}
3783+
(@uref{http://www.openmathlib.org/OpenBLAS/}) is a descendant project
3784+
with support for some later CPUs, covering some from Intel/AMD, Arm,
3785+
MIPS and RISC-V.
37793786

37803787
This can be used by configuring @R{} with something like
37813788

@@ -3785,7 +3792,7 @@ This can be used by configuring @R{} with something like
37853792

37863793
@noindent
37873794
See @pxref{Shared BLAS} for an alternative (and in many ways preferable)
3788-
way to use them.
3795+
way to use it.
37893796

37903797
Some platforms provide multiple builds of @I{OpenBLAS}: for example Fedora
37913798
has @abbr{RPM}s@footnote{(and more, e.g.@: for 64-bit ints and static versions).}
@@ -3876,8 +3883,9 @@ MKL="-L$@{MKL_LIB_PATH@} -lmkl_gf_lp64 -lmkl_core -lmkl_sequential"
38763883
The option @option{--with-lapack} is used since @I{MKL} contains a tuned
38773884
copy of LAPACK (often older than the current version) as well as the
38783885
@acronym{BLAS} (@pxref{LAPACK}). Also, it does not at the time of
3879-
writing contain the BLAS functions such as @code{dgemmtr} and so cannot
3880-
be used with LAPACK 3.12.1 included in @R{} 4.5.0 and later.
3886+
writing contain the BLAS functions such as @code{dgemmtr} and so to
3887+
be used with LAPACK 3.12.1 included in @R{} 4.5.0 and later needs @R{}
3888+
configured with @option{--with-2025blas}.
38813889

38823890
Threaded @I{MKL} may be used by replacing the line defining the variable
38833891
@code{MKL} by
@@ -3988,7 +3996,8 @@ changed in @file{@var{R_HOME}/etc/ldpaths}.
39883996

39893997
This becasme less easy in 2025: swapping the BLAS is only possible to
39903998
one compatible with the LAPACK in use. For the LAPACK shipped with @R{}
3991-
4.5.0 that means one containing @code{dgemmtr} and @code{zgemmtr}.
3999+
4.5.0 that means one containing @code{dgemmtr} and @code{zgemmtr}, or
4000+
configuring @R{} with @option{--with-2025blas}.
39924001
@end itemize
39934002

39944003
Another option to change the @acronym{BLAS} in use is to symlink a
@@ -4038,7 +4047,7 @@ It is assumed that @code{-llapack} is the reference LAPACK library but
40384047
on Debian/Ubuntu it can be switched, including after @R{} is installed.
40394048
On such a platform it is better to use @option{--without-lapack} or
40404049
@option{--with-blas --with-lapack} (see below) explicitly. The known
4041-
examples@footnote{ATLAS, @I{OpenBLAS} and @I{Accelerate}.} of a
4050+
examples@footnote{ATLAS, @I{MKL}, @I{OpenBLAS} and @I{Accelerate}.} of a
40424051
non-reference LAPACK library found at installation all contain BLAS
40434052
routines so are not used by a default @command{configure} run.
40444053

@@ -4075,7 +4084,7 @@ practice its main uses are without a value,
40754084

40764085
@itemize
40774086
@item
4078-
with an `enhanced' BLAS such as ATLAS, @I{FlexiBLAS}, @I{MKL} or @I{OpenBLAS} which
4087+
with an `enhanced' BLAS such as ATLAS, @I{MKL} or @I{OpenBLAS} which
40794088
contains a full LAPACK (to avoid possible conflicts), or
40804089

40814090
@item
@@ -4092,7 +4101,7 @@ If building LAPACK from its @I{Netlib} sources, be aware that @command{make}
40924101
with its supplied @file{Makefile} will make a @emph{static} library and
40934102
@R{} requires a shared/dynamic one. To get one, use @command{cmake} as
40944103
documented briefly in @file{README.md}. Something like (to build only
4095-
the double and double complex subroutines with 32-bit array indices),
4104+
the double and double complex subroutines with 32-bit array indices):
40964105

40974106
@example
40984107
mkdir build
@@ -4111,13 +4120,13 @@ make -j10
41114120
This builds the reference BLAS and the reference LAPACK linked to it.
41124121

41134122
Note that @command{cmake} files do not provide an @code{uninstall}
4114-
target, but @file{build/install_manifest.txt} is a list of the files
4123+
target, but file @file{build/install_manifest.txt} lists the files
41154124
installed, so you can remove them @emph{via} shell commands or from
41164125
@R{}.
41174126

4118-
If using @option{--with-lapack} to get a generic LAPACK (or allowing the
4119-
default to select one), consider also using @option{--with-blas} (with a
4120-
path if an enhanced BLAS is installed).
4127+
If using @option{--with-lapack} to get a reference (`generic') LAPACK
4128+
(or allowing the default to select one), consider also using
4129+
@option{--with-blas} (with a path if an enhanced BLAS is installed).
41214130

41224131

41234132
@node Caveats
@@ -5551,7 +5560,7 @@ can be used @emph{via} the configuration option
55515560
@end example
55525561

55535562
@noindent
5554-
@c LAPACK in Accelerate was still 3.2.1 in macOS 13.6 and 14
5563+
@c LAPACK in Accelerate was still 3.2.1 in macOS 15
55555564
to provide potentially higher-performance versions of the @acronym{BLAS}
55565565
and LAPACK routines.@footnote{It has been reported that for some
55575566
non-Apple toolchains @code{CPPFLAGS} needed to contain
@@ -5576,10 +5585,9 @@ BLAS and LAPACK calls, configure with
55765585
@option{--with-newAccelerate=lapack}. These options cannot be used with
55775586
others such as @option{--with-blas} and @option{--with-lapack}.
55785587

5579-
Not that none of the Accelerate distributions contain the BLAS routines
5580-
added in LAPACK 3.12.1 so cannot be used with the internal LAPACK as
5581-
from @R{}@tie{}4.5.0. They can be used with the (old) LAPACK they
5582-
contain, or with an external LAPACK 3.12.0 or earlier.
5588+
As from @R{}@tie{}4.5.0, specifying @option{--with-newAccelerate} also
5589+
requires the option @option{--with-2025blas}. (Using
5590+
@option{--with-newAccelerate=lapack} does not.)
55835591

55845592
@c https://developer.apple.com/documentation/accelerate/veclib
55855593
Threading in @I{Accelerate} is controlled by `Grand Central

0 commit comments

Comments
 (0)