Skip to content

Commit fe2dbe6

Browse files
Document more solvers
1 parent aeedba4 commit fe2dbe6

File tree

2 files changed

+22
-7
lines changed

2 files changed

+22
-7
lines changed

docs/src/advanced/custom.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ interface to be easily extendable by users. To that end, the linear solve algori
66
`LinearSolveFunction()` accepts a user-defined function for handling the solve. A
77
user can pass in their custom linear solve function, say `my_linsolve`, to
88
`LinearSolveFunction()`. A contrived example of solving a linear system with a custom solver is below.
9+
910
```julia
1011
using LinearSolve, LinearAlgebra
1112

docs/src/solvers/solvers.md

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,18 +13,19 @@ more precision is necessary, `QRFactorization()` and `SVDFactorization()` are
1313
the best choices, with SVD being the slowest but most precise.
1414

1515
For efficiency, `RFLUFactorization` is the fastest for dense LU-factorizations.
16+
`FastLUFactorization` will be faster than `LUFactorization` which is the Base.LinearAlgebra
17+
(`\` default) implementation of LU factorization. `SimpleLUFactorization` will be fast
18+
on very small matrices.
19+
1620
For sparse LU-factorizations, `KLUFactorization` if there is less structure
1721
to the sparsity pattern and `UMFPACKFactorization` if there is more structure.
1822
Pardiso.jl's methods are also known to be very efficient sparse linear solvers.
1923

2024
As sparse matrices get larger, iterative solvers tend to get more efficient than
2125
factorization methods if a lower tolerance of the solution is required.
2226

23-
IterativeSolvers.jl uses a low-rank Q update in its GMRES so it tends to be
24-
faster than Krylov.jl for CPU-based arrays, but it's only compatible with
25-
CPU-based arrays while Krylov.jl is more general and will support accelerators
26-
like CUDA. Krylov.jl works with CPUs and GPUs and tends to be more efficient than other
27-
Krylov-based methods.
27+
Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible
28+
with CPUs and GPUs, and thus is the generally preferred form for Krylov methods.
2829

2930
Finally, a user can pass a custom function for handling the linear solve using
3031
`LinearSolveFunction()` if existing solvers are not optimally suited for their application.
@@ -83,6 +84,19 @@ LinearSolve.jl contains some linear solvers built in.
8384

8485
- `SimpleLUFactorization`: a simple LU-factorization implementation without BLAS. Fast for small matrices.
8586

87+
### FastLapackInterface.jl
88+
89+
FastLapackInterface.jl is a package that allows for a lower-level interface to the LAPACK
90+
calls to allow for preallocating workspaces to decrease the overhead of the wrappers.
91+
LinearSolve.jl provides a wrapper to these routines in a way where an initialized solver
92+
has a non-allocating LU factorization. In theory, this post-initialized solve should always
93+
be faster than the Base.LinearAlgebra version.
94+
95+
- `FastLUFactorization` the `FastLapackInterface` version of the LU factorizaiton. Notably,
96+
this version does not allow for choice of pivoting method.
97+
- `FastQRFactorization(pivot=NoPivot(),blocksize=32)`, the `FastLapackInterface` version of
98+
the QR factorizaiton.
99+
86100
### SuiteSparse.jl
87101

88102
By default, the SuiteSparse.jl are implemented for efficiency by caching the
@@ -117,7 +131,7 @@ MKLPardisoIterate(;kwargs...) = PardisoJL(;solve_phase=Pardiso.NUM_FACT_SOLVE_RE
117131
kwargs...)
118132
```
119133

120-
The full set of keyword arguments for `PardisoJL` are:
134+
The full set of keyword arguments for `PardisoJL` are:
121135

122136
```julia
123137
Base.@kwdef struct PardisoJL <: SciMLLinearSolveAlgorithm
@@ -140,7 +154,7 @@ The following are non-standard GPU factorization routines.
140154
!!! note
141155

142156
Using this solver requires adding the package LinearSolveCUDA.jl
143-
157+
144158
- `CudaOffloadFactorization()`: An offloading technique used to GPU-accelerate CPU-based
145159
computations. Requires a sufficiently large `A` to overcome the data transfer
146160
costs.

0 commit comments

Comments
 (0)