You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
parameterless_type(::Type{T}) where {T} =__parameterless_type(T)
43
+
44
+
p =plot(ns, res[1]; ylabel ="GFLOPs", xlabel ="N", title ="GFLOPs for NxN LU Factorization", label =string(Symbol(parameterless_type(algs[1]))), legend=:outertopright)
parameterless_type(::Type{T}) where {T} =__parameterless_type(T)
43
+
44
+
p =plot(ns, res[1]; ylabel ="GFLOPs", xlabel ="N", title ="GFLOPs for NxN LU Factorization", label =string(Symbol(parameterless_type(algs[1]))), legend=:outertopright)
parameterless_type(::Type{T}) where {T} =__parameterless_type(T)
43
+
44
+
p =plot(ns, res[1]; ylabel ="GFLOPs", xlabel ="N", title ="GFLOPs for NxN LU Factorization", label =string(Symbol(parameterless_type(algs[1]))), legend=:outertopright)
Copy file name to clipboardExpand all lines: docs/src/solvers/solvers.md
+60-5Lines changed: 60 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,15 +7,37 @@ Solves for ``Au=b`` in the problem defined by `prob` using the algorithm
7
7
8
8
## Recommended Methods
9
9
10
+
### Dense Matrices
11
+
10
12
The default algorithm `nothing` is good for picking an algorithm that will work,
11
13
but one may need to change this to receive more performance or precision. If
12
14
more precision is necessary, `QRFactorization()` and `SVDFactorization()` are
13
15
the best choices, with SVD being the slowest but most precise.
14
16
15
-
For efficiency, `RFLUFactorization` is the fastest for dense LU-factorizations.
16
-
`FastLUFactorization` will be faster than `LUFactorization` which is the Base.LinearAlgebra
17
-
(`\` default) implementation of LU factorization. `SimpleLUFactorization` will be fast
18
-
on very small matrices.
17
+
For efficiency, `RFLUFactorization` is the fastest for dense LU-factorizations until around
18
+
150x150 matrices, though this can be dependent on the exact details of the hardware. After this
19
+
point, `MKLLUFactorization` is usually faster on most hardware. Note that on Mac computers
20
+
that `AppleAccelerateLUFactorization` is generally always the fastest. `LUFactorization` will
21
+
use your base system BLAS which can be fast or slow depending on the hardware configuration.
22
+
`SimpleLUFactorization` will be fast only on very small matrices but can cut down on compile times.
23
+
24
+
For very large dense factorizations, offloading to the GPU can be preferred. Metal.jl can be used
25
+
on Mac hardware to offload, and has a cutoff point of being faster at around size 20,000 x 20,000
26
+
matrices (and only supports Float32). `CudaOffloadFactorization` can be more efficient at a
27
+
much smaller cutoff, possibly around size 1,000 x 1,000 matrices, though this is highly dependent
28
+
on the chosen GPU hardware. `CudaOffloadFactorization` requires a CUDA-compatible NVIDIA GPU.
29
+
CUDA offload supports Float64 but most consumer GPU hardware will be much faster on Float32
30
+
(many are >32x faster for Float32 operations than Float64 operations) and thus for most hardware
31
+
this is only recommended for Float32 matrices.
32
+
33
+
!!! note
34
+
35
+
Performance details for dense LU-factorizations can be highly dependent on the hardware configuration.
36
+
For details see [this issue](https://github.com/SciML/LinearSolve.jl/issues/357).
37
+
If one is looking to best optimize their system, we suggest running the performance
38
+
tuning benchmark.
39
+
40
+
### Sparse Matrices
19
41
20
42
For sparse LU-factorizations, `KLUFactorization` if there is less structure
21
43
to the sparsity pattern and `UMFPACKFactorization` if there is more structure.
@@ -31,12 +53,25 @@ As sparse matrices get larger, iterative solvers tend to get more efficient than
31
53
factorization methods if a lower tolerance of the solution is required.
32
54
33
55
Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible
34
-
with CPUs and GPUs, and thus is the generally preferred form for Krylov methods.
56
+
with CPUs and GPUs, and thus is the generally preferred form for Krylov methods. The
57
+
choice of Krylov method should be the one most constrained to the type of operator one
58
+
has, for example if positive definite then `Krylov_CG()`, but if no good properties then
59
+
use `Krylov_GMRES()`.
35
60
36
61
Finally, a user can pass a custom function for handling the linear solve using
37
62
`LinearSolveFunction()` if existing solvers are not optimally suited for their application.
38
63
The interface is detailed [here](@ref custom).
39
64
65
+
### Lazy SciMLOperators
66
+
67
+
If the linear operator is given as a lazy non-concrete operator, such as a `FunctionOperator`,
68
+
then using a Krylov method is preferred in order to not concretize the matrix.
69
+
Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible
70
+
with CPUs and GPUs, and thus is the generally preferred form for Krylov methods. The
71
+
choice of Krylov method should be the one most constrained to the type of operator one
72
+
has, for example if positive definite then `Krylov_CG()`, but if no good properties then
73
+
use `Krylov_GMRES()`.
74
+
40
75
## Full List of Methods
41
76
42
77
### RecursiveFactorization.jl
@@ -121,6 +156,26 @@ KrylovJL
121
156
MKLLUFactorization
122
157
```
123
158
159
+
### AppleAccelerate.jl
160
+
161
+
!!! note
162
+
163
+
Using this solver requires a Mac with Apple Accelerate. This should come standard in most "modern" Mac computers.
164
+
165
+
```@docs
166
+
AppleAccelerateLUFactorization
167
+
```
168
+
169
+
### Metal.jl
170
+
171
+
!!! note
172
+
173
+
Using this solver requires adding the package Metal.jl, i.e. `using Metal`. This package is only compatible with Mac M-Series computers with a Metal-compatible GPU.
0 commit comments