Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 12 additions & 11 deletions docs/src/tutorials/gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,20 @@

LinearSolve.jl provides two ways to GPU accelerate linear solves:

* Offloading: offloading takes a CPU-based problem and automatically transforms it into a
GPU-based problem in the background, and returns the solution on CPU. Thus using
offloading requires no change on the part of the user other than to choose an offloading
solver.
* Array type interface: the array type interface requires that the user defines the
`LinearProblem` using an `AbstractGPUArray` type and chooses an appropriate solver
(or uses the default solver). The solution will then be returned as a GPU array type.
- Offloading: offloading takes a CPU-based problem and automatically transforms it into a
GPU-based problem in the background, and returns the solution on CPU. Thus using
offloading requires no change on the part of the user other than to choose an offloading
solver.
- Array type interface: the array type interface requires that the user defines the
`LinearProblem` using an `AbstractGPUArray` type and chooses an appropriate solver
(or uses the default solver). The solution will then be returned as a GPU array type.

The offloading approach has the advantage of being simpler and requiring no change to
existing CPU code, while having the disadvantage of having more overhead. In the following
sections we will demonstrate how to use each of the approaches.

!!! warn

GPUs are not always faster! Your matrices need to be sufficiently large in order for
GPU accelerations to actually be faster. For offloading it's around 1,000 x 1,000 matrices
and for Array type interface it's around 100 x 100. For sparse matrices, it is highly
Expand Down Expand Up @@ -74,7 +74,7 @@ to return it to the CPU. This setup does no automated memory transfers and will
move things to CPU on command.

!!! warn

Many GPU functionalities, such as `CUDA.cu`, have a built-in preference for `Float32`.
Generally it is much faster to use 32-bit floating point operations on GPU than 64-bit
operations, and thus this is generally the right choice if going to such platforms.
Expand Down Expand Up @@ -118,7 +118,7 @@ sol = LS.solve(prob, LS.LUFactorization())
```

!!! note

For now, CUDSS only supports CuSparseMatrixCSR type matrices.

Note that `KrylovJL` methods also work with sparse GPU arrays:
Expand All @@ -130,7 +130,8 @@ sol = LS.solve(prob, LS.KrylovJL_GMRES())
Note that CUSPARSE also has some GPU-based preconditioners, such as a built-in `ilu`. However:

```julia
sol = LS.solve(prob, LS.KrylovJL_GMRES(precs = (A, p) -> (CUDA.CUSPARSE.ilu02!(A, 'O'), LA.I)))
sol = LS.solve(
prob, LS.KrylovJL_GMRES(precs = (A, p) -> (CUDA.CUSPARSE.ilu02!(A, 'O'), LA.I)))
```

However, right now CUSPARSE is missing the right `ldiv!` implementation for this to work
Expand Down
26 changes: 14 additions & 12 deletions docs/src/tutorials/linear.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,11 +103,11 @@ The Matrix-Free operators are provided by the [SciMLOperators.jl interface](http
For example, for the matrix `A` defined via:

```@example linsys1
A = [-2.0 1 0 0 0
1 -2 1 0 0
0 1 -2 1 0
0 0 1 -2 1
0 0 0 1 -2]
A = [-2.0 1 0 0 0
1 -2 1 0 0
0 1 -2 1 0
0 0 1 -2 1
0 0 0 1 -2]
```

We can define the `FunctionOperator` that does the `A*v` operations, without using the matrix `A`. This is done by defining
Expand All @@ -116,18 +116,18 @@ operator. See the [SciMLOperators.jl documentation](https://docs.sciml.ai/SciMLO
non-constant operators, operator algebras, and many more features). This is done by:

```@example linsys1
function Afunc!(w,v,u,p,t)
function Afunc!(w, v, u, p, t)
w[1] = -2v[1] + v[2]
for i in 2:4
w[i] = v[i-1] - 2v[i] + v[i+1]
w[i] = v[i - 1] - 2v[i] + v[i + 1]
end
w[5] = v[4] - 2v[5]
nothing
end

function Afunc!(v,u,p,t)
function Afunc!(v, u, p, t)
w = zeros(5)
Afunc!(w,v,u,p,t)
Afunc!(w, v, u, p, t)
w
end

Expand Down Expand Up @@ -159,6 +159,8 @@ mfopA * sol.u - b
```

!!! note
Note that not all methods can use a matrix-free operator. For example, `LS.LUFactorization()` requires a matrix. If you use an
invalid method, you will get an error. The methods particularly from KrylovJL are the ones preferred for these cases
(and are defaulted to).


Note that not all methods can use a matrix-free operator. For example, `LS.LUFactorization()` requires a matrix. If you use an
invalid method, you will get an error. The methods particularly from KrylovJL are the ones preferred for these cases
(and are defaulted to).
11 changes: 7 additions & 4 deletions ext/LinearSolveCUDAExt.jl
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
module LinearSolveCUDAExt

using CUDA
using LinearSolve: LinearSolve, is_cusparse, defaultalg, cudss_loaded, DefaultLinearSolver,
DefaultAlgorithmChoice, ALREADY_WARNED_CUDSS, LinearCache, needs_concrete_A,
error_no_cudss_lu, init_cacheval, OperatorAssumptions, CudaOffloadFactorization,
using LinearSolve: LinearSolve, is_cusparse, defaultalg, cudss_loaded, DefaultLinearSolver,
DefaultAlgorithmChoice, ALREADY_WARNED_CUDSS, LinearCache,
needs_concrete_A,
error_no_cudss_lu, init_cacheval, OperatorAssumptions,
CudaOffloadFactorization,
SparspakFactorization, KLUFactorization, UMFPACKFactorization
using LinearSolve.LinearAlgebra, LinearSolve.SciMLBase, LinearSolve.ArrayInterface
using SciMLBase: AbstractSciMLOperator

function LinearSolve.is_cusparse(A::Union{CUDA.CUSPARSE.CuSparseMatrixCSR, CUDA.CUSPARSE.CuSparseMatrixCSC})
function LinearSolve.is_cusparse(A::Union{
CUDA.CUSPARSE.CuSparseMatrixCSR, CUDA.CUSPARSE.CuSparseMatrixCSC})
true
end

Expand Down
8 changes: 4 additions & 4 deletions ext/LinearSolveEnzymeExt.jl
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
module LinearSolveEnzymeExt

using LinearSolve: LinearSolve, SciMLLinearSolveAlgorithm, init, solve!, LinearProblem,
LinearCache, AbstractKrylovSubspaceMethod, DefaultLinearSolver,
using LinearSolve: LinearSolve, SciMLLinearSolveAlgorithm, init, solve!, LinearProblem,
LinearCache, AbstractKrylovSubspaceMethod, DefaultLinearSolver,
defaultalg_adjoint_eval, solve
using LinearSolve.LinearAlgebra
using EnzymeCore
Expand Down Expand Up @@ -205,9 +205,9 @@ function EnzymeRules.augmented_primal(

cache = (copy(res.u), resvals, cachesolve, dAs, dbs)

_res = EnzymeRules.needs_primal(config) ? res : nothing
_res = EnzymeRules.needs_primal(config) ? res : nothing
_dres = EnzymeRules.needs_shadow(config) ? dres : nothing

return EnzymeRules.AugmentedReturn(_res, _dres, cache)
end

Expand Down
19 changes: 8 additions & 11 deletions ext/LinearSolveForwardDiffExt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -150,26 +150,29 @@ function SciMLBase.init(
dual_type = get_dual_type(prob.b)
end

alg isa LinearSolve.DefaultLinearSolver ? real_alg = LinearSolve.defaultalg(primal_prob.A, primal_prob.b) : real_alg = alg
alg isa LinearSolve.DefaultLinearSolver ?
real_alg = LinearSolve.defaultalg(primal_prob.A, primal_prob.b) : real_alg = alg

non_partial_cache = init(
primal_prob, real_alg, assumptions, args...;
alias = alias, abstol = abstol, reltol = reltol,
maxiters = maxiters, verbose = verbose, Pl = Pl, Pr = Pr, assumptions = assumptions,
sensealg = sensealg, u0 = new_u0, kwargs...)
return DualLinearCache(non_partial_cache, dual_type, ∂_A, ∂_b, !isnothing(∂_b) ? zero.(∂_b) : ∂_b, A, b, zeros(dual_type, length(b)))
return DualLinearCache(non_partial_cache, dual_type, ∂_A, ∂_b,
!isnothing(∂_b) ? zero.(∂_b) : ∂_b, A, b, zeros(dual_type, length(b)))
end

function SciMLBase.solve!(cache::DualLinearCache, args...; kwargs...)
solve!(cache, cache.alg, args...; kwargs...)
solve!(cache, cache.alg, args...; kwargs...)
end

function SciMLBase.solve!(cache::DualLinearCache, alg::SciMLLinearSolveAlgorithm, args...; kwargs...)
function SciMLBase.solve!(
cache::DualLinearCache, alg::SciMLLinearSolveAlgorithm, args...; kwargs...)
sol,
partials = linearsolve_forwarddiff_solve(
cache::DualLinearCache, cache.alg, args...; kwargs...)
dual_sol = linearsolve_dual_solution(sol.u, partials, cache.dual_type)

if cache.dual_u isa AbstractArray
cache.dual_u[:] = dual_sol
else
Expand All @@ -192,7 +195,6 @@ function Base.setproperty!(dc::DualLinearCache, sym::Symbol, val)
setproperty!(dc.linear_cache, sym, val)
end


# Update the partials if setting A or b
if sym === :A
setfield!(dc, :dual_A, val)
Expand Down Expand Up @@ -221,8 +223,6 @@ function Base.getproperty(dc::DualLinearCache, sym::Symbol)
end
end



# Enhanced helper functions for Dual numbers to handle recursion
get_dual_type(x::Dual{T, V, P}) where {T, V <: AbstractFloat, P} = typeof(x)
get_dual_type(x::Dual{T, V, P}) where {T, V <: Dual, P} = typeof(x)
Expand All @@ -241,7 +241,6 @@ nodual_value(x::Dual{T, V, P}) where {T, V <: AbstractFloat, P} = ForwardDiff.va
nodual_value(x::Dual{T, V, P}) where {T, V <: Dual, P} = x.value # Keep the inner dual intact
nodual_value(x::AbstractArray{<:Dual}) = map(nodual_value, x)


function partials_to_list(partial_matrix::AbstractVector{T}) where {T}
p = eachindex(first(partial_matrix))
[[partial[i] for partial in partial_matrix] for i in p]
Expand All @@ -263,6 +262,4 @@ function partials_to_list(partial_matrix)
return res_list
end


end

3 changes: 2 additions & 1 deletion ext/LinearSolveRecursiveFactorizationExt.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
module LinearSolveRecursiveFactorizationExt

using LinearSolve: LinearSolve, userecursivefactorization, LinearCache, @get_cacheval, RFLUFactorization
using LinearSolve: LinearSolve, userecursivefactorization, LinearCache, @get_cacheval,
RFLUFactorization
using LinearSolve.LinearAlgebra, LinearSolve.ArrayInterface, RecursiveFactorization
using SciMLBase: SciMLBase, ReturnCode

Expand Down
48 changes: 29 additions & 19 deletions ext/LinearSolveSparseArraysExt.jl
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
module LinearSolveSparseArraysExt

using LinearSolve: LinearSolve, BLASELTYPES, pattern_changed, ArrayInterface,
@get_cacheval, CHOLMODFactorization, GenericFactorization, GenericLUFactorization,
KLUFactorization, LUFactorization, NormalCholeskyFactorization, OperatorAssumptions,
QRFactorization, RFLUFactorization, UMFPACKFactorization, solve
@get_cacheval, CHOLMODFactorization, GenericFactorization,
GenericLUFactorization,
KLUFactorization, LUFactorization, NormalCholeskyFactorization,
OperatorAssumptions,
QRFactorization, RFLUFactorization, UMFPACKFactorization, solve
using ArrayInterface: ArrayInterface
using LinearAlgebra: LinearAlgebra, I, Hermitian, Symmetric, cholesky, ldiv!, lu, lu!, QR
using SparseArrays: SparseArrays, AbstractSparseArray, AbstractSparseMatrixCSC, SparseMatrixCSC,
using SparseArrays: SparseArrays, AbstractSparseArray, AbstractSparseMatrixCSC,
SparseMatrixCSC,
nonzeros, rowvals, getcolptr, sparse, sprand
using SparseArrays.UMFPACK: UMFPACK_OK
using Base: /, \, convert
Expand Down Expand Up @@ -107,23 +110,25 @@ function LinearSolve.init_cacheval(
alg::LUFactorization, A::AbstractSparseArray{T, Int64}, b, u,
Pl, Pr,
maxiters::Int, abstol, reltol,
verbose::Bool, assumptions::OperatorAssumptions) where {T<:BLASELTYPES}
verbose::Bool, assumptions::OperatorAssumptions) where {T <: BLASELTYPES}
if LinearSolve.is_cusparse(A)
ArrayInterface.lu_instance(A)
else
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int64}(zero(Int64), zero(Int64), [Int64(1)], Int64[], T[]))
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int64}(
zero(Int64), zero(Int64), [Int64(1)], Int64[], T[]))
end
end

function LinearSolve.init_cacheval(
alg::LUFactorization, A::AbstractSparseArray{T, Int32}, b, u,
Pl, Pr,
maxiters::Int, abstol, reltol,
verbose::Bool, assumptions::OperatorAssumptions) where {T<:BLASELTYPES}
verbose::Bool, assumptions::OperatorAssumptions) where {T <: BLASELTYPES}
if LinearSolve.is_cusparse(A)
ArrayInterface.lu_instance(A)
else
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int32}(zero(Int32), zero(Int32), [Int32(1)], Int32[], T[]))
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int32}(
zero(Int32), zero(Int32), [Int32(1)], Int32[], T[]))
end
end

Expand All @@ -140,7 +145,6 @@ function LinearSolve.init_cacheval(
maxiters::Int, abstol,
reltol,
verbose::Bool, assumptions::OperatorAssumptions)

PREALLOCATED_UMFPACK
end

Expand All @@ -156,16 +160,18 @@ function LinearSolve.init_cacheval(
alg::UMFPACKFactorization, A::AbstractSparseArray{T, Int64}, b, u,
Pl, Pr,
maxiters::Int, abstol, reltol,
verbose::Bool, assumptions::OperatorAssumptions) where {T<:BLASELTYPES}
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int64}(zero(Int64), zero(Int64), [Int64(1)], Int64[], T[]))
verbose::Bool, assumptions::OperatorAssumptions) where {T <: BLASELTYPES}
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int64}(
zero(Int64), zero(Int64), [Int64(1)], Int64[], T[]))
end

function LinearSolve.init_cacheval(
alg::UMFPACKFactorization, A::AbstractSparseArray{T, Int32}, b, u,
Pl, Pr,
maxiters::Int, abstol, reltol,
verbose::Bool, assumptions::OperatorAssumptions) where {T<:BLASELTYPES}
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int32}(zero(Int32), zero(Int32), [Int32(1)], Int32[], T[]))
verbose::Bool, assumptions::OperatorAssumptions) where {T <: BLASELTYPES}
SparseArrays.UMFPACK.UmfpackLU(SparseMatrixCSC{T, Int32}(
zero(Int32), zero(Int32), [Int32(1)], Int32[], T[]))
end

function SciMLBase.solve!(
Expand Down Expand Up @@ -197,7 +203,8 @@ function SciMLBase.solve!(
F = LinearSolve.@get_cacheval(cache, :UMFPACKFactorization)
if F.status == UMFPACK_OK
y = ldiv!(cache.u, F, cache.b)
SciMLBase.build_linear_solution(alg, y, nothing, cache; retcode = ReturnCode.Success)
SciMLBase.build_linear_solution(
alg, y, nothing, cache; retcode = ReturnCode.Success)
else
SciMLBase.build_linear_solution(
alg, cache.u, nothing, cache; retcode = ReturnCode.Infeasible)
Expand Down Expand Up @@ -236,7 +243,8 @@ function LinearSolve.init_cacheval(
maxiters::Int, abstol,
reltol,
verbose::Bool, assumptions::OperatorAssumptions)
KLU.KLUFactorization(SparseMatrixCSC{Float64, Int32}(0, 0, [Int32(1)], Int32[], Float64[]))
KLU.KLUFactorization(SparseMatrixCSC{Float64, Int32}(
0, 0, [Int32(1)], Int32[], Float64[]))
end

# TODO: guard this against errors
Expand Down Expand Up @@ -266,14 +274,15 @@ function SciMLBase.solve!(cache::LinearSolve.LinearCache, alg::KLUFactorization;
F = LinearSolve.@get_cacheval(cache, :KLUFactorization)
if F.common.status == KLU.KLU_OK
y = ldiv!(cache.u, F, cache.b)
SciMLBase.build_linear_solution(alg, y, nothing, cache; retcode = ReturnCode.Success)
SciMLBase.build_linear_solution(
alg, y, nothing, cache; retcode = ReturnCode.Success)
else
SciMLBase.build_linear_solution(
alg, cache.u, nothing, cache; retcode = ReturnCode.Infeasible)
end
end

const PREALLOCATED_CHOLMOD = cholesky(sparse(reshape([1.0],1,1)))
const PREALLOCATED_CHOLMOD = cholesky(sparse(reshape([1.0], 1, 1)))

function LinearSolve.init_cacheval(alg::CHOLMODFactorization,
A::Union{SparseMatrixCSC{T, Int}, Symmetric{T, SparseMatrixCSC{T, Int}}}, b, u,
Expand All @@ -290,7 +299,7 @@ function LinearSolve.init_cacheval(alg::CHOLMODFactorization,
maxiters::Int, abstol, reltol,
verbose::Bool, assumptions::OperatorAssumptions) where {T <:
BLASELTYPES}
cholesky(sparse(reshape([one(T)],1,1)))
cholesky(sparse(reshape([one(T)], 1, 1)))
end

function LinearSolve.init_cacheval(alg::CHOLMODFactorization,
Expand Down Expand Up @@ -368,7 +377,8 @@ function LinearSolve.init_cacheval(
nothing
end

function LinearSolve.init_cacheval(alg::QRFactorization, A::SparseMatrixCSC{Float64, <:Integer}, b, u, Pl, Pr,
function LinearSolve.init_cacheval(
alg::QRFactorization, A::SparseMatrixCSC{Float64, <:Integer}, b, u, Pl, Pr,
maxiters::Int, abstol, reltol, verbose::Bool,
assumptions::OperatorAssumptions)
ArrayInterface.qr_instance(convert(AbstractMatrix, A), alg.pivot)
Expand Down
6 changes: 3 additions & 3 deletions ext/LinearSolveSparspakExt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@ function LinearSolve.init_cacheval(
::SparspakFactorization, A::AbstractSparseMatrixCSC{Tv, Ti}, b, u, Pl, Pr, maxiters::Int, abstol,
reltol,
verbose::Bool, assumptions::OperatorAssumptions) where {Tv, Ti}

if size(A,1) == size(A,2)
if size(A, 1) == size(A, 2)
A = convert(AbstractMatrix, A)
if A isa SparseArrays.AbstractSparseArray
return sparspaklu(
SparseMatrixCSC{Tv, Ti}(size(A)..., getcolptr(A), rowvals(A),
nonzeros(A)),
factorize = false)
else
return sparspaklu(SparseMatrixCSC{Tv, Ti}(zero(Ti), zero(Ti), [one(Ti)], Ti[], eltype(A)[]),
return sparspaklu(
SparseMatrixCSC{Tv, Ti}(zero(Ti), zero(Ti), [one(Ti)], Ti[], eltype(A)[]),
factorize = false)
end
else
Expand Down
Loading
Loading