more docs

vpuri3 · vpuri3 · commit dd821bcd509f · 2023-06-04T20:24:16.000-04:00
diff --git a/README.md b/README.md
@@ -11,11 +11,11 @@
 [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac)
 [![SciML Code Style](https://img.shields.io/static/v1?label=code%20style&message=SciML&color=9558b2&labelColor=389826)](https://github.com/SciML/SciMLStyle)
 
-`SciMLOperators` is a package for managing linear, nonlinear, and
-time-dependent operators acting on vectors, (or column-vectors of matrices).
-We provide wrappers for matrix-free operators, fast tensor-product
-evaluations, pre-cached mutating evaluations, as well as `Zygote`-compatible
-non-mutating evaluations.
+`SciMLOperators` is a package for managing linear, nonlinear,
+time-dependent, and parameter dependent operators acting on vectors,
+(or column-vectors of matrices). We provide wrappers for matrix-free
+operators, fast tensor-product evaluations, pre-cached mutating
+evaluations, as well as `Zygote`-compatible non-mutating evaluations.
 
 The lazily implemented operator algebra allows the user to update the
 operator state by passing in an update function that accepts arbirary
@@ -37,8 +37,8 @@ julia> Pkg.add("SciMLOperators")
 
 ## Examples
 
-Let `M`, `D`, `F` be matrix, diagonal, and function-based `SciMLOperators`
-respectively.
+Let `M`, `D`, `F` be matrix, diagonal matrix, and function-based
+`SciMLOperators` respectively.
 
 ```julia
 N = 4
@@ -52,37 +52,56 @@ F = FunctionOperator(f, zeros(N), zeros(N))
 Then, the following codes just work.
 
 ```julia
-L1 = 2M + 3F + LinearAlgebra.I
+L1 = 2M + 3F + LinearAlgebra.I + rand(N, N)
 L2 = D * F * M'
 L3 = kron(M, D, F)
 L4 = M \ D
 L5 = [M; D]' * [M F; F D] * [F; D]
 ```
 
-Each `L#` can be applied to vectors of appropriate sizes:
+Each `L#` can be applied to `AbstractVector`s of appropriate sizes:
 
 ```julia
+p = nothing # parameter struct
+t = 0.0     # time
+
 u = rand(N)
-v = zeros(N)
-u_kron = rand(N ^ 3)
+v = L1(u, p, t) # == L1 * u
 
-v = L1 * u
-v_kron = L3(u_kron, p, t)
+u_kron = rand(N ^ 3)
+v_kron = L3(u_kron, p, t) # == L3 * u_kron
 ```
 
 For mutating operator evaluations, call `cache_operator` to generate
 in-place cache so the operation is nonallocating.
 
 ```julia
+α, β = rand(2)
+
 # allocate cache
 L2 = cache_operator(L2, u)
 L4 = cache_operator(L4, u)
 
 # allocation-free evaluation
-mul!(v, L2, u)
-L4(v, u, p, t)
+L2(v, u, p, t) # == mul!(v, L2, u)
+L4(v, u, p, t, α, β) # == mul!(v, L4, u, α, β)
 ```
 
+The calling signature `L(u, p, t)`, for out-of-place evaluations is
+equivalent to `L * u`, and the in-place evaluation `L(v, u, p, t, args...)`
+is equivalent to `LinearAlgebra.mul!(v, L, u, args...)`, where the arguments
+`p, t` are passed to `L` to update its state. More details are provided
+in the operator update section below. While overloads to `Base.*`
+and `LinearAlgebra.mul!` are available, where a `SciMLOperator` behaves
+like an `AbstractMatrix`, we recommend sticking with the
+`L(u, p, t)`, `L(v, u, p, t)`, `L(v, u, p, t, α, β)` calling signatures
+as the latter internally update the operator state.
+
+The `(u, p, t)` calling signature is standardized over the `SciML`
+ecosystem and is flexible enough to support use cases such as time-evolution
+in ODEs, as well as sensitivity computation with respect to the parameter
+object `p`.
+
 Thanks to overloads defined for evaluation methods and traits in
 `Base`, `LinearAlgebra`, the behaviour of a `SciMLOperator` is
 indistinguishable from an `AbstractMatrix`. These operators can be
@@ -94,8 +113,120 @@ interface include, but are not limited, the following:
 - `LinearAlgebra: mul!, ldiv!, lmul!, rmul!, factorize, issymmetric, ishermitian, isposdef`
 - `SparseArrays: sparse, issparse`
 
+## Multidimension arrays and batching
+
+SciMLOperator can also be applied to `AbstractMatrix` subtypes where
+operator-evaluation is done column-wise.
+
+```julia
+K = 10
+u_mat = rand(N, K)
+
+v_mat = F(u_mat, p, t) # == mul!(v_mat, F, u_mat)
+size(v_mat) == (N, K) # true
+```
+
+`L#` can also be applied to `AbstractArray`s that are not
+`AbstractVecOrMat`s so long as their size in the first dimension is appropriate
+for matrix-multiplication. Internally, `SciMLOperator`s reshapes an
+`N`-dimensional array to an `AbstractMatrix`, and applies the operator via
+matrix-multiplication.
+
 ## Operator update
 
+This package can also be used to write time-dependent, and
+parameter-dependent operators, whose state can be updated per
+a user-defined function.
+The updates can be done in-place, i.e. by mutating the object,
+or out-of-place, i.e. in a non-mutating, `Zygote`-compatible way.
+
+For example,
+
+```julia
+u = rand(N)
+p = rand(N)
+t = rand()
+
+# out-of-place update
+mat_update_func = (A, u, p, t) -> t * (p * p')
+sca_update_func = (a, u, p, t) -> t * sum(p)
+
+M = MatrixOperator(zero(N, N); update_func = mat_update_func)
+α = ScalarOperator(zero(Float64); update_func = sca_update_func)
+
+L = α * M
+L = cache_operator(L, u)
+
+# L is initialized with zero state
+L * u == zeros(N) # true
+
+# update operator state with `(u, p, t)`
+L = update_coefficients(L, u, p, t)
+# and multiply
+L * u != zeros(N) # true
+
+# updates state and evaluates L at (u, p, t)
+L(u, p, t) != zeros(N) # true
+```
+
+The out-of-place evaluation function `L(u, p, t)` calls
+`update_coefficients` under the hood, which recursively calls
+the `update_func` for each component `SciMLOperator`.
+Therefore the out-of-place evaluation function is equivalent to
+calling `update_coefficients` followed by `Base.*`. Notice that
+the out-of-place evaluation does not return the updated operator.
+
+On the other hand,, the in-place evaluation function, `L(v, u, p, t)`,
+mutates `L`, and is equivalent to calling `update_coefficients!`
+followed by `mul!`. The in-place update behaviour works the same way
+with a few `<!>`s appended here and there. For example,
+
+```julia
+v = rand(N)
+u = rand(N)
+p = rand(N)
+t = rand()
+
+# in-place update
+_A = rand(N, N)
+_d = rand(N)
+mat_update_func!  = (A, u, p, t) -> (copy!(A, _A); lmul!(t, A); nothing)
+diag_update_func! = (diag, u, p, t) -> copy!(diag, N)
+
+M = MatrixOperator(zero(N, N); update_func! = mat_update_func!)
+D = DiagonalOperator(zero(N); update_func! = diag_update_func!)
+
+L = D * M
+L = cache_operator(L, u)
+
+# L is initialized with zero state
+L * u == zeros(N) # true
+
+# update L in-place
+update_coefficients!(L, u, p, t)
+# and multiply
+mul!(v, u, p, t) != zero(N) # true
+
+# updates L in-place, and evaluates at (u, p, t)
+L(v, u, p, t) != zero(N) # true
+```
+
+The update behaviour makes this package flexible enough to be used
+in `OrdianryDiffEq`. As the parameter object `p` is often reserved
+for sensitivy computation via automatic-differentiation, a user may
+prefer to pass in state information via other arguments. For that
+reason, we allow for update functions with arbitrary keyword arguments.
+
+```julia
+mat_update_func = (A, u, p, t; scale = 0.0) -> scale * (p * p')
+
+M = MatrixOperator(zero(N, N); update_func = mat_update_func,
+                   accepted_kwargs = (:state,))
+
+M(u, p, t) == zeros(N) # true
+M(u, p, t; scale = 1.0) != zero(N)
+```
+
 ## Features
 
 * Matrix-free operators with `FunctionOperator`
@@ -143,9 +274,9 @@ Some packages providing similar functionality are
 ## Interoperability and extended Julia ecosystem
 
 `SciMLOperator.jl` overloads the `AbstractMatrix` interface for
-`AbstractSciMLOperator`s, here allowing seamless compatibility with
+`AbstractSciMLOperator`s, allowing seamless compatibility with
 linear solves, and nonlinear solvers. Further, due to the update functionality,
-`AbstractSciMLOperato`s can represent an `ODEFunction` in `OrdinaryDiffEq.jl`,
+`AbstractSciMLOperator`s can represent an `ODEFunction` in `OrdinaryDiffEq.jl`,
 and downstream packages. See tutorials for example of usage with
 `OrdinaryDiffEq.jl`, `LinearSolve.jl`, `NonlinearSolve.jl`.
 
@@ -157,7 +288,7 @@ An example of `Zygote.jl` usage with
 [`Lux.jl`](https://github.com/LuxDL/Lux.jl) is also provided in the tutorials.
 
 Please make an issue [here](https://github.com/SciML/SciMLOperators.jl/issues)
-if you come across an unexpected issue using `SciMLOperator`
+if you come across an unexpected issue while using `SciMLOperators`.
 
 We provide below a list of packages that make use of `SciMLOperators`.
 If you are using `SciMLOperators` in your work, feel free to create a PR
@@ -190,4 +321,3 @@ and add your package to this list.
     - [JuliaDiffEq](https://gitter.im/JuliaDiffEq/Lobby) on Gitter
     - On the Julia Discourse forums (look for the [modelingtoolkit tag](https://discourse.julialang.org/tag/modelingtoolkit)
     - See also [SciML Community page](https://sciml.ai/community/)
-
diff --git a/src/matrix.jl b/src/matrix.jl
@@ -1,12 +1,60 @@
 #
 """
-    MatrixOperator(A; [update_func, update_func!, accepted_kwargs])
+$SIGNATURES
 
-Represents a time-dependent linear operator given by an AbstractMatrix. The
-update function is called by `update_coefficients!` and is assumed to have
-the following signature:
+Represents a linear operator given by an `AbstractMatrix` that may be
+applied to an `AbstractVecOrMat`. Its state is updated by the user-provided
+`update_func` during operator evaluation (`L([v,], u, p, t)`), or by calls
+to `update_coefficients[!](L, u, p, t)`. Both recursively call the
+`update_function`, `update_func` which is assumed to have the signature
 
-    update_func(A::AbstractMatrix,u,p,t; <accepted kwargs>) -> [modifies A]
+    update_func(A::AbstractMatrix, u, p, t; <accepted kwargs>) -> newA
+or
+    update_func!(A::AbstractMatrix, u ,p , t; <accepted kwargs>) -> [modifies A]
+
+The set of keyword-arguments accepted by `update_func[!]` must be provided
+to `MatrixOperator` via the kwarg `accepted_kwargs` as a tuple of `Symbol`s.
+`kwargs` cannot be passed down to `update_func[!]` if `accepted_kwargs`
+are not provided.
+
+$(UPDATE_COEFFS_WARNING)
+
+# Interface
+
+Lazy matrix algebra is defined for `AbstractSciMLOperator`s. The Interface
+supports lazy addition, subtraction, multiplication, inversion,
+adjoints, transposes.
+
+# Example
+
+```
+u = rand(4)
+p = rand(N, N)
+t = rand()
+
+mat_update = (A, u, p, t; scale = 0.0) -> t * p
+M = MatrixOperator(0.0; update_func = mat_update; accepted_kwargs = (:scale,))
+
+L = M * M + 3I
+L = cache_operator(M, u)
+
+# update L and evaluate
+v = L(u, p, t; scale = 1.0)
+```
+
+```
+v = zero(4)
+u = rand(4)
+p = nothing
+t = rand()
+
+mat_update! = (A, u, p, t; scale = 0.0) -> (copy!(A, p); lmul!(t, A))
+M = MatrixOperator(0.0; update_func! = val_update!; accepted_kwargs = (:scale,))
+L = M * M + 3I
+
+# update L in-place and evaluate
+L(v, u, p, t; scale = 1.0)
+```
 """
 struct MatrixOperator{T,AT<:AbstractMatrix{T},F,F!} <: AbstractSciMLOperator{T}
     A::AT
@@ -135,22 +183,37 @@ LinearAlgebra.ldiv!(v::AbstractVecOrMat, L::MatrixOperator, u::AbstractVecOrMat)
 LinearAlgebra.ldiv!(L::MatrixOperator, u::AbstractVecOrMat) = ldiv!(L.A, u)
 
 """
-    DiagonalOperator(diag; [update_func, update_func!, accepted_kwargs])
-
-Represents a time-dependent elementwise scaling (diagonal-scaling) operation.
-The update function is called by `update_coefficients!` and is assumed to have
-the following signature:
+$SIGNATURES
 
-    update_func(diag::AbstractVector,u,p,t; <accepted kwargs>) -> [modifies diag]
-
-When `diag` is an `AbstractVector` of length N, `L=DiagonalOpeator(diag, ...)`
-can be applied to `AbstractArray`s with `size(u, 1) == N`. Each column of the `u`
-will be scaled by `diag`, as in `LinearAlgebra.Diagonal(diag) * u`.
+Represents a elementwise scaling (diagonal-scaling) operation that may
+be applied to an `AbstractVecOrMat`. When `diag` is an `AbstractVector`
+of length N, `L = DiagonalOpeator(diag, ...)` can be applied to
+`AbstractArray`s with `size(u, 1) == N`. Each column of the `u` will be
+scaled by `diag`, as in `LinearAlgebra.Diagonal(diag) * u`.
 
 When `diag` is a multidimensional array, `L = DiagonalOperator(diag, ...)` forms
 an operator of size `(N, N)` where `N = size(diag, 1)` is the leading length of `diag`.
 `L` then is the elementwise-scaling operation on arrays of `length(u) = length(diag)`
 with leading length `size(u, 1) = N`.
+
+Its state is updated by the user-provided `update_func` during operator
+evaluation (`L([v,], u, p, t)`), or by calls to
+`update_coefficients[!](L, u, p, t)`. Both recursively call the
+`update_function`, `update_func` which is assumed to have the signature
+
+    update_func(diag::AbstractVecOrMat, u, p, t; <accepted kwargs>) -> new_diag
+or
+    update_func!(diag::AbstractVecOrMat, u, p, t; <accepted kwargs>) -> [modifies diag]
+
+The set of keyword-arguments accepted by `update_func[!]` must be provided
+to `MatrixOperator` via the kwarg `accepted_kwargs` as a tuple of `Symbol`s.
+`kwargs` cannot be passed down to `update_func[!]` if `accepted_kwargs`
+are not provided.
+
+$(UPDATE_COEFFS_WARNING)
+
+# Example
+
 """
 function DiagonalOperator(diag::AbstractVector;
                           update_func = DEFAULT_UPDATE_FUNC,
diff --git a/src/scalar.jl b/src/scalar.jl