Start moving termination conditions to solve kwargs

utkarsh530 · utkarsh530 · commit 1c5bb596a8a6 · 2023-10-25T17:28:28.000-04:00
diff --git a/docs/pages.jl b/docs/pages.jl
@@ -2,8 +2,7 @@
 
 pages = ["index.md",
     "Getting Started with Nonlinear Rootfinding in Julia" => "tutorials/getting_started.md",
-    "Tutorials" => Any[
-        "Code Optimization for Small Nonlinear Systems" => "tutorials/code_optimization.md",
+    "Tutorials" => Any["Code Optimization for Small Nonlinear Systems" => "tutorials/code_optimization.md",
         "Handling Large Ill-Conditioned and Sparse Systems" => "tutorials/large_systems.md",
         "Symbolic System Definition and Acceleration via ModelingToolkit" => "tutorials/modelingtoolkit.md",
         "tutorials/small_compile.md",
diff --git a/docs/src/tutorials/code_optimization.md b/docs/src/tutorials/code_optimization.md
@@ -34,7 +34,7 @@ Take for example a prototypical small nonlinear solver code in its out-of-place
 ```@example small_opt
 using NonlinearSolve
 
-function f(u, p) 
+function f(u, p)
     u .* u .- p
 end
 u0 = [1.0, 1.0]
@@ -54,7 +54,7 @@ using BenchmarkTools
 Note that this way of writing the function is a shorthand for:
 
 ```@example small_opt
-function f(u, p) 
+function f(u, p)
     [u[1] * u[1] - p, u[2] * u[2] - p]
 end
 ```
@@ -69,25 +69,25 @@ In order to avoid this issue, we can use a non-allocating "in-place" approach. W
 this looks like:
 
 ```@example small_opt
-function f(du, u, p) 
+function f(du, u, p)
     du[1] = u[1] * u[1] - p
     du[2] = u[2] * u[2] - p
     nothing
 end
 
 prob = NonlinearProblem(f, u0, p)
-@btime  sol = solve(prob, NewtonRaphson())
+@btime sol = solve(prob, NewtonRaphson())
 ```
 
 Notice how much faster this already runs! We can make this code even simpler by using
 the `.=` in-place broadcasting.
 
 ```@example small_opt
-function f(du, u, p) 
+function f(du, u, p)
     du .= u .* u .- p
 end
 
-@btime  sol = solve(prob, NewtonRaphson())
+@btime sol = solve(prob, NewtonRaphson())
 ```
 
 ## Further Optimizations for Small Nonlinear Solves with Static Arrays and SimpleNonlinearSolve
@@ -140,7 +140,7 @@ want to use the out-of-place allocating form, but this time we want to output
 a static array. Doing it with broadcasting looks like:
 
 ```@example small_opt
-function f_SA(u, p) 
+function f_SA(u, p)
     u .* u .- p
 end
 u0 = SA[1.0, 1.0]
@@ -153,7 +153,7 @@ Note that only change here is that `u0` is made into a StaticArray! If we needed
 for a more complex nonlinear case, then we'd simply do the following:
 
 ```@example small_opt
-function f_SA(u, p) 
+function f_SA(u, p)
     SA[u[1] * u[1] - p, u[2] * u[2] - p]
 end
 
@@ -170,4 +170,4 @@ which are designed for these small-scale static examples. Let's now use `SimpleN
 @btime solve(prob, SimpleNewtonRaphson())
 ```
 
-And there we go, around 100ns from our starting point of almost 6μs!
+And there we go, around 100ns from our starting point of almost 6μs!
diff --git a/docs/src/tutorials/large_systems.md b/docs/src/tutorials/large_systems.md
@@ -1,8 +1,8 @@
 # [Efficiently Solving Large Sparse Ill-Conditioned Nonlinear Systems in Julia](@id large_systems)
 
-This tutorial is for getting into the extra features of using NonlinearSolve.jl. Solving ill-conditioned nonlinear systems 
-requires specializing the linear solver on properties of the Jacobian in order to cut down on the ``\mathcal{O}(n^3)`` 
-linear solve and the ``\mathcal{O}(n^2)`` back-solves. This tutorial is designed to explain the advanced usage of 
+This tutorial is for getting into the extra features of using NonlinearSolve.jl. Solving ill-conditioned nonlinear systems
+requires specializing the linear solver on properties of the Jacobian in order to cut down on the ``\mathcal{O}(n^3)``
+linear solve and the ``\mathcal{O}(n^2)`` back-solves. This tutorial is designed to explain the advanced usage of
 NonlinearSolve.jl by solving the steady state stiff Brusselator partial differential equation (BRUSS) using NonlinearSolve.jl.
 
 ## Definition of the Brusselator Equation
@@ -182,8 +182,8 @@ nothing # hide
 
 Notice that this acceleration does not require the definition of a sparsity
 pattern, and can thus be an easier way to scale for large problems. For more
-information on linear solver choices, see the 
-[linear solver documentation](https://docs.sciml.ai/DiffEqDocs/stable/features/linear_nonlinear/#linear_nonlinear). 
+information on linear solver choices, see the
+[linear solver documentation](https://docs.sciml.ai/DiffEqDocs/stable/features/linear_nonlinear/#linear_nonlinear).
 `linsolve` choices are any valid [LinearSolve.jl](https://linearsolve.sciml.ai/dev/) solver.
 
 !!! note
diff --git a/docs/src/tutorials/modelingtoolkit.md b/docs/src/tutorials/modelingtoolkit.md
@@ -120,4 +120,4 @@ sol[u5]
 
 If you're interested in building models in a component or block based form, such as seen in systems like Simulink or Modelica,
 take a deeper look at [ModelingToolkit.jl's documentation](https://docs.sciml.ai/ModelingToolkit/stable/) which goes into
-detail on such features.
+detail on such features.
diff --git a/docs/src/tutorials/small_compile.md b/docs/src/tutorials/small_compile.md
@@ -19,18 +19,18 @@ sol = solve(prob, SimpleNewtonRaphson())
 
 However, there are a few downsides to SimpleNonlinearSolve's `SimpleX` style algorithms to note:
 
-1. SimpleNonlinearSolve.jl's methods are not hooked into the LinearSolve.jl system, and thus do not have
-   the ability to specify linear solvers, use sparse matrices, preconditioners, and all of the other features
-   which are required to scale for very large systems of equations.
-2. SimpleNonlinearSolve.jl's methods have less robust error handling and termination conditions, and thus
-   these methods are missing some flexibility and give worse hints for debugging.
-3. SimpleNonlinearSolve.jl's methods are focused on out-of-place support. There is some in-place support,
-   but it's designed for simple smaller systems and so some optimizations are missing.
+ 1. SimpleNonlinearSolve.jl's methods are not hooked into the LinearSolve.jl system, and thus do not have
+    the ability to specify linear solvers, use sparse matrices, preconditioners, and all of the other features
+    which are required to scale for very large systems of equations.
+ 2. SimpleNonlinearSolve.jl's methods have less robust error handling and termination conditions, and thus
+    these methods are missing some flexibility and give worse hints for debugging.
+ 3. SimpleNonlinearSolve.jl's methods are focused on out-of-place support. There is some in-place support,
+    but it's designed for simple smaller systems and so some optimizations are missing.
 
 However, the major upsides of SimpleNonlinearSolve.jl are:
 
-1. The methods are optimized and non-allocating on StaticArrays
-2. The methods are minimal in compilation
+ 1. The methods are optimized and non-allocating on StaticArrays
+ 2. The methods are minimal in compilation
 
 As such, you can use the code as shown above to have very low startup with good methods, but for more scaling and debuggability
 we recommend the full NonlinearSolve.jl. But that said,
@@ -51,4 +51,4 @@ is not only sufficient but optimal.
 
 Julia has tools for building small binaries via static compilation with [StaticCompiler.jl](https://github.com/tshort/StaticCompiler.jl).
 However, these tools are currently limited to type-stable non-allocating functions. That said, SimpleNonlinearSolve.jl's solvers are
-precisely the subset of NonlinearSolve.jl which are compatible with static compilation.
+precisely the subset of NonlinearSolve.jl which are compatible with static compilation.
diff --git a/src/NonlinearSolve.jl b/src/NonlinearSolve.jl
@@ -26,7 +26,7 @@ const AbstractSparseADType = Union{ADTypes.AbstractSparseFiniteDifferences,
     ADTypes.AbstractSparseForwardMode, ADTypes.AbstractSparseReverseMode}
 
 abstract type AbstractNonlinearSolveAlgorithm <: AbstractNonlinearAlgorithm end
-abstract type AbstractNewtonAlgorithm{CJ, AD, TC} <: AbstractNonlinearSolveAlgorithm end
+abstract type AbstractNewtonAlgorithm{CJ, AD} <: AbstractNonlinearSolveAlgorithm end
 
 abstract type AbstractNonlinearSolveCache{iip} end
 
diff --git a/src/gaussnewton.jl b/src/gaussnewton.jl
@@ -36,7 +36,7 @@ for large-scale and numerically-difficult nonlinear least squares problems.
     Jacobian-Free version of `GaussNewton` doesn't work yet, and it forces jacobian
     construction. This will be fixed in the near future.
 """
-@concrete struct GaussNewton{CJ, AD, TC} <: AbstractNewtonAlgorithm{CJ, AD, TC}
+@concrete struct GaussNewton{CJ, AD, TC} <: AbstractNewtonAlgorithm{CJ, AD}
     ad::AD
     linsolve
     precs
diff --git a/src/levenberg.jl b/src/levenberg.jl
@@ -75,7 +75,7 @@ numerically-difficult nonlinear systems.
     `DᵀD` to prevent the damping from being too small. Defaults to `1e-8`.
 """
 @concrete struct LevenbergMarquardt{CJ, AD, T, TC <: NLSolveTerminationCondition} <:
-                 AbstractNewtonAlgorithm{CJ, AD, TC}
+                 AbstractNewtonAlgorithm{CJ, AD}
     ad::AD
     linsolve
     precs
@@ -157,7 +157,7 @@ end
 end
 
 function SciMLBase.__init(prob::Union{NonlinearProblem{uType, iip},
-    NonlinearLeastSquaresProblem{uType, iip}}, alg_::LevenbergMarquardt,
+        NonlinearLeastSquaresProblem{uType, iip}}, alg_::LevenbergMarquardt,
     args...; alias_u0 = false, maxiters = 1000, abstol = nothing, reltol = nothing,
     internalnorm = DEFAULT_NORM,
     linsolve_kwargs = (;), kwargs...) where {uType, iip}
diff --git a/src/raphson.jl b/src/raphson.jl
@@ -30,31 +30,26 @@ for large-scale and numerically-difficult nonlinear systems.
     which means that no line search is performed. Algorithms from `LineSearches.jl` can be
     used here directly, and they will be converted to the correct `LineSearch`.
 """
-@concrete struct NewtonRaphson{CJ, AD, TC <: NLSolveTerminationCondition} <:
-                 AbstractNewtonAlgorithm{CJ, AD, TC}
+@concrete struct NewtonRaphson{CJ, AD} <:
+                 AbstractNewtonAlgorithm{CJ, AD}
     ad::AD
     linsolve
     precs
     linesearch
-    termination_condition::TC
 end
 
 function set_ad(alg::NewtonRaphson{CJ}, ad) where {CJ}
     return NewtonRaphson{CJ}(ad, alg.linsolve, alg.precs, alg.linesearch)
 end
 
 function NewtonRaphson(; concrete_jac = nothing, linsolve = nothing,
-    linesearch = LineSearch(), precs = DEFAULT_PRECS,
-    termination_condition = NLSolveTerminationCondition(NLSolveTerminationMode.NLSolveDefault;
-        abstol = nothing,
-        reltol = nothing), adkwargs...)
+    linesearch = LineSearch(), precs = DEFAULT_PRECS, adkwargs...)
     ad = default_adargs_to_adtype(; adkwargs...)
     linesearch = linesearch isa LineSearch ? linesearch : LineSearch(; method = linesearch)
     return NewtonRaphson{_unwrap_val(concrete_jac)}(ad,
         linsolve,
         precs,
-        linesearch,
-        termination_condition)
+        linesearch)
 end
 
 @concrete mutable struct NewtonRaphsonCache{iip} <: AbstractNonlinearSolveCache{iip}
@@ -79,11 +74,13 @@ end
     prob
     stats::NLStats
     lscache
+    termination_condition
     tc_storage
 end
 
 function SciMLBase.__init(prob::NonlinearProblem{uType, iip}, alg_::NewtonRaphson, args...;
     alias_u0 = false, maxiters = 1000, abstol = nothing, reltol = nothing,
+    termination_condition = nothing,
     internalnorm = DEFAULT_NORM,
     linsolve_kwargs = (;), kwargs...) where {uType, iip}
     alg = get_concrete_algorithm(alg_, prob)
@@ -93,27 +90,28 @@ function SciMLBase.__init(prob::NonlinearProblem{uType, iip}, alg_::NewtonRaphso
     uf, linsolve, J, fu2, jac_cache, du = jacobian_caches(alg, f, u, p, Val(iip);
         linsolve_kwargs)
 
-    tc = alg.termination_condition
-    mode = DiffEqBase.get_termination_mode(tc)
+    abstol, reltol, termination_condition = _init_termination_elements(abstol,
+        reltol,
+        termination_condition,
+        eltype(u))
 
-    atol = _get_tolerance(abstol, tc.abstol, eltype(u))
-    rtol = _get_tolerance(reltol, tc.reltol, eltype(u))
+    mode = DiffEqBase.get_termination_mode(termination_condition)
 
     storage = mode ∈ DiffEqBase.SAFE_TERMINATION_MODES ? NLSolveSafeTerminationResult() :
               nothing
 
     return NewtonRaphsonCache{iip}(f, alg, u, copy(u), fu1, fu2, du, p, uf, linsolve, J,
-        jac_cache, false, maxiters, internalnorm, ReturnCode.Default, atol, rtol, prob,
+        jac_cache, false, maxiters, internalnorm, ReturnCode.Default, abstol, reltol, prob,
         NLStats(1, 0, 0, 0, 0), LineSearchCache(alg.linesearch, f, u, p, fu1, Val(iip)),
-        storage)
+        termination_condition, storage)
 end
 
 function perform_step!(cache::NewtonRaphsonCache{true})
     @unpack u, u_prev, fu1, f, p, alg, J, linsolve, du = cache
     jacobian!!(J, cache)
 
     tc_storage = cache.tc_storage
-    termination_condition = cache.alg.termination_condition(tc_storage)
+    termination_condition = cache.termination_condition(tc_storage)
 
     # u = u - J \ fu
     linres = dolinsolve(alg.precs, linsolve; A = J, b = _vec(fu1), linu = _vec(du),
@@ -140,7 +138,7 @@ function perform_step!(cache::NewtonRaphsonCache{false})
     @unpack u, u_prev, fu1, f, p, alg, linsolve = cache
 
     tc_storage = cache.tc_storage
-    termination_condition = cache.alg.termination_condition(tc_storage)
+    termination_condition = cache.termination_condition(tc_storage)
 
     cache.J = jacobian!!(cache.J, cache)
     # u = u - J \ fu
@@ -169,7 +167,9 @@ function perform_step!(cache::NewtonRaphsonCache{false})
 end
 
 function SciMLBase.reinit!(cache::NewtonRaphsonCache{iip}, u0 = cache.u; p = cache.p,
-    abstol = cache.abstol, maxiters = cache.maxiters) where {iip}
+    abstol = cache.abstol, reltol = cache.reltol,
+    termination_condition = cache.termination_condition,
+    maxiters = cache.maxiters) where {iip}
     cache.p = p
     if iip
         recursivecopy!(cache.u, u0)
@@ -179,7 +179,14 @@ function SciMLBase.reinit!(cache::NewtonRaphsonCache{iip}, u0 = cache.u; p = cac
         cache.u = u0
         cache.fu1 = cache.f(cache.u, p)
     end
+
+    termination_condition = _get_reinit_termination_condition(cache,
+        abstol,
+        reltol,
+        termination_condition)
     cache.abstol = abstol
+    cache.reltol = reltol
+    cache.termination_condition = termination_condition
     cache.maxiters = maxiters
     cache.stats.nf = 1
     cache.stats.nsteps = 1
diff --git a/src/trustRegion.jl b/src/trustRegion.jl
@@ -149,7 +149,7 @@ for large-scale and numerically-difficult nonlinear systems.
     Support for the OOP version is planned!
 """
 @concrete struct TrustRegion{CJ, AD, MTR, TC <: NLSolveTerminationCondition} <:
-                 AbstractNewtonAlgorithm{CJ, AD, TC}
+                 AbstractNewtonAlgorithm{CJ, AD}
     ad::AD
     linsolve
     precs
diff --git a/src/utils.jl b/src/utils.jl
@@ -213,3 +213,55 @@ function _get_tolerance(η, tc_η, ::Type{T}) where {T}
     fallback_η = real(oneunit(T)) * (eps(real(one(T))))^(4 // 5)
     return ifelse(η !== nothing, η, ifelse(tc_η !== nothing, tc_η, fallback_η))
 end
+
+function _init_termination_elements(abstol,
+    reltol,
+    termination_condition,
+    ::Type{T}) where {T}
+    if termination_condition !== nothing
+        abstol !== nothing ?
+        (abstol != termination_condition.abstol ?
+         error("Incompatible absolute tolerances found. The tolerances supplied as the keyword argument and the one supplied in the termination condition should be same.") :
+         nothing) : nothing
+        reltol !== nothing ?
+        (reltol != termination_condition.abstol ?
+         error("Incompatible relative tolerances found. The tolerances supplied as the keyword argument and the one supplied in the termination condition should be same.") :
+         nothing) : nothing
+        abstol = _get_tolerance(abstol, termination_condition.abstol, T)
+        reltol = _get_tolerance(reltol, termination_condition.reltol, T)
+        return abstol, reltol, termination_condition
+    else
+        abstol = _get_tolerance(abstol, nothing, T)
+        reltol = _get_tolerance(reltol, nothing, T)
+        termination_condition = NLSolveTerminationCondition(NLSolveTerminationMode.NLSolveDefault;
+            abstol,
+            reltol)
+        return abstol, reltol, termination_condition
+    end
+end
+
+function _get_reinit_termination_condition(cache, abstol, reltol, termination_condition)
+    if termination_condition != cache.termination_condition
+        if abstol != cache.abstol
+            if abstol != termination_condition.abstol
+                error("Incompatible absolute tolerances found")
+            end
+        end
+
+        if reltol != cache.reltol
+            if reltol != termination_condition.reltol
+                error("Incompatible relative tolerances found")
+            end
+        end
+        termination_condition
+    else
+        # Build the termination_condition with new abstol and reltol
+        return NLSolveTerminationCondition{
+            DiffEqBase.get_termination_mode(termination_condition),
+            eltype(abstol),
+            typeof(termination_condition.safe_termination_options),
+        }(abstol,
+            reltol,
+            termination_condition.safe_termination_options)
+    end
+end
diff --git a/test/basictests.jl b/test/basictests.jl
@@ -135,7 +135,7 @@ end
         termination_condition = NLSolveTerminationCondition(mode; abstol = nothing,
             reltol = nothing)
         probN = NonlinearProblem(quadratic_f, u0, 2.0)
-        @test all(solve(probN, NewtonRaphson(; termination_condition)).u .≈ sqrt(2.0))
+        @test all(solve(probN, NewtonRaphson(); termination_condition).u .≈ sqrt(2.0))
     end
 end