Skip to content

Conversation

@termi-official
Copy link
Contributor

@termi-official termi-official commented Oct 6, 2025

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

  • Switch from hard-coded SimpleNonlinearSolve to any NonlinearSolve algorithm which supports the iterator interface.
  • Add adaptive homotopy path increments via modified Deuflhard's estimates

function empty(u_next, u, p, t)
nothing
end
# @testset "Handle nothing in u0" begin
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have trouble understanding the purpose of this test. What exactly is the practical scenario here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use u=nothing to represent length 0 u in a type stable manner (as op0osed to a Vector which is length 0 at runtime)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I do not understand the point - do you imply here that a zero-length solution vector not type stable and this is why we resort to "nothing"?

When I try to get this test running, I get errors from an alias analysis function not being dispatched for nothing in NonlinearSolve (https://github.com/SciML/NonlinearSolve.jl/blob/ac9344f9359833282e443c4479427ad9ce3311dd/lib/NonlinearSolveFirstOrder/src/solve.jl#L157).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inplace functions with empty return types are also not correctly handled.

using NonlinearSolveFirstOrder
function f(u, p)
    nothing
end
prob = NonlinearProblem{false}(f, Float64[], nothing)
iter = init(prob, NewtonRaphson())

errors when the termination cache is built, because the increment type cannot be derived.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point is that zero length u tends to cause problems (e.g. solvers will try to look at the first element of the state to find a type), so this way you can skip the solve process since nothing interesting can happen if you don't have any state. See https://github.com/SciML/DiffEqBase.jl/blob/3667bdbdc85489f7b296316df7f4c440519e82f6/src/solve.jl#L31 for how this gets handled for ODEs/DAEs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense from a software engineering stand point to replace such isa statements with dispatchable functions to make the code more extensible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quite possibly. DiffEqBase is not my favorite code organization.

Comment on lines +23 to +43
# @concrete struct ConvergenceRateTracing
# inner_tracing
# end

# @concrete struct ConvergenceRateTraceTrick
# incrementL2norms
# residualL2norms
# trace_wrapper
# end

# function NonlinearSolveBase.init_nonlinearsolve_trace(
# prob, alg::IDSolve, u, fu, J, δu;
# trace_level::ConvergenceRateTracing, kwargs... # This kind of dispatch does not work. Need to figure out a different way.
# )
# inner_trace = NonlinearSolveBase.init_nonlinearsolve_trace(
# prob, alg, u, fu, J, δu;
# trace_level.inner_tracing, kwargs...
# )

# return ConvergenceRateTraceTrick(eltype(δu)[], eltype(fu)[], inner_trace)
# end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, it should be possible to use NonlinearSolveBase.init_nonlinearsolve_trace to query convergence rate estimates. However, in the current design I cannot add a new dispatch. Should I make a PR to pull the trace level before the kwargs (i.e. NonlinearSolveBase.init_nonlinearsolve_trace(prob, alg, u, fu, J, δu, trace_level; kwargs...) for custom dispatches?

state = ImplicitDiscreteState(isnothing(u) ? nothing : zero(u), p, t)
IDSolveCache(u, uprev, state, nothing)
state = ImplicitDiscreteState(zero(u), p, t)
f_nl = (resid, u_next, p) -> f(resid, u_next, p.u, p.p, p.t)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reasoning here to include the current u in the signature of this function, but no information on dt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dt is just a parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is simply, why is $dt$ (or tprev) not a parameter, but $uprev$ is part of the function signature?

@@ -0,0 +1,52 @@
Base.@kwdef struct KantorovichTypeController <: OrdinaryDiffEqCore.AbstractController
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO reference and documentation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure what this is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. I will write the docs, no worries. I just left it here so I do not forget before merging. This is a controller derived from a posteriori estimates on how much the convergence radius in the Newton-Kantorovich theorem changes for some increment $dt_n$ and a solution given at $t_n$.

else # :constant
cache.z .= integrator.u
end
state = ImplicitDiscreteState(cache.z, p, t+dt)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On master we solve at time $t$. From my understanding, we should solve at $t+dt$ to obtain the next solution. Can someone confirm or reject this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the somewhat complicated part here is that we want to match a DiscreteSolve. I forget which is the right way here. @jClugstor might remember.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell from the docs https://docs.sciml.ai/DiffEqDocs/stable/types/discrete_types/
if you put in u_n, p, t_(n+1) to the function you get u_(n+1) out, so in this case you would get u_(t + dt) out I guess. So if you want u_(t + dt) I think that's correct. Not entirely sure though.

@termi-official termi-official marked this pull request as ready for review October 6, 2025 21:31
@oscardssmith oscardssmith self-requested a review October 7, 2025 05:54

isfsal(alg::IDSolve) = false
alg_order(alg::IDSolve) = 0
alg_order(alg::IDSolve) = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 1 here?

Copy link
Contributor Author

@termi-official termi-official Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from the analysis shown in Deuflhard, Newton Methods for Nonlinear Problems (Section 5.1.1, see Equation 5.6 and the surrounding definition).

The algorithms here have an associated order in the sense that for a given $dt_n = t_{n+1} - t_n$ we have for some solution $\hat{u}(t_n+1)$ derived from an initial guess given at $u({t_n})$. Now we can define an associated ODE (Davidenko differential equation*) for each nonlinear problem with a "time parameter" by taking the time derivative of the time parameter, which has an analytical solution $\bar{u}(t_{n+1})$, given the same initial guess ($u({t_n})$). The solver now has order $p$ if we have $$||\hat{u}(t_{n+1}) - \bar{u}(t_{n+1})|| \leq C dt^p_n . $$
Does that explain it?

*The Davidenko differential equation for $F(u,t)$ is simply $du/dt = - dF/dx (u,t)^{-1} * dF/dt (u,t)$.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm confused. It's a discrete time problem, it's exact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me try to explain it differently then. We have the parametric function $F(u,t)$ and want to find the solution of $F(u_2,t_2) = 0$ given a $u_1$ such that $F(u_1, t_1) = 0$. With with $t_1 &lt; t_2$ we want to find some initial guess for $u^0_2$ given $u_1$, such that the initial guess $u^0_2$ is inside the convergence radius of a Newton method to solve $F(u_2,t_2) = 0$. The obvious choice is that we can simply say our initial guess is simply $u_1$, but this is typically not really great, as $t_2 - t_1$ is often quite small. However, we can use additional information contained in $F(u,t) = 0$. To be specific the derivative with respect to the parameter $t$ contains some extra information which we can use to improve the initial guess. Here we can observe that analytically solving the associated Davidenko differential equation with initial condition $u_1$ on $[t_1, t_2]$ is equivalent to solving $F(u_2,t_2) = 0$. Furthermore, we can exploit this information to inform how large $t_2$ can be chosen, such that the Newton method is guaranteed converge for a given initial guess. Now, the order of this extrapolation polynomial is directly related to the order of the implicit discrete solver. I hope that helps.

resize!(Θks, 0)
residualnormprev = zero(eltype(u))
while NonlinearSolveBase.not_terminated(nlcache)
step!(nlcache)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it also be trying jacobian reuse in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. However, shoudln't the reuse strategy should be part of the NonlinearSolve algorithm instead of some intermediate layer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you expand out the iterator then you're taking over the iteration strategy.

Copy link
Contributor Author

@termi-official termi-official Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I would like to remove that later. Right now I cannot use the high level API because I cannot access the convergence rates which I need in the controller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants