Add SUNNonlinearSolver_Auto module that automatically switches between newton and fixed-point based on stiffness#854
Add SUNNonlinearSolver_Auto module that automatically switches between newton and fixed-point based on stiffness#854
Conversation
|
Some plots demo'ing that the algorithm does work. This is using the KPR problem: with b = 0 and varying a (stiffness). With a=-1e-2 we see that fixed point is mostly used as one might expect:
With a=-1e2 we see a nice mix of fixed point and newton, where newton is used during the periods of rapid change: Finally, with a=-1e6 we see newton used entirely as one might expect:
Side note: codex wrote all of the code to make these plots. I told it to use the logs from SUNLogger and the tools in suntools to make them, and it did in one shot. Also, thanks to @Steven-Roberts for the discussions. |
|
Here are some stats with a=-1e2, the case where the most switching takes place We can see that the fastest option is using the Newton nonlinear solver, second fastest using the automatic solver, and third using fixed-point. I think a good takeaway is that the automatic solver may be worth trying when you are unsure if your problem is stiff (or how stiff it is). However, its not clear how likely it is that the automatic solver will be the best option w.r.t. time to solution. As in this case, I suspect it might often be the second best. |
|
Another thing to note. The performance of the switching solver does depend (quite a bit) on the values used for the parameters in the algorithm. |
|
I did go ahead and try this with arkode, just letting each stage decide on switching or not, and the results are not great. |
|
Something else to think about. This moves a norm from within the integrator into the solver modules. In standalone uses of the SUNNonlinearSolver_Newton and SUNNonlinearSolver_FixedPoint modules, its possible this is an extra cost. Although, I imagine most convergence tests would need it, so it may be only in some edge cases. |
There was a problem hiding this comment.
Pull request overview
This PR introduces a new SUNNonlinearSolver_Auto implementation that can switch between Newton and fixed-point nonlinear iterations based on stiffness-related runtime metrics, and wires that capability into CVODE/CVODES (and partially ARKODE).
Changes:
- Added
SUNNonlinearSolver_Automodule (C + Fortran interface, SWIG, and sundials4py bindings) and integrated it into build/test targets. - Extended the nonlinear solver API with an optional
getdelnrmop (SUNNonlinSolGetDelNrm) and propagatedSUN_NLS_SWITCHto integrators to support switching. - Added docs and examples demonstrating the new solver and switching behavior.
Reviewed changes
Copilot reviewed 71 out of 71 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/sunnonlinsol/auto/sunnonlinsol_auto.c |
Implements the new auto-switching nonlinear solver and switching logic. |
include/sunnonlinsol/sunnonlinsol_auto.h |
Public API for constructing/configuring the auto solver. |
src/sundials/sundials_nonlinearsolver.c |
Adds SUNNonlinSolGetDelNrm API entry point and ops initialization. |
include/sundials/sundials_nonlinearsolver.h |
Extends ops struct, adds SUNNONLINEARSOLVER_HYBRID, and SUN_NLS_SWITCH. |
src/sunnonlinsol/newton/sunnonlinsol_newton.c |
Stores update norm and stiffness metric; supports switch return code. |
src/sunnonlinsol/fixedpoint/sunnonlinsol_fixedpoint.c |
Adds convergence-rate estimate, update norm storage, and switch return code. |
src/cvode/*, src/cvodes/*, src/arkode/* |
Wires hybrid solver type, uses SUNNonlinSolGetDelNrm, handles SUN_NLS_SWITCH. |
doc/shared/sunnonlinsol/* |
Documents new API and the Auto solver; adds references. |
examples/* |
Adds new examples demonstrating Auto solver usage. |
bindings/sundials4py/* |
Adds Auto solver bindings and exposes new enums/functions. |
Comments suppressed due to low confidence (7)
src/sunnonlinsol/newton/sunnonlinsol_newton.c:1
- This overwrites
content->delnrm(documented/used as the update norm) with a residual norm, and computesstiffrasresnorm / delnrmwithout guarding againstdelnrm == 0. This can (a) breakSUNNonlinSolGetDelNrm*semantics and integrator usage, and (b) trigger division-by-zero when the correction/update norm is zero. Consider keepingcontent->delnrmas the update norm and storing the residual norm in a separate local variable (or dedicated field), and guard the ratio computation whendelnrmis 0.
src/sunnonlinsol/fixedpoint/sunnonlinsol_fixedpoint.c:1 delnrmpmay be 0 (or uninitialized ifFP_CONTENT(NLS)->delnrmis never initialized before the first solve/iteration), makingdelnrm/delnrmpa division-by-zero hazard and potentially producing inf/NaN incrate. Additionally, the newly-addeddelnrm/cratefields should be explicitly initialized in the solver constructor to avoid reading uninitialized memory. Suggested fix: initializecontent->delnrmandcontent->crateduring creation, and guard the ratio whendelnrmp <= 0(or when it is not finite).
src/sunnonlinsol/auto/sunnonlinsol_auto.c:1- These setters only forward linear-solver callbacks when Newton is currently active. If the Auto solver starts in fixed-point and later switches to Newton,
newton_solvermay never receiveLSetupFn/LSolveFn, leading to a broken Newton path after switching. For an auto-switching solver, these callbacks should be set on the Newton sub-solver regardless of the currently-active type (or stored and applied when switching). Similar concerns likely apply to other setters that currently forward only to the active sub-solver (e.g., max iters / system function), since switching changes which sub-solver needs the configuration.
src/sunnonlinsol/auto/sunnonlinsol_auto.c:1 SUNNonlinSolGetNumItersreturns the total iterations across all solves for that solver; adding that total into*_niters_totalaccumulators will overcount (the total is repeatedly added after each solve). This will makeSUNNonlinSolGetNumItersByType_Autoreport incorrect values. A concrete fix is to remove the manual accumulation and haveSUNNonlinSolGetNumItersByType_AutoquerySUNNonlinSolGetNumIters(fp_solver, ...)andSUNNonlinSolGetNumIters(newton_solver, ...)directly, or track the previous totals and only add per-solve deltas.
src/sunnonlinsol/auto/sunnonlinsol_auto.c:1nsolves_since_switchis incremented even when the underlying solve returnsSUN_NLS_SWITCH. Since the switching logic resetsnsolves_since_switch = 0inside the convergence test right before returningSUN_NLS_SWITCH, this increment makes the post-switch state1(off-by-one), which undermines delay gating. Consider incrementing only whenretval != SUN_NLS_SWITCH(and possibly only when the solve completes successfully).
src/sunnonlinsol/auto/sunnonlinsol_auto.c:1- The constructor does not check whether
SUNNonlinSol_FixedPointorSUNNonlinSol_NewtonreturnedNULL(allocation failure, etc.). Returning an Auto solver with a NULL sub-solver will lead to null dereferences later (and also leaksNLS/contentwhen creation fails partway). Suggested fix: validate both sub-solver pointers, and on failure free any partially-created resources and returnNULL(consistent with other SUNDIALS constructors).
src/sunnonlinsol/auto/sunnonlinsol_auto.c:1 - The switching behavior (threshold/delay gating +
SUN_NLS_SWITCHpropagation) is new, user-visible behavior that can materially affect solver outcomes. Given the repo already has unit test infrastructure for ARKODE/CVODE/CVODES, please add unit tests covering the Auto solver's switching decision logic (including delay gating and threshold boundaries) and that integrators correctly recover/retry afterSUN_NLS_SWITCH.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Steven-Roberts
left a comment
There was a problem hiding this comment.
I took a pass through the docs
I think the auto solver will need to be added to https://sundials.readthedocs.io/en/latest/sundials/Install_link.html#nonlinear-solvers
Co-authored-by: Steven Roberts <sroberts994@gmail.com>



This implements the algorithm from https://doi.org/10.1007/BF01933714 and enables its use from CVODE/CVODES. Usage from ARKODE is a bit more complicated if we want to be true to what the paper does (use the max convergence rate estimate across all stages).
The results wrt the benefits of using this module are mixed and we may still need to explore it more. I am opening the PR now so the implementation can be reviewed and we can decide on if it should be merged.