Skip to content

Conversation

@DanielDoehring
Copy link
Member

@DanielDoehring DanielDoehring requested a review from jlchan July 29, 2025 12:19
@github-actions
Copy link
Contributor

Review checklist

This checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging.

Purpose and scope

  • The PR has a single goal that is clear from the PR title and/or description.
  • All code changes represent a single set of modifications that logically belong together.
  • No more than 500 lines of code are changed or there is no obvious way to split the PR into multiple PRs.

Code quality

  • The code can be understood easily.
  • Newly introduced names for variables etc. are self-descriptive and consistent with existing naming conventions.
  • There are no redundancies that can be removed by simple modularization/refactoring.
  • There are no leftover debug statements or commented code sections.
  • The code adheres to our conventions and style guide, and to the Julia guidelines.

Documentation

  • New functions and types are documented with a docstring or top-level comment.
  • Relevant publications are referenced in docstrings (see example for formatting).
  • Inline comments are used to document longer or unusual code sections.
  • Comments describe intent ("why?") and not just functionality ("what?").
  • If the PR introduces a significant change or new feature, it is documented in NEWS.md with its PR number.

Testing

  • The PR passes all tests.
  • New or modified lines of code are covered by tests.
  • New or modified tests run in less then 10 seconds.

Performance

  • There are no type instabilities or memory allocations in performance-critical parts.
  • If the PR intent is to improve performance, before/after time measurements are posted in the PR.

Verification

  • The correctness of the code was verified using appropriate tests.
  • If new equations/methods are added, a convergence test has been run and the results
    are posted in the PR.

Created with ❤️ by the Trixi.jl community.

@codecov
Copy link

codecov bot commented Jul 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.76%. Comparing base (aa192cc) to head (8b819e6).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2505      +/-   ##
==========================================
+ Coverage   93.36%   96.76%   +3.41%     
==========================================
  Files         513      515       +2     
  Lines       42395    42430      +35     
==========================================
+ Hits        39578    41056    +1478     
+ Misses       2817     1374    -1443     
Flag Coverage Δ
unittests 96.76% <100.00%> (+3.41%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@DanielDoehring DanielDoehring added the example Adding/changing examples (elixirs) label Jul 29, 2025
@ranocha
Copy link
Member

ranocha commented Aug 21, 2025

Nice idea! CC @vchuravy as inspiration for Ariadne.jl.

Do you have some comparisons with other methods (either implicit or explicit)?

@DanielDoehring
Copy link
Member Author

Do you have some comparisons with other methods (either implicit or explicit)?

I think for now these methods are very inefficient, as most runtime and allocations are actually spent in the integrators:

https://github.com/trixi-framework/Trixi.jl/actions/runs/17121368313/job/48562848330?pr=2505#step:7:1468

https://github.com/trixi-framework/Trixi.jl/actions/runs/17121368313/job/48562848330?pr=2505#step:7:11486

My main motivation was to provide elixirs that actually use the SplitODEProblem.

@JoshuaLampert
Copy link
Member

Is this bad performance related to the performance issues of IMEX methods we observed in DispersiveShallowWater.jl, @ranocha?
xref: NumericalMathematics/DispersiveShallowWater.jl#207

@DanielDoehring
Copy link
Member Author

I think if you run the simulation for a subsequent time, i.e., after all datatstructures/caches are initialized, these numbers should vastly improve. At least that is what I noted for other stuff from the SciML ecosystem like NonlinearSolve etc.

@ranocha
Copy link
Member

ranocha commented Aug 22, 2025

Did you experience this specifically with split ODE solvers, too? I mainly have bad experience with their current state.

@DanielDoehring
Copy link
Member Author

Hm so the scenario for DIRK methods from the PR #2508 looks kinda similar, see the comment by Marco:

#2508 (comment)

@ChrisRackauckas
Copy link

Did you experience this specifically with split ODE solvers, too? I mainly have bad experience with their current state.

Hey, it is a good time to talk about this. It was one of my goals this summer to get the PDE space up and revived (for the Thunderbolt.jl project we have going, but of course there's a lot of knock-on effects to that as well) and so there has been a lot of motion here:

  1. I ended up doing SciMLOperators v1 myself. I put a big bounty on it and made it a preferred student project but no one wanted to do it. It's done now, so SciMLOperators is fixed. For a long time this part of the system was in a "I don't want to do that yet because I want to make sure SciMLOperators v1 is done first" and 🎉 that blocker is out of the way.
  2. LinearSolve.jl got a bunch of better auto-tuning https://docs.sciml.ai/LinearSolve/stable/tutorials/autotune/, and for sparse auto-tuning I'm trying to spin up general sparse matrix benchmarks https://docs.sciml.ai/SciMLBenchmarksOutput/stable/LinearSolve/MatrixDepot/, and we just got more compute power online to help us power through that.
  3. Debugged the shit out of ExponentialUtilities.jl, got it snappy with precompilation, separate repos, etc.

And with that, we then just got a bunch of new benchmarks up on this (live today!):

What's going on here needs a little bit of an explanation.

Exponential Integrators

First of all, The results are all very discouraging for exponential integrators by a few orders of magnitude in their own territory. At first I thought that might be pessimistic? Is ExponentialUtilities.jl really slow? But other benchmarks against things like ExpoKit show that it's fine, actually pretty great. I think it's just fundamental to exponential integrators. If you think about it, Rosenbrock methods are like exponential integrators where you only keep the first N terms of the expansion Av + A^2v + ..., where as the exponential operator is trying to approximate the limit. But if you're already trying to truncate for order reasons due to how error propagates in the ODE... why are you trying to get this exponential so exactly? Especially if the operator A is actually A(t) or something. It is just a ton of work that isn't really needed.

Now even if you do a Krylov subspace method for the exponential, it is competing against the performance of doing a Krylov subspace method for linear solving. But exponentials have all sorts of quirks that makes them harder to solve than a linear system, kind of obvious because one is a linear system and the other isn't, and these details are things like the fact that high condition numbers mean the need to split the exponential calculation into chunks ("adaptive timestepping", since the exponential is solving a linear ODE), doing multiple different Krylov subspaces, and then multiplying the solutions together. And then there's the fact that for structured A like a banded matrix, a linear solve has all sorts of tricks while there aren't any for matrix exponential. Again, this is because matrix exponential is the limit as Av + A^2v + ... goes to infinity, so having just one off band means A^2 has 2, A^3 has 3, etc. which means A^n at some point is dense. The matrix exponential effectively always becomes dense, so you always lose structure. So in the end it's just always more expensive to do... that doesn't mean it's worse. Like in RK methods, if you can take larger steps then you're good. But the exponential integrators end up having the same order as the nonlinear solver methods. Nonlinear solver methods do have an issue that they need to guess the Newton solution, but with the other numerical details in KIOPS, that's balanced out. So in the end they are about as numerically stable as internal processes, and the methods achieve the same order, so KenCarp4 and ETDRK4 step with about the same error to the same dt but ETDRK4 takes a lot more work. To me, now that we've gotten all of this optimized it's not not a "maybe this can get more optimizations" but a "this looks fundamental to the method, I think I understand why it's like this".

So I have 0 hope in the exponential integrators 😅. I think all of the literature out there touting them is academic without good enough software to back it up, and I think this is the definitive software attempt to say "I have beaten all of the implementations out there that I could find, and so this looks like the supreme being of exponential integrators, and it's still losing by 2 orders of magnitude". Nail in the coffin.

ARKODE... wins?

So ARKODE actually does pretty well, and there's a funny story to this.

image

If you go to any of the stiff ODE benchmarks, you'll notice ARKODE fails:

https://docs.sciml.ai/SciMLBenchmarksOutput/stable/StiffODE/Orego/#Omissions-and-Tweaking

So basically Sundials has setup the defaults for ARKODE in such a way where "most" stiff ODEs fail with it, but these "semi-stiff" ODEs you get from PDE semi-discretizations (which can be better to do with implicit solvers, but are much less stiff than things like biological stiff ODEs with conditions numbers of around 10^10) just absolutely fail to even solve with ARKODE a good chunk of the time. And when it does work like in POLLU it doesn't do so well:

image

So the time stepper seems to be setup to be extremely aggressive, in a way that's not a good default as a stiff ODE solver, but actually works quite well in this context of PDEs.

Should we make KenCarp4 change its default controller behavior when you put it into IMEX mode? I don't think so... but without changing the step controller it won't be competitive. So maybe it just needs like a KenCarp4PDE or something that just an alias with some re-tuned defaults? I've known this could be a thing for years but haven't put the time into actually putting together that defaulting, but if you want to match the ARKODE result in pure Julia that's the one major bit to change.

Current State and Next Steps

So with all of that said:

  1. The current state is, test a bit with ARKODE as that will be the fastest right now.
  2. We need to get some alias modes for the KenCarps to set them up by default better for PDEs, and that would then make that match.
  3. The new mixed precision solvers in LinearSolve, especially with GPU offloading, should definitely get tested here
  4. There are a few things to do with SciMLOperators still, in particular its GMRES support (Oscar might've already gotten to that?) and SplitODEProblem and SciMLOperators error  SciML/OrdinaryDiffEq.jl#2078, but for the most part it should be handled by now.
  5. These kinds of results should put an end to ExponentialUtilities.jl as a focus of this IMEX work, which will save us a ton of time because we invest a lot there. But IMO it seems like wasted effort, and instead making the big SDIRK IMEX methods better is more effort.
  6. Getting the SBDF methods to be at the same level of support as the FBDF class of methods would probably actually be the winner. There is no literature on adaptive time stepping for them but we know what to do from making QNDF and FBDF, that would be a great student project... and probably the actually best method to use here.

@ChrisRackauckas
Copy link

Oh and not only did we have do the SciMLOperators.jl v1 but also the DifferentiationInterface.jl change 😅. So I'm going to have to scramble and see if any of the allocations reported in NumericalMathematics/DispersiveShallowWater.jl#207 are related to that... I'd say we should be ready for more large-scale banging away at this in October.

If anyone had any time to donate, I would say the best help would be to help with getting each of the steppers setup with JET and AllocCheck tests like https://github.com/SciML/OrdinaryDiffEq.jl/blob/master/lib/OrdinaryDiffEqTsit5/test/allocation_tests.jl . The introduction of DifferentiationInterface was kind of a nightmare to all of this and we could've handled it much more elegantly if we compiler-enforced 0 allocations for every step! call, so now is better than never.

@ranocha
Copy link
Member

ranocha commented Aug 25, 2025

Did you experience this specifically with split ODE solvers, too? I mainly have bad experience with their current state.

Hey, it is a good time to talk about this. It was one of my goals this summer to get the PDE space up and revived (for the Thunderbolt.jl project we have going, but of course there's a lot of knock-on effects to that as well) and so there has been a lot of motion here

That's great to hear 🥳 Thanks a lot for improving the state of the art in this field!

So I have 0 hope in the exponential integrators 😅. I think all of the literature out there touting them is academic without good enough software to back it up, and I think this is the definitive software attempt to say "I have beaten all of the implementations out there that I could find, and so this looks like the supreme being of exponential integrators, and it's still losing by 2 orders of magnitude". Nail in the coffin.

There may be one more idea to try: For something like the KdV example with spectral methods (like in https://docs.sciml.ai/SciMLBenchmarksOutput/stable/SimpleHandwrittenPDE/kdv_spectral_wpd/), the exponential could be computed much more efficiently using an FFT. The same also holds for other methods (fully explicit or implicit in the linear stiff part only). Right now, the stiff third-derivative discretization uses a dense matrix, so there is room for improvement for many classes of methods.

Is there an option to tell OrdinaryDiffEq.jl and SciMLOperators.jl something like "Hey, this $$T$$ is a linear operator that is constant in time and I do know how to solve a system with the operator $$I - \alpha T$$ efficiently"?

@ranocha
Copy link
Member

ranocha commented Aug 25, 2025

Should we make KenCarp4 change its default controller behavior when you put it into IMEX mode? I don't think so... but without changing the step controller it won't be competitive. So maybe it just needs like a KenCarp4PDE or something that just an alias with some re-tuned defaults? I've known this could be a thing for years but haven't put the time into actually putting together that defaulting, but if you want to match the ARKODE result in pure Julia that's the one major bit to change.

That's really interesting! Do you have an idea how to change the current KenCarp Julia implementations to make them always at least as good as ARKODE?

Comment on lines 167 to 171
sol = solve(ode,
# Use (diagonally) implicit Runge-Kutta method with Jacobian-free (!) Newton-Krylov (GMRES) solver
# See https://docs.sciml.ai/DiffEqDocs/stable/tutorials/advanced_ode_example/#Using-Jacobian-Free-Newton-Krylov
KenCarp47(autodiff = AutoFiniteDiff(), linsolve = KrylovJL_GMRES());
ode_default_options()..., callback = callbacks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please check how the ARKODE methods perform compared to the KenCarp methods implemented in OrdinaryDiffEq.jl?

Copy link
Member Author

@DanielDoehring DanielDoehring Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm the summary callback seems to be not compatible with the more abstract DifferentialEquations options:

ERROR: type DEOptions has no field adaptive
Stacktrace:
  [1] getproperty
    @ ./Base.jl:37 [inlined]
  [2] initialize_summary_callback(cb::DiscreteCallback{…}, u::Vector{…}, t::Float64, integrator::Sundials.ARKODEIntegrator{…}; reset_threads::Bool)
    @ Trixi ~/git/Trixi.jl/src/callbacks_step/summary.jl:197
  [3] initialize_summary_callback
    @ ~/git/Trixi.jl/src/callbacks_step/summary.jl:148 [inlined]
  [4] initialize
    @ ~/git/Trixi.jl/src/callbacks_step/summary.jl:20 [inlined]
  [5] initialize!
    @ ~/.julia/packages/DiffEqBase/AfBQT/src/callbacks.jl:18 [inlined]
  [6] initialize!
    @ ~/.julia/packages/DiffEqBase/AfBQT/src/callbacks.jl:7 [inlined]
  [7] initialize_callbacks!
    @ ~/.julia/packages/Sundials/ixqpi/src/common_interface/solve.jl:1574 [inlined]
  [8] initialize_callbacks!(integrator::Sundials.ARKODEIntegrator{…})
    @ Sundials ~/.julia/packages/Sundials/ixqpi/src/common_interface/solve.jl:1569
  [9] __init(prob::ODEProblem{…}, alg::ARKODE{…}, timeseries::Vector{…}, ts::Vector{…}, ks::Vector{…}; verbose::Bool, callback::CallbackSet{…}, abstol::Float64, reltol::Float64, saveat::Vector{…}, tstops::Vector{…}, d_discontinuities::Vector{…}, maxiters::Int64, dt::Nothing, dtmin::Float64, dtmax::Float64, timeseries_errors::Bool, dense_errors::Bool, save_everystep::Bool, save_idxs::Nothing, dense::Bool, save_on::Bool, save_start::Bool, save_end::Bool, save_timeseries::Nothing, progress::Bool, progress_steps::Int64, progress_name::String, progress_message::typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), progress_id::Symbol, advance_to_tstop::Bool, stop_at_next_tstop::Bool, userdata::Nothing, alias_u0::Bool, kwargs::@Kwargs{})
    @ Sundials ~/.julia/packages/Sundials/ixqpi/src/common_interface/solve.jl:951
 [10] __solve(prob::ODEProblem{…}, alg::ARKODE{…}, timeseries::Vector{…}, ts::Vector{…}, ks::Vector{…}, recompile::Type{…}; calculate_error::Bool, kwargs::@Kwargs{})
    @ Sundials ~/.julia/packages/Sundials/ixqpi/src/common_interface/solve.jl:15
 [11] __solve (repeats 5 times)
    @ ~/.julia/packages/Sundials/ixqpi/src/common_interface/solve.jl:3 [inlined]
 [12] solve_call(_prob::ODEProblem{…}, args::ARKODE{…}; merge_callbacks::Bool, kwargshandle::Nothing, kwargs::@Kwargs{})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/AfBQT/src/solve.jl:657
 [13] solve_call
    @ ~/.julia/packages/DiffEqBase/AfBQT/src/solve.jl:614 [inlined]
 [14] #solve_up#45
    @ ~/.julia/packages/DiffEqBase/AfBQT/src/solve.jl:1211 [inlined]
 [15] solve_up
    @ ~/.julia/packages/DiffEqBase/AfBQT/src/solve.jl:1188 [inlined]
 [16] #solve#43
    @ ~/.julia/packages/DiffEqBase/AfBQT/src/solve.jl:1083 [inlined]
 [17] top-level scope
    @ ~/git/Trixi.jl/examples/tree_1d_dgsem/elixir_diffusion_ldg_newton_krylov.jl:64
Some type information was truncated. Use `show(err)` to see complete types.

for

sol = solve(ode,
            ARKODE();
            ode_default_options()..., callback = callbacks)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without callbacks the simulation runs, but usefulness is questionable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but usefulness is questionable.

What do you mean by that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error seems to say you're doing something on the non public API, likely something like integrator.opts.adaptive. Sundials cannot be mutated to being non adaptive between steps as it's stepper cannot be non adaptive in general.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by that?

Callbacks are somewhat essential for meaningful simulations, right? So what I mean is that unless we rewrite the callbacks to be Sundials-ready, the simulations with sundials are relatively useless.

@DanielDoehring
Copy link
Member Author

I added tolerances for the GMRES solver which greatly reduces runtime without significantly impacting the outcome. From my side this is ready to go 👍

Copy link
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! How did you choose the ODE solver tolerances?

@DanielDoehring
Copy link
Member Author

DanielDoehring commented Sep 1, 2025

Thanks! How did you choose the ODE solver tolerances?

Such that the error is space-dominated, i.e., stricter tolerances do not lead to (significantly) smaller errors.

Copy link
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ranocha ranocha merged commit 1c29507 into trixi-framework:main Sep 1, 2025
34 of 35 checks passed
@DanielDoehring DanielDoehring deleted the ExamplesNewtonKrylov branch September 1, 2025 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

example Adding/changing examples (elixirs)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants