Skip to content

Conversation

@termi-official
Copy link
Contributor

@termi-official termi-official commented Oct 26, 2025

Resolves #2883 .

TODO

  • Upgrade non-standard controllers to new interface
  • Introduce multi-controller for algorithms with stiffness switching
  • Add appropriate documentation of the full API

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

Add any other context about the problem here.

Comment on lines +585 to +600
# If the error estimate is zero, we can increase the step size as much as
# desired. This additional check fixes problems of the code below when the
# error estimates become zero
# -> err1, err2, err3 become Inf
# -> err1^positive_number * err2^negative_number becomes NaN
# -> dt becomes NaN
#
# `EEst_min` is smaller than PETSC_SMALL used in the equivalent logic in PETSc.
# For example, `eps(Float64) ≈ 2.2e-16` but `PETSC_SMALL ≈ 1.0e-10` for `double`.
EEst_min = eps(typeof(EEst))
# The code below is a bit more robust than
# ```
# if iszero(EEst)
# EEst = eps(typeof(EEst))
# end
# ```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment makes no sense to me...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part exactly?

err1, err2, err3 = cache.err

k = min(alg_order(alg), alg_adaptive_order(alg)) + 1
dt_factor = err1^(beta1 / k) * err2^(beta2 / k) * err3^(beta3 / k)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use fastpow and inv(k) to save computation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some evidence in Trixi.jl CI tests that fastpow (via Float32) reduces reproducibility of computational results across different systems, while the impact on computational efficiency is minimal (if at all noticeable). The PID controllers are used only by a few algorithms designed for large-scale PDE discretizations, where the runtime performance of computing the time step size is negligible. In contrast, stability and reproducibility are essential to us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also a bit confused. Even without fast powers etc the controller code should never become the bottleneck if we have more than a few nonlinear ODEs, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PDEs is the best case scenario here so it's not a good testing case. The worst case scenario is a small non-stiff ODE. Test again with something like a 2 ODE one compartment model. What is the overhead? At least last it was timed at around 5x.

I would be curious to know what you mean by "reproducibility" here because there is nothing that is platform-dependent and no system libraries used

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lorenz equations or other similarly sized chaotic systems are also good examples to test here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We typically test many examples: run an ODE solver for a few time steps and record the norms of the solution minus a reference (e.g., the initial condition). With fastpow, we observed that the results seem to differ more significantly when running on different systems (likely due to the reduced precision during the computation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the default methods uses a PIDController, which is the only one not using fastpow at the moment. I would like to keep this state for backward compatibility in Trixi.jl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR definitely shouldn't change any code. It's a refactor? Any substance changes would need to go anywhere else. I didn't see that this wasn't to the PI

We typically test many examples: run an ODE solver for a few time steps and record the norms of the solution minus a reference (e.g., the initial condition). With fastpow, we observed that the results seem to differ more significantly when running on different systems (likely due to the reduced precision during the computation).

That's definitely odd. I used test have some checks on chaotic systems that you'd reproduce across OS's and chips . @oscardssmith can you look into this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will investigate

erracc::QT
dtacc::tType
# TODO ^^^ remove these
controller_cache::CC
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at some of the algorithms, we even might want to think moving the controller into the algorithm cache together with some helpers to construct standard controllers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interaction between controller api and time loop

4 participants