Skip to content

Conversation

@franckgaga
Copy link
Member

@franckgaga franckgaga commented Nov 1, 2025

The new keyword argument allows enabling and customizing the computation of the Hessian of the Langrangian function. Note that this is only supported with oracle=true option. Changing the Hessian backend will throw an error if oracle=false.

The default is hessian=false, meaning Hessian is not computed, and the Quasi-newton approximation provided by the optimizer will be used (e.g. L-BFGS for Ipopt.jl). The other options are:

  • hessian=true: it will select an appropriate AD backend for the Hessian, which is:
    AutoSparse(
        AutoForwardDiff(); 
        sparsity_detector  = TracerSparsityDetector(), 
        coloring_algorithm = GreedyColoringAlgorithm()
    )
  • hessian=backend: use backend::AbstractADType for the computation of the Hessian.

Also, pretty-printing NonLinMPC will now show the hessian backend:

NonLinMPC controller with a sample time Ts = 0.1 s:
├ estimator: UnscentedKalmanFilter
├ model: NonLinModel
├ optimizer: Uno 
├ transcription: MultipleShooting
├ gradient: AutoForwardDiff
├ jacobian: AutoSparse (AutoForwardDiff, TracerSparsityDetector, GreedyColoringAlgorithm)
├ hessian: AutoSparse (AutoForwardDiff, TracerSparsityDetector, GreedyColoringAlgorithm)
└ dimensions:
  ├ 20 prediction steps Hp
  ├  2 control steps Hc
  ├  0 slack variable ϵ (control constraints)
  ├  1 manipulated inputs u (1 integrating states)
  ├  3 estimated states x̂
  ├  1 measured outputs ym (0 integrating states)
  ├  0 unmeasured outputs yu
  └  0 measured disturbances d

@franckgaga franckgaga changed the title added: hessian argument in NonLinMPC added: hessian keyword argument in NonLinMPC Nov 1, 2025
@codecov-commenter
Copy link

codecov-commenter commented Nov 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.48%. Comparing base (b9f325f) to head (df1bb37).
⚠️ Report is 20 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #275      +/-   ##
==========================================
+ Coverage   98.45%   98.48%   +0.03%     
==========================================
  Files          28       28              
  Lines        4659     4764     +105     
==========================================
+ Hits         4587     4692     +105     
  Misses         72       72              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Nov 1, 2025

Benchmark Results (Julia v1)

Time benchmarks
main df1bb37... main / df1bb37...
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/DAQP/SingleShooting 5.34 ± 0.5 ms 4.86 ± 0.51 ms 1.1 ± 0.15
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/Ipopt/MultipleShooting 0.313 ± 0.0042 s 0.31 ± 0.0042 s 1.01 ± 0.019
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/Ipopt/SingleShooting 0.225 ± 0.012 s 0.224 ± 0.011 s 1.01 ± 0.07
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/OSQP/MultipleShooting 10.2 ± 0.56 ms 8.94 ± 0.52 ms 1.14 ± 0.091
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/OSQP/SingleShooting 1.73 ± 0.065 ms 1.64 ± 0.064 ms 1.05 ± 0.057
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/DAQP/SingleShooting 5.29 ± 0.49 ms 5.15 ± 0.51 ms 1.03 ± 0.14
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/Ipopt/MultipleShooting 0.279 ± 0.0024 s 0.275 ± 0.002 s 1.01 ± 0.011
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/Ipopt/SingleShooting 0.24 ± 0.0014 s 0.238 ± 0.0016 s 1.01 ± 0.0089
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/OSQP/MultipleShooting 7.16 ± 0.43 ms 6.13 ± 0.39 ms 1.17 ± 0.1
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/OSQP/SingleShooting 1.85 ± 0.072 ms 1.74 ± 0.068 ms 1.06 ± 0.059
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/DAQP/SingleShooting 8.51 ± 1.4 ms 8.54 ± 1.4 ms 0.996 ± 0.23
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/Ipopt/MultipleShooting 0.308 ± 0.036 s 0.3 ± 0.02 s 1.03 ± 0.14
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/Ipopt/SingleShooting 0.157 ± 0.0021 s 0.156 ± 0.0012 s 1 ± 0.015
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/OSQP/MultipleShooting 0.101 ± 0.0082 s 0.1 ± 0.011 s 1.01 ± 0.14
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/OSQP/SingleShooting 11.7 ± 1.5 ms 11.4 ± 1.4 ms 1.03 ± 0.18
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Custom constraints/Ipopt/MultipleShooting 0.699 ± 0.034 s 0.693 ± 0.036 s 1.01 ± 0.072
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Custom constraints/Ipopt/SingleShooting 1.82 ± 0.016 s 1.75 ± 0.081 s 1.04 ± 0.05
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Custom constraints/Ipopt/TrapezoidalCollocation 0.701 ± 0.037 s 0.697 ± 0.024 s 1 ± 0.063
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/Ipopt/MultipleShooting 0.356 ± 0.023 s 0.354 ± 0.0086 s 1 ± 0.069
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/Ipopt/SingleShooting 0.517 ± 0.037 s 0.488 ± 0.012 s 1.06 ± 0.08
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/Ipopt/TrapezoidalCollocation 0.331 ± 0.018 s 0.328 ± 0.0096 s 1.01 ± 0.063
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/MadNLP/SingleShooting 0.146 ± 0.0063 s 0.145 ± 0.0059 s 1 ± 0.06
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/MultipleShooting 0.338 ± 0.014 s 0.342 ± 0.0054 s 0.988 ± 0.044
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/MultipleShooting (threaded) 0.364 ± 0.028 s 0.367 ± 0.03 s 0.992 ± 0.11
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/SingleShooting 0.512 ± 0.035 s 0.491 ± 0.011 s 1.04 ± 0.074
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/TrapezoidalCollocation 0.326 ± 0.012 s 0.324 ± 0.011 s 1.01 ± 0.049
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/TrapezoidalCollocation (threaded) 0.357 ± 0.027 s 0.353 ± 0.029 s 1.01 ± 0.11
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/MadNLP/SingleShooting 0.122 ± 0.0055 s 0.124 ± 0.0052 s 0.986 ± 0.061
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/DAQP/Current form 0.0391 ± 0.0037 s 0.0374 ± 0.0034 s 1.04 ± 0.14
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/DAQP/Prediction form 0.0327 ± 0.0077 s 31.4 ± 5.5 ms 1.04 ± 0.31
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/Ipopt/Current form 0.186 ± 0.038 s 0.188 ± 0.03 s 0.989 ± 0.25
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/Ipopt/Prediction form 0.166 ± 0.022 s 0.159 ± 0.024 s 1.04 ± 0.21
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/OSQP/Current form 0.0347 ± 0.0045 s 0.033 ± 0.0038 s 1.05 ± 0.18
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/OSQP/Prediction form 27.1 ± 4.8 ms 25.9 ± 0.68 ms 1.05 ± 0.19
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/Ipopt/Current form 9.67 ± 0.041 s 9.67 ± 0.038 s 1 ± 0.0058
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/Ipopt/Prediction form 3.62 ± 0.022 s 3.54 ± 0.014 s 1.02 ± 0.0074
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/MadNLP/Current form 2.56 ± 0.024 s 2.44 ± 0.044 s 1.05 ± 0.021
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/MadNLP/Prediction form 1.45 ± 0.014 s 1.42 ± 0.013 s 1.02 ± 0.014
UNIT TESTS/PredictiveController/ExplicitMPC/moveinput! 3.81 ± 0.03 μs 4.25 ± 0.04 μs 0.896 ± 0.011
UNIT TESTS/PredictiveController/LinMPC/moveinput!/MultipleShooting 0.128 ± 0.0078 ms 0.109 ± 0.0071 ms 1.18 ± 0.11
UNIT TESTS/PredictiveController/LinMPC/moveinput!/SingleShooting 16.9 ± 0.35 μs 15.5 ± 0.33 μs 1.09 ± 0.032
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/LinModel/MultipleShooting 2.63 ± 0.22 ms 2.6 ± 0.32 ms 1.01 ± 0.15
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/LinModel/SingleShooting 1.94 ± 0.23 ms 1.87 ± 0.18 ms 1.03 ± 0.16
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/NonLinModel/MultipleShooting 3.06 ± 0.11 ms 3.08 ± 0.15 ms 0.995 ± 0.059
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/NonLinModel/SingleShooting 1.78 ± 0.11 ms 1.76 ± 0.066 ms 1.01 ± 0.075
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/NonLinModel/TrapezoidalCollocation 2.19 ± 0.13 ms 2.2 ± 0.12 ms 0.999 ± 0.081
UNIT TESTS/SimModel/LinModel/evaloutput 0.16 ± 0.009 μs 0.14 ± 0.01 μs 1.14 ± 0.1
UNIT TESTS/SimModel/LinModel/updatestate! 0.21 ± 0.01 μs 0.21 ± 0.01 μs 1 ± 0.067
UNIT TESTS/SimModel/NonLinModel/evaloutput 0.421 ± 0 μs 0.431 ± 0.01 μs 0.977 ± 0.023
UNIT TESTS/SimModel/NonLinModel/linearize! 2.01 ± 0.02 μs 2.05 ± 0.021 μs 0.981 ± 0.014
UNIT TESTS/SimModel/NonLinModel/updatestate! 0.491 ± 0.01 μs 0.491 ± 0.01 μs 1 ± 0.029
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/evaloutput/LinModel 0.551 ± 0.001 μs 0.561 ± 0.01 μs 0.982 ± 0.018
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/evaloutput/NonLinModel 1.79 ± 0.019 μs 1.82 ± 0.011 μs 0.984 ± 0.012
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/preparestate!/LinModel 0.28 ± 0.01 μs 0.301 ± 0.001 μs 0.93 ± 0.033
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/preparestate!/NonLinModel 1.41 ± 0.001 μs 1.43 ± 0.001 μs 0.986 ± 0.00098
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/updatestate!/LinModel 3.91 ± 0.051 μs 4.04 ± 0.06 μs 0.968 ± 0.019
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/updatestate!/NonLinModel 8.93 ± 0.081 μs 9.06 ± 0.07 μs 0.986 ± 0.012
UNIT TESTS/StateEstimator/InternalModel/evaloutput/LinModel 0.24 ± 0.001 μs 0.241 ± 0.01 μs 0.996 ± 0.042
UNIT TESTS/StateEstimator/InternalModel/evaloutput/NonLinModel 0.571 ± 0 μs 0.581 ± 0.009 μs 0.983 ± 0.015
UNIT TESTS/StateEstimator/InternalModel/preparestate!/LinModel 0.281 ± 0.001 μs 0.3 ± 0.01 μs 0.937 ± 0.031
UNIT TESTS/StateEstimator/InternalModel/preparestate!/NonLinModel 0.732 ± 0.01 μs 0.731 ± 0.001 μs 1 ± 0.014
UNIT TESTS/StateEstimator/InternalModel/updatestate!/LinModel 0.411 ± 0.001 μs 0.411 ± 0.001 μs 1 ± 0.0034
UNIT TESTS/StateEstimator/InternalModel/updatestate!/NonLinModel 0.912 ± 0.01 μs 0.922 ± 0.001 μs 0.989 ± 0.011
UNIT TESTS/StateEstimator/KalmanFilter/evaloutput 0.261 ± 0.01 μs 0.27 ± 0.01 μs 0.967 ± 0.052
UNIT TESTS/StateEstimator/KalmanFilter/preparestate! 0.14 ± 0 μs 0.14 ± 0.01 μs 1 ± 0.071
UNIT TESTS/StateEstimator/KalmanFilter/updatestate! 2.52 ± 0.03 μs 2.6 ± 0.04 μs 0.973 ± 0.019
UNIT TESTS/StateEstimator/Luenberger/evaloutput 0.24 ± 0.001 μs 0.25 ± 0.011 μs 0.96 ± 0.042
UNIT TESTS/StateEstimator/Luenberger/preparestate! 0.26 ± 0.001 μs 0.261 ± 0.011 μs 0.996 ± 0.042
UNIT TESTS/StateEstimator/Luenberger/updatestate! 0.351 ± 0.01 μs 0.361 ± 0.011 μs 0.972 ± 0.041
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/LinModel/Current form 3.4 ± 0.19 ms 3.48 ± 0.21 ms 0.978 ± 0.081
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/LinModel/Prediction form 0.481 ± 0 μs 0.481 ± 0.01 μs 1 ± 0.021
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/NonLinModel/Current form 0.34 ± 0.018 ms 0.317 ± 0.018 ms 1.07 ± 0.084
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/NonLinModel/Prediction form 1.33 ± 0.011 μs 1.32 ± 0.01 μs 1.01 ± 0.011
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/LinModel/Current form 7.62 ± 2.2 μs 7.77 ± 2.2 μs 0.98 ± 0.39
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/LinModel/Prediction form 2.93 ± 0.12 ms 2.96 ± 0.12 ms 0.99 ± 0.058
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/NonLinModel/Current form 16.8 ± 0.38 μs 16.7 ± 0.31 μs 1 ± 0.029
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/NonLinModel/Prediction form 0.34 ± 0.018 ms 0.317 ± 0.019 ms 1.07 ± 0.084
UNIT TESTS/StateEstimator/SteadyKalmanFilter/evaloutput 0.27 ± 0.01 μs 0.271 ± 0.01 μs 0.996 ± 0.052
UNIT TESTS/StateEstimator/SteadyKalmanFilter/preparestate! 0.29 ± 0.01 μs 0.281 ± 0.01 μs 1.03 ± 0.051
UNIT TESTS/StateEstimator/SteadyKalmanFilter/updatestate! 0.381 ± 0.001 μs 0.391 ± 0.01 μs 0.974 ± 0.025
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/evaloutput/LinModel 0.301 ± 0.001 μs 0.331 ± 0.01 μs 0.909 ± 0.028
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/evaloutput/NonLinModel 0.942 ± 0.01 μs 0.942 ± 0.011 μs 1 ± 0.016
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/preparestate!/LinModel 3.68 ± 0.07 μs 3.7 ± 0.029 μs 0.995 ± 0.02
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/preparestate!/NonLinModel 4.86 ± 0.03 μs 4.89 ± 0.04 μs 0.994 ± 0.01
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/updatestate!/LinModel 3.51 ± 0.031 μs 3.52 ± 0.03 μs 0.997 ± 0.012
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/updatestate!/NonLinModel 6.46 ± 0.12 μs 6.38 ± 0.04 μs 1.01 ± 0.02
time_to_load 3.41 ± 0.03 s 3.45 ± 0.015 s 0.987 ± 0.0098
Memory benchmarks
main df1bb37... main / df1bb37...
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/DAQP/SingleShooting 0.0424 M allocs: 1.7 MB 0.0424 M allocs: 1.7 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/Ipopt/MultipleShooting 0.162 M allocs: 9.02 MB 0.162 M allocs: 9.02 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/Ipopt/SingleShooting 0.0538 M allocs: 2.4 MB 0.0538 M allocs: 2.4 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/OSQP/MultipleShooting 0.0758 M allocs: 1.95 MB 0.0758 M allocs: 1.95 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/With feedforward/OSQP/SingleShooting 7.85 k allocs: 0.249 MB 7.85 k allocs: 0.249 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/DAQP/SingleShooting 0.0422 M allocs: 1.7 MB 0.0422 M allocs: 1.7 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/Ipopt/MultipleShooting 0.127 M allocs: 7.19 MB 0.127 M allocs: 7.19 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/Ipopt/SingleShooting 0.0555 M allocs: 2.44 MB 0.0555 M allocs: 2.44 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/OSQP/MultipleShooting 0.0532 M allocs: 1.39 MB 0.0532 M allocs: 1.39 MB 1
CASE STUDIES/PredictiveController/CSTR/LinMPC/Without feedforward/OSQP/SingleShooting 7.7 k allocs: 0.243 MB 7.7 k allocs: 0.243 MB 1
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/DAQP/SingleShooting 0.107 M allocs: 5.91 MB 0.107 M allocs: 5.91 MB 1
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/Ipopt/MultipleShooting 4.65 M allocs: 0.25 GB 4.65 M allocs: 0.25 GB 1
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/Ipopt/SingleShooting 0.108 M allocs: 6.34 MB 0.108 M allocs: 6.34 MB 1
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/OSQP/MultipleShooting 4.68 M allocs: 0.254 GB 4.68 M allocs: 0.254 GB 1
CASE STUDIES/PredictiveController/Pendulum/LinMPC/Successive linearization/OSQP/SingleShooting 0.119 M allocs: 7.46 MB 0.119 M allocs: 7.46 MB 1
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Custom constraints/Ipopt/MultipleShooting 0.353 M allocs: 27.4 MB 0.354 M allocs: 20.7 MB 1.33
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Custom constraints/Ipopt/SingleShooting 0.351 M allocs: 0.0509 GB 0.357 M allocs: 17.2 MB 3.03
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Custom constraints/Ipopt/TrapezoidalCollocation 0.536 M allocs: 0.0354 GB 0.537 M allocs: 29.2 MB 1.24
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/Ipopt/MultipleShooting 0.27 M allocs: 20.9 MB 0.271 M allocs: 15.6 MB 1.34
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/Ipopt/SingleShooting 0.0962 M allocs: 18 MB 0.0988 M allocs: 4.16 MB 4.32
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/Ipopt/TrapezoidalCollocation 0.38 M allocs: 25.9 MB 0.38 M allocs: 20.6 MB 1.25
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Economic/MadNLP/SingleShooting 0.297 M allocs: 0.0649 GB 0.297 M allocs: 0.0649 GB 1
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/MultipleShooting 0.229 M allocs: 17.5 MB 0.23 M allocs: 13.2 MB 1.33
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/MultipleShooting (threaded) 0.247 M allocs: 25.4 MB 0.248 M allocs: 21 MB 1.21
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/SingleShooting 0.0758 M allocs: 13.6 MB 0.0778 M allocs: 3.26 MB 4.16
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/TrapezoidalCollocation 0.32 M allocs: 21.6 MB 0.321 M allocs: 17.3 MB 1.25
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/Ipopt/TrapezoidalCollocation (threaded) 0.338 M allocs: 29.6 MB 0.339 M allocs: 25.3 MB 1.17
CASE STUDIES/PredictiveController/Pendulum/NonLinMPC/Noneconomic/MadNLP/SingleShooting 0.255 M allocs: 0.056 GB 0.255 M allocs: 0.056 GB 1
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/DAQP/Current form 0.762 M allocs: 0.0804 GB 0.762 M allocs: 0.0804 GB 1
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/DAQP/Prediction form 0.682 M allocs: 0.0587 GB 0.682 M allocs: 0.0587 GB 1
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/Ipopt/Current form 0.647 M allocs: 0.0787 GB 0.647 M allocs: 0.0787 GB 1
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/Ipopt/Prediction form 0.601 M allocs: 0.0564 GB 0.601 M allocs: 0.0564 GB 1
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/OSQP/Current form 0.63 M allocs: 0.0766 GB 0.63 M allocs: 0.0766 GB 1
CASE STUDIES/StateEstimator/CSTR/MovingHorizonEstimator/OSQP/Prediction form 0.585 M allocs: 0.0555 GB 0.585 M allocs: 0.0555 GB 1
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/Ipopt/Current form 14.4 M allocs: 2.56 GB 14.4 M allocs: 2.56 GB 1
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/Ipopt/Prediction form 2.16 M allocs: 0.371 GB 2.16 M allocs: 0.371 GB 1
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/MadNLP/Current form 15.2 M allocs: 2.8 GB 15.2 M allocs: 2.8 GB 1
CASE STUDIES/StateEstimator/Pendulum/MovingHorizonEstimator/MadNLP/Prediction form 8.88 M allocs: 1.63 GB 8.88 M allocs: 1.63 GB 1
UNIT TESTS/PredictiveController/ExplicitMPC/moveinput! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/PredictiveController/LinMPC/moveinput!/MultipleShooting 0.994 k allocs: 25.5 kB 0.994 k allocs: 25.5 kB 1
UNIT TESTS/PredictiveController/LinMPC/moveinput!/SingleShooting 0.088 k allocs: 2.23 kB 0.088 k allocs: 2.23 kB 1
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/LinModel/MultipleShooting 2.91 k allocs: 0.192 MB 2.92 k allocs: 0.149 MB 1.29
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/LinModel/SingleShooting 0.511 k allocs: 0.0542 MB 0.519 k allocs: 19.6 kB 2.83
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/NonLinModel/MultipleShooting 3.15 k allocs: 0.27 MB 3.16 k allocs: 0.168 MB 1.6
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/NonLinModel/SingleShooting 0.545 k allocs: 0.114 MB 0.553 k allocs: 21.2 kB 5.52
UNIT TESTS/PredictiveController/NonLinMPC/moveinput!/NonLinModel/TrapezoidalCollocation 2.27 k allocs: 0.16 MB 2.27 k allocs: 0.102 MB 1.56
UNIT TESTS/SimModel/LinModel/evaloutput 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/SimModel/LinModel/updatestate! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/SimModel/NonLinModel/evaloutput 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/SimModel/NonLinModel/linearize! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/SimModel/NonLinModel/updatestate! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/evaloutput/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/evaloutput/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/preparestate!/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/preparestate!/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/updatestate!/LinModel 4 allocs: 0.0938 kB 4 allocs: 0.0938 kB 1
UNIT TESTS/StateEstimator/ExtendedKalmanFilter/updatestate!/NonLinModel 4 allocs: 0.0938 kB 4 allocs: 0.0938 kB 1
UNIT TESTS/StateEstimator/InternalModel/evaloutput/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/InternalModel/evaloutput/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/InternalModel/preparestate!/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/InternalModel/preparestate!/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/InternalModel/updatestate!/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/InternalModel/updatestate!/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/KalmanFilter/evaloutput 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/KalmanFilter/preparestate! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/KalmanFilter/updatestate! 4 allocs: 0.0938 kB 4 allocs: 0.0938 kB 1
UNIT TESTS/StateEstimator/Luenberger/evaloutput 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/Luenberger/preparestate! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/Luenberger/updatestate! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/LinModel/Current form 0.0754 M allocs: 14.6 MB 0.0754 M allocs: 14.6 MB 0.998
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/LinModel/Prediction form 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/NonLinModel/Current form 0.973 k allocs: 25.4 kB 0.973 k allocs: 25.4 kB 1
UNIT TESTS/StateEstimator/MovingHorizonEstimator/preparestate!/NonLinModel/Prediction form 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/LinModel/Current form 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/LinModel/Prediction form 7.04 k allocs: 0.445 MB 7.04 k allocs: 0.445 MB 1
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/NonLinModel/Current form 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/MovingHorizonEstimator/updatestate!/NonLinModel/Prediction form 0.973 k allocs: 25.4 kB 0.973 k allocs: 25.4 kB 1
UNIT TESTS/StateEstimator/SteadyKalmanFilter/evaloutput 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/SteadyKalmanFilter/preparestate! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/SteadyKalmanFilter/updatestate! 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/evaloutput/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/evaloutput/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/preparestate!/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/preparestate!/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/updatestate!/LinModel 0 allocs: 0 B 0 allocs: 0 B
UNIT TESTS/StateEstimator/UnscentedKalmanFilter/updatestate!/NonLinModel 0 allocs: 0 B 0 allocs: 0 B
time_to_load 0.149 k allocs: 11.2 kB 0.159 k allocs: 11.6 kB 0.964

@franckgaga franckgaga merged commit 6fdbce5 into main Nov 2, 2025
5 checks passed
@franckgaga franckgaga deleted the exact_hessian branch November 2, 2025 01:11
@gdalle
Copy link
Contributor

gdalle commented Nov 4, 2025

Hi @franckgaga,
@amontoison alerted me to this PR, and I have a couple of questions/suggestions:

  • If the bulk of the time is spent in actual autodiff, you could try GreedyColoringAlgorithm(; decompression=:substitution, postprocessing=true) to see if it reduces the number of colors (aka the number of autodiff passes, up to ForwardDiff chunking).
  • If the bulk of the time is spent in decompression, it might be because your function fill_lowertriangle! is not optimized for SparseMatrixCSC (individual indexing is very slow on such matrices). I suggest you keep track of the issue I opened about this: Decompress into a single triangle gdalle/SparseMatrixColorings.jl#280

@franckgaga
Copy link
Member Author

franckgaga commented Nov 4, 2025

Many thanks for the suggestions @gdalle. I tried to differentiate actual autodiff to decompression using @profview, but I was a bit lost in the call stack. Here it is in attached files, in case you can decode it better than me, see "thread 1 (interactive)": profile.html

If the bulk of the time is spent in actual autodiff, you could try GreedyColoringAlgorithm(; decompression=:substitution, postprocessing=true) to see if it reduces the number of colors (aka the number of autodiff passes, up to ForwardDiff chunking).

I benchmarked my case study with theses two options and it essentially changes nothing on the results, compared to GreedyColoringAlgorithm().

If the bulk of the time is spent in decompression, it might be because your function fill_lowertriangle! is not optimized for SparseMatrixCSC (individual indexing is very slow on such matrices). I suggest you keep track of the issue I opened about this: gdalle/SparseMatrixColorings.jl#280

Alright thanks, I'm following it. The JuMP.@operator docstring says that I must fill in-place only the non-zero lower triangular entries only. Do you have any suggestions for a more efficient implementation of fill_lowertriangle! function?

@gdalle
Copy link
Contributor

gdalle commented Nov 4, 2025

I tried to differentiate actual autodiff to decompression using @profview, but I was a bit lost in the call stack. Here it is in attached files, in case you can decode it better than me, see "thread 1 (interactive)": profile.html

This doesn't look too bad, it's all blue and you're spending most of your time in the actual differentiation (gradients and hessians).

Do you have any suggestions for a more efficient implementation of fill_lowertriangle! function?

Here's the better implementation, which can be a thousand times faster (although it doesn't seem to impact your profiling). That's similar to what I'd like to do in SMC but I'm still figuring out API design for only filling one triangle.

using LinearAlgebra, SparseArrays, StableRNGs

function copyto_lowertriangle_naive!(T::SparseMatrixCSC, A::SparseMatrixCSC)
    for j in axes(A, 2)
        for i in axes(A, 1)
            if i >= j
                T[i, j] = A[i, j]
            end
        end
    end
    return
end

function copyto_lowertriangle!(T::SparseMatrixCSC, A::SparseMatrixCSC)
    @assert size(T) == size(A)
    kT = 0
    rvA, rvT = rowvals(A), rowvals(T)
    nzA, nzT = nonzeros(A), nonzeros(T)
    for j in axes(A, 2)
        for kA in nzrange(A, j)
            i = rvA[kA]
            if i >= j
                kT += 1
                @assert i == rvT[kT]
                nzT[kT] = nzA[kA]
            end
        end
    end
    return T
end

Benchmark and test:

julia> n, p = 1_000, 0.005;

julia> A = sparse(Symmetric(sprand(n, n, p)));

julia> T = similar(sparse(LowerTriangular(A)));

julia> copyto_lowertriangle!(T, A)

julia> T == LowerTriangular(A)
true

julia> using Chairmarks

julia> @b (T, A) copyto_lowertriangle_naive!(_[1], _[2])
7.881 ms

julia> @b (T, A) copyto_lowertriangle!(_[1], _[2])
12.958 μs

@franckgaga
Copy link
Member Author

franckgaga commented Nov 5, 2025

Impressive gains!

I just understood that MathOptInterface will call the in-place ∇²f function of @operator with the first argument as H::MathOptInterface.Nonlinear.ReverseAD._UnsafeLowerTriangularMatrixView, instead of a Matrix or SparseMatrixCSC. Its docstring is:

  _UnsafeLowerTriangularMatrixView(x, N)

  Lightweight unsafe view that converts a vector x into the lower-triangular component of a symmetric N-by-N matrix.

  Motivation
  ==========

  _UnsafeLowerTriangularMatrixView is needed as an allocation-free equivalent of view. Other alternatives, like reshape(view(x, 1:N^2), N, N) or a struct like

  struct _SafeView{T}
      x::Vector{T}
      len::Int
  end

  will allocate so that x can be tracked by Julia's GC. _UnsafeLowerTriangularMatrixView relies on the fact that the use-cases of _UnsafeLowerTriangularMatrixView only temporarily wrap a
  long-lived vector like d.jac_storage so that we don't have to worry about the GC removing d.jac_storage while _UnsafeLowerTriangularMatrixView exists. This lets us use a Ptr{T} and
  create a struct that is isbitstype and therefore does not allocate.

  Unsafe behavior
  ===============

  _UnsafeLowerTriangularMatrixView is unsafe because it assumes that the vector x remains valid during the usage of _UnsafeLowerTriangularMatrixView.

  ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

  _UnsafeLowerTriangularMatrixView(x::Vector{Float64}, N::Int)

  Create a new _UnsafeLowerTriangularMatrixView from x, zero the elements in x, and resize x if needed to ensure it has a length of at least N * (N + 1) / 2.

  Unsafe behavior
  ===============

  In addition to the usafe behavior of _UnsafeLowerTriangularMatrixView, this constructor is additionally unsafe because it may resize x. Only call it if you are sure that the usage of
  _UnsafeLowerTriangularMatrixView(x, N) is short-lived, and that there are no other views to x while the returned value is within scope.

That's probably why individual indexing of H does not seems to be slow here. This tutorial provides some additional details.

That being said, I just noticed that I filled all the entries of the lower-triangular part, and the docstring of @operator says that I "must fill in the non-zero lower-triangular entries only". I'm opening a PR to correct this.

@franckgaga
Copy link
Member Author

Come to think of it, the destination matrix is not a SparseMatrixCSC, but the source A matrix will still be a SparseMatrixCSC, and I'm indexing to read the non-zero entries.

I need to do some benchmarks to verify if this part is slow...

@franckgaga
Copy link
Member Author

franckgaga commented Nov 5, 2025

Okay I did some more benchmarks. The indexing of A::SparseMatrixCSC for reading the lower-triangular non-zero entries is not the bottleneck here. As a matter of fact, I don't think there is much more room for improvements on my side with DI.jl computations and interfacing it with VectorNonlinearOracle/@operator API.

I'm not sure why it did not appear on my first profile above, but I ran multiple profiling and now, systematically, a fair amount of time is spent in MOI matrix coloring algorithms. See under thread "1 (interactive) and under initialize on the new profile here : profile.html. Is it expected @odow ? It feels weird that MOI spends a lot of time in its own coloring codebase when I'm delegating Hessian matrix coloring to DifferentiationInterface.jl.

edit: maybe also @amontoison can answer my question.

@gdalle
Copy link
Contributor

gdalle commented Nov 5, 2025

Maybe MOI does the coloring regardless of whether it actually has to compute a hessian?

@odow
Copy link
Contributor

odow commented Nov 5, 2025

Are there other scalar nonlinear constraints in the model? Our AD probably needs to kick in to compute the Hessian for those parts (or the objective, etc)

@franckgaga
Copy link
Member Author

franckgaga commented Nov 5, 2025

What's in the model:

  • 1 @operator with a gradient and Hessian functions for the objective.
  • 1 VectorNonlinearOracle with a Jacobian and a Hessian of the Lagrangian functions for nonlinear equality constraints
  • 40 AffExpr in MOI.LessThan{Float64} for the linear equality constraints

@franckgaga
Copy link
Member Author

franckgaga commented Nov 5, 2025

I provide the Hessian/Hessian of the Langrangian function for all the nonlinear ingredients. I know that MOI still need to sum the 2 Hessians in the end, but AD should not be require for this, no ?

It is also worth mentioning that the Hessian sparsity pattern is different in the @operator and in the VectorNonlinearOracle, would it be the cause ?

@odow
Copy link
Contributor

odow commented Nov 5, 2025

1 @operator with a gradient and Hessian functions for the objective.

The AD in MOI runs for this. We could probably look at improving the performance of initialize if it's a bottleneck? It hasn't typically been an issue because most time is spent in the iterations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants