-
Notifications
You must be signed in to change notification settings - Fork 3
added: hessian keyword argument in NonLinMPC
#275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hessian argument in NonLinMPChessian keyword argument in NonLinMPC
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #275 +/- ##
==========================================
+ Coverage 98.45% 98.48% +0.03%
==========================================
Files 28 28
Lines 4659 4764 +105
==========================================
+ Hits 4587 4692 +105
Misses 72 72 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Benchmark Results (Julia v1)Time benchmarks
Memory benchmarks
|
|
Hi @franckgaga,
|
|
Many thanks for the suggestions @gdalle. I tried to differentiate actual autodiff to decompression using
I benchmarked my case study with theses two options and it essentially changes nothing on the results, compared to
Alright thanks, I'm following it. The |
This doesn't look too bad, it's all blue and you're spending most of your time in the actual differentiation (gradients and hessians).
Here's the better implementation, which can be a thousand times faster (although it doesn't seem to impact your profiling). That's similar to what I'd like to do in SMC but I'm still figuring out API design for only filling one triangle. using LinearAlgebra, SparseArrays, StableRNGs
function copyto_lowertriangle_naive!(T::SparseMatrixCSC, A::SparseMatrixCSC)
for j in axes(A, 2)
for i in axes(A, 1)
if i >= j
T[i, j] = A[i, j]
end
end
end
return
end
function copyto_lowertriangle!(T::SparseMatrixCSC, A::SparseMatrixCSC)
@assert size(T) == size(A)
kT = 0
rvA, rvT = rowvals(A), rowvals(T)
nzA, nzT = nonzeros(A), nonzeros(T)
for j in axes(A, 2)
for kA in nzrange(A, j)
i = rvA[kA]
if i >= j
kT += 1
@assert i == rvT[kT]
nzT[kT] = nzA[kA]
end
end
end
return T
endBenchmark and test: julia> n, p = 1_000, 0.005;
julia> A = sparse(Symmetric(sprand(n, n, p)));
julia> T = similar(sparse(LowerTriangular(A)));
julia> copyto_lowertriangle!(T, A)
julia> T == LowerTriangular(A)
true
julia> using Chairmarks
julia> @b (T, A) copyto_lowertriangle_naive!(_[1], _[2])
7.881 ms
julia> @b (T, A) copyto_lowertriangle!(_[1], _[2])
12.958 μs |
|
Impressive gains! I just understood that That's probably why individual indexing of That being said, I just noticed that I filled all the entries of the lower-triangular part, and the docstring of |
|
Come to think of it, the destination matrix is not a I need to do some benchmarks to verify if this part is slow... |
|
Okay I did some more benchmarks. The indexing of I'm not sure why it did not appear on my first profile above, but I ran multiple profiling and now, systematically, a fair amount of time is spent in edit: maybe also @amontoison can answer my question. |
|
Maybe MOI does the coloring regardless of whether it actually has to compute a hessian? |
|
Are there other scalar nonlinear constraints in the model? Our AD probably needs to kick in to compute the Hessian for those parts (or the objective, etc) |
|
What's in the model:
|
|
I provide the Hessian/Hessian of the Langrangian function for all the nonlinear ingredients. I know that MOI still need to sum the 2 Hessians in the end, but AD should not be require for this, no ? It is also worth mentioning that the Hessian sparsity pattern is different in the |
The AD in MOI runs for this. We could probably look at improving the performance of |
AD runs if there are any scalar nonlinear components, even if the result of the AD is just "query the operator, fill in the Hessian". We still need an algorithm for that. There's no special handling for exploiting the case where the constraint or objective body is a single user-defined operator. |
|
Why is |
|
No parameters. There are 35 solves in the profile above. I'm not sure if it's being called at each |
|
Okay, funnily enough, |
|
MWE? |
|
Here's what I'm benchmarking/profiling on Julia v1.12.1 and the latest stable releases of the packages: using ModelPredictiveControl, JuMP
using UnoSolver
N = 35 # number of solves/optimize! calls
function f!(ẋ, x, u, _ , p)
g, L, K, m = p # [m/s²], [m], [kg/s], [kg]
θ, ω = x[1], x[2] # [rad], [rad/s]
τ = u[1] # [Nm]
ẋ[1] = ω
ẋ[2] = -g/L*sin(θ) - K/m*ω + τ/m/L^2
return nothing
end
h!(y, x, _ , _ ) = (y[1] = 180/π*x[1]; nothing) # [°]
p = [9.8, 0.4, 1.2, 0.3]
nu, nx, ny, Ts = 1, 2, 1, 0.1
model = NonLinModel(f!, h!, Ts, nu, nx, ny; p)
p_plant = copy(p); p_plant[3] = p[3]*1.25
plant = NonLinModel(f!, h!, Ts, nu, nx, ny; p=p_plant)
Hp, Hc, Mwt, Nwt = 20, 2, [0.5], [2.5]
α=0.01; σQ=[0.1, 1.0]; σR=[5.0]; nint_u=[1]; σQint_u=[0.1]
σQint_ym = zeros(0)
umin, umax = [-1.5], [+1.5]
nc = Hp+1
transcription = MultipleShooting()
optim = Model(()->UnoSolver.Optimizer(preset="filtersqp"));
oracle = true
hessian = true
nmpc = NonLinMPC(model;
Hp, Hc, Mwt, Nwt, Cwt=Inf, transcription, oracle, hessian, optim,
α, σQ, σR, nint_u, σQint_u, σQint_ym
)
nmpc = setconstraint!(nmpc; umin, umax)
unset_time_limit_sec(nmpc.optim)
sim!(nmpc, N, [180.0]; plant=plant, x_0=[0, 0], x̂_0=[0, 0, 0])
@time sim!(nmpc, N, [180.0]; plant=plant, x_0=[0, 0], x̂_0=[0, 0, 0])
@profview sim!(nmpc, N, [180.0]; plant=plant, x_0=[0, 0], x̂_0=[0, 0, 0]) |
|
Erratum: I made some improvements in DI.jl part of MPC.jl and I forget to register it. The MWE is ran on MPC.jl v1.13.1. I'm registering it right now, but you can dev the package in the meantime. |
|
There are a number of things going on here:
We could fix this entirely in Uno (@amontoison) by supporting modifying the RHS. Or we could improve MOI somewhat by trying to detect and exploit dense Hessians during the colouring. |
|
I can support the modification of the RHS in However, I am quite reluctant to change things in the nonlinear API of MOI. It works, and all the coloring and sparsity-pattern detection were done back when |
|
Your take. I think it would make sense to support modifying the RHS of affine constraints in |
|
Oh but the RHS of affine / quadratic contraints is independent of the sparsity pattern, right? |
|
Yes. I'm updating the RHS of affine inequality constraints with new constants. Or, in other words, updating the About this, I think it's already supported in |
|
The MOI interface in |

The new keyword argument allows enabling and customizing the computation of the Hessian of the Langrangian function. Note that this is only supported with
oracle=trueoption. Changing the Hessian backend will throw an error iforacle=false.The default is
hessian=false, meaning Hessian is not computed, and the Quasi-newton approximation provided by the optimizer will be used (e.g. L-BFGS forIpopt.jl). The other options are:hessian=true: it will select an appropriate AD backend for the Hessian, which is:hessian=backend: usebackend::AbstractADTypefor the computation of the Hessian.Also, pretty-printing
NonLinMPCwill now show thehessianbackend: