-
Notifications
You must be signed in to change notification settings - Fork 98
refined hessian sparsity detection #2882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Following #2527 (comment): using NLPModels, NLPModelsJuMP, JuMP, SparseArrays
nlp = Model()
@variable(nlp, x[i = 1:10])
@constraint(nlp, sum(x[i] for i = 1:10) / x[1] == 0)
@objective(nlp, Min, x[1]^4)
nlp2 = MathOptNLPModel(nlp)
rows, cols = hess_structure(nlp2)
nnz_nlp = length(rows)
vals = ones(Int, nnz_nlp)
H = sparse(rows, cols, vals, 10, 10)Before: After: |
|
@gdalle do you have any pointers for how we can rerun the comparisons in adrhill/SparseConnectivityTracer.jl#83? This change should reduce/eliminate discrepancies for the structural non-zeros off the diagonal. |
|
I think everything should be in those two files: https://github.com/adrhill/SparseConnectivityTracer.jl/blob/main/test/nlpmodels.jl Ping @adrhill, nice to know SCT is being used as reference for JuMP tests 😉 |
|
I ran on all the PureJuMP instances in OptimizationModels. Sampled at a random point, all hessians agreed up to numerical tolerances. Only two instances had a different number of nonzeros in the hessian:
Based on adrhill/SparseConnectivityTracer.jl#83 (comment) it seems like there might still be room for improvement in a future PR (specific small examples would be useful, I'm unlikely to dig in myself). |
|
I ran all the comparisons and here are the results: ┌ Warning: Inconsistencies were detected
┌ Warning: Inconsistency for Jacobian of hs117: SCT (75 nz) ⊃ JuMP (62 nz)
┌ Warning: Inconsistency for Jacobian of lincon: SCT (19 nz) ⊃ JuMP (17 nz)
┌ Warning: Inconsistency for Hessian of argauss: SCT (8 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs5: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs6: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of britgas: SCT (1087 nz) ⊂ JuMP (1111 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of chain: SCT (75 nz) ⊂ JuMP (100 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of channel: SCT (696 nz) ⊃ JuMP (672 nz)
┌ Warning: Inconsistency for Hessian of dixmaane: SCT (493 nz) ⊃ JuMP (297 nz)
┌ Warning: Inconsistency for Hessian of dixmaani: SCT (493 nz) ⊃ JuMP (297 nz)
┌ Warning: Inconsistency for Hessian of dixmaanm: SCT (493 nz) ⊃ JuMP (297 nz)
┌ Warning: Inconsistency for Hessian of hs114: SCT (19 nz) ⊂ JuMP (21 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs119: SCT (256 nz) ⊃ JuMP (76 nz)
┌ Warning: Inconsistency for Hessian of hs250: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs251: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs36: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs37: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs40: SCT (15 nz) ⊂ JuMP (16 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs41: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs45: SCT (20 nz) ⊂ JuMP (25 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs56: SCT (10 nz) ⊂ JuMP (13 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs68: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs69: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs87: SCT (9 nz) ⊂ JuMP (11 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs93: SCT (34 nz) ⊂ JuMP (36 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon1: SCT (550 nz) ⊂ JuMP (600 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon2: SCT (350 nz) ⊂ JuMP (400 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of robotarm: SCT (252 nz) ⊂ JuMP (276 nz) [diagonal difference only]We have a few small problems with less than 20 nnz in the Jacobian / Hessian of the Lagrangian. |
|
@adrhill do you know why there are some cases where the SCT pattern is a superset of the JuMP pattern? |
|
It is because some |
|
Oh nice, looks like there's no evidence of more to optimize on the JuMP side except the diagonal terms. |
|
@amontoison if you have the code and environment ready, could you re-run the comparison for |
|
@gdalle I tested with the local tracer and we have only two specific problems to investigate ( ┌ Warning: Inconsistencies were detected
┌ Warning: Inconsistency for Hessian of argauss: SCT (8 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs5: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs6: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of britgas: SCT (1087 nz) ⊂ JuMP (1111 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of chain: SCT (75 nz) ⊂ JuMP (100 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of channel: SCT (696 nz) ⊃ JuMP (672 nz)
┌ Warning: Inconsistency for Hessian of hs114: SCT (19 nz) ⊂ JuMP (21 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs250: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs251: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs36: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs37: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs40: SCT (15 nz) ⊂ JuMP (16 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs41: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs45: SCT (20 nz) ⊂ JuMP (25 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs56: SCT (10 nz) ⊂ JuMP (13 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs68: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs69: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs87: SCT (9 nz) ⊂ JuMP (11 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs93: SCT (34 nz) ⊂ JuMP (36 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon1: SCT (450 nz) ⊂ JuMP (600 nz)
┌ Warning: Inconsistency for Hessian of polygon2: SCT (350 nz) ⊂ JuMP (400 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of robotarm: SCT (252 nz) ⊂ JuMP (276 nz) [diagonal difference only] |
odow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a couple of minor formatting changes. LGTM from my side. Do we want to wait to dig into the comparisons with SCT?
|
@adrhill do you have some bandwidth to figure this out? I most definitely don't |
|
I don't think digging into the comparisons with SCT should be blocking for merging. |
Unfortunately not, as NeurIPS is coming up next week. Happy to revisit this afterwards. |
|
I will do a new release of ┌ Warning: Inconsistency for Jacobian of hs117: SCT (75 nz) ⊃ JuMP (62 nz)
┌ Warning: Inconsistency for Hessian of argauss: SCT (8 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs5: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs6: SCT (9 nz) ⊂ JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of britgas: SCT (1087 nz) ⊂ JuMP (1111 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of chain: SCT (75 nz) ⊂ JuMP (100 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of channel: SCT (696 nz) ⊃ JuMP (672 nz)
┌ Warning: Inconsistency for Hessian of dixmaane: SCT (493 nz) ⊃ JuMP (297 nz)
┌ Warning: Inconsistency for Hessian of dixmaani: SCT (493 nz) ⊃ JuMP (297 nz)
┌ Warning: Inconsistency for Hessian of dixmaanm: SCT (493 nz) ⊃ JuMP (297 nz)
┌ Warning: Inconsistency for Hessian of hs114: SCT (19 nz) ⊂ JuMP (21 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs250: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs251: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs36: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs37: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs40: SCT (15 nz) ⊂ JuMP (16 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs41: SCT (6 nz) ⊂ JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs45: SCT (20 nz) ⊂ JuMP (25 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs56: SCT (10 nz) ⊂ JuMP (13 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs68: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs69: SCT (9 nz) ⊂ JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs87: SCT (9 nz) ⊂ JuMP (11 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs93: SCT (34 nz) ⊂ JuMP (36 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon1: SCT (550 nz) ⊂ JuMP (600 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon2: SCT (350 nz) ⊂ JuMP (400 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of robotarm: SCT (252 nz) ⊂ JuMP (276 nz) [diagonal difference only]The following script was used: using Dates: now
using LinearAlgebra
using OptimizationProblems
using SparseArrays
using Test
using SparseConnectivityTracerBenchmarks.Optimization:
compute_jac_and_hess_sparsity_sct,
compute_jac_and_hess_sparsity_and_value_jump,
optimization_problem_names
function compare_patterns(
truth::AbstractMatrix{<:Real}; sct::AbstractMatrix{Bool}, jump::AbstractMatrix{Bool}
)
difference = jump - sct
if nnz(difference) > 0
# test that all pattern differences are local zeros in the ground truth
I, J, _ = findnz(difference)
coeffs = [truth[i, j] for (i, j) in zip(I, J)]
@test maximum(abs, coeffs) < 1.0e-7
end
nnz_sct, nnz_jump = nnz(sct), nnz(jump)
diagonal = (difference == Diagonal(difference)) ? "[diagonal difference only]" : ""
message = if all(>(0), nonzeros(difference))
"SCT ($nnz_sct nz) ⊂ JuMP ($nnz_jump nz) $diagonal"
elseif all(<(0), nonzeros(difference))
"SCT ($nnz_sct nz) ⊃ JuMP ($nnz_jump nz) $diagonal"
else
"SCT ($nnz_sct nz) ≠ JuMP ($nnz_jump nz) $diagonal"
end
return message
end
#=
Please look at the warnings displayed at the end.
=#
jac_inconsistencies = []
hess_inconsistencies = []
@testset "$name" for name in optimization_problem_names()
in(name, [:catmix, :gasoil, :glider, :methanol, :minsurf, :pinene, :rocket, :steering, :torsion]) && continue
@info "$(now()) - $name"
(jac_sparsity_sct, hess_sparsity_sct) = compute_jac_and_hess_sparsity_sct(name)
((jac_sparsity_jump, jac), (hess_sparsity_jump, hess)) = compute_jac_and_hess_sparsity_and_value_jump(
name
)
@testset verbose = true "Jacobian comparison" begin
if jac_sparsity_sct == jac_sparsity_jump
@test jac_sparsity_sct == jac_sparsity_jump
else
@test_broken jac_sparsity_sct == jac_sparsity_jump
message = compare_patterns(jac; sct = jac_sparsity_sct, jump = jac_sparsity_jump)
@warn "Inconsistency for Jacobian of $name: $message"
push!(jac_inconsistencies, (name, message))
end
end
@testset verbose = true "Hessian comparison" begin
if hess_sparsity_sct == hess_sparsity_jump
@test hess_sparsity_sct == hess_sparsity_jump
else
@test_broken hess_sparsity_sct == hess_sparsity_jump
message = compare_patterns(hess; sct = hess_sparsity_sct, jump = hess_sparsity_jump)
@warn "Inconsistency for Hessian of $name: $message"
push!(hess_inconsistencies, (name, message))
end
end
yield()
end;
if !isempty(jac_inconsistencies) || !isempty(hess_inconsistencies)
@warn "Inconsistencies were detected"
for (name, message) in jac_inconsistencies
@warn "Inconsistency for Jacobian of $name: $message"
end
for (name, message) in hess_inconsistencies
@warn "Inconsistency for Hessian of $name: $message"
end
end@gdalle @adrhill With the local tracer, only one problem with SCT gives a different sparsity pattern: Warning: Inconsistency for Hessian of channel: SCT (696 nz) ⊃ JuMP (672 nz) I used |
Rewrites the hessian sparsity detection code to be aware that, for some nonlinear operations, not all combinations of operands generate hessian terms. For example: x*y (no diagonal terms) and x/y (no (x,x) term).
Fixes #2527
CC @amontoison