Skip to content

Conversation

@mlubin
Copy link
Member

@mlubin mlubin commented Nov 19, 2025

Rewrites the hessian sparsity detection code to be aware that, for some nonlinear operations, not all combinations of operands generate hessian terms. For example: x*y (no diagonal terms) and x/y (no (x,x) term).

Fixes #2527
CC @amontoison

@mlubin mlubin marked this pull request as ready for review November 19, 2025 21:29
@mlubin
Copy link
Member Author

mlubin commented Nov 19, 2025

Following #2527 (comment):

using NLPModels, NLPModelsJuMP, JuMP, SparseArrays

nlp = Model()

@variable(nlp,  x[i = 1:10])
@constraint(nlp, sum(x[i] for i = 1:10) / x[1]  == 0)
@objective(nlp, Min, x[1]^4)

nlp2 = MathOptNLPModel(nlp)
rows, cols = hess_structure(nlp2)
nnz_nlp = length(rows)
vals = ones(Int, nnz_nlp)
H = sparse(rows, cols, vals, 10, 10)

Before:

10×10 SparseMatrixCSC{Int64, Int64} with 55 stored entries:
 2  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  1  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  1  1  ⋅  ⋅  ⋅  ⋅
 1  1  1  1  1  1  1  ⋅  ⋅  ⋅
 1  1  1  1  1  1  1  1  ⋅  ⋅
 1  1  1  1  1  1  1  1  1  ⋅
 1  1  1  1  1  1  1  1  1  1

After:

10×10 SparseMatrixCSC{Int64, Int64} with 19 stored entries:
 2  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅
 1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1

@odow
Copy link
Member

odow commented Nov 19, 2025

@mlubin mlubin requested a review from odow November 20, 2025 01:52
@mlubin
Copy link
Member Author

mlubin commented Nov 20, 2025

@gdalle do you have any pointers for how we can rerun the comparisons in adrhill/SparseConnectivityTracer.jl#83? This change should reduce/eliminate discrepancies for the structural non-zeros off the diagonal.

@gdalle
Copy link

gdalle commented Nov 20, 2025

I think everything should be in those two files:

https://github.com/adrhill/SparseConnectivityTracer.jl/blob/main/benchmark/SparseConnectivityTracerBenchmarks/src/nlpmodels.jl

https://github.com/adrhill/SparseConnectivityTracer.jl/blob/main/test/nlpmodels.jl

Ping @adrhill, nice to know SCT is being used as reference for JuMP tests 😉

@mlubin
Copy link
Member Author

mlubin commented Nov 21, 2025

I ran on all the PureJuMP instances in OptimizationModels. Sampled at a random point, all hessians agreed up to numerical tolerances. Only two instances had a different number of nonzeros in the hessian:

  • hs114 reduced from 23 to 21
  • britgas reduced from 1415 to 1111

Based on adrhill/SparseConnectivityTracer.jl#83 (comment) it seems like there might still be room for improvement in a future PR (specific small examples would be useful, I'm unlikely to dig in myself).

@amontoison
Copy link
Contributor

amontoison commented Nov 21, 2025

I ran all the comparisons and here are the results:

┌ Warning: Inconsistencies were detected
┌ Warning: Inconsistency for Jacobian of hs117: SCT (75 nz)  JuMP (62 nz)
┌ Warning: Inconsistency for Jacobian of lincon: SCT (19 nz)  JuMP (17 nz) 
┌ Warning: Inconsistency for Hessian of argauss: SCT (8 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs5: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs6: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of britgas: SCT (1087 nz)  JuMP (1111 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of chain: SCT (75 nz)  JuMP (100 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of channel: SCT (696 nz)  JuMP (672 nz) 
┌ Warning: Inconsistency for Hessian of dixmaane: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of dixmaani: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of dixmaanm: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of hs114: SCT (19 nz)  JuMP (21 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs119: SCT (256 nz)  JuMP (76 nz) 
┌ Warning: Inconsistency for Hessian of hs250: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs251: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs36: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs37: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs40: SCT (15 nz)  JuMP (16 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs41: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs45: SCT (20 nz)  JuMP (25 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs56: SCT (10 nz)  JuMP (13 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs68: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs69: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs87: SCT (9 nz)  JuMP (11 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs93: SCT (34 nz)  JuMP (36 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon1: SCT (550 nz)  JuMP (600 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon2: SCT (350 nz)  JuMP (400 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of robotarm: SCT (252 nz)  JuMP (276 nz) [diagonal difference only]

We have a few small problems with less than 20 nnz in the Jacobian / Hessian of the Lagrangian.

@gdalle
Copy link

gdalle commented Nov 21, 2025

@adrhill do you know why there are some cases where the SCT pattern is a superset of the JuMP pattern?

@amontoison
Copy link
Contributor

It is because some ADNLPProblems are badly implemented and we force some structural zeros to be non-zeros: JuliaSmoothOptimizers/OptimizationProblems.jl#397

@mlubin
Copy link
Member Author

mlubin commented Nov 21, 2025

Oh nice, looks like there's no evidence of more to optimize on the JuMP side except the diagonal terms.

@gdalle
Copy link

gdalle commented Nov 21, 2025

@amontoison if you have the code and environment ready, could you re-run the comparison for TracerLocalSparsityDetector() instead of TracerSparsityDetector() on the SCT side? It might help us diagnose the remaining mismatches.

@amontoison
Copy link
Contributor

@gdalle I tested with the local tracer and we have only two specific problems to investigate (channel and polygon1).

┌ Warning: Inconsistencies were detected
┌ Warning: Inconsistency for Hessian of argauss: SCT (8 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs5: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs6: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of britgas: SCT (1087 nz)  JuMP (1111 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of chain: SCT (75 nz)  JuMP (100 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of channel: SCT (696 nz)  JuMP (672 nz) 
┌ Warning: Inconsistency for Hessian of hs114: SCT (19 nz)  JuMP (21 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs250: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs251: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs36: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs37: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs40: SCT (15 nz)  JuMP (16 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs41: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs45: SCT (20 nz)  JuMP (25 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs56: SCT (10 nz)  JuMP (13 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs68: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs69: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs87: SCT (9 nz)  JuMP (11 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs93: SCT (34 nz)  JuMP (36 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon1: SCT (450 nz)  JuMP (600 nz) 
┌ Warning: Inconsistency for Hessian of polygon2: SCT (350 nz)  JuMP (400 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of robotarm: SCT (252 nz)  JuMP (276 nz) [diagonal difference only]

Copy link
Member

@odow odow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a couple of minor formatting changes. LGTM from my side. Do we want to wait to dig into the comparisons with SCT?

@gdalle
Copy link

gdalle commented Nov 24, 2025

@adrhill do you have some bandwidth to figure this out? I most definitely don't

@mlubin
Copy link
Member Author

mlubin commented Nov 25, 2025

I don't think digging into the comparisons with SCT should be blocking for merging.

@odow odow merged commit a8aa0b0 into master Nov 25, 2025
31 checks passed
@odow odow deleted the ml/sparsity branch November 25, 2025 20:17
@adrhill
Copy link

adrhill commented Nov 25, 2025

@adrhill do you have some bandwidth to figure this out? I most definitely don't

Unfortunately not, as NeurIPS is coming up next week. Happy to revisit this afterwards.

@amontoison
Copy link
Contributor

amontoison commented Dec 8, 2025

I will do a new release of OptimizationProblems.jl tonight that contains a better implementation of a few problems.
I confirm that only the diagonal of the Hessian can be further improved on the JuMP / MOI side.
We will gain (almost) nothing in terms of storage but we can have a gain for the computation with a better symmetric coloring.
I used the global tracer.

┌ Warning: Inconsistency for Jacobian of hs117: SCT (75 nz)  JuMP (62 nz)
┌ Warning: Inconsistency for Hessian of argauss: SCT (8 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs5: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of biggs6: SCT (9 nz)  JuMP (12 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of britgas: SCT (1087 nz)  JuMP (1111 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of chain: SCT (75 nz)  JuMP (100 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of channel: SCT (696 nz)  JuMP (672 nz) 
┌ Warning: Inconsistency for Hessian of dixmaane: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of dixmaani: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of dixmaanm: SCT (493 nz)  JuMP (297 nz) 
┌ Warning: Inconsistency for Hessian of hs114: SCT (19 nz)  JuMP (21 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs250: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs251: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs36: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs37: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs40: SCT (15 nz)  JuMP (16 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs41: SCT (6 nz)  JuMP (9 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs45: SCT (20 nz)  JuMP (25 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs56: SCT (10 nz)  JuMP (13 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs68: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs69: SCT (9 nz)  JuMP (10 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs87: SCT (9 nz)  JuMP (11 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of hs93: SCT (34 nz)  JuMP (36 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon1: SCT (550 nz)  JuMP (600 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of polygon2: SCT (350 nz)  JuMP (400 nz) [diagonal difference only]
┌ Warning: Inconsistency for Hessian of robotarm: SCT (252 nz)  JuMP (276 nz) [diagonal difference only]

The following script was used:

using Dates: now
using LinearAlgebra
using OptimizationProblems
using SparseArrays
using Test
using SparseConnectivityTracerBenchmarks.Optimization:
    compute_jac_and_hess_sparsity_sct,
    compute_jac_and_hess_sparsity_and_value_jump,
    optimization_problem_names

function compare_patterns(
        truth::AbstractMatrix{<:Real}; sct::AbstractMatrix{Bool}, jump::AbstractMatrix{Bool}
    )
    difference = jump - sct
    if nnz(difference) > 0
        # test that all pattern differences are local zeros in the ground truth
        I, J, _ = findnz(difference)
        coeffs = [truth[i, j] for (i, j) in zip(I, J)]
        @test maximum(abs, coeffs) < 1.0e-7
    end

    nnz_sct, nnz_jump = nnz(sct), nnz(jump)
    diagonal = (difference == Diagonal(difference)) ? "[diagonal difference only]" : ""
    message = if all(>(0), nonzeros(difference))
        "SCT ($nnz_sct nz) ⊂ JuMP ($nnz_jump nz) $diagonal"
    elseif all(<(0), nonzeros(difference))
        "SCT ($nnz_sct nz) ⊃ JuMP ($nnz_jump nz) $diagonal"
    else
        "SCT ($nnz_sct nz) ≠ JuMP ($nnz_jump nz) $diagonal"
    end
    return message
end

#=
Please look at the warnings displayed at the end.
=#

jac_inconsistencies = []
hess_inconsistencies = []

@testset "$name" for name in optimization_problem_names()
    in(name, [:catmix, :gasoil, :glider, :methanol, :minsurf, :pinene, :rocket, :steering, :torsion]) && continue
    @info "$(now()) - $name"

    (jac_sparsity_sct, hess_sparsity_sct) = compute_jac_and_hess_sparsity_sct(name)
    ((jac_sparsity_jump, jac), (hess_sparsity_jump, hess)) = compute_jac_and_hess_sparsity_and_value_jump(
        name
    )

    @testset verbose = true "Jacobian comparison" begin
        if jac_sparsity_sct == jac_sparsity_jump
            @test jac_sparsity_sct == jac_sparsity_jump
        else
            @test_broken jac_sparsity_sct == jac_sparsity_jump
            message = compare_patterns(jac; sct = jac_sparsity_sct, jump = jac_sparsity_jump)
            @warn "Inconsistency for Jacobian of $name: $message"
            push!(jac_inconsistencies, (name, message))
        end
    end

    @testset verbose = true "Hessian comparison" begin
        if hess_sparsity_sct == hess_sparsity_jump
            @test hess_sparsity_sct == hess_sparsity_jump
        else
            @test_broken hess_sparsity_sct == hess_sparsity_jump
            message = compare_patterns(hess; sct = hess_sparsity_sct, jump = hess_sparsity_jump)
            @warn "Inconsistency for Hessian of $name: $message"
            push!(hess_inconsistencies, (name, message))
        end
    end
    yield()
end;

if !isempty(jac_inconsistencies) || !isempty(hess_inconsistencies)
    @warn "Inconsistencies were detected"
    for (name, message) in jac_inconsistencies
        @warn "Inconsistency for Jacobian of $name: $message"
    end
    for (name, message) in hess_inconsistencies
        @warn "Inconsistency for Hessian of $name: $message"
    end
end

@gdalle @adrhill With the local tracer, only one problem with SCT gives a different sparsity pattern:

Warning: Inconsistency for Hessian of channel: SCT (696 nz)  JuMP (672 nz) 

I used x0 = 1e-8 * ones(n) instead of a vector of zeros in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

[Nonlinear] sparsity pattern of Hessian with :(x * y)

6 participants