-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Explain what you would like to see improved and how.
When I provide my cost function with an analytical Hessian, I would expect the same/equivalent behavior as for the "numerical path" (just with fewer function calls).
While this seems to be the case for MnHesse, where both analytical and numerically approximated Hessians are made positive-definite, the behavior at minimizer seeding is different (inconsistent to my mind):
When an analytical Hessian is provided and it is not positive-definite for the given initial parameter values, it is left non-positive-definite which leads to the minimizer initially stepping away from the optimal solution and can even lead to unexpected minimization failures where the "numerical path" for the same scenario succeeds.
I think the analytical Hessian should be made positive-definite for the minimizer seeding as well. But maybe I am missing something?!
Simple example:
Use MIGRAD to minimize the sum of squared residuals between the following exponential decay model and toy data (generated from the model using amplitude p[0] = 3, rate p[1] = 2, offset p[2] = 1 and adding random normal noise with a standard deviation of 0.01):
modelFunc(x, p) = p[0] * exp(-p[1] * x) + p[2]
modelGrad(x, p) = [exp(-p[1] * x), -p[0] * x * exp(-p[1] * x), 1]
modelHess(x, p) = [0, -x * exp(-p[1] * x), 0, -x * exp(-p[1] * x), p[0] * x * x * exp(-p[1] * x), 0, 0, 0, 0]
x = [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9]
y = [3.99, 3.45, 3.02, 2.65, 2.36, 2.11, 1.9, 1.75, 1.6, 1.49, 1.41, 1.32, 1.28, 1.22, 1.19, 1.15, 1.14, 1.09, 1.08, 1.07, 1.03, 1.05, 1.05, 1.04, 1.03, 1.02, 1.03, 1.02, 1.02, 1.01]
For the initial parameter values p0=[2, 1, 0], the Hessian of the least squares cost function is not positive-definite, and consequently the minimizer initially steps far into the negative range for the rate parameter. If I am using 0 as a lower limit for the rate, this leads to the minimization terminating somewhere far from the optimal solution.
When I am not providing an analytical Hessian the minimization succeeds with ease.
ROOT version
latest
Installation method
build from source
Operating system
Windows
Additional context
No response