Compute JVP in line searches by devmotion · Pull Request #1210 · JuliaNLSolvers/Optim.jl

devmotion · 2025-11-24T13:48:13Z

Based on JuliaNLSolvers/NLSolversBase.jl#168 and JuliaNLSolvers/LineSearches.jl#187.

~~The diff would be cleaner if #1195 (which depends on #1209) would be available on the master branch.~~

Based on #1212.

Currently, tests are failing due to missing definitions of NLSolversBase.value_jvp! etc. for ManifoldObjective.

src/Manifolds.jl

pkofod · 2025-11-24T14:28:36Z

this one also has conflicts for the same reason as the AD pr I suppseo

pkofod · 2025-11-25T07:54:32Z

I see you're hitting all the wrappers

devmotion · 2025-11-25T10:54:16Z

src/multivariate/optimize/optimize.jl

+    elseif !NLSolversBase.isfinite_value(d)
+        TerminationCode.ObjectiveNotFinite
+    elseif !NLSolversBase.isfinite_gradient(d)
        TerminationCode.GradientNotFinite


@pkofod a random bug I came across when fixing these lines

I'm wondering if I had that f_calls there for a reason

The logic is not changed, it's just moved to NLSolversBase to avoid having to expose jvp(obj) (IMO such calls are quite unsafe in general, so I'd like to avoid having to introduce new ones at least).

My point's just that currently the termination code is a GradientNotFinite when the objective function is not finite.

I understood the part of about the wrong code but had missed the part about the new function . Great

devmotion · 2025-11-25T10:55:53Z

src/multivariate/solvers/first_order/accelerated_gradient_descent.jl

    initial_x = copy(initial_x)
    retract!(method.manifold, initial_x)

-    value_gradient!!(d, initial_x)


@pkofod why would we ever want to call the !! methods explicitly? If the input is different from the one that the value and gradient were evaluated with the value/gradient will be recomputed anyway.

An alternative is to create new instances of OnceDifferentiable, but the "problem" is Fminbox that has outer and inner loops. In the previous inner loop you may evaluate the objective at x but then Fminbox updates a parameter that changes the function essentially and then the stored value is not correct.

Without getting too specific here, you can thing of it as the EBEs in Laplace's method. If you have an outer objective that performs an EBE optimization on the inside then a OnceDifferentiable for the outside objective of the marginal loglikelihood cannot know if we changes the inner EBEs or not.

As I said above, maybe a better approach is to construct a new OnceDifferentiable per outer iteration.

I believe other users have relied on it in the past as well, because they are doing the same. Allocating / constructing the OnceDifferntiable once and then they're solving a sequence of optimize with some parameter that's updated between optimize calls. I'm not completely sure if it's documented or not, but the idea is that each initial evaluation and each reset! for Fminbox forces an evaluation even if x is not updated.

In the previous inner loop you may evaluate the objective at x but then Fminbox updates a parameter that changes the function essentially and then the stored value is not correct.

It sounds like we should just use NLSolversBase.clear! instead of only resetting the number of calls in the loop of the algorithm?

yes we can call clear before value!, gradient! etc is called in those places

pkofod · 2025-11-25T12:06:36Z

src/multivariate/solvers/constrained/fminbox.jl

-    bw.Fb = value(bw.b, x)
-    bw.Ftotal = bw.mu * bw.Fb
+NLSolversBase.value(obj::BarrierWrapper) = obj.Ftotal
+function NLSolversBase.value!(bw::BarrierWrapper, x)


I didn't check CI yet, but I think these may call failures because the multiplier could have been updated

I guess you can rewrite these to check if mu was updated in the same manner...

pkofod · 2025-11-26T06:04:00Z

src/multivariate/solvers/constrained/fminbox.jl

    results = optimize(dfbox, x, _optimizer, options, state)
    stopped_by_callback = results.stopped_by.callback
-    dfbox.obj.f_calls[1] = 0
+    # TODO: Reset everything? Add upstream API for resetting call counters?


As discussed above, clear! should suffice

It didn't but maybe I did something wrong. I spent a day trying to fix the errors but then decided to revert back to not touching any of the !! etc. functionality in this PR. The problem - and why I wanted to get rid of them in the first place - is that the JVP function messes up any of such implicit assumptions. There's no guarantee anymore that during/after the line search the gradient will be available or that a new gradient is used at all in the line search. IMO one should (at some point, but maybe not in this PR) get rid of the value(d) etc. API and any such implicit assumptions completely. If a specific gradient etc. has to be passed, it should be passed explicitly; and IMO the objective function structs should be treated as black boxes that just perform some optimisations by caching evaluations.

codecov · 2025-12-01T20:23:48Z

Codecov Report

❌ Patch coverage is 84.55882% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.36%. Comparing base (d3ccb73) to head (9fa280e).
⚠️ Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
src/multivariate/solvers/constrained/fminbox.jl	74.19%	8 Missing ⚠️
src/Manifolds.jl	75.00%	4 Missing ⚠️
src/api.jl	50.00%	3 Missing ⚠️
src/multivariate/solvers/constrained/samin.jl	60.00%	2 Missing ⚠️
src/multivariate/solvers/first_order/adam.jl	50.00%	1 Missing ⚠️
src/multivariate/solvers/first_order/adamax.jl	50.00%	1 Missing ⚠️
...ariate/solvers/second_order/newton_trust_region.jl	66.66%	1 Missing ⚠️
src/types.jl	75.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1210      +/-   ##
==========================================
- Coverage   86.79%   86.36%   -0.44%     
==========================================
  Files          44       44              
  Lines        3552     3594      +42     
==========================================
+ Hits         3083     3104      +21     
- Misses        469      490      +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

devmotion · 2025-12-01T21:20:16Z

Locally (macOS aarch64) and on MacOS (aarch64) in CI all tests pass, but on ubuntu and windows 4 tests fail...

github-actions · 2026-01-08T10:59:49Z

Benchmark Results (Julia vlts)

Time benchmarks

	master	`9fa280e`...	master / `9fa280e`...
multivariate/solvers/first_order/AdaMax	0.542 ± 0.0089 ms	0.548 ± 0.0089 ms	0.988 ± 0.023
multivariate/solvers/first_order/Adam	0.541 ± 0.0088 ms	0.547 ± 0.0089 ms	0.988 ± 0.023
multivariate/solvers/first_order/BFGS	0.261 ± 0.008 ms	0.265 ± 0.0081 ms	0.984 ± 0.042
multivariate/solvers/first_order/ConjugateGradient	0.181 ± 0.0029 ms	0.177 ± 0.0027 ms	1.02 ± 0.023
multivariate/solvers/first_order/GradientDescent	1.54 ± 0.012 ms	1.57 ± 0.011 ms	0.982 ± 0.011
multivariate/solvers/first_order/LBFGS	0.232 ± 0.0073 ms	0.238 ± 0.0075 ms	0.974 ± 0.043
multivariate/solvers/first_order/MomentumGradientDescent	2.17 ± 0.015 ms	2.19 ± 0.015 ms	0.988 ± 0.0095
multivariate/solvers/first_order/NGMRES	0.428 ± 0.01 ms	0.437 ± 0.011 ms	0.978 ± 0.034
time_to_load	0.37 ± 0.0048 s	0.276 ± 0.00031 s	1.34 ± 0.018

Memory benchmarks

	master	`9fa280e`...	master / `9fa280e`...
multivariate/solvers/first_order/AdaMax	0.34 k allocs: 7.19 kB	0.34 k allocs: 7.28 kB	0.987
multivariate/solvers/first_order/Adam	0.34 k allocs: 7.19 kB	0.34 k allocs: 7.28 kB	0.987
multivariate/solvers/first_order/BFGS	0.336 k allocs: 15 kB	0.281 k allocs: 13.3 kB	1.12
multivariate/solvers/first_order/ConjugateGradient	0.338 k allocs: 13.6 kB	0.294 k allocs: 14.7 kB	0.923
multivariate/solvers/first_order/GradientDescent	1.89 k allocs: 0.0713 MB	1.73 k allocs: 0.0757 MB	0.942
multivariate/solvers/first_order/LBFGS	0.317 k allocs: 14.2 kB	0.298 k allocs: 15 kB	0.946
multivariate/solvers/first_order/MomentumGradientDescent	2.24 k allocs: 0.077 MB	2.07 k allocs: 0.081 MB	0.951
multivariate/solvers/first_order/NGMRES	1.41 k allocs: 0.112 MB	1.39 k allocs: 0.114 MB	0.984
time_to_load	0.153 k allocs: 14.5 kB	0.153 k allocs: 14.5 kB	1

A plot of the benchmark results has been uploaded as an artifact at .

github-actions · 2026-01-08T11:01:29Z

Benchmark Results (Julia v1)

Time benchmarks

	master	`9fa280e`...	master / `9fa280e`...
multivariate/solvers/first_order/AdaMax	0.64 ± 0.0086 ms	0.642 ± 0.0088 ms	0.998 ± 0.019
multivariate/solvers/first_order/Adam	0.638 ± 0.0082 ms	0.639 ± 0.0084 ms	0.999 ± 0.018
multivariate/solvers/first_order/BFGS	0.224 ± 0.006 ms	0.23 ± 0.0066 ms	0.974 ± 0.038
multivariate/solvers/first_order/ConjugateGradient	0.0577 ± 0.00081 ms	0.0525 ± 0.00072 ms	1.1 ± 0.022
multivariate/solvers/first_order/GradientDescent	1.68 ± 0.012 ms	1.72 ± 0.011 ms	0.974 ± 0.0091
multivariate/solvers/first_order/LBFGS	0.22 ± 0.0069 ms	0.231 ± 0.0082 ms	0.952 ± 0.045
multivariate/solvers/first_order/MomentumGradientDescent	2.46 ± 0.017 ms	2.48 ± 0.017 ms	0.991 ± 0.0098
multivariate/solvers/first_order/NGMRES	0.543 ± 0.011 ms	0.593 ± 0.011 ms	0.917 ± 0.025
time_to_load	0.419 ± 0.0072 s	0.3 ± 0.0025 s	1.39 ± 0.027

Memory benchmarks

	master	`9fa280e`...	master / `9fa280e`...
multivariate/solvers/first_order/AdaMax	0.356 k allocs: 7.12 kB	0.354 k allocs: 7.19 kB	0.991
multivariate/solvers/first_order/Adam	0.356 k allocs: 7.12 kB	0.354 k allocs: 7.19 kB	0.991
multivariate/solvers/first_order/BFGS	0.306 k allocs: 10.9 kB	0.266 k allocs: 9.7 kB	1.12
multivariate/solvers/first_order/ConjugateGradient	0.121 k allocs: 4.83 kB	0.108 k allocs: 4.89 kB	0.987
multivariate/solvers/first_order/GradientDescent	2.15 k allocs: 0.066 MB	2 k allocs: 0.0708 MB	0.933
multivariate/solvers/first_order/LBFGS	0.349 k allocs: 12.9 kB	0.336 k allocs: 13.7 kB	0.942
multivariate/solvers/first_order/MomentumGradientDescent	2.56 k allocs: 0.0754 MB	2.39 k allocs: 0.0793 MB	0.95
multivariate/solvers/first_order/NGMRES	2.64 k allocs: 0.127 MB	2.62 k allocs: 0.129 MB	0.984
time_to_load	0.145 k allocs: 11 kB	0.145 k allocs: 11 kB	1

A plot of the benchmark results has been uploaded as an artifact at .

pkofod · 2026-01-08T12:03:16Z

runs locally, so I think it's just one of those numerical things seen earlier

devmotion · 2026-01-08T12:07:41Z

Yes, these test issues are architecture specific and have persisted for a long time in this PR: #1210 (comment)

pkofod · 2026-01-08T12:40:45Z

As I mentioned earlier, they are generally run with "tuned" step sizes / learning parameters as well. I think the tests are not appropriate, really. There's not rule that for the given input the specified "accuracy" should be met. I need to rethink this sometime in the future. I turned off the rosenbrock test for AGD for now

pkofod · 2026-01-08T12:53:38Z

I suppose the JET errors only shows up after the combined change to initial_state as well as upgrading to v8 of NLSolvesrBase?

devmotion · 2026-01-08T21:50:14Z

I suppose the JET errors only shows up after the combined change to initial_state as well as upgrading to v8 of NLSolvesrBase?

No, I don't think the initial_state change is relevant at all. It's caused by rebasing on master which contains #1223. JET@0.11 (requires Julia >= 1.12) is just much better at detecting issues. Currently we only use JETs to detect "typos" (undefined references) and perform a package-wide analysis that is based on the method signatures (and hence generally not very targeted). Hence the latter is affected quite a bit by the major code changes and signature changes in the PR.

I pushed 9fa280e, a minimal set of changes that fixed the JET errors locally.

pkofod reviewed Nov 24, 2025

View reviewed changes

src/Manifolds.jl Outdated Show resolved Hide resolved

devmotion force-pushed the dmw/jvp branch from bcd7d47 to 550624f Compare November 24, 2025 14:21

devmotion force-pushed the dmw/jvp branch 4 times, most recently from e601dc9 to 33f43b0 Compare November 25, 2025 00:02

devmotion force-pushed the dmw/jvp branch 3 times, most recently from 2e8626a to 28b8899 Compare November 25, 2025 10:48

devmotion commented Nov 25, 2025

View reviewed changes

pkofod reviewed Nov 25, 2025

View reviewed changes

devmotion force-pushed the dmw/jvp branch 3 times, most recently from 170a47b to aa176a1 Compare November 25, 2025 22:53

pkofod reviewed Nov 26, 2025

View reviewed changes

devmotion mentioned this pull request Nov 26, 2025

Do not (mis)use objective as state #1212

Merged

devmotion force-pushed the dmw/jvp branch 6 times, most recently from 6525ce4 to 6d92ce1 Compare December 1, 2025 08:49

pkofod closed this in #1212 Dec 1, 2025

pkofod reopened this Dec 1, 2025

devmotion force-pushed the dmw/jvp branch 2 times, most recently from b5c9710 to 6394bf3 Compare December 1, 2025 20:10

devmotion force-pushed the dmw/jvp branch from 6394bf3 to f502b22 Compare December 1, 2025 21:07

devmotion force-pushed the dmw/jvp branch 4 times, most recently from 473d75c to a5a9636 Compare December 7, 2025 16:17

devmotion force-pushed the dmw/jvp branch 2 times, most recently from cc9f818 to 536eaa9 Compare December 15, 2025 16:39

Compute JVP in line searches

790ef85

devmotion force-pushed the dmw/jvp branch from 536eaa9 to 790ef85 Compare January 8, 2026 10:54

Allow a test to be 42 or 43 and show debug printing for AGD

563d72d

Update accelerated_gradient_descent.jl

711cdc5

Fix problems detected by JET@0.11

9fa280e

devmotion force-pushed the dmw/jvp branch from 0731077 to 9fa280e Compare January 8, 2026 21:43

devmotion marked this pull request as ready for review January 8, 2026 22:06

pkofod approved these changes Jan 9, 2026

View reviewed changes

pkofod merged commit 286aeed into JuliaNLSolvers:master Jan 9, 2026
12 of 14 checks passed

devmotion deleted the dmw/jvp branch January 9, 2026 06:51

pkofod referenced this pull request Jan 9, 2026

Update Project.toml

d0f5e0e

Conversation

devmotion commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pkofod commented Nov 24, 2025

Uh oh!

pkofod commented Nov 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

devmotion commented Dec 1, 2025

Uh oh!

github-actions bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results (Julia vlts)

Uh oh!

github-actions bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results (Julia v1)

Uh oh!

pkofod commented Jan 8, 2026

Uh oh!

devmotion commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkofod commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkofod commented Jan 8, 2026

Uh oh!

devmotion commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

devmotion commented Nov 24, 2025 •

edited

Loading

codecov bot commented Dec 1, 2025 •

edited

Loading

github-actions bot commented Jan 8, 2026 •

edited

Loading

github-actions bot commented Jan 8, 2026 •

edited

Loading

devmotion commented Jan 8, 2026 •

edited

Loading

pkofod commented Jan 8, 2026 •

edited

Loading