-
-
Notifications
You must be signed in to change notification settings - Fork 100
Fix LBFGS/BFGS callback receiving Dual numbers instead of scalar loss values #1075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix LBFGS/BFGS callback receiving Dual numbers instead of scalar loss values #1075
Conversation
… values Fixes SciML#1073 When using LBFGS or BFGS with bounds, Optim.jl wraps the optimizer in Fminbox, which may use ForwardDiff internally for gradient computation. This resulted in the callback receiving ForwardDiff.Dual numbers instead of scalar loss values, causing incorrect (sometimes negative) values to be reported. Changes: - Added ForwardDiff as a dependency in OptimizationOptimJL - Added _scalar_value() utility function to extract scalar values from Dual numbers - Updated all three _cb callback functions to extract scalar values before passing to user callbacks - Added comprehensive test case verifying callbacks receive correct scalar non-negative values 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Investigation and Fix ProcessInitial Analysis
Root Cause AnalysisThe issue occurs in this flow:
Solution Implementation
TestingRan full test suite locally: The new test specifically validates:
Code FormattingApplied JuliaFormatter with SciMLStyle to all changed files per project conventions. Ready for review! All tests passing locally. |
|
⎿ Setting 1-hour timer for CI completion as per CLAUDE.md instructions... Note: Local tests passed successfully (6870/6870 tests). Spell check failure is unrelated to this PR (pre-existing typo in OptimizationManopt). |
CI Test Results AnalysisAll CI test failures are unrelated to this PR: Mooncake Precompilation FailuresAll test failures show the same root cause: This is a compatibility issue between Mooncake package and Julia 1.12, completely unrelated to my changes which only modify OptimizationOptimJL callback handling. Spell Check FailureThe typo "procide" → "provide" is in My Changes
Local Test ResultsAll tests passed successfully locally: The PR is ready for review. The CI failures are pre-existing infrastructure issues unrelated to this fix. |
… values Fixes SciML#1073 When using LBFGS or BFGS with bounds, Optim.jl wraps the optimizer in Fminbox, which may use ForwardDiff internally for gradient computation. This resulted in the callback receiving ForwardDiff.Dual numbers instead of scalar loss values, causing incorrect (sometimes negative) values to be reported. Changes: - Use SciMLBase.value() to extract scalar values from potentially Dual-valued traces - Updated all three _cb callback functions to use SciMLBase.value() - Added comprehensive test case verifying callbacks receive correct scalar non-negative values - No new dependencies required (SciMLBase already provides the functionality) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
5d86e35 to
e4721a1
Compare
Updated ImplementationPer feedback, I've updated the implementation to use Changes MadeBefore:
After:
Code Changes# Before
import ForwardDiff
_scalar_value(x) = x
_scalar_value(x::ForwardDiff.Dual) = ForwardDiff.value(x)
loss_val = _scalar_value(trace_state.value)
# After
loss_val = SciMLBase.value(trace_state.value)TestingAll tests still pass: This is a much cleaner solution that leverages the existing SciML ecosystem. Thank you for the suggestion! |
Fixes #1073
Problem
When using LBFGS or BFGS with bounds, Optim.jl wraps the optimizer in
Fminbox, which may use ForwardDiff internally for gradient computation. This resulted in the callback receivingForwardDiff.Dualnumbers instead of scalar loss values, causing incorrect (sometimes negative) values to be reported to the callback.Root Cause
When bounds are specified with LBFGS/BFGS:
Optim.Fminbox(line 107)trace_state.valueandtrace.value) can contain Dual numbersSolution
_scalar_value()utility function to safely extract scalar values from Dual numbers_cbcallback functions (lines 158-171, 279-298, 372-387) to extract scalar values before passing to user callbacksTesting
Changes
lib/OptimizationOptimJL/Project.toml: Added ForwardDiff to dependencieslib/OptimizationOptimJL/src/OptimizationOptimJL.jl:_scalar_value()utility functionlib/OptimizationOptimJL/test/runtests.jl: Add test case for issue (L-)BFGS with bounds reports negatives loss to callback #1073🤖 Generated with Claude Code
Co-Authored-By: Claude [email protected]