Skip to content

Commit c1faa74

Browse files
committed
Document optimization state and callback functionality. Change initial_state to initial_x in the user-facing API and export initial_state.
1 parent 7934f7b commit c1faa74

28 files changed

+324
-254
lines changed

docs/src/user/callbacks.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
## Callbacks
2+
3+
Callbacks are functions that are called at certain points during the optimization process. They can be used to monitor progress, log information, or implement custom stopping criteria. Callbacks are called each **iteration** of an algorithm. By iteration, we mean each time the algorithm updates its current estimate of the solution and checks for convergence. This structure is not necessarily uniquely defined for all algorithms. For example, we could in principle call the callback function within the line search algorithm, or for each sampled point in a derivative-free algorithm.
4+
5+
### Callback Function Example
6+
7+
We show a simple example of a callback function that prints the current iteration number and objective value at each iteration.
8+
9+
```julia
10+
using Optim
11+
function my_callback(state)
12+
print(" Objective Value: ", state.f_x)
13+
println(" at state x: ", state.x)
14+
return false # Return true to stop the optimization
15+
end
16+
function objective(x)
17+
return (x[1]-2)^2 + (x[2]-3)^2
18+
end
19+
20+
initial_x = [0.0, 0.0]
21+
method = BFGS()
22+
options = Optim.Options(callback=my_callback)
23+
d = OnceDifferentiable(objective, initial_x)
24+
25+
optstate = initial_state(method, options, d, initial_x)
26+
result = optimize(d, initial_x, method, options, optstate)
27+
```

docs/src/user/minimization.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,7 @@ Defined for multivariate optimization:
213213
* `x_converged(res)`
214214
* `f_converged(res)`
215215
* `g_converged(res)`
216-
* `initial_state(res)`
216+
* `initial_x(res)`
217217

218218
Defined for `NelderMead` with the option `trace_simplex=true`:
219219

docs/src/user/optstate.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
## Optimization State
2+
3+
Each algorithm in Optim.jl maintains an optimization state that encapsulates all relevant information about the current iteration of the optimization process. This state is represented by the sub-types of `Optim.OptimizationState` and contains various fields that provide insights into the progress of the optimization and any information needed to maintain and update the search direction.
4+
5+
### Exceptions
6+
7+
Currently, there are two main exceptions to this structure:
8+
- **SAMIN**: This algorithm is currently not written using the main `optimize` loop and does not maintain an `OptimizationState`.
9+
- **Univariate Optimization Algorithms**: These algorithms do not use the `OptimizationState` structure as they also do not use the main `optimize` loop.
10+
11+
The exceptions matter mostly for users who want to pre-allocate the `OptimizationState` for performance reasons. In these cases, users should check the documentation of the specific algorithm they are using to see if it supports pre-allocation. It also matters for users who want to make use of the callback functionality, as the callback functions receive the `OptimizationState` as an argument. If the algorithm does not use the `OptimizationState`, the callback will instead receive a `NamedTuple` with relevant information and the callback functions should not use type annotations for their arguments based on the `OptimizationState` hierarchy.
12+
13+
### Using the Optimization State
14+
15+
As mentioned above, the optimization state is passed to callback functions during the optimization process. Users can access various fields of the state to monitor progress or implement custom logic based on the current state of the optimization. It is also possible to pre-allocate the optimization state if users which to re-use it across multiple optimization runs for performance reasons. This can be done using the `initial_state` function, which takes the optimization method, options, differentiable object, and initial parameters as arguments.
16+
17+
#### Initial State Example
18+
```julia
19+
using Optim
20+
function objective(x)
21+
return (x[1]-2)^2 + (x[2]-3)^2
22+
end
23+
24+
initial_x = [0.0, 0.0]
25+
method = BFGS()
26+
options = Optim.Options(callback=my_callback)
27+
d = OnceDifferentiable(objective, initial_x)
28+
29+
# Pre-allocate the optimization state
30+
optstate = initial_state(method, options, d, initial_x)
31+
32+
# Verify that the state has the properties f_x and x
33+
hasproperty(optstate, :f_x) # true
34+
hasproperty(optstate, :x) # true
35+
36+
result = optimize(d, initial_x, method, options, optstate)
37+
```
38+
39+
After the optimization is complete, the state has been updated as part of the optimization process and contains information about the final iteration. Users can access fields of the state to retrieve information about the final state. For example, we can verify that the final objective value matches the value stored in the state.
40+
41+
```julia
42+
@assert optstate.f_x == Optim.minimum(result)
43+
```

src/Optim.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,9 +85,9 @@ export optimize,
8585
# Re-export constraint types from NLSolversBase
8686
TwiceDifferentiableConstraints,
8787

88-
# I don't think these should be here [pkofod]
8988
OptimizationState,
9089
OptimizationTrace,
90+
initial_state,
9191

9292
# Optimization algorithms
9393
## Zeroth order methods (heuristics)

src/api.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -159,9 +159,9 @@ g_abstol(r::MultivariateOptimizationResults) = r.g_abstol
159159
g_residual(r::MultivariateOptimizationResults) = r.g_residual
160160

161161

162-
initial_state(r::OptimizationResults) =
163-
error("initial_state is not implemented for $(summary(r)).")
164-
initial_state(r::MultivariateOptimizationResults) = r.initial_x
162+
initial_x(r::OptimizationResults) =
163+
error("initial_x is not implemented for $(summary(r)).")
164+
initial_x(r::MultivariateOptimizationResults) = r.initial_x
165165

166166
lower_bound(r::OptimizationResults) =
167167
error("lower_bound is not implemented for $(summary(r)).")

src/maximize.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ for api_method in (
8989
:rel_tol,
9090
:abs_tol,
9191
:iterations,
92-
:initial_state,
92+
:initial_x,
9393
:converged,
9494
:x_tol,
9595
:x_abstol,

src/multivariate/optimize/interface.jl

Lines changed: 36 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ fallback_method(d::OnceDifferentiable) = LBFGS()
3939
fallback_method(d::TwiceDifferentiable) = Newton()
4040

4141
# promote the objective (tuple of callables or an AbstractObjective) according to method requirement
42-
promote_objtype(method, initial_x, autodiff::ADTypes.AbstractADType, inplace::Bool, args...) =
42+
promote_objtype(method, x0, autodiff::ADTypes.AbstractADType, inplace::Bool, args...) =
4343
error("No default objective type for $method and $args.")
4444
# actual promotions, notice that (args...) captures FirstOrderOptimizer and NonDifferentiable, etc
4545
promote_objtype(method::ZerothOrderOptimizer, x, autodiff::ADTypes.AbstractADType, inplace::Bool, args...) =
@@ -138,156 +138,156 @@ promote_objtype(
138138
# if no method or options are present
139139
function optimize(
140140
f,
141-
initial_x::AbstractArray;
141+
x0::AbstractArray;
142142
inplace::Bool = true,
143143
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
144144
)
145145
method = fallback_method(f)
146-
d = promote_objtype(method, initial_x, autodiff, inplace, f)
146+
d = promote_objtype(method, x0, autodiff, inplace, f)
147147

148148
options = Options(; default_options(method)...)
149-
optimize(d, initial_x, method, options)
149+
optimize(d, x0, method, options)
150150
end
151151
function optimize(
152152
f,
153153
g,
154-
initial_x::AbstractArray;
154+
x0::AbstractArray;
155155
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
156156
inplace::Bool = true,
157157
)
158158

159159
method = fallback_method(f, g)
160160

161-
d = promote_objtype(method, initial_x, autodiff, inplace, f, g)
161+
d = promote_objtype(method, x0, autodiff, inplace, f, g)
162162

163163
options = Options(; default_options(method)...)
164-
optimize(d, initial_x, method, options)
164+
optimize(d, x0, method, options)
165165
end
166166
function optimize(
167167
f,
168168
g,
169169
h,
170-
initial_x::AbstractArray;
170+
x0::AbstractArray;
171171
inplace::Bool = true,
172172
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
173173
)
174174
method = fallback_method(f, g, h)
175-
d = promote_objtype(method, initial_x, autodiff, inplace, f, g, h)
175+
d = promote_objtype(method, x0, autodiff, inplace, f, g, h)
176176

177177
options = Options(; default_options(method)...)
178-
optimize(d, initial_x, method, options)
178+
optimize(d, x0, method, options)
179179
end
180180

181181
# no method supplied with objective
182182
function optimize(
183183
d::T,
184-
initial_x::AbstractArray,
184+
x0::AbstractArray,
185185
options::Options,
186186
) where {T<:AbstractObjective}
187-
optimize(d, initial_x, fallback_method(d), options)
187+
optimize(d, x0, fallback_method(d), options)
188188
end
189189
# no method supplied with inplace and autodiff keywords becauase objective is not supplied
190190
function optimize(
191191
f,
192-
initial_x::AbstractArray,
192+
x0::AbstractArray,
193193
options::Options;
194194
inplace::Bool = true,
195195
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
196196
)
197197
method = fallback_method(f)
198-
d = promote_objtype(method, initial_x, autodiff, inplace, f)
199-
optimize(d, initial_x, method, options)
198+
d = promote_objtype(method, x0, autodiff, inplace, f)
199+
optimize(d, x0, method, options)
200200
end
201201
function optimize(
202202
f,
203203
g,
204-
initial_x::AbstractArray,
204+
x0::AbstractArray,
205205
options::Options;
206206
inplace::Bool = true,
207207
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
208208
)
209209

210210
method = fallback_method(f, g)
211-
d = promote_objtype(method, initial_x, autodiff, inplace, f, g)
212-
optimize(d, initial_x, method, options)
211+
d = promote_objtype(method, x0, autodiff, inplace, f, g)
212+
optimize(d, x0, method, options)
213213
end
214214
function optimize(
215215
f,
216216
g,
217217
h,
218-
initial_x::AbstractArray{T},
218+
x0::AbstractArray{T},
219219
options::Options;
220220
inplace::Bool = true,
221221
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
222222
) where {T}
223223
method = fallback_method(f, g, h)
224-
d = promote_objtype(method, initial_x, autodiff, inplace, f, g, h)
224+
d = promote_objtype(method, x0, autodiff, inplace, f, g, h)
225225

226-
optimize(d, initial_x, method, options)
226+
optimize(d, x0, method, options)
227227
end
228228

229229
# potentially everything is supplied (besides caches)
230230
function optimize(
231231
f,
232-
initial_x::AbstractArray,
232+
x0::AbstractArray,
233233
method::AbstractOptimizer,
234234
options::Options = Options(; default_options(method)...);
235235
inplace::Bool = true,
236236
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
237237
)
238-
d = promote_objtype(method, initial_x, autodiff, inplace, f)
239-
optimize(d, initial_x, method, options)
238+
d = promote_objtype(method, x0, autodiff, inplace, f)
239+
optimize(d, x0, method, options)
240240
end
241241
function optimize(
242242
f,
243243
c::AbstractConstraints,
244-
initial_x::AbstractArray,
244+
x0::AbstractArray,
245245
method::AbstractOptimizer,
246246
options::Options = Options(; default_options(method)...);
247247
inplace::Bool = true,
248248
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
249249
)
250250

251-
d = promote_objtype(method, initial_x, autodiff, inplace, f)
252-
optimize(d, c, initial_x, method, options)
251+
d = promote_objtype(method, x0, autodiff, inplace, f)
252+
optimize(d, c, x0, method, options)
253253
end
254254
function optimize(
255255
f,
256256
g,
257-
initial_x::AbstractArray,
257+
x0::AbstractArray,
258258
method::AbstractOptimizer,
259259
options::Options = Options(; default_options(method)...);
260260
inplace::Bool = true,
261261
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
262262
)
263-
d = promote_objtype(method, initial_x, autodiff, inplace, f, g)
263+
d = promote_objtype(method, x0, autodiff, inplace, f, g)
264264

265-
optimize(d, initial_x, method, options)
265+
optimize(d, x0, method, options)
266266
end
267267
function optimize(
268268
f,
269269
g,
270270
h,
271-
initial_x::AbstractArray,
271+
x0::AbstractArray,
272272
method::AbstractOptimizer,
273273
options::Options = Options(; default_options(method)...);
274274
inplace::Bool = true,
275275
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
276276

277277
)
278-
d = promote_objtype(method, initial_x, autodiff, inplace, f, g, h)
278+
d = promote_objtype(method, x0, autodiff, inplace, f, g, h)
279279

280-
optimize(d, initial_x, method, options)
280+
optimize(d, x0, method, options)
281281
end
282282

283283
function optimize(
284284
d::D,
285-
initial_x::AbstractArray,
285+
x0::AbstractArray,
286286
method::SecondOrderOptimizer,
287287
options::Options = Options(; default_options(method)...);
288288
inplace::Bool = true,
289289
autodiff::ADTypes.AbstractADType = DEFAULT_AD_TYPE,
290290
) where {D<:Union{NonDifferentiable,OnceDifferentiable}}
291-
d = promote_objtype(method, initial_x, autodiff, inplace, d)
292-
optimize(d, initial_x, method, options)
291+
d = promote_objtype(method, x0, autodiff, inplace, d)
292+
optimize(d, x0, method, options)
293293
end

0 commit comments

Comments
 (0)